Home

Learn SQL

Along with regular expressions, SQL is one of those things I wish I had learned earlier in my career. It is extremely powerful and can make most data manipulation tasks simple. And chances are that the query engine of your favorite DBMS will do a much more efficient job of munging your data than you could ever hope to do.

But It's Just Another Language, Right?

For a good part of my career I had thought of SQL as just a language I would use to pull data or populate some data repository after I had done the REAL work with the data in the language of my choice. Luckily I had the good fortune to learn the errors of my ways from a co-worker. Now he may have gone a teensy weensy bit overboard in that he would fire up his SQL Query tool like most people would fire up a perl or python shell to do mundane tasks and hack on something, but what this did make pretty clear is that one could use the language and your various databsse tools to do a little bit more than yank a bunch of data off the database server.

On top of that, after having to do something beyond very mundane work with databases, it hit me that SQL is probably one of the very few real world applications that has anything to do with all that stuff they taught in my various mathematics classes. Since I don't regularly write compilers all the fun stuff about defining grammers, the pumping lemma, deterministic and nondeterministic automata, etc. are all still around in my head somewhere, but SQL is an actual real live application of set theory. There are, of course, omissions, extensions and the model falls apart in places as is the case with most real world implementations of an abstraction, but if you dig math beyond high school algebra at all and you work in the field of web or enterpise applications, this is one of the few ways you will be able to exercise your grey matter on the clock.

So How Should I Go About It?

The first thing I would recommend is not waiting until you are more than well over a decade into your career to become comfortable with SQL. Second, you do not need to aspire to become some super SQL guru. I've seen a fair share of SQL that still scares me silly.

After getting a better understanind of how SQL could be really useful, and coming to terms with the fact that a good chunk of my career is going to be spent pulling data from and writing data to a database, and seeing that some interesting questions were popping up about SQL in job interviews where I actually thought "Yeah, I want that job.", I picked myself up a copy of O'Reilly's SQL Cookbook. Generally speaking I've been well served by O'Reilly books and having done reasonably well in my discrete math class in college and not having forgotten everything they taught me there I don't think I needed to grab a more introductory SQL book that walked you through tables and joining stuff and what NULL is and what have you. I was interested in getting a good primer on how to write queries to do useful things and that did it for me. I still refer to that book pretty often. It is one of my more used books.

After that, for me, it was a matter of just hacking away on databases in a SQL editor. One of the more useful queries I've used is one to help map IDs from a table with duplicated data into something that has only one entry. Let's say you have a table that is full of information that is related to your customers. You have the company name, some other information and an autogenerated id. You realize that after years of entering data that there are duplicate records and you would like to get that pared down so you have a single id. Previous to my getting more comfortable with SQL I would have done a select all on the table and iterated through the records, entering data I cared about into a hash table with the key being the field I wanted to be unique, in this case the company name.

It turns out that with SQL you can do a lot of munging of this sort of data. For example, if you want to create a map of the company information to get a list of single ID's, you can do this:

select a.dd as tid, b.company, b.id from
(
select company, min(id) as dd  from Members group by company
) as a, Members as b
WHERE a.company = b.company
order by b.company

This query will return a list of the company names and IDs and the minimum ID associated with each company, allowing you to create a single id for each company. If you have data stored in other tables that needs to be updated and you want to normalize the databse, this map is very nice to have.

From this query you can do other related things. For example, if you want to get a simple list of company names and what the unique id will be:

SELECT distinct  a.dd AS tid, b.company
FROM (
select company, min(id) as dd  from Members group by company
) AS a, Members AS b
WHERE a.company = b.company

Just The Tip Of The Iceberg

Hopefully this gives you some faint feeling of the power and usefulness of being reasonably well versed in SQL. There is a TON of things that can be done with it, and there are many considerations that go into constructing queries and manipulating data. Different constructs perform differently on various DBMS's and different queries that return the same data can perform vastly differently. As with most things programming related you need to weigh the pros and cons of getting things done, readable code and code that performs well. This article is nothing more than a vain attempt to push you in the direction of becoming more familiar with the language.