Query Optimization
11 November 2010

I have a certain weekly report I run at work that has become increasingly slow as the tables continue to grow. I knew why this was happening and even knew where the problem lay - a nested select, but it was never high on my priority list. Having some time recently, and becoming increasingly annoyed at the length of time it took this one report to run, I had a go at making it better.

The query itself consisted of a union of two select statements. One of the unions had a curious left join:

select t1.number from table1
left join table2 t2 on t2.number = t1.number
where t2.number is null

The other side of the union used a nested select to achieve the same effect, which was to get all the rows out of table1 for which an entry did not exist in table2.

select t1.number from table1 t1
where t1.number not in (select number from table2)

The left join with a null where clause is an ingenious method to perform a "not in" statement. The nested select was slowing things exponentially as the tables grew. Implementing the join method on the other side of the union cut my query time from 35 seconds to 2.5 seconds. 94% optimization with that one little trick.