The following shows the outline of a complex SELECT
statement to be executed:
SELECT ... FROM ...
WHERE cond AND field IN
(
SELECT field1
FROM table2
WHERE cond
GROUP BY ...
HAVING ...
)
GROUP BY ...
ORDER BY ...
The exact fields, tables, and conditions are not important as
they are usually determined by the business. What is important to
notice, however, is the use of the IN clause with another internal
SELECT statement. This can prove to be very expensive in
terms of query execution time. Depending on your specific data set, a much
faster approach may be to create a temporary table holding the results
from the internal SELECT, then use a greatly simplified
SELECT statement within the IN clause. For example:
SELECT ... FROM ...
WHERE cond AND field IN
( SELECT field1 FROM temp_table )
GROUP BY ...
ORDER BY ...
Now both SELECT statements execute many times faster than
the previous single SELECT statement.
In some tests that I ran on a client's data set, I even found
that the percentage of reduction in query execution time appeared to increase
as the size of the data set was increased. The largest data sets,
with the longest execution times, realized the largest percentage of time reduction.