So, you've probably heard that using
OR
s in your queries is heavy and should be avoided when possible. Here is a live example. We had the following query running on a medium sized table in MYSQL (around 200K rows):
SELECT count(*) FROM interactions p
WHERE (
(p.employer = 69 AND p.flag1 = 1)
OR
(p.employee = 69 AND p.flag2 = 0)
)
Let's think of
interactions
as a table holding the interaction between employers and employees, and this query should tell the user the total number of interactions where user number 69 is involved.
The query looked pretty innoccent on an idle MYSQL server taking around 200ms, but when the server got busy the query execution time was reaching 5-6 seconds.
Altough we did have indexes on
interactions.employer
and
interactions. employee
doing
EXPLAIN
showed that MYSQL was not using them.
After some digging and fiddling and trying we ended up with:
SELECT count(*) as count FROM (
SELECT 1 FROM interactions p
WHERE (p.employer = 69 AND p.flag1 = 1)
UNION ALL
SELECT 1 FROM privatemsg p
WHERE (p.employee = 69 AND p.flag2 = 0)
) t2
This query took on the busy server less then 250 milliseconds and less then 80ms on the idle server. Doing
EXPLAIN
showed that MYSQL was doing two queries, and using the correct index on each one.
After some more digging we noticed that if we
EXPLAIN
the
original query on the idle server the optimizer occasionally converts it to a
UNION
, but for some reason this did not always happen and in any case took about twice the time then the UNION query.
To sum up the results here is a simple table
Query type | Server activity | Query Time |
Using OR | Idle | 200ms |
Using OR | Busy | 5500ms |
Using UNION | Idle | 78ms |
Using UNION | Busy | 189ms |
Conclusion: always consider alternatives to
OR
, but make sure you check them well against real-time examples.
A Further Note: When timing queries in MYSQL alway use the optimizer hint
/*! SQL_NO_CACHE */
to make sure you are not getting results from the query cache (if you have one set up).