For instance say you have table:
- A with 1 million rows
- B with 10 million rows
- C with 100 million rows
So, doing a small range on table A and taking these results to filter out with the other tables I expect the join order to be
A, B, C
Yet, the mySQL optimizer in many cases will join the table in the order of
B, A, C.
This is wrong. I know that the range generated from A is smaller then the range generated from B.
To get around this I use STRAIGHT_JOIN in a global context
SELECT STRAIGHT_JOIN SQL_CALC_FOUND_ROWS A.*, B.*, C.* FROM A,B,C WHERE A.owner_id = 123 AND A.id=B.id AND C.range_id=3 AND C.a_id=B.id ORDER BY A.key_part_2 DESC LIMIT 0,100
How do I know this is the correct order?
Well, order is based on the smallest number of results that will return of the elements involved.
A by itself will return 10 rows
B by itself will return 100 rows
C by itself will return 1000 rows
So A's 10 rows reduces B to 10, then C will reduce A,B to 5 since out of A,B set only 5 rows intersect with the entire C set.
5.0 fix some major deficiencies with the optimizer, but it's far from perfect. Personally, I think its just marginally better then 4.1, although I have not played with it as much as 4.1's.
What would be cool is for mySQL to build a Machine Learning algorithm to make the optimizer learn this stuff over a period of time. Maybe this could be a new type of index that could be made to dynamically act like a BTREE for any desired question. Decision Trees may hold the answer, and new ways of keeping data organized. I think this would be a cool feature for mySQL, which they can use to get customers to pay a subscription fee for.