When these two sort definitions are put together in a single statement a filesort is produced.
Why do we want to avoid filesorts?
Filesorts are bad. 1st they tickle a thread based buffer called sort_buffer_size. Additionally filesorts reads the data twice, unless max_length_for_sort_data limit is reached and as a result the Filesort runs slower to reduce disk I/O. If you want filesorts to run faster at the expense of the disk increase the default max_length_for_sort_data. You can read the filesort algorithm here.
So, here is an example
CREATE TABLE `ABCD` (
`A` int(10) unsigned NOT NULL default '0',
`B` int(10) unsigned NOT NULL default '0',
`C` int(10) unsigned NOT NULL default '0',
`D` int(10) unsigned NOT NULL default '0',
PRIMARY KEY (`a`,`b`,`c`,`d`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
mysql> explain SELECT * FROM ABCD WHERE a=1 AND b=1 ORDER BY c DESC, d ASC\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ABCD
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 8
ref: const,const
rows: 2
Extra: Using where; Using index; Using filesort
1 row in set (0.00 sec)
Notice the filesort? So how does one get around this filesort?
Well
Let's define some roles for columns C and D. C is the parent while D is the child.
We want all the latest parents (C)
We want all the oldest children (D)
We require pagination of all the PARENTS (show 10 parents per page) so Queries like this is PRODUCED
SELECT * FROM ABCD WHERE A=? AND B=? ORDER BY C DESC
explain SELECT * FROM ABCD WHERE a=1 AND b=1 ORDER BY c DESC LIMIT 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ABCD
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 8
ref: const,const
rows: 2
Extra: Using where; Using index
1 row in set (0.00 sec)
Now
FOREACH($C_parent as $i => $c_id) {
$C_parent[$i] = SELECT SQL_CALC_FOUND_ROWS * FROM ABCD WHERE A=? AND B=? AND C=$c_id ORDER BY D ASC LIMIT 1;
}
So, we changed 1 query into 11 queries (10 parents per page) to make the page load happen faster, by getting rid of the filesort.
What 11 queries is faster then 1? Yes, for this case it is. The reason is because filesorts are SLOOOOW, they chew up a lot of limited resources and they should be avoided. I've see filesorts take close to 50-60% of the query time.
2 comments:
Oh, for pete's sake, make a column E that is the largest # for C's type minus the value of C and have the index (or create an additional index) on that (a,b,e,d).
Yea that would work for ints, but utf8 strings not so much.
Post a Comment