Tuesday, April 12, 2011

Optimizing MySQL filesort with small limit.

"filesort" is the MySQL catchall algorithm for producing ordered results. (For an explanation on how it works, see Sergey Petrunias blog http://s.petrunia.net/blog/?p=24).

For small data sets, "filesort" simply uses a quicksort in memory. However, for large data sets we create temporary (sorted) files, which are later merged.

In the recent MySQL release candidate http://dev.mysql.com/tech-resources/articles/whats-new-in-mysql-5.6.html this has been optimized for the following kinds of queries (and subqueries)

SELECT ... FROM single_table ... ORDER BY non_index_column [DESC] LIMIT N [OFFSET M];

If we have space in memory for (N + M) rows, we create a (bounded) Priority Queue to hold the data. This means we can produce the ordered result using a single table scan, thus saving the writing/reading of temporary files, and saving the merge step.

How can you see it in action? Sorry, there's no explain for it (yet). What you can do currently is to inspect the status counters which are associated with filesort, and verify they are all zero.

FLUSH STATUS;
SHOW SESSION STATUS LIKE 'Sort%';
<run your select ... limit query>
SHOW SESSION STATUS LIKE 'Sort%';

What kind of speedup can you expect? This depends on your data set, and your sort_buffer_size of course. With this table, and about 20 million rows, and default sort buffer, execution time drops from about 40 seconds to 10 seconds on my desktop machine:

CREATE TABLE t1(
  f0 int auto_increment PRIMARY KEY,
  f1 int,
  f2 varchar(200)
);

SELECT * FROM t1 ORDER BY f2 LIMIT 100;