Hi Sergei, Thanks a lot for your answer, this was indeed extremely helpful!
- Do you really records_in_range() implementation like in (*) Yes.
- Are estimates of 10 and 20 rows close to reality? Not at all... The thing is: we can implement records_in_range not as a function with constant complexity and since a full table scan is not faster than a index range lookup in our storage engine (even if the whole table is contained in the range), we wanted to force MySQL to make always an index lookup. If I understand everything correctly, my error in thinking was, that I thought MySQL uses the information returned by records_in_range only to chose between index lookup and fulltable scan (since on a traditional disk architecture a full table scan will be faster than an index scan which scans the whole table). But this is obviously not the case. So we will have to implemented this records_in_range method. Do I understand this correct, that a logarithmic (with respect to the table size) run time will be ok for this method (InnoDB seems to walk two times through the index if I get the comments in the code correct)?
Thanks again and best regards Markus On Mon, May 14, 2012 at 11:20 AM, Sergei Petrunia <psergey@askmonty.org> wrote:
On Mon, May 14, 2012 at 08:40:59AM +0200, Markus Pilman wrote:
Hello all,
At ETH Zurich we are working on a new storage engine, that allows us to test several new architectures for transactional databases. So far we worked with MySQL, but we had massive performance issues. After some investigation we figured out, that MySQL generates different query plans for InnoDB than for our engine. One query which killed our performance was the following (this is a query from the TPC-W benchmark):
SELECT ol2.ol_i_id, SUM(ol2.ol_qty) AS sum_ol FROM order_line ol, order_line ol2, (SELECT o_id FROM orders ORDER BY o_date DESC LIMIT 10000) AS t WHERE ol.ol_o_id = t.o_id AND ol.ol_i_id = 10 AND ol2.ol_o_id = t.o_id AND ol2.ol_i_id <> 10 GROUP BY ol2.ol_i_id ORDER BY sum_ol DESC LIMIT 0,5
MySQL generated the following plan for InnoDB:
+----+-------------+------------+-------+-------------------------------+----------------+---------+-----------------+-------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+-------+-------------------------------+----------------+---------+-----------------+-------+---------------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 10000 | Using temporary; Using filesort | | 1 | PRIMARY | ol | ref | orderline_o_id,orderline_i_id | orderline_o_id | 8 | t.o_id | 1 | Using where | | 1 | PRIMARY | ol2 | ref | orderline_o_id,orderline_i_id | orderline_o_id | 8 | tpcw.ol.OL_O_ID | 1 | Using where | | 2 | DERIVED | orders | index | NULL | orders_o_date | 4 | NULL | 10000 | Using index | +----+-------------+------------+-------+-------------------------------+----------------+---------+-----------------+-------+---------------------------------+
while it generated the following one for our storage engine:
+----+-------------+------------+-------+-------------------------------+----------------+---------+-------+-------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+-------+-------------------------------+----------------+---------+-------+-------+----------------------------------------------+ | 1 | PRIMARY | ol | ref | orderline_o_id,orderline_i_id | orderline_i_id | 5 | const | 10 | Using where; Using temporary; Using filesort | | 1 | PRIMARY | ol2 | range | orderline_o_id,orderline_i_id | orderline_i_id | 5 | NULL | 20 | Using where; Using join buffer | | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 10000 | Using where; Using join buffer | | 2 | DERIVED | orders | index | NULL | orders_o_date | 4 | NULL | 10000 | | +----+-------------+------------+-------+-------------------------------+----------------+---------+-------+-------+----------------------------------------------+ It's difficult to make guesses just on EXPLAINs, but I will try: we see {table=ol, rows=10}, {table=ol2, rows=20} -- this makes me suspect that your storage engine has this function (I recall some skeleton storage engine implementation had this):
ha_rows records_in_range(...) { return 10; } (*)
The WHERE clause has: - ol.ol_i_id = 10 -- this producess ref acess on table `ol`, and the estimate of 10 rows comes from records_in_range() call.
- ol2.ol_i_id <> 10 -- the optimizer converts this to
(-inf < ol2.ol_i_id <10) OR ( 10< ol2.ol_i_id < +inf)
which is two ranges and we get an estimate of 20 rows.
Questions: - Do you really records_in_range() implementation like in (*)
- Are estimates of 10 and 20 rows close to reality?
The second one is obviously a very bad one. So we decided to try with MariaDB, which generates the following query plan:
+------+-------------+------------+-------+-------------------------------+----------------+---------+-----------------+-------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+------------+-------+-------------------------------+----------------+---------+-----------------+-------+---------------------------------+ | 1 | PRIMARY | ol | ref | orderline_o_id,orderline_i_id | orderline_i_id | 5 | const | 10 | Using temporary; Using filesort | | 1 | PRIMARY | ol2 | ref | orderline_o_id,orderline_i_id | orderline_o_id | 8 | test.ol.OL_O_ID | 11 | Using where | | 1 | PRIMARY | <derived2> | ref | key0 | key0 | 8 | test.ol.OL_O_ID | 10 | | | 2 | DERIVED | orders | index | NULL | orders_o_date | 4 | NULL | 10000 | | +------+-------------+------------+-------+-------------------------------+----------------+---------+-----------------+-------+---------------------------------+ table=<derived2>, key=key0 - apparently, this is MariaDB 5.3/5.5, and it's using "derived table with keys" (http://kb.askmonty.org/en/derived-table-with-key-optimization) optimization.
The query plan from MariaDB looks sane to me, and the numbers approve this (the query runs on a middle sized data set about 200 times faster with MariaDB than with MySQL). So we will continue our work with MariaDB. But I have a question to these query plans: why are we getting this differences in MySQL between our storage engine and InnoDB? Is there a feature in our storage engine missing (we first thought we need the ability to support HA_KEYREAD_ONLY - but implementing this feature did not change the query plan)? Or does MySQL some kind of "cheating"? We should understand this issue to be able to present our results we get later (may be we will compare MariaDB and MySQL, but in a paper we would have to explain why MySQL sucks that much).
And btw: good work with MariaDB!! The optimizer seems to do a much better job than MySQL - even with InnoDB/XtraDB (we had to rewrite some queries in MySQL to force it to generate sane query plans - with MariaDB this does not seem to be necessary anymore).
BR Sergei -- Sergei Petrunia, Software Developer Monty Program AB, http://askmonty.org Blog: http://s.petrunia.net/blog