Re: [Maria-developers] Question about Query plan

14 May 2012


      Ah cool, thanks a lot for this information, that is all very useful!
So we added the implementation of these two hanlder methods on our
todo list. I also did not know about the keyread_time (etc) functions.

Thanks and best regards

Markus

On Mon, May 14, 2012 at 3:58 PM, Sergei Petrunia <psergey@askmonty.org> wrote:
...
On Mon, May 14, 2012 at 03:15:09PM +0200, Markus Pilman wrote:
...
...
- Do you really records_in_range() implementation like in (*)
Yes.
...
- Are estimates of 10 and 20 rows close to reality?
Not at all... The thing is: we can implement records_in_range not as a
function with constant complexity and since a full table scan is not
faster than a index range lookup in our storage engine (even if the
whole table is contained in the range), we wanted to force MySQL to
make always an index lookup. If I understand everything correctly, my
error in thinking was, that I thought MySQL uses the information
returned by records_in_range only to chose between index lookup and
fulltable scan (since on a traditional disk architecture a full table
scan will be faster than an index scan which scans the whole table).
But this is obviously not the case.
As far as I understand you, you're correct. Information returned by
records_in_range() is used in multiple places:
1. It is used to determine whether to make range scan or a full table scan.
2. It affects the order in which tables are joined.
3. (maybe there is something else it affects also)
I think you should be able to force MySQL into doing index lookups instead of
table scans by returing appropriate costs from
handler->keyread_time() and handler->read_time() as opposed to scan_time().
...
So we will have to implemented this records_in_range method. Do I
understand this correct, that a logarithmic (with respect to the table
size) run time will be ok for this method (InnoDB seems to walk two
times through the index if I get the comments in the code correct)?
I think logarithmic time should be ok.
Note that records_in_range() can be called multiple times if you have multiple
ranges. For example, for query like
SELECT * FROM tbl WHERE t.key < 11 OR t.key IN (15, 17, 19, 22)
you will get these calls:
 records_in_range(t.key < 11)
 records_in_range(t.key=15)
 records_in_range(t.key=17)
 records_in_range(t.key=19)
 records_in_range(t.key=22)
If you prefer to get all ranges at once, check out
handler::multi_range_read_info_const() in sql/multi_range_read.cc - you can
implement that function and get access to a  disjoint-and-sorted list of ranges.
(however, records_in_range() will still need to be implemented because there
are certain cases where records_in_range() is called directly, not through
multi_range_read_info_const()).
BR
 Sergei
--
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog