Hi Christian, On Thu, Feb 20, 2014 at 06:31:45PM -0500, Christian Rober wrote:
I work for TokuTek as a Storage Engine developer and I have been using the, relatively, new optimizer trace functionality. I have found the results very helpful when combined with analyzing the source code, but there a few questions I want to ask. Here is some output from the trace for a recent SELECT query on an InnoDB table:
At the end of the trace, it shows the index chosen by, what I presume, is the optimizer:
... "considered_execution_plans": [ { "plan_prefix": [], "table": "`reports` `a`", "best_access_path": { "considered_access_paths": [ { "access_type": "ref", "index": "reports_group_id", "rows": 1.55e6, "cost": 411057, "chosen": true }, ... ] }, "cost_for_plan": 411057, "rows_for_plan": 1.55e6, "chosen": true } ] }, ...
Here are the cost results for some rejected alternative indexes, (note: the row count for two of them is similar to the chosen index):
... "analyzing_range_alternatives": { "range_scan_alternatives" ... "rows": 2377102, "cost": 2.81e6, ... "rows": 1554164, "cost": 1.84e6, ... "rows": 2346664, "cost": 2.78e6, ... "rows": 1554164, "cost": 1.84e6, ...
Here is the cost analysis for a table scan, from the beginning of the trace (still not as good as the above chosen index, but close on cost):
... "table": "`reports` `a`", "range_analysis": { "table_scan": { "rows": 3108329, "cost": 655076 }, ...
Related to the fact that the table scan analysis is dominated by the large row count, and probably rejected as a valid plan compared to the chosen index for that very reason:
1. How is the row count more than merely informational to the optimizer? I had the impression the row count did not directly contribute to the optimizer's decision as much as the estimated cost, or that the row count was already accounted for/factored into in the final cost measurement. 2. Related to that question, what is cost exactly? Is there a unit of measurement (ex: Time, Disk Seeks, etc.)?
The cost is "random IO disk seeks". However, a lot of parts of the cost model do not distinguish between random and sequential disk io, do not account for presence of caches, etc. Another important constant is (1/TIME_FOR_COMPARE). This is a cost of evaluating the WHERE clause (or some part of it).
3. Is cost more important than row count or equally important?
The decision which query plan should be used is made based on cost. However, the join optimizer's cost calculations put a big penalty on query plans that produce lots of rows. See sql_select.cc, best_extension_by_limited_search(): ... /* Compute the cost of extending the plan with 's' */ current_record_count= record_count * position->records_read; current_read_time=read_time + position->read_time + current_record_count / (double) TIME_FOR_COMPARE; ... Note the last part, current_record_count/TIME_FOR_COMPARE. current_record_count is "# random disk reads", so adding current_record_count/10 puts a big penalty on query plans that return a lot of rows.
4. What is the genesis of the cost variable displayed in the trace? I assume it is just a guess on the respective storage engine's part of how expensive (in terms of disk access, processing, time, etc.) getting the rows will be for the query.
Cost numbers for basic operations (range scan, full table scan) come from the storage engine. Cost numbers for more complex query plans (joins, index_merge, etc) come from cost calculations done at the SQL layer. BR Sergei -- Sergei Petrunia, Software Developer MariaDB | Skype: sergefp | Blog: http://s.petrunia.net/blog