Hello All, I work for TokuTek as a Storage Engine developer and I have been using the, relatively, new optimizer trace functionality. I have found the results very helpful when combined with analyzing the source code, but there a few questions I want to ask. Here is some output from the trace for a recent SELECT query on an InnoDB table: At the end of the trace, it shows the index chosen by, what I presume, is the optimizer: ... "considered_execution_plans": [ { "plan_prefix": [], "table": "`reports` `a`", "best_access_path": { "considered_access_paths": [ { "access_type": "ref", "index": "reports_group_id", "rows": 1.55e6, "cost": 411057, "chosen": true }, ... ] }, "cost_for_plan": 411057, "rows_for_plan": 1.55e6, "chosen": true } ] }, ... Here are the cost results for some rejected alternative indexes, (note: the row count for two of them is similar to the chosen index): ... "analyzing_range_alternatives": { "range_scan_alternatives" ... "rows": 2377102, "cost": 2.81e6, ... "rows": 1554164, "cost": 1.84e6, ... "rows": 2346664, "cost": 2.78e6, ... "rows": 1554164, "cost": 1.84e6, ... Here is the cost analysis for a table scan, from the beginning of the trace (still not as good as the above chosen index, but close on cost): ... "table": "`reports` `a`", "range_analysis": { "table_scan": { "rows": 3108329, "cost": 655076 }, ... Related to the fact that the table scan analysis is dominated by the large row count, and probably rejected as a valid plan compared to the chosen index for that very reason: 1. How is the row count more than merely informational to the optimizer? I had the impression the row count did not directly contribute to the optimizer's decision as much as the estimated cost, or that the row count was already accounted for/factored into in the final cost measurement. 2. Related to that question, what is cost exactly? Is there a unit of measurement (ex: Time, Disk Seeks, etc.)? 3. Is cost more important than row count or equally important? 4. What is the genesis of the cost variable displayed in the trace? I assume it is just a guess on the respective storage engine's part of how expensive (in terms of disk access, processing, time, etc.) getting the rows will be for the query. thanks in advance, Christian