Hi!
"Sergei" == Sergei Golubchik <serg@askmonty.org> writes:
<cut>
+ Data is clustered on this key. This means that when you read the key + you also get the row data in the same block.
Sergei> perhaps, better to say that "you get the row data without an extra seek Sergei> (or even without an extra disk read)". I added the comment. Sergei> A question: what about memory-based (and SSD) engines? Could/Should one Sergei> consider all indexes clustered? No, as there is still a high cost for fetching and managing an extra disk block. SSD are faster than disk, but not unlimited fast. For SSD, we need to add a different factor for the time to fetch a block. The storage engine can already take this into account a bit by adjusting the times for scan / key reads if things are on ssd. I have been discussing with igor for making all the timing constants variables to allow us to do real time tuning with them. (Good for finding better values for the constants that we have now). <cut>
- @retval TRUE Primary key (if there is one) is clustered - key covering all fields - @retval FALSE otherwise + Check if the primary key (if there is one) is a clustered key covering + all fields. This means:
Sergei> I wouldn't say that it "checks if the primary key is clustered", but Sergei> something like "historical method that returns TRUE if everything below Sergei> holds:" I don't like to say 'historical' as there is no plans to remove this code, only rename it... I did change the start to: Check if the primary key (if there is one) is a clustered key covering all fields and that the primary key is also stored as part of all other keys and used as a reference key.
+ + - Data is stored together with the primary key (no secondary lookup + needed to find the row data). The optimizer uses this to find out + the cost of fetching data. + - The primary key is part of each secondary key and is used + to find the row data in the primary index when reading trough + secondary indexes. + - When doing a HA_KEYREAD_ONLY we get also all the primary key parts + into the row. This is critical property used by index_merge.
Sergei> and here "all the above is usually true for engines that store the row Sergei> data in the primary key index (e.g. in a b-tree), and use the primary Sergei> key value as a position()" Added <cut>
+++ b/sql/sql_select.cc 2011-05-18 16:26:30 +0000 @@ -16525,9 +16527,9 @@ test_if_skip_sort_order(JOIN_TAB *tab,OR */ DBUG_ASSERT (ref_key != (int) nr);
- bool is_covering= table->covering_keys.is_set(nr) || - (nr == table->s->primary_key && - table->file->primary_key_is_clustered()); + bool is_covering= (table->covering_keys.is_set(nr) || + (table->file->index_flags(nr, 0, 1) & + HA_CLUSTERED_INDEX));
Sergei> See, that's what I said. It make sense to set covering_keys bitmap Sergei> for all clustered indexes once, when the table is opened, and then you Sergei> not only get this condition simpler, but probably will also cover other Sergei> cases where table->covering_keys is checked (or may be checked in the Sergei> future). Looking at how covering keys is used in many places, I am not sure that change is straightforward or will not cause wrong timing results. This is because in many cases we don't want to use a clustered key when we are trying to do index-only reads as the covering key is much slower than any other key. I agree that your idea has some merits, but I would not like to try to do it just now. Something you can look at when you are ready with 5.5... Regards, Monty