Hi, Michael! On May 18, Michael Widenius wrote:
=== modified file 'sql/handler.h' --- a/sql/handler.h 2011-05-16 11:05:45 +0000 +++ b/sql/handler.h 2011-05-18 16:26:30 +0000 @@ -161,8 +161,11 @@ */ #define HA_KEY_SCAN_NOT_ROR 128 #define HA_DO_INDEX_COND_PUSHDOWN 256 /* Supports Index Condition Pushdown */ - - +/* + Data is clustered on this key. This means that when you read the key + you also get the row data in the same block.
perhaps, better to say that "you get the row data without an extra seek (or even without an extra disk read)". A question: what about memory-based (and SSD) engines? Could/Should one consider all indexes clustered?
+*/ +#define HA_CLUSTERED_INDEX 512
/* bits in alter_table_flags: @@ -2311,9 +2314,22 @@ class handler :public Sql_alloc
/* - @retval TRUE Primary key (if there is one) is clustered - key covering all fields - @retval FALSE otherwise + Check if the primary key (if there is one) is a clustered key covering + all fields. This means:
I wouldn't say that it "checks if the primary key is clustered", but something like "historical method that returns TRUE if everything below holds:"
+ + - Data is stored together with the primary key (no secondary lookup + needed to find the row data). The optimizer uses this to find out + the cost of fetching data. + - The primary key is part of each secondary key and is used + to find the row data in the primary index when reading trough + secondary indexes. + - When doing a HA_KEYREAD_ONLY we get also all the primary key parts + into the row. This is critical property used by index_merge.
and here "all the above is usually true for engines that store the row data in the primary key index (e.g. in a b-tree), and use the primary key value as a position()"
+ + For a clustered primary key, index_flags() returns also HA_CLUSTERED_INDEX + + @retval TRUE yes + @retval FALSE No. */ virtual bool primary_key_is_clustered() { return FALSE; } virtual int cmp_ref(const uchar *ref1, const uchar *ref2)
=== modified file 'sql/sql_select.cc' --- a/sql/sql_select.cc 2011-05-16 12:07:04 +0000 +++ b/sql/sql_select.cc 2011-05-18 16:26:30 +0000 @@ -16525,9 +16527,9 @@ test_if_skip_sort_order(JOIN_TAB *tab,OR */ DBUG_ASSERT (ref_key != (int) nr);
- bool is_covering= table->covering_keys.is_set(nr) || - (nr == table->s->primary_key && - table->file->primary_key_is_clustered()); + bool is_covering= (table->covering_keys.is_set(nr) || + (table->file->index_flags(nr, 0, 1) & + HA_CLUSTERED_INDEX));
See, that's what I said. It make sense to set covering_keys bitmap for all clustered indexes once, when the table is opened, and then you not only get this condition simpler, but probably will also cover other cases where table->covering_keys is checked (or may be checked in the future).
/* Don't use an index scan with ORDER BY without limit. === modified file 'sql/sql_table.cc' --- a/sql/sql_table.cc 2011-05-16 11:05:45 +0000 +++ b/sql/sql_table.cc 2011-05-18 16:26:30 +0000 @@ -7983,7 +7983,8 @@ copy_data_between_tables(TABLE *from,TAB
if (order) { - if (to->s->primary_key != MAX_KEY && to->file->primary_key_is_clustered()) + if (to->s->primary_key != MAX_KEY && + to->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX)
Hmm, good. It took me a while to come up with a realistic use case where to->file->primary_key_is_clustered() is false, but to->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX is true. Example: new InnoDB optimization that makes it to use internal generated primary key when user creates a PK of, say, VARCHAR(255). That is, in this optimization InnoDB treat this key as UNIQUE but not PRIMARY. Not completely unrealistic idea. In this case to->s->primary_key != MAX_KEY is true, there is a primary key. to->file->primary_key_is_clustered() is false, but to->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX is true. Heh, this is a good example, perhaps we need to apply it as a test to other code that uses primary_key_is_clustered() and see if the condition is used correctly.
{ char warn_buff[MYSQL_ERRMSG_SIZE]; my_snprintf(warn_buff, sizeof(warn_buff),
Regards, Sergei