Re: MDEV-27576 Use reverse index for max/min optimization
Hi Yuchen,
commit c4c4bd8e47e5bfd4d0a650e286ea9daa3e6276f0 Author: Yuchen Pei <ycp@mariadb.com> Date: Thu Nov 2 11:55:42 2023 +1100
MDEV-27576 Use reverse index for max/min optimization
We add a field in st_table_ref to indicate that the key used is a descending index, which will flip the functions and flags used in get_index_max_value() and get_index_min_value(), that allows correct optimization for max/min for descending index.
Generally the patch looks good. I have only a few cosmetic comments. Please rename the test file from mdev_27576.test to something more descriptive like desc_index_min_max.test (following desc_index_range.test) Also please rename the table in the test to t1.
diff --git a/sql/sql_select.h b/sql/sql_select.h index e0bbadadc69..45d7527280a 100644 --- a/sql/sql_select.h +++ b/sql/sql_select.h @@ -126,6 +126,7 @@ typedef struct st_table_ref uint key_parts; ///< num of ... uint key_length; ///< length of key_buff int key; ///< key no + bool reverse; ///< key is reverse uchar *key_buff; ///< value to look for with key uchar *key_buff2; ///< key_buff+key_length store_key **key_copy; //
It's actually not "key is reverse" but "last used key part is reverse"... also this member is never set for the most common scenario of TABLE_REF use, which is ref access. The member is set in a rather trivial logic in find_key_for_maxmin(), and then it's used in get_index_min_value() and get_index_max_value(). I suggest that we do without this member. get_index_min_value() and get_index_max_value() can check: bool reverse= table->key_info[ref->key].key_part[ref->key_part]->key_part_flag & HA_REVERSE_SORT
diff --git a/sql/opt_sum.cc b/sql/opt_sum.cc index 794ec40fc66..1f94a4f4dc7 100644 --- a/sql/opt_sum.cc +++ b/sql/opt_sum.cc @@ -114,7 +114,8 @@ static int get_index_min_value(TABLE *table, TABLE_REF *ref, int error;
if (!ref->key_length) - error= table->file->ha_index_first(table->record[0]); + error= ref->reverse ? table->file->ha_index_last(table->record[0]) : + table->file->ha_index_first(table->record[0]);
Please add { ... } as the if-branch is a multi-line statement now.
else { /* @@ -206,8 +210,12 @@ static int get_index_max_value(TABLE *table, TABLE_REF *ref, uint range_fl) table->file->ha_index_read_map(table->record[0], ref->key_buff, make_prev_keypart_map(ref->key_parts), range_fl & NEAR_MAX ? - HA_READ_BEFORE_KEY : - HA_READ_PREFIX_LAST_OR_PREV) : + (ref->reverse ? HA_READ_AFTER_KEY : + HA_READ_BEFORE_KEY) : + (ref->reverse ? HA_READ_KEY_OR_NEXT : + HA_READ_PREFIX_LAST_OR_PREV)) : + ref->reverse ? + table->file->ha_index_first(table->record[0]) : table->file->ha_index_last(table->record[0]));
Multiple levels of x?y:z statements are hard to read. Please change to if (ref->key_length) { ... } else { ... } BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Blog: http://petrunia.net
Hi Sergey,
[... 14 lines elided]
Generally the patch looks good. I have only a few cosmetic comments.
Thanks for the comments. I've updated my patch and the commit is in my latest comment in the ticket. I understand this task has fix version 11.5 so no rush.
Please rename the test file from mdev_27576.test to something more descriptive like desc_index_min_max.test (following desc_index_range.test)
Done.
Also please rename the table in the test to t1.
Done, but why?
diff --git a/sql/sql_select.h b/sql/sql_select.h index e0bbadadc69..45d7527280a 100644 --- a/sql/sql_select.h +++ b/sql/sql_select.h @@ -126,6 +126,7 @@ typedef struct st_table_ref uint key_parts; ///< num of ... uint key_length; ///< length of key_buff int key; ///< key no + bool reverse; ///< key is reverse uchar *key_buff; ///< value to look for with key uchar *key_buff2; ///< key_buff+key_length store_key **key_copy; //
It's actually not "key is reverse" but "last used key part is reverse"... also this member is never set for the most common scenario of TABLE_REF use, which is ref access.
The member is set in a rather trivial logic in find_key_for_maxmin(), and then it's used in get_index_min_value() and get_index_max_value().
I suggest that we do without this member. get_index_min_value() and get_index_max_value() can check:
bool reverse= table->key_info[ref->key].key_part[ref->key_part]->key_part_flag & HA_REVERSE_SORT
I tried this, but it did not work. For example it will not work on --8<---------------cut here---------------start------------->8--- create or replace table t (a int, key(a desc)) engine=innodb; insert into t select seq * 2 from seq_1_to_100 order by rand(1); select min(a) from t; --8<---------------cut here---------------end--------------->8--- because the ref->key_parts is mutated, see (A) below: --8<---------------cut here---------------start------------->8--- static bool find_key_for_maxmin(bool max_fl, TABLE_REF *ref, Field* field, COND *cond, uint *range_fl, uint *prefix_len) { // [... 37 lines elided] if (field->eq(part->field)) { ref->key= idx; ref->key_length= 0; ref->key_parts= 0; key_part_map key_part_used= 0; *range_fl= NO_MIN_RANGE | NO_MAX_RANGE; if (matching_cond(max_fl, ref, keyinfo, part, cond, &key_part_used, range_fl, prefix_len) && !(key_part_to_use & ~key_part_used)) { if (!max_fl && key_part_used == key_part_to_use && part->null_bit) { // [... 17 lines elided] ref->key_parts++; // (A) DBUG_ASSERT(ref->key_parts == jdx+1); // [... 15 lines elided] } --8<---------------cut here---------------end--------------->8--- But I agree there's no need to add a field to st_table_ref, as it is only set and used in a block inside opt_sum_query(). So I used a local variable called reverse instead, just like range_fl and prefix_len, and passed it to find_key_fo_maxmin(), get_index_max_value() and get_index_min_value(). I moved the update to reverse just before the DBUG_RETURN(TRUE) in find_key_for_maxmin() so that it is only updated when needed.
diff --git a/sql/opt_sum.cc b/sql/opt_sum.cc index 794ec40fc66..1f94a4f4dc7 100644 --- a/sql/opt_sum.cc +++ b/sql/opt_sum.cc @@ -114,7 +114,8 @@ static int get_index_min_value(TABLE *table, TABLE_REF *ref, int error;
if (!ref->key_length) - error= table->file->ha_index_first(table->record[0]); + error= ref->reverse ? table->file->ha_index_last(table->record[0]) : + table->file->ha_index_first(table->record[0]);
Please add { ... } as the if-branch is a multi-line statement now.
Done.
else { /* @@ -206,8 +210,12 @@ static int get_index_max_value(TABLE *table, TABLE_REF *ref, uint range_fl) table->file->ha_index_read_map(table->record[0], ref->key_buff, make_prev_keypart_map(ref->key_parts), range_fl & NEAR_MAX ? - HA_READ_BEFORE_KEY : - HA_READ_PREFIX_LAST_OR_PREV) : + (ref->reverse ? HA_READ_AFTER_KEY : + HA_READ_BEFORE_KEY) : + (ref->reverse ? HA_READ_KEY_OR_NEXT : + HA_READ_PREFIX_LAST_OR_PREV)) : + ref->reverse ? + table->file->ha_index_first(table->record[0]) : table->file->ha_index_last(table->record[0]));
Multiple levels of x?y:z statements are hard to read.
Please change to
if (ref->key_length) { ... } else { ... }
Done.
BR Sergei
Best, Yuchen
Hi Yuchen, On Thu, Nov 09, 2023 at 11:57:35AM +1100, Yuchen Pei wrote:
Hi Sergey,
[... 14 lines elided]
Generally the patch looks good. I have only a few cosmetic comments.
Thanks for the comments. I've updated my patch and the commit is in my latest comment in the ticket. I understand this task has fix version 11.5 so no rush.
Please rename the test file from mdev_27576.test to something more descriptive like desc_index_min_max.test (following desc_index_range.test)
Done.
Also please rename the table in the test to t1.
Done, but why?
It's a convention to name tables t1,t2 ... One may argue if it has a lot of merit. At least one of them is that one known which table names would not conflict with tables used in the test.
diff --git a/sql/sql_select.h b/sql/sql_select.h index e0bbadadc69..45d7527280a 100644 --- a/sql/sql_select.h +++ b/sql/sql_select.h @@ -126,6 +126,7 @@ typedef struct st_table_ref uint key_parts; ///< num of ... uint key_length; ///< length of key_buff int key; ///< key no + bool reverse; ///< key is reverse uchar *key_buff; ///< value to look for with key uchar *key_buff2; ///< key_buff+key_length store_key **key_copy; //
It's actually not "key is reverse" but "last used key part is reverse"... also this member is never set for the most common scenario of TABLE_REF use, which is ref access.
The member is set in a rather trivial logic in find_key_for_maxmin(), and then it's used in get_index_min_value() and get_index_max_value().
I suggest that we do without this member. get_index_min_value() and get_index_max_value() can check:
bool reverse= table->key_info[ref->key].key_part[ref->key_part]->key_part_flag & HA_REVERSE_SORT
I tried this, but it did not work. For example it will not work on ...
I still think it could work, I've just made a typo, it should be table->key_info[ref->key].key_part[ref->key_parts-1].key_part_flag but it's fine if a local variable is passed around. Please apply the attached diff fixing coding style. After that let's create a preview tree and submit the feature for testing. BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Blog: http://petrunia.net
participants (2)
-
Sergey Petrunia
-
Yuchen Pei