[Commits] 33ffd3aeb5f: MDEV-24353: Adding GROUP BY slows down a query
revision-id: 33ffd3aeb5ff753ecdcd035114bbfd3acbbcafc2 (mariadb-10.5.2-320-g33ffd3aeb5f) parent(s): e9f33b7760539e2ba60fb236fab8eaf0ce53ca4a author: Varun Gupta committer: Varun Gupta timestamp: 2020-12-09 19:31:30 +0530 message: MDEV-24353: Adding GROUP BY slows down a query A heuristic in best_access_path says that if for an index ref access involved key parts which are greater than equal to that for range access, then range access should not be considered. The assumption made by this heuristic does not hold when the range optimizer opted to use the group-by min-max optimization. So the fix here would be to not consider the heuristic if the range optimizer picked the usage of group-by min-max optimization. --- mysql-test/main/group_min_max.result | 23 ++++++++++++++++++++++- mysql-test/main/group_min_max.test | 18 ++++++++++++++++++ sql/sql_select.cc | 7 +++++-- 3 files changed, 45 insertions(+), 3 deletions(-) diff --git a/mysql-test/main/group_min_max.result b/mysql-test/main/group_min_max.result index 4f32db780fd..5f51be715fe 100644 --- a/mysql-test/main/group_min_max.result +++ b/mysql-test/main/group_min_max.result @@ -2664,7 +2664,7 @@ a b 3 13 explain extended select sql_buffer_result a, max(b)+1 from t1 where a = 0 group by a; id select_type table type possible_keys key key_len ref rows filtered Extra -1 SIMPLE t1 ref a,index a 5 const 15 20.00 Using index; Using temporary +1 SIMPLE t1 range a,index a 5 NULL 3 100.00 Using where; Using index for group-by; Using temporary Warnings: Note 1003 select sql_buffer_result `test`.`t1`.`a` AS `a`,max(`test`.`t1`.`b`) + 1 AS `max(b)+1` from `test`.`t1` where `test`.`t1`.`a` = 0 group by `test`.`t1`.`a` drop table t1; @@ -4027,3 +4027,24 @@ drop table t1; # # End of 10.1 tests # +# +# MDEV-24353: Adding GROUP BY slows down a query +# +CREATE TABLE t1 (p int NOT NULL, a int NOT NULL, PRIMARY KEY (p,a)); +insert into t1 select 2,seq from seq_0_to_1000; +EXPLAIN select MIN(a) from t1 where p = 2 group by p; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 range PRIMARY PRIMARY 4 NULL 10 Using where; Using index for group-by +SELECT MIN(a) from t1 where p = 2 group by p; +MIN(a) +0 +EXPLAIN select MIN(a) from t1 group by p; +id select_type table type possible_keys key key_len ref rows Extra +1 SIMPLE t1 range NULL PRIMARY 4 NULL 10 Using index for group-by +SELECT MIN(a) from t1 where p = 2; +MIN(a) +0 +drop table t1; +# +# End of 10.6 tests +# diff --git a/mysql-test/main/group_min_max.test b/mysql-test/main/group_min_max.test index 526552dda92..1db479b1c74 100644 --- a/mysql-test/main/group_min_max.test +++ b/mysql-test/main/group_min_max.test @@ -4,6 +4,7 @@ # --source include/default_optimizer_switch.inc +--source include/have_sequence.inc # # TODO: @@ -1689,3 +1690,20 @@ drop table t1; --echo # --echo # End of 10.1 tests --echo # + +--echo # +--echo # MDEV-24353: Adding GROUP BY slows down a query +--echo # + +CREATE TABLE t1 (p int NOT NULL, a int NOT NULL, PRIMARY KEY (p,a)); +insert into t1 select 2,seq from seq_0_to_1000; + +EXPLAIN select MIN(a) from t1 where p = 2 group by p; +SELECT MIN(a) from t1 where p = 2 group by p; +EXPLAIN select MIN(a) from t1 group by p; +SELECT MIN(a) from t1 where p = 2; +drop table t1; + +--echo # +--echo # End of 10.6 tests +--echo # diff --git a/sql/sql_select.cc b/sql/sql_select.cc index bf3d4fd9bb1..060ac427a3c 100644 --- a/sql/sql_select.cc +++ b/sql/sql_select.cc @@ -7926,7 +7926,8 @@ best_access_path(JOIN *join, access is to use the same index IDX, with the same or more key parts. (note: it is not clear how this rule is/should be extended to index_merge quick selects). Also if we have a hash join we prefer that - over a table scan + over a table scan. This heuristic is no more allowed for + if the range optimizer opted to use the group-by min-max optimization. (3) See above note about InnoDB. (4) NOT ("FORCE INDEX(...)" is used for table and there is 'ref' access path, but there is no quick select) @@ -7944,7 +7945,9 @@ best_access_path(JOIN *join, Json_writer_object trace_access_scan(thd); if ((records >= s->found_records || best > s->read_time) && // (1) !(best_key && best_key->key == MAX_KEY) && // (2) - !(s->quick && best_key && s->quick->index == best_key->key && // (2) + !(s->quick && + s->quick->get_type() != QUICK_SELECT_I::QS_TYPE_GROUP_MIN_MAX && + best_key && s->quick->index == best_key->key && // (2) best_max_key_part >= s->table->opt_range[best_key->key].key_parts) &&// (2) !((s->table->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX) && // (3) ! s->table->covering_keys.is_clear_all() && best_key && !s->quick) &&// (3)
Hi Varun, Please find some input below. Ok to push when addressed. On Wed, Dec 09, 2020 at 07:33:42PM +0530, varun wrote:
revision-id: 33ffd3aeb5ff753ecdcd035114bbfd3acbbcafc2 (mariadb-10.5.2-320-g33ffd3aeb5f) parent(s): e9f33b7760539e2ba60fb236fab8eaf0ce53ca4a author: Varun Gupta committer: Varun Gupta timestamp: 2020-12-09 19:31:30 +0530 message:
MDEV-24353: Adding GROUP BY slows down a query
A heuristic in best_access_path says that if for an index ref access involved key parts which are greater than equal to that for range access, then range access should not be considered. The assumption made by this heuristic does not hold when the range optimizer opted to use the group-by min-max optimization. So the fix here would be to not consider the heuristic if the range optimizer picked the usage of group-by min-max optimization.
--- mysql-test/main/group_min_max.result | 23 ++++++++++++++++++++++- mysql-test/main/group_min_max.test | 18 ++++++++++++++++++ sql/sql_select.cc | 7 +++++-- 3 files changed, 45 insertions(+), 3 deletions(-) ... diff --git a/sql/sql_select.cc b/sql/sql_select.cc index bf3d4fd9bb1..060ac427a3c 100644 --- a/sql/sql_select.cc +++ b/sql/sql_select.cc @@ -7926,7 +7926,8 @@ best_access_path(JOIN *join, access is to use the same index IDX, with the same or more key parts. (note: it is not clear how this rule is/should be extended to index_merge quick selects). Also if we have a hash join we prefer that - over a table scan + over a table scan. This heuristic is no more allowed for + if the range optimizer opted to use the group-by min-max optimization.
"no more allowed for if the range optimizer" is bad wording. Please re-word, at least into something "The heuristict doesn't apply if the quick select is a group-by min-max quick select".
(3) See above note about InnoDB. (4) NOT ("FORCE INDEX(...)" is used for table and there is 'ref' access path, but there is no quick select) @@ -7944,7 +7945,9 @@ best_access_path(JOIN *join, Json_writer_object trace_access_scan(thd); if ((records >= s->found_records || best > s->read_time) && // (1) !(best_key && best_key->key == MAX_KEY) && // (2) - !(s->quick && best_key && s->quick->index == best_key->key && // (2) + !(s->quick &&
Please follow the pattern and add "// (2)" at the each line occupied by the check for (2).
+ s->quick->get_type() != QUICK_SELECT_I::QS_TYPE_GROUP_MIN_MAX && + best_key && s->quick->index == best_key->key && // (2) best_max_key_part >= s->table->opt_range[best_key->key].key_parts) &&// (2) !((s->table->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX) && // (3) ! s->table->covering_keys.is_clear_all() && best_key && !s->quick) &&// (3)
BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
participants (2)
-
Sergey Petrunia
-
varun