Hi Varun,
On Wed, Apr 24, 2019 at 01:31:38PM +0530, Varun wrote:
revision-id: 88a80e92dc444ce30718f3e08d3ab66fb02bcea4 (mariadb-10.3.10-284-g88a80e92dc4)
parent(s): 3032cd8e91f1e1ead8b6f941e75cd29e473e7eaa
author: Varun Gupta
committer: Varun Gupta
timestamp: 2019-04-24 13:31:24 +0530
message:
MDEV-9959: A serious MariaDB server performance bug
Step #2: If any field in the select list of the derived tables is present in the group by list also , then we are again guaranteed
that ref access to the derived table would always produce one row per key.
...
diff --git a/sql/sql_lex.cc b/sql/sql_lex.cc
index c52005e7683..aa645ac00ca 100644
--- a/sql/sql_lex.cc
+++ b/sql/sql_lex.cc
@@ -7617,6 +7617,68 @@ Item *st_select_lex::build_cond_for_grouping_fields(THD *thd, Item *cond,
}
+/**
+ Check if any any item in the group by list is also present in the select_list
+ @retval true: All elements common between select and group by list
+*/
+
+void st_select_lex::is_group_by_prefix(KEY *keyinfo)
I think the name of the function is poor - "is_something()" hints at hints at boolean return
value, and I would say it doesn't imply any side effects.
Something like 'mark_derived_index_stats()' would be better, I think.
+{
+ uint key_parts= keyinfo->usable_key_parts;
+ KEY_PART_INFO *key_part_info= keyinfo->key_part;
+ bool found= FALSE;
+
+ if (key_parts < group_list.elements)
+ return;
+
+ uint matched_fields=0, i, j;
+ Item *item;
+
+ for (i= 0; i < key_parts; key_part_info++, i++)
+ {
+ uint fld_idx= key_part_info->fieldnr - 1;
+ item= join->fields_list.elem(fld_idx);
+ for (ORDER *order= group_list.first; order; order= order->next)
+ {
+ Item *ord_item= order->item[0]->real_item();
+ Item_equal *item_equal= ord_item->get_item_equal();
+
+ if (item_equal)
+ {
+ Item_equal_fields_iterator it(*item_equal);
+ Item *equal_item;
+ while ((equal_item= it++))
+ {
+ if (equal_item->eq(item, 0))
+ {
+ matched_fields++;
+ found= TRUE;
+ break;
+ }
+ }
+ }
+ else
+ {
+ if (item->eq(ord_item, 0))
+ {
+ matched_fields++;
+ found= TRUE;
+ }
+ }
+ if (found)
+ break;
+ }
+
+ if (matched_fields == group_list.elements)
+ {
+ for (j=matched_fields - 1; j < key_parts; j++)
+ keyinfo->rec_per_key[j]= 1;
This doesn't always work. An example:
create table t1 (a int, b int, c int, d int);
insert into t1 select a,a,a,a from one_k;
create table t2 as select * from t1;
# Note the "a, a AS b" in the select list. This is intentional.
explain select *
from
t2,
(select a, a as b from t1 group by a,b) TBL
where
t2.a=TBL.a and t2.b=TBL.b;
Here, the select output is not guaranteed to be unique (as it contains only one
of two GROUP BY columns). But I see the execution to reac the above line,
assigning keyinfo->rec_per_key[j]= 1.
+ return;
+ }
+ found= FALSE;
+ }
+}
+
int set_statement_var_if_exists(THD *thd, const char *var_name,
size_t var_name_length, ulonglong value)
{
A general comment: do you think we should expose this in optimizer trace?
At the moment, Derived-with-keys optimization doesn't print anything there
(neither in MariaDB nor in MySQL). One can see the generated KEYUSEs when they
are printed.
But perhaps we could add a node that would print that a derived table was
added, with this cardinality, and index with these keypart cardinalities.
This would make debugging easier. Any thoughts?
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog