Re: [Maria-developers] [Commits] 88a80e92dc4: MDEV-9959: A serious MariaDB server performance bug

27 Apr 2019

Hi Varun,

On Wed, Apr 24, 2019 at 01:31:38PM +0530, Varun wrote:
...
revision-id: 88a80e92dc444ce30718f3e08d3ab66fb02bcea4 (mariadb-10.3.10-284-g88a80e92dc4)
parent(s): 3032cd8e91f1e1ead8b6f941e75cd29e473e7eaa
author: Varun Gupta
committer: Varun Gupta
timestamp: 2019-04-24 13:31:24 +0530
message:
MDEV-9959: A serious MariaDB server performance bug
Step #2: If any field in the select list of the derived tables is present in the group by list also , then we are again guaranteed
that ref access to the derived table would always produce one row per key.
...
...

diff --git a/sql/sql_lex.cc b/sql/sql_lex.cc
index c52005e7683..aa645ac00ca 100644
--- a/sql/sql_lex.cc
+++ b/sql/sql_lex.cc
@@ -7617,6 +7617,68 @@ Item *st_select_lex::build_cond_for_grouping_fields(THD *thd, Item *cond,
 }
+/**
+  Check if any any item in the group by list is also present in the select_list
+  @retval true: All elements common between select and group by list
+*/
+
+void st_select_lex::is_group_by_prefix(KEY *keyinfo)
I think the name of the function is poor - "is_something()" hints at hints at boolean return
value, and I would say it doesn't imply any side effects.

Something like 'mark_derived_index_stats()' would be better, I think.
...
+{
+  uint key_parts= keyinfo->usable_key_parts;
+  KEY_PART_INFO *key_part_info= keyinfo->key_part;
+  bool found= FALSE;
+
+  if (key_parts < group_list.elements)
+    return;
+
+  uint matched_fields=0, i, j;
+  Item *item;
+
+  for (i= 0; i < key_parts; key_part_info++, i++)
+  {
+    uint fld_idx= key_part_info->fieldnr - 1;
+    item= join->fields_list.elem(fld_idx);
+    for (ORDER *order= group_list.first; order; order= order->next)
+    {
+      Item *ord_item= order->item[0]->real_item();
+      Item_equal *item_equal= ord_item->get_item_equal();
+
+      if (item_equal)
+      {
+        Item_equal_fields_iterator it(*item_equal);
+        Item *equal_item;
+        while ((equal_item= it++))
+        {
+          if (equal_item->eq(item, 0))
+          {
+            matched_fields++;
+            found= TRUE;
+            break;
+          }
+        }
+      }
+      else
+      {
+        if (item->eq(ord_item, 0))
+        {
+          matched_fields++;
+          found= TRUE;
+        }
+      }
+      if (found)
+        break;
+    }
+
+    if (matched_fields == group_list.elements)
+    {
+      for (j=matched_fields - 1; j < key_parts; j++)
+        keyinfo->rec_per_key[j]= 1;
This doesn't always work. An example:

create table t1 (a int, b int, c int, d int);
insert into t1 select a,a,a,a from one_k;

create table t2 as select * from t1;

# Note the "a, a AS b" in the select list. This is intentional.

explain select * 
from 
  t2,
  (select a, a as b from t1 group by a,b) TBL
where
  t2.a=TBL.a and t2.b=TBL.b;

Here, the select output is not guaranteed to be unique (as it contains only one
of two GROUP BY columns). But I see the execution to reac the above line,
assigning keyinfo->rec_per_key[j]= 1.
...
+      return;
+    }
+    found= FALSE;
+  }
+}
+
 int set_statement_var_if_exists(THD *thd, const char *var_name,
                                 size_t var_name_length, ulonglong value)
 {
A general comment: do you think we should expose this in optimizer trace? 

At the moment, Derived-with-keys optimization doesn't print anything there
(neither in MariaDB nor in MySQL).  One can see the generated KEYUSEs when they
are printed.

But perhaps we could add a node that would print that a derived table was
added, with this cardinality, and index with these keypart cardinalities. 

This would make debugging easier. Any thoughts?

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog

    

Re: [Maria-developers] [Commits] 88a80e92dc4: MDEV-9959: A serious MariaDB server performance bug

Sergey Petrunia