[Maria-developers] MDEV-17399: JSON_TABLE: Incorrect code with table elimination
Hi Alexey,
diff --git a/sql/opt_table_elimination.cc b/sql/opt_table_elimination.cc index 3958797ec44..f2497d524ec 100644 --- a/sql/opt_table_elimination.cc +++ b/sql/opt_table_elimination.cc @@ -637,6 +637,16 @@ void eliminate_tables(JOIN *join) List_iterator<Item> it(join->fields_list); while ((item= it++)) used_tables |= item->used_tables(); + + { + List_iterator<TABLE_LIST> it(*join->join_list); + TABLE_LIST *tbl; + while ((tbl= it++)) + { + if (tbl->table_function) + used_tables|= tbl->table_function->used_tables(); + } + }
This only walks the tables that are "at the top level" of the join. if a JSON_TABLE(...) is inside a nested outer join, it will not be found. Please walk select_lex->leaf_tables instead. Please add the testcase (I provide one below). Please also add a comment, clarifying what is being done, something like: Table function JSON_TABLE() can have references to other tables. Do not eliminate the tables that JSON_TABLE() refers to. Note: the JSON_TABLE itself cannot be eliminated as it doesn't have unique keys. == Testcase == create table t20 (a int not null); create table t21 (a int not null primary key, js varchar(100)); insert into t20 select seq from seq_1_to_100; insert into t21 select a, '{"a":100}' from t20; create table t31(a int); create table t32(b int); insert into t31 values (1); insert into t32 values (1); explain select t20.a, jt1.ab from t20 left join t21 on t20.a=t21.a join (t31 left join (t32 join JSON_TABLE(t21.js,'$' COLUMNS (ab INT PATH '$.a')) AS jt1) on t31.a<3); BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
I've rewritten it using the json->select_lex->leaf_tables as you suggested. So the JSON_TABLE is calculated properly and all the dependencies seem to be marked. Still your testcase query fails the same way, so needs some more investigation. To me it looks more like optimizer issue than the JSON_TABLE-s. Best regards. HF On Fri, Mar 12, 2021 at 2:21 AM Sergey Petrunia <sergey@mariadb.com> wrote:
Hi Alexey,
diff --git a/sql/opt_table_elimination.cc b/sql/opt_table_elimination.cc index 3958797ec44..f2497d524ec 100644 --- a/sql/opt_table_elimination.cc +++ b/sql/opt_table_elimination.cc @@ -637,6 +637,16 @@ void eliminate_tables(JOIN *join) List_iterator<Item> it(join->fields_list); while ((item= it++)) used_tables |= item->used_tables(); + + { + List_iterator<TABLE_LIST> it(*join->join_list); + TABLE_LIST *tbl; + while ((tbl= it++)) + { + if (tbl->table_function) + used_tables|= tbl->table_function->used_tables(); + } + }
This only walks the tables that are "at the top level" of the join. if a JSON_TABLE(...) is inside a nested outer join, it will not be found.
Please walk select_lex->leaf_tables instead. Please add the testcase (I provide one below).
Please also add a comment, clarifying what is being done, something like:
Table function JSON_TABLE() can have references to other tables. Do not eliminate the tables that JSON_TABLE() refers to. Note: the JSON_TABLE itself cannot be eliminated as it doesn't have unique keys.
== Testcase ==
create table t20 (a int not null); create table t21 (a int not null primary key, js varchar(100));
insert into t20 select seq from seq_1_to_100; insert into t21 select a, '{"a":100}' from t20;
create table t31(a int); create table t32(b int); insert into t31 values (1); insert into t32 values (1);
explain select t20.a, jt1.ab from t20 left join t21 on t20.a=t21.a join (t31 left join (t32 join JSON_TABLE(t21.js,'$' COLUMNS (ab INT PATH '$.a')) AS jt1) on t31.a<3);
BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
On Sat, Mar 13, 2021 at 04:01:56PM +0400, Alexey Botchkov wrote:
I've rewritten it using the json->select_lex->leaf_tables as you suggested. So the JSON_TABLE is calculated properly and all the dependencies seem to be marked. Still your testcase query fails the same way, so needs some more investigation. To me it looks more like optimizer issue than the JSON_TABLE-s.
Can you construct an example that would demonstrate the issue without use of JSON_TABLE? (No). I've looked - it looks like the optimizer is unable to handle certain kinds of dependencies... which didn't exist before JSON_TABLE was added. == Short == The optimizer fails to construct a complete join order: 1. It builds a "dead-end" join prefix, which cannot be expanded to a complete join order. 2. Join prefix pruning removes other join prefixes from consideration. == Long == Table dependencies: t20 -> {} t21 -> t20 t31 -> {} t32 -> t31 jt -> t20 t21 t31 jt depends on: t31, because it's a left join t21, because it refers to it in its argument t20 due to transitive clouse of the dependency relation. t21 depends on t20. The optimizer constructs a prefix: t31, t32 This is a "dead end", it cannot be extended to a full join order: we cannot add t20 or t21 because we've entered the join nest: t1 left join (t32 ...) and must add all tables in the nest before we add any table that's not part of the nest. The table in the nest is "JSON_TABLE(...) as jt", but we cannot add it, because it needs t21 to be added first. Why doesn't the optimizer construct a join order starting from e.g. t20 t21 t31 ... The issue is in pruning. The join orders starting from t20 are pruned away because they look less promising than the join orders starting from t31. (Does MySQL-8 have this issue? No, they don't. They have a found_plan_with_allowed_sj member, and this logic: do not do pruning if there was no complete join order constructed, yet.) BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
A patch for this issue: http://lists.askmonty.org/pipermail/commits/2021-March/014500.html I'll need to check one more thing. On Sun, Mar 14, 2021 at 08:42:40PM +0300, Sergey Petrunia wrote:
On Sat, Mar 13, 2021 at 04:01:56PM +0400, Alexey Botchkov wrote:
I've rewritten it using the json->select_lex->leaf_tables as you suggested. So the JSON_TABLE is calculated properly and all the dependencies seem to be marked. Still your testcase query fails the same way, so needs some more investigation. To me it looks more like optimizer issue than the JSON_TABLE-s.
Can you construct an example that would demonstrate the issue without use of JSON_TABLE? (No). I've looked - it looks like the optimizer is unable to handle certain kinds of dependencies... which didn't exist before JSON_TABLE was added.
== Short ==
The optimizer fails to construct a complete join order: 1. It builds a "dead-end" join prefix, which cannot be expanded to a complete join order. 2. Join prefix pruning removes other join prefixes from consideration.
== Long == Table dependencies:
t20 -> {} t21 -> t20 t31 -> {} t32 -> t31 jt -> t20 t21 t31
jt depends on: t31, because it's a left join t21, because it refers to it in its argument t20 due to transitive clouse of the dependency relation. t21 depends on t20.
The optimizer constructs a prefix:
t31, t32
This is a "dead end", it cannot be extended to a full join order:
we cannot add t20 or t21 because we've entered the join nest:
t1 left join (t32 ...)
and must add all tables in the nest before we add any table that's not part of the nest. The table in the nest is "JSON_TABLE(...) as jt", but we cannot add it, because it needs t21 to be added first.
Why doesn't the optimizer construct a join order starting from e.g.
t20 t21 t31 ...
The issue is in pruning. The join orders starting from t20 are pruned away because they look less promising than the join orders starting from t31.
(Does MySQL-8 have this issue? No, they don't. They have a found_plan_with_allowed_sj member, and this logic: do not do pruning if there was no complete join order constructed, yet.)
BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
-- BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
participants (2)
-
Alexey Botchkov
-
Sergey Petrunia