Re: [Maria-developers] New patch on the JSON_TABLE.
Hi Alexey, On Mon, Sep 14, 2020 at 03:38:38PM +0400, Alexey Botchkov wrote:
Hi, Sergei!.
so the branch name is 'bb-mdef17399-hf'. It has slightly changed since last patch so please pay attention.
There i present the patch that resolves issues that you mentioned in your last comments. One important is not fixed though. That is: CREATE TABLE t1(id INT, f1 JSON); INSERT INTO t1 VALUES (1, '{\"1\": 1}'), (2, '{\"1\": 2}'), (3, '{\"1\": 3}'), (4, '{\"1\": 4}'), (5, '{\"1\": 5}'), (6, '{\"1\": 6}');
SELECT * FROM t1 WHERE id IN (SELECT id FROM t1 as tt2, JSON_TABLE(f1, "$" COLUMNS (jf FOR ORDINALITY)) AS tbl);
That SELECT crashes in the optimizer, as the greedy_search() can't find any satisfying plan. I got rid of the crash with this line: @@ -9520,7 +9528,7 @@ best_extension_by_limited_search(JOIN *join, table_map real_table_bit= s->table->map; if ((remaining_tables & real_table_bit) && (allowed_tables & real_table_bit) && - !(remaining_tables & s->dependent) && +// !(remaining_tables & s->dependent) &&
But i don't mean it's an acceptable fix.
It seems to me the problem is in the optimizer, so i'd like to ask your opinion. Maybe you just know the answer at once.
After some debugging I see that the following happens: 1. Execution reaches Table_function_json_table::setup It computes m_dep_tables= m_json->used_tables(); this is =1 (In the subquery, table tt2 has map=1) 2. Then, subquery is converted into a semi-join. That is, it is merged into its parent subquery. There, we have: table t1: map=1 table tt2: map=2 table JSON_TABLE(...) AS tbl: map=3 3. The subquery was uncorrelated, so SJ-Materialization is a possible option and the join optimizer attempts to construct a join order for the subquery tables. However it can't succeed, as the table "JSON_TABLE(..) AS tbl" is set to be dependent on table with map=1, which is not put into the join order because it is not a part of subquery. As a result, we get assertion failure when the optimizer fails to produce any join orders. The probem here is on step #2. When the subquery's tables are moved to the parent query, they get new values of TABLE::map, and also all their attributes that contain table maps need to be updated. This done here in convert_subq_to_sj() function: ``` /* n. Walk through child's tables and adjust table->map */ List_iterator_fast<TABLE_LIST> si(subq_lex->leaf_tables); while ((tl= si++)) { tl->set_tablenr(table_no); if (tl->is_jtbm()) { tl->jtbm_table_no= table_no; Item *dummy= tl->jtbm_subselect; tl->jtbm_subselect->fix_after_pullout(parent_lex, &dummy, true); ``` In that code one can see: if the subquery has table that has 'is_jtbm()==true', then ... some processing is done to re-compute its attributes after the table has been pulled up into the parent select. The same should be done for TABLE_LIST elements that are JSON_TABLE(..). If you'll introduce a new function, fix_after_pullout() is a good name. It is already used for Item-derived classes.
If not, i'll dig into it. And you're welcome to observe the new patch.
Best regards. HF
-- BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
Hi again, Sergey! So i pumped out the patch where all mentioned problems were fixed. The branch name bb-17399-hf. Including that optimizer problem and the OPTIMIZER_TRACE So loo On Wed, Sep 23, 2020 at 5:06 PM Sergey Petrunia <sergey@mariadb.com> wrote:
Hi Alexey,
On Mon, Sep 14, 2020 at 03:38:38PM +0400, Alexey Botchkov wrote:
Hi, Sergei!.
so the branch name is 'bb-mdef17399-hf'. It has slightly changed since last patch so please pay attention.
There i present the patch that resolves issues that you mentioned in your last comments. One important is not fixed though. That is: CREATE TABLE t1(id INT, f1 JSON); INSERT INTO t1 VALUES (1, '{\"1\": 1}'), (2, '{\"1\": 2}'), (3, '{\"1\": 3}'), (4, '{\"1\": 4}'), (5, '{\"1\": 5}'), (6, '{\"1\": 6}');
SELECT * FROM t1 WHERE id IN (SELECT id FROM t1 as tt2, JSON_TABLE(f1, "$" COLUMNS (jf FOR ORDINALITY)) AS tbl);
That SELECT crashes in the optimizer, as the greedy_search() can't find any satisfying plan. I got rid of the crash with this line: @@ -9520,7 +9528,7 @@ best_extension_by_limited_search(JOIN *join, table_map real_table_bit= s->table->map; if ((remaining_tables & real_table_bit) && (allowed_tables & real_table_bit) && - !(remaining_tables & s->dependent) && +// !(remaining_tables & s->dependent) &&
But i don't mean it's an acceptable fix.
It seems to me the problem is in the optimizer, so i'd like to ask your opinion. Maybe you just know the answer at once.
After some debugging I see that the following happens:
1. Execution reaches Table_function_json_table::setup
It computes
m_dep_tables= m_json->used_tables();
this is =1 (In the subquery, table tt2 has map=1)
2. Then, subquery is converted into a semi-join. That is, it is merged into its parent subquery. There, we have:
table t1: map=1 table tt2: map=2 table JSON_TABLE(...) AS tbl: map=3
3. The subquery was uncorrelated, so SJ-Materialization is a possible option and the join optimizer attempts to construct a join order for the subquery tables.
However it can't succeed, as the table "JSON_TABLE(..) AS tbl" is set to be dependent on table with map=1, which is not put into the join order because it is not a part of subquery. As a result, we get assertion failure when the optimizer fails to produce any join orders.
The probem here is on step #2. When the subquery's tables are moved to the parent query, they get new values of TABLE::map, and also all their attributes that contain table maps need to be updated.
This done here in convert_subq_to_sj() function:
``` /* n. Walk through child's tables and adjust table->map */ List_iterator_fast<TABLE_LIST> si(subq_lex->leaf_tables); while ((tl= si++)) { tl->set_tablenr(table_no); if (tl->is_jtbm()) { tl->jtbm_table_no= table_no; Item *dummy= tl->jtbm_subselect; tl->jtbm_subselect->fix_after_pullout(parent_lex, &dummy, true); ```
In that code one can see: if the subquery has table that has 'is_jtbm()==true', then ... some processing is done to re-compute its attributes after the table has been pulled up into the parent select.
The same should be done for TABLE_LIST elements that are JSON_TABLE(..).
If you'll introduce a new function, fix_after_pullout() is a good name. It is already used for Item-derived classes.
If not, i'll dig into it. And you're welcome to observe the new patch.
Best regards. HF
-- BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
participants (2)
-
Alexey Botchkov
-
Sergey Petrunia