Re: [Maria-developers] MDEV-17399: JSON_TABLE, commit 85757ecbef44484270e385a8d52998c67f72d11a
Hi Alexey, First reactions to the JSON_TABLE patch: == CI Build is failing == Please check http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.6-mdev17399-hf the JSON_TABLE tests are failing. == On testcases == I see now the patch has tests in 3 files: 1. mysql-test/main/json_table_mysql.test 2. mysql-test/main/json_table_mysql2.test 3. mysql-test/suite/json/t/json_table.test #1 and #2 are both fixed copies of MySQL's test. Please remove one of them (the one that has less coverage). Then, can we have both files in the same testsuite? == On Table_function_json_table::ready_for_lookup() == Now that we have end_lateral_table, do we need to use Table_function_json_table::ready_for_lookup? This looks like two solutions for [almost] the same problem? I see that this example from json_table_mysql.test still uses the code path with ready_for_lookup() : --error ER_BAD_FIELD_ERROR UPDATE t1, JSON_TABLE(t1.c1,'$[*]' COLUMNS (a INT PATH '$.a')) AS jt1 SET jt1.a=1; Here, the name resolution for "jt1.a" is done without adjustment for end_lateral_table. Do we need the ready_for_lookup() trick here? (If we do, are we fine with m_setup_done to be set to true but never reset back to false? How will this work with PS?) == On opt_sum.cc == The diff has this: diff --git a/sql/opt_sum.cc b/sql/opt_sum.cc index af2d9ddc2e7..d021c651196 100644 --- a/sql/opt_sum.cc +++ b/sql/opt_sum.cc @@ -299,6 +299,7 @@ int opt_sum_query(THD *thd, tl->schema_table) { maybe_exact_count&= MY_TEST(!tl->schema_table && +//mdev-17399 !tl->table_function && (tl->table->file->ha_table_flags() & HA_HAS_RECORDS)); is_exact_count= FALSE; Please remove this if it is no longer needed. == On table dependencies: table elimination == create table t20 (a int not null); create table t21 (a int not null primary key, js varchar(100)); insert into t20 values (1),(2); insert into t21 values (1, '{"a":100}'); explain select t20.a, jt1.ab from t20 left join t21 on t20.a=t21.a join JSON_TABLE(t21.js,'$' COLUMNS (ab INT PATH '$.a')) AS jt1; +------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | 1 | SIMPLE | t20 | ALL | NULL | NULL | NULL | NULL | 2 | | | 1 | SIMPLE | jt1 | ALL | NULL | NULL | NULL | NULL | 40 | Table function: json_table | +------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ Here we can see an apparently invalid query plan: table t21 was eleminated even if JSON_TABLE uses it. == On table dependencies: regular subqueries == But it doesn't work for regular subqueries either (and this is unexpected): insert into t21 values (3, '{"a":300}'); explain select a, (select jt1.ab from JSON_TABLE(t21.js,'$' COLUMNS (ab INT PATH '$.a')) as jt1 ) from t21; +------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | 1 | PRIMARY | t21 | ALL | NULL | NULL | NULL | NULL | 2 | | | 2 | SUBQUERY | jt1 | ALL | NULL | NULL | NULL | NULL | 40 | Table function: json_table | +------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ Note the 'SUBQUERY', that is, the subquery is considered uncorrelated? This needs to be investigated. BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
On Mon, Feb 22, 2021 at 09:59:54PM +0300, Sergey Petrunia wrote:
== On table dependencies: regular subqueries ==
But it doesn't work for regular subqueries either (and this is unexpected):
insert into t21 values (3, '{"a":300}');
explain select a, (select jt1.ab from JSON_TABLE(t21.js,'$' COLUMNS (ab INT PATH '$.a')) as jt1 ) from t21;
+------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | 1 | PRIMARY | t21 | ALL | NULL | NULL | NULL | NULL | 2 | | | 2 | SUBQUERY | jt1 | ALL | NULL | NULL | NULL | NULL | 40 | Table function: json_table | +------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
Note the 'SUBQUERY', that is, the subquery is considered uncorrelated? This needs to be investigated.
Ok, I see that the subquery is first marked as correlated but then marked as non-correlated here: #0 st_select_lex::update_correlated_cache #1 st_select_lex::optimize_unflattened_subqueries #2 JOIN::optimize_unflattened_subqueries #3 JOIN::optimize_stage2 #4 JOIN::optimize_inner Apparently, st_select_lex::update_correlated_cache doesn't yet account for Table Functions. Please fix. BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
On Mon, Feb 22, 2021 at 09:59:54PM +0300, Sergey Petrunia wrote:
== On table dependencies: table elimination ==
create table t20 (a int not null); create table t21 (a int not null primary key, js varchar(100));
insert into t20 values (1),(2); insert into t21 values (1, '{"a":100}');
explain select t20.a, jt1.ab from t20 left join t21 on t20.a=t21.a join JSON_TABLE(t21.js,'$' COLUMNS (ab INT PATH '$.a')) AS jt1;
+------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | 1 | SIMPLE | t20 | ALL | NULL | NULL | NULL | NULL | 2 | | | 1 | SIMPLE | jt1 | ALL | NULL | NULL | NULL | NULL | 40 | Table function: json_table | +------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
Here we can see an apparently invalid query plan: table t21 was eleminated even if JSON_TABLE uses it.
We can do with a simple rule: "do not eliminate tables that are used as argument for any table function". I think this should be achieved as follows: in eliminate_tables(), look at the code that collects a bitmap of tables used in the select list (to prevent them from being eliminated). /* Add tables referred to from the select list */ List_iterator<Item> it(join->fields_list); while ((item= it++)) used_tables |= item->used_tables(); Right below this, add a loop which does the same for table functions. BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
Ok, fixed that as you said. See the last commit to the bb-10.6-mdev17399-hf for details Best regards. HF On Tue, Feb 23, 2021 at 3:57 PM Sergey Petrunia <sergey@mariadb.com> wrote:
On Mon, Feb 22, 2021 at 09:59:54PM +0300, Sergey Petrunia wrote:
== On table dependencies: table elimination ==
create table t20 (a int not null); create table t21 (a int not null primary key, js varchar(100));
insert into t20 values (1),(2); insert into t21 values (1, '{"a":100}');
explain select t20.a, jt1.ab from t20 left join t21 on t20.a=t21.a join JSON_TABLE(t21.js,'$' COLUMNS (ab INT PATH '$.a')) AS jt1;
+------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
| 1 | SIMPLE | t20 | ALL | NULL | NULL | NULL | NULL | 2 | | | 1 | SIMPLE | jt1 | ALL | NULL | NULL | NULL | NULL | 40 | Table function: json_table |
+------+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
Here we can see an apparently invalid query plan: table t21 was
eleminated
even if JSON_TABLE uses it.
We can do with a simple rule: "do not eliminate tables that are used as argument for any table function".
I think this should be achieved as follows: in eliminate_tables(), look at the code that collects a bitmap of tables used in the select list (to prevent them from being eliminated).
/* Add tables referred to from the select list */ List_iterator<Item> it(join->fields_list); while ((item= it++)) used_tables |= item->used_tables();
Right below this, add a loop which does the same for table functions.
BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
participants (2)
-
Alexey Botchkov
-
Sergey Petrunia