Hi again, Sergey!

So i pumped out the patch where all mentioned problems were fixed.
The branch name bb-17399-hf.
Including that optimizer problem and the OPTIMIZER_TRACE

So loo



On Wed, Sep 23, 2020 at 5:06 PM Sergey Petrunia <sergey@mariadb.com> wrote:
Hi Alexey,

On Mon, Sep 14, 2020 at 03:38:38PM +0400, Alexey Botchkov wrote:
> Hi, Sergei!.
>
> so the branch name is 'bb-mdef17399-hf'.
> It has slightly changed since last patch so please pay attention.
>
> There i present the patch that resolves issues that you mentioned
> in your last comments.
> One important is not fixed though.
> That is:
> CREATE TABLE t1(id INT, f1 JSON);
> INSERT INTO t1 VALUES (1, '{\"1\": 1}'), (2, '{\"1\": 2}'), (3, '{\"1\":
> 3}'),
> (4, '{\"1\": 4}'), (5, '{\"1\": 5}'), (6, '{\"1\": 6}');
>
> SELECT * FROM t1 WHERE id IN
>   (SELECT id FROM t1 as tt2,
>       JSON_TABLE(f1, "$" COLUMNS (jf FOR ORDINALITY)) AS tbl);
>
> That SELECT crashes in the optimizer, as the greedy_search() can't find any
> satisfying plan.
> I got rid of the crash with this line:
> @@ -9520,7 +9528,7 @@ best_extension_by_limited_search(JOIN      *join,
>      table_map real_table_bit= s->table->map;
>      if ((remaining_tables & real_table_bit) &&
>          (allowed_tables & real_table_bit) &&
> -        !(remaining_tables & s->dependent) &&
> +//        !(remaining_tables & s->dependent) &&
>
> But i don't mean it's an acceptable fix.
>
> It seems to me the problem is in the optimizer, so i'd like to ask your
> opinion.
> Maybe you just know the answer at once.

After some debugging I see that the following happens:

1. Execution reaches Table_function_json_table::setup

It computes

   m_dep_tables= m_json->used_tables();

this is =1 (In the subquery, table tt2 has map=1)

2. Then, subquery is converted into a semi-join. That is, it is merged into its
parent subquery. There, we have:

table t1: map=1
table tt2:  map=2
table JSON_TABLE(...) AS tbl: map=3

3.  The subquery was uncorrelated, so SJ-Materialization is a possible option and
the join optimizer attempts to construct a join order for the subquery tables.

However it can't succeed, as the table "JSON_TABLE(..) AS tbl" is set to be
dependent on table with map=1, which is not put into the join order because it
is not a part of subquery.
As a result, we get assertion failure when the optimizer fails to produce any
join orders.

The probem here is on step #2.
When the subquery's tables are moved to the parent query, they get new values
of TABLE::map, and also all their attributes that contain table maps need to
be updated.

This done here in convert_subq_to_sj() function:

```
  /* n. Walk through child's tables and adjust table->map */
  List_iterator_fast<TABLE_LIST> si(subq_lex->leaf_tables);
  while ((tl= si++))
  {
    tl->set_tablenr(table_no);
    if (tl->is_jtbm())
    {
      tl->jtbm_table_no= table_no;
      Item *dummy= tl->jtbm_subselect;
      tl->jtbm_subselect->fix_after_pullout(parent_lex, &dummy, true);
```

In that code one can see: if the subquery has table that has 'is_jtbm()==true',
then ... some processing is done to re-compute its attributes after the table
has been pulled up into the parent select.

The same should be done for TABLE_LIST elements that are JSON_TABLE(..).

If you'll introduce a new function, fix_after_pullout() is a good name. It is
already used for Item-derived classes.

> If not, i'll dig into it.
> And you're welcome to observe the new patch.
>
> Best regards.
> HF

--
BR
 Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog