On Mon, 15 Nov 2021 at 13:52, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Nikita!

On Nov 14, Nikita Malyavin wrote:
> revision-id: a5fca9a6e30 (mariadb-10.5.2-478-ga5fca9a6e30)
> parent(s): ad6e7b87107
> author: Nikita Malyavin
> committer: Nikita Malyavin
> timestamp: 2021-01-27 17:28:05 +1000
> message:
>
> MENT-651 [6/8] store cache managers in a list

again, a couple of lines with a description would be good here

I think it should be just a fixup for [4/8] ALTER ONLINE TABLE.
 
> diff --git a/mysql-test/suite/binlog/t/online_alter.test b/mysql-test/suite/binlog/t/online_alter.test
> index 804e847d008..4ed2db74bd6 100644
> --- a/mysql-test/suite/binlog/t/online_alter.test
> +++ b/mysql-test/suite/binlog/t/online_alter.test
> @@ -218,7 +219,7 @@ set autocommit = 0;
>  start transaction;
>  update t1 set b= 7007 where a = 7;
>  --error ER_DUP_ENTRY
> -update t1 set a= 8 where a = 8 or a = 9;
> +update t1 set a= 8, b= 8008 where a = 8 or a = 9 order by a;

why did you change tests for what is, as far as I understand, just an
optimization?

That's the point, that it's not! 

The thing is, in case of an error, the table is invalidated. As far as I remember, it could happen even if the table could've just evicted from the cache (in case of a long transaction involving many tables).

So when committing, corresponding TABLE_SHARE may be inaccessible, by different reasons.

So it turns to practically correct that handlerton can't depend from TABLE or TABLE_SHARE.

>  commit;
>  set autocommit = 1;

> diff --git a/sql/handler.cc b/sql/handler.cc
> index 8b95653e69f..8b22167e729 100644
> --- a/sql/handler.cc
> +++ b/sql/handler.cc
> @@ -6660,7 +6660,16 @@ static int binlog_log_row_online_alter(TABLE* table,
>    THD *thd= table->in_use;

>    if (!table->online_alter_cache)
> -    table->online_alter_cache= thd->binlog_setup_cache_data();
> +  {
> +    auto *cache_mngr= online_alter_binlog_get_cache_mngr(thd, table);
> +    // Use transaction cache directly, if it is not multi-transaction mode
> +    table->online_alter_cache= binlog_get_cache_data(cache_mngr,
> +                                        !thd->in_multi_stmt_transaction_mode());
> +
> +    trans_register_ha(thd, false, binlog_hton, 0);
> +    if (thd->in_multi_stmt_transaction_mode())
> +      trans_register_ha(thd, true, binlog_hton, 0);

I still don't understand that.
Why do you need a handlerton, why do you fake a transaction?
Why do you call it faking a transaction??
I need a handlerton.
I am not sure if this was a right place for this hunk. Looks more like a "Fix running without binlog" part.

And why do you need to search a list in THD, instead of storing
cache_mngr in TABLE_SHARE?

See above I guess
 
> +  }

>    // We need to log all columns for the case if alter table changes primary key.
>    table->use_all_columns();
> diff --git a/sql/log.cc b/sql/log.cc
> index 6c678624230..c58096e6d9a 100644
> --- a/sql/log.cc
> +++ b/sql/log.cc
> @@ -2258,14 +2298,18 @@ static int binlog_rollback(handlerton *hton, THD *thd, bool all)
>  {
>    DBUG_ENTER("binlog_rollback");

> -  for (TABLE *table= thd->open_tables; table; table= table->next)
> -  {
> -    if (!table->online_alter_cache)
> -      continue;
> -    table->online_alter_cache->reset();
> -    delete table->online_alter_cache;
> -    table->online_alter_cache= NULL;
> -  }
> +  bool is_ending_trans= ending_trans(thd, all);
> +
> +  /*
> +    This is a crucial moment that we are running through
> +    thd->online_alter_cache_list, and not through thd->open_tables to cleanup
> +    stmt cache, though both have it. The reason is that the tables can be closed
> +    to that moment in case of an error.
> +    The same reason applies to the fact we don't store cache_mngr in the table
> +    itself -- because it can happen to be not existing.
> +    Still in case if tables are left opened
> +   */
> +  binlog_online_alter_cleanup(thd->online_alter_cache_list, is_ending_trans);

still don't understand that comment either :(

Still in case of table left opened....
we'll never know what I meant.

I suppose, in the case if tables are left opened, it still is error-prone. I guess TABLE_SHARE is not evicted and TABLE is left half-initialized, that's why I was lucky to have correct results after COMMIT.
But in case of an error, table is definitely got garbaged, so COMMIT might be empty, because the trx_cache would be lost.


I guess I could have a following behavior in early versions:
1. Store one only transaction cache in TABLE.
2. On commit, append trx cache into online binlog.
3. On rollback do nothing.

This is incorrect, because it does not handle errors -- in case of error we should discard only statement changes.
The better implementation would be:
1. Store BOTH statement and transaction caches in TABLE
2. On statement success, append stmt cache to trx cache
3. On stmt failure, do nothing (except stmt cache is discarded, but that's on every step)
4. On COMMIT, append trx cache into online binlog
5. On rollback, do nothing

It will fail because when error happens, TABLE can be deleted, reopened, etc. It anyway can be reopened or something. We can even flush it!

So we CANNOT store the caches (at least trx cache) in TABLE! Neither we can in TABLE_SHARE.
This data should not be flushed.

The only correct place turns out to be a handlerton! And note how consistent it is with an idea where the transaction data should be stored by design!

Sincerely,
Nikita.