Hi, Nikita, On Sep 29, Nikita Malyavin wrote:
revision-id: a71b13433f9 (mariadb-11.2.1-7-ga71b13433f9) parent(s): 883e7e5f7e7 author: Nikita Malyavin committer: Nikita Malyavin timestamp: 2023-09-22 17:55:14 +0400 message:
MDEV-32126 Assertion fails upon online ALTER and binary log enabled
Assertion `!writer.checksum_len || writer.remains == 0' fails upon concurrent online ALTER and transactions with failing statements and binary log enabled. Also another assertion, `pos != (~(my_off_t) 0)', fails in my_seek, upon reinit_io_cache, on a simplified test. This means that IO_CACHE wasn't properly initialized, or had an error before.
The overall problem is a deep interference with the effect of an installed binlog_hton.
As I wrote in Jira, please, provide a better explanation here. I had to debug the issue myself and as far as I can see, the problem happens because you register binlog_hton for a transaction, but doesn't prepare the cache manager for it. This can be fixed simply with @@ -2279,6 +2279,12 @@ int binlog_log_row_online_alter(TABLE* table, const ucha> trans_register_ha(thd, false, binlog_hton, 0); if (thd->in_multi_stmt_transaction_mode()) trans_register_ha(thd, true, binlog_hton, 0); + if (mysql_bin_log.is_open()) + { + binlog_cache_mngr *cache_mngr= thd->binlog_get_cache_mngr(); + if (cache_mngr && cache_mngr->trx_cache.get_prev_position() == MY_OFF_T_UNDEF) + thd->binlog_set_stmt_begin(); + } } which solves all cases from the bug report. Don't get me wrong, I don't suggest this fix, it's kind of hackish, and I liked how other pieces of the puzzle fell into a place after a separate handlerton, like savepoints and online_alter_cache_list. I'm just saying that "magical interference with the binlog_hton" doesn't explain the problem. At all.
Solution: Extract online alter operations into a separate handlerton.
--- a/sql/CMakeLists.txt +++ b/sql/CMakeLists.txt @@ -218,6 +218,8 @@ MYSQL_ADD_PLUGIN(partition ha_partition.cc STORAGE_ENGINE DEFAULT STATIC_ONLY RECOMPILE_FOR_EMBEDDED) MYSQL_ADD_PLUGIN(sql_sequence ha_sequence.cc STORAGE_ENGINE MANDATORY STATIC_ONLY RECOMPILE_FOR_EMBEDDED) +MYSQL_ADD_PLUGIN(online_alter_log log.cc STORAGE_ENGINE MANDATORY STATIC_ONLY +NOT_EMBEDDED)
better add it next to builtin_maria_binlog_plugin, instead of via MYSQL_ADD_PLUGIN.
ADD_LIBRARY(sql STATIC ${SQL_SOURCE}) MAYBE_DISABLE_IPO(sql) diff --git a/sql/log.cc b/sql/log.cc index ee78844464e..132ec254adf 100644 --- a/sql/log.cc +++ b/sql/log.cc @@ -12522,3 +12512,61 @@ void wsrep_register_binlog_handler(THD *thd, bool trx) }
#endif /* WITH_WSREP */ + +static int online_alter_close_connection(handlerton *hton, THD *thd) +{ + DBUG_ASSERT(thd->online_alter_cache_list.empty()); + return 0; +} + +handlerton *online_alter_hton; + +int online_alter_log_init(void *p) +{ + online_alter_hton= (handlerton *)p; + online_alter_hton->db_type= DB_TYPE_ONLINE_ALTER; + online_alter_hton->savepoint_offset= sizeof(my_off_t); + online_alter_hton->close_connection= online_alter_close_connection; + + online_alter_hton->savepoint_set= // Done by online_alter_savepoint_set + [](handlerton *, THD *, void *){ return 0; }; + online_alter_hton->savepoint_rollback= // Done by online_alter_savepoint_rollback + [](handlerton *, THD *, void *){ return 0; }; + online_alter_hton->savepoint_rollback_can_release_mdl= + [](handlerton *hton, THD *thd){ return true; }; + + online_alter_hton->commit= [](handlerton *, THD *thd, bool all) + { return binlog_online_alter_end_trans(thd, all, true); }; + online_alter_hton->rollback= [](handlerton *, THD *thd, bool all) + { return binlog_online_alter_end_trans(thd, all, false); }; + online_alter_hton->commit_by_xid= [](handlerton *hton, XID *xid) + { return binlog_online_alter_end_trans(current_thd, true, true); }; + online_alter_hton->rollback_by_xid= [](handlerton *hton, XID *xid) + { return binlog_online_alter_end_trans(current_thd, true, false); };
should it be so? commit_by_xid/rollback_by_xid take XID as an argument ad you ignore it. If commit_by_xid/rollback_by_xid shouldn't be called for online alter, don't set them at all. Or set to something that always fails (and add an assert too).
+ + online_alter_hton->drop_table= [](handlerton *, const char*) { return -1; };
yeah. Like this.
+ online_alter_hton->flags= HTON_NOT_USER_SELECTABLE | HTON_HIDDEN + | HTON_NO_ROLLBACK;
better remove HTON_NO_ROLLBACK. As far as I can see it doesn't do anything here anyway, so the only effect it has - everyone reading the code will stop and thing "why HTON_NO_ROLLBACK is here?" Takes at least 5-10 minutes to look through all places where HTON_NO_ROLLBACK is tested to see that online_alter_hton cannot be there.
+ return 0; +} + +struct st_mysql_storage_engine online_alter_storage_engine= +{ MYSQL_HANDLERTON_INTERFACE_VERSION }; + +maria_declare_plugin(online_alter_log) +{ + MYSQL_STORAGE_ENGINE_PLUGIN, + &online_alter_storage_engine, + "online_alter_log", + "MariaDB PLC",
"plc" lowercase
+ "This is a pseudo storage engine to represent the online alter log in a transaction", + PLUGIN_LICENSE_GPL, + online_alter_log_init, + NULL, + 0x0100, // 1.0 + NULL, // no status vars + NULL, // no sysvars + "1.0", + MariaDB_PLUGIN_MATURITY_STABLE +} +maria_declare_plugin_end;
Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org