Hi, Nikita,
On Aug 08, Nikita Malyavin wrote:
> revision-id: 88a77860a8b (mariadb-11.0.1-194-g88a77860a8b)
> parent(s): 73a677f0306
> author: Nikita Malyavin
> committer: Nikita Malyavin
> timestamp: 2023-08-07 22:45:11 +0400
> message:
>
> MDEV-31804 Assertion `thd->m_transaction_psi == __null' fails
>
> ... upon replicating online ALTER
>
> When an online event is applied and slave_exec_mode is idempotent,
> Write_rows_log_event::do_before_row_operations had reset
> thd->lex->sql_command to SQLCOM_REPLACE.
>
> This led to that a statement was detected as a row-type during
> binlogging, and was logged as not standalone.
>
> So the corresponding Gtid_log_event, when applied on replica, did not
> exit early and created a new PSI transaction. Hence the difference
> with non-online ALTER.
Nice find.
I vaguely believe I've seen in one of the commits I've reviewed a change
that makes every row event set thd->lex->sql_command to the
corresponding SQLCOM_INSERT/UPDATE/DELETE. But be it was in an unpushed
commit.
Anyway, I think it's a bit risky to assume that row events never change
thd->lex->sql_command (and never will). Perhaps you should simply
restore it after the applying part? Like
--- a/sql/sql_table.cc
+++ b/sql/sql_table.cc
@@ -12181,6 +12181,7 @@ copy_data_between_tables(THD *thd, TABLE *from, TABLE *>
if (error)
from->s->tdc->flush_unused(1); // to free the binlog
to->pos_in_table_list= NULL; // Safety
+ thd->lex->sql_command= SQLCOM_ALTER_TABLE;
}
else if (online) // error was on copy stage
{
Regards,
Sergei
VP of MariaDB Server Engineering
and security@mariadb.org