Re: [Maria-developers] f1db957c5da: MDEV-21469: Implement crash-safe binary logging of the user XA
Hi, Andrei! On Mar 16, Andrei Elkin wrote:
On Mar 14, Sujatha wrote:
revision-id: f1db957c5da (mariadb-10.5.2-247-gf1db957c5da) parent(s): dc92235f21e author: Sujatha <sujatha.sivakumar@mariadb.com> committer: Andrei Elkin <andrei.elkin@mariadb.com> timestamp: 2020-12-21 16:10:46 +0200 message:
MDEV-21469: Implement crash-safe binary logging of the user XA ... diff --git a/sql/log.cc b/sql/log.cc index 731bb3e98f0..c69b8518cf4 100644 --- a/sql/log.cc +++ b/sql/log.cc @@ -2019,28 +2027,49 @@ static int binlog_xa_recover_dummy(handlerton *hton __attribute__((unused)), return 0; }
- +/* + The function invokes binlog_commit() and returns its result + when it has not yet called it already. + binlog_cache_mngr::completed_by_xid remembers the fact of + the 1st of maximum two subsequent calls. +*/ static int binlog_commit_by_xid(handlerton *hton, XID *xid) { + int rc= 0; THD *thd= current_thd; - - (void) thd->binlog_setup_trx_data(); + binlog_cache_mngr *cache_mngr= thd->binlog_setup_trx_data();
DBUG_ASSERT(thd->lex->sql_command == SQLCOM_XA_COMMIT);
- return binlog_commit(hton, thd, TRUE); -} + if (!cache_mngr->completed_by_xid)
On what code path can you have completed_by_xid == false here?
Let me answer broadly, the purpose of `completed_by_xid' is to make sure binlog_hton->commit_by_xid() does not invoke lower level binlog_commit() for the 2nd time at the following function.
I understand that. And I mean that if commit_by_xid is always true, then binlog_hton->commit_by_xid can be a noop, because you always invoke binlog_commit_by_xid (or binlog_rollback_by_xid) explicitly.
I leave only two essential statements prepared with explanatory comments:
ha_commit_or_rollback_by_xid() { // binlog_commit() has to be called as first hton ...
binlog_hton->commit_by_xid(binlog_hton, xid);
// ... So the above raises the flag (sets up a fence) to block binlog_hton // run for the 2nd time in the loop here:
plugin_foreach(NULL, commit ? xacommit_handlerton : xarollback_handlerton, MYSQL_STORAGE_ENGINE_PLUGIN, &xaop);
Now I am thinking a (much) better option exists to block that 2nd time invocation of binlog_commit() with merely testing the type of hton in
static my_bool xacommit_handlerton(THD *unused1, plugin_ref plugin, void *arg) { handlerton *hton= plugin_hton(plugin); - if (hton->recover) + if (hton->recover && hton != binlog_hton) { hton->commit_by_xid();
and the same to xarollback_handlerton. I am now committed to turn to it.
No, see above. Ideally, binlog won't have a hton and it won't be in the list of engines at all. But if that's a too big step to make now, then just make binlog_hton->commit_by_xid noop (or NULL, if that works) and invoke binlog_commit_by_xid() explicitly by name.
+ { + rc= binlog_commit(hton, thd, TRUE); + cache_mngr->completed_by_xid= true; + }
+ return rc; +}
@@ -10333,14 +10386,108 @@ start_binlog_background_thread() ... + +/** + Performs recovery based on transaction coordinator log for 2pc. At the + time of crash, if the binary log was in active state, then recovery for + "implicit" 'xid's and explicit 'XA' transactions is initiated, + otherwise merely the gtid binlog state is updated. + For 'xid' and 'XA' based recovery the following steps are performed. + + Identify the active binlog checkpoint file. + Scan the binary log from the beginning. + From GTID_LIST and GTID_EVENTs reconstruct the gtid binlog state. + Prepare a list of 'xid's for recovery. + Prepare a list of explicit 'XA' transactions for recovery. + Recover the 'xid' transactions. + The explicit 'XA' transaction recovery is initiated once all the server + components are initialized. Please check 'execute_xa_for_recovery()'.
why explicit XA recovery is delayed?
It has to wait for at least until after init_update_queries(); in init_server_components() without which
Better move init_update_queries() to happen before recovery. As far as I can see, init_update_queries() only sets flags in the array of commands, it does not depend on anything and can be done anytime. Ideally all the recovery could happen in one place, not split over two separate functions that are invoked at separate points in time during the startup.
@@ -10561,6 +10734,219 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name, + { + enable_apply_event= true; + // partially engine-prepared XA is first cleaned out prior replay + thd->lex->sql_command= SQLCOM_XA_ROLLBACK; + ha_commit_or_rollback_by_xid(&gev->xid, 0); + } + else + --recover_xa_count; + } + } + else if (gev->flags2 & Gtid_log_event::FL_COMPLETED_XA) + { + if (member->state == XA_COMPLETE && + member->in_engine_prepare > 0) + enable_apply_event= true;
why? you cannot replay a fully prepared and partially committed transaction
We might make use of slave_exec_mode = IDEMPOTENT but only ROW format events are subject to the mode.
Yes, so why you're trying to replay it instead of just completing the commit? Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
participants (1)
-
Sergei Golubchik