Howdy Kristian.
Before doing mark_start_commit(), check that there is no pending deadlock kill. If there is a pending kill, we won't commit (we will abort, roll back, and retry). Then we should not mark the commit as started, since that could potentially make the following GCO start too early, before we completed the commit after the retry.
This condition could trigger in some corner cases, where InnoDB would take temporarily table/row locks that are released again immediately, not held until the transaction commits. This happens with dict_stats updates and possibly auto-increment locks.
Such locks can be passed to thd_rpl_deadlock_check() and cause a deadlock kill to be scheduled in the background. But since the blocking locks are held only temporarily, they can be released before the background kill happens. This way, the kill can be delayed until after mark_start_commit() has been called. Thus we need to check the synchronous indication rgi->killed_for_retry, not just the asynchroneous thd->killed.
I think I understood what's going on, also thanks to a verbose Jira ticket description.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org> --- sql/rpl_parallel.cc | 20 ++++++++++++++++---- sql/rpl_rli.cc | 17 +++++++++++++++++ 2 files changed, 33 insertions(+), 4 deletions(-)
diff --git a/sql/rpl_parallel.cc b/sql/rpl_parallel.cc index 1cfdf96ee3b..9c4222d7817 100644 --- a/sql/rpl_parallel.cc +++ b/sql/rpl_parallel.cc @@ -1450,11 +1450,23 @@ handle_rpl_parallel_thread(void *arg) after mark_start_commit(), we have to unmark, which has at least a theoretical possibility of leaving a window where it looks like all transactions in a GCO have started committing, while in fact one - will need to rollback and retry. This is not supposed to be possible - (since there is a deadlock, at least one transaction should be - blocked from reaching commit), but this seems a fragile ensurance, - and there were historically a number of subtle bugs in this area. + will need to rollback and retry. + + Normally this will not happen, since the kill is there to resolve a + deadlock that is preventing at least one transaction from proceeding. + One case it can happen is with InnoDB dict stats update, which can + temporarily cause transactions to block each other, but locks are + released immediately, they don't linger until commit. There could be + other similar cases, there were historically a number of subtle bugs + in this area. + + But once we start the commit, we can expect that no new lock + conflicts will be introduced. So by handling any lingering deadlock + kill at this point just before mark_start_commit(), we should be + robust even towards spurious deadlock kills. */ + if (rgi->killed_for_retry != rpl_group_info::RETRY_KILL_NONE) + wait_for_pending_deadlock_kill(thd, rgi);
Assuming my understanding, please correct me if anything, I left a question on this block in https://github.com/MariaDB/server/commit/df0c36a354ffde2766ebed2615642c74b79... quoted further. ... this guard would not let a victim of an optimistic conflict proceed until it clears out itself it is such a victim. For instance in the binlog sequence like T_1, T_2, D_3 where T_2 is the victim, D_3 is a DDL, the victim T_2 won't anymore release the next gco member D_3 into its execution. But what if at this point T_2 is only going to be marked in rgi->killed_for_retry by T_1? That is it is apparently able, with this patch as well, having ...NONE killed status go through. T_1 would reach this very point later and when that is done before T_2 finds itself killed, T_1 would release D_3, and afterward T_2 would finally see itself KILLed into retry. Cheers, Andrei