andrei.elkin@pp.inet.fi writes:
commit 3cebb54e6387a7eace1757c82ed0efd6e11590b9 Author: Andrei Elkin <andrei.elkin@mariadb.com> Date: Fri Feb 9 15:00:23 2018 +0200
MDEV-12746 rpl.rpl_parallel_optimistic_nobinlog fails committing out of order at retry
The 2nd issue was out of order commit by transactions that are ordered after an erroring out transaction. The latter was possible in the test thanks to the above retry failure. Out-of-order committing was done despite the slave was stopping.
Let me see if I understand this correctly: Normally, a transaction in optimistic parallel replication cannot commit out of order because it either waits for the prior transaction to commit first, or, if the prior transaction has stopped with an error earlier, then the following transaction will not even start. But in this case, during retry, there was missing the check that the prior transaction stopped with an error. Therefore, the following transaction's wait_for_prior_commit() does nothing, and it wrongly goes ahead and runs and even commits, even though an earlier transaction failed and replication is stopping. Correct? That's a nice catch of a subtle, but potentially serious bug. I wonder how this is possible though. As I recall, when a transaction fails, this failure is marked in the wait_for_prior_commit data structures. So the retrying transaction should have gotten an error in wait_for_prior_commit() and aborted the retries. So why does this not happen in this case?
The test failures were of two sorts. One is that the number of potential temporary errors actually exceeds the default value of the slave retry option. Increased @@global.slave_transaction_retries fixes that part.
Hm, this shouldn't be needed, I think. The retry waits for all prior transactions to commit first, so after the first retry no further retries should be caused by other parallel replication workers. So I would expect this problem to be a side-effect of the other, real issue, and this increase should not be needed (and might hide another problem later, if left in).
diff --git a/sql/rpl_parallel.cc b/sql/rpl_parallel.cc index 34f9113f7fe..9f1d23a1161 100644 --- a/sql/rpl_parallel.cc +++ b/sql/rpl_parallel.cc
@@ -745,7 +761,25 @@ retry_event_group(rpl_group_info *rgi, rpl_parallel_thread *rpt, for (;;) { mysql_mutex_lock(&entry->LOCK_parallel_entry); - register_wait_for_prior_event_group_commit(rgi, entry); + if (entry->stop_on_error_sub_id == (uint64) ULONGLONG_MAX || +#ifndef DBUG_OFF + (DBUG_EVALUATE_IF("simulate_mdev_12746", 1, 0)) || +#endif + rgi->gtid_sub_id < entry->stop_on_error_sub_id) + { + register_wait_for_prior_event_group_commit(rgi, entry); + } + else + { + /* + An earlier transaction by another worker may have not had not + have the current one in subsequent committers list and thus + the actual induced error status could not have been sumbitted. + So it should be set by ourselves now. + */
This comment is hard to understand - what do you mean here?
+ if (!rgi->worker_error) + rgi->worker_error= 1; + }
I would have expected an abort (goto err) here (or well just below, after unlocking the mutex). Is this not needed, and if so, why not?
@@ -716,12 +724,20 @@ retry_event_group(rpl_group_info *rgi, rpl_parallel_thread *rpt, unregistering (and later re-registering) the wait. */ if(thd->wait_for_commit_ptr) - thd->wait_for_commit_ptr->unregister_wait_for_prior_commit(); + thd->wait_for_commit_ptr->unregister_wait_for_prior_commit();
Indentation? - Kristian.