Michael Widenius <monty@askmonty.org> writes:
+ if (list->wakeup_subsequent_commits_running) + { + mysql_mutex_lock(&list->LOCK_wait_commit); + list->wakeup_subsequent_commits_running= false; + mysql_mutex_unlock(&list->LOCK_wait_commit);
Why do we need the mutex above?
I checked this again. We _do_ need a full memory barrier here (memory barrier is implied in taking a mutex). Otherwise the compiler or CPU could re-order the setting of the wakeup_subsequent_commits_running flag with the reads and writes done in the list manipulations. This could cause the two threads to corrupt the list as they both manipulate it at the same time. But after carefully checking, it seems to me a barrier would be enough, we do not need to lock and unlock the mutex. Unfortunately, we do not have any good mechanisms for doing memory barriers in the MariaDB source currently, AFAIK. I will put a comment here that the lock/unlock is only needed for the memory barrier, and could be changed later. Then we can re-visit it when the more critical things have been fixed (and if we get memory barrier primitives in the source tree). What do you think of this suggestion? [I was wondering why I did it like this in the first place. Due to chaining between the waiter's mutex and the waitees mutex, the extra lock/unlock does prevent that wakeup_subsequent_commits2() can change the flag in the middle of unregister_wait_for_prior_commit2() doing its operation. Maybe that is why I did it this way. But I could not actually think of any problems from such a change in the middle. Or maybe I did it this way to emphasise that the code conceptually is locking both the waitee and the waiter - but to prevent deadlocks, it temporarily releases the waitee lock in the middle. So instead of LOCK a; LOCK b; UNLOCK b; UNLOCK a; we inject a temporary unlock of a: LOCK a; [UNLOCK a]; LOCK b; UNLOCK b; [LOCK a]; UNLOCK a; But it does not really make things any clearer.]
The only place where we test this variable, as far as I can find, is in wait_for_commit::register_wait_for_prior_commit()
The critical place is in wait_for_commit::unregister_wait_for_prior_commit2()
This thread doesn't know if the other thread is just before mysql_mutex_lock(&waitee->LOCK_wait_commit), inside the lock or after the lock. So when we reset the variable should not matter.
What we need to ensure is that - All the reads of the list in thread 1 see only writes from other threads that happened _before_ the variable was reset. - All changes to the list in thread 2 happen only _after_ the variable was reset in thread 1. So a memory barrier is needed, but as you say, lock/unlock should not be needed.
But somehow this sounds strange. What happens if the above variable is reset just before mysql_mutex_lock(&waitee->LOCK_wait_commit) in wait_for_commit::register_wait_for_prior_commit() ? Isn't the waitee added to the list when it shouldn't?
That should be ok. The caller ensures that wakeup_subsequent_commits2() will be called _after_ the last call to register_wait_for_prior_commit() has completed. This is done in this code in handle_rpl_parallel_thread(): mysql_mutex_lock(&entry->LOCK_parallel_entry); if (entry->last_committed_sub_id < event_gtid_sub_id) { entry->last_committed_sub_id= event_gtid_sub_id; mysql_cond_broadcast(&entry->COND_parallel_entry); } mysql_mutex_unlock(&entry->LOCK_parallel_entry); rgi->commit_orderer.wakeup_subsequent_commits(); and mysql_mutex_lock(&entry->LOCK_parallel_entry); if (wait_for_sub_id > entry->last_committed_sub_id) { wait_for_commit *waitee= &rgi->wait_commit_group_info->commit_orderer; rgi->commit_orderer.register_wait_for_prior_commit(waitee); } mysql_mutex_unlock(&entry->LOCK_parallel_entry); Thanks, - Kristian.