Hi!
"Kristian" == Kristian Nielsen <knielsen@knielsen-hq.org> writes:
Kristian> Michael Widenius <monty@askmonty.org> writes:
+ if (list->wakeup_subsequent_commits_running) + { + mysql_mutex_lock(&list->LOCK_wait_commit); + list->wakeup_subsequent_commits_running= false; + mysql_mutex_unlock(&list->LOCK_wait_commit);
Why do we need the mutex above?
Kristian> I checked this again. Kristian> We _do_ need a full memory barrier here (memory barrier is implied in taking a Kristian> mutex). Otherwise the compiler or CPU could re-order the setting of the Kristian> wakeup_subsequent_commits_running flag with the reads and writes done in the Kristian> list manipulations. This could cause the two threads to corrupt the list as Kristian> they both manipulate it at the same time. This I would like to understand better. (It always nice to know what the cpu/compiler is really doing).. In this case the code is looping over a list and just setting list->wakeup_subsequent_commits_running= false; I don't see how the compiler or CPU could reorder the code to things in different order (as we are not updating anything else than this flag in the loop). This is especially true as setting of cur->wakeup_subsequent_commits_running= true; is done within a mutex and that is the last write to this element of the list. So there is a barrier between we set it and potentially clear it. As the 'clear' may now happen 'any time' (from other threads point of view) I don't see why it needs to be protected. Kristian> But after carefully checking, it seems to me a barrier would be enough, we do Kristian> not need to lock and unlock the mutex. Kristian> Unfortunately, we do not have any good mechanisms for doing memory barriers in Kristian> the MariaDB source currently, AFAIK. I will put a comment here that the Kristian> lock/unlock is only needed for the memory barrier, and could be changed Kristian> later. Then we can re-visit it when the more critical things have been fixed Kristian> (and if we get memory barrier primitives in the source tree). What do you Kristian> think of this suggestion? That is fine with me. Kristian> [I was wondering why I did it like this in the first place. Kristian> Due to chaining between the waiter's mutex and the waitees mutex, the extra Kristian> lock/unlock does prevent that wakeup_subsequent_commits2() can change the flag Kristian> in the middle of unregister_wait_for_prior_commit2() doing its Kristian> operation. Maybe that is why I did it this way. But I could not actually think Kristian> of any problems from such a change in the middle. Kristian> Or maybe I did it this way to emphasise that the code conceptually is locking Kristian> both the waitee and the waiter - but to prevent deadlocks, it temporarily Kristian> releases the waitee lock in the middle. So instead of Kristian> LOCK a; LOCK b; UNLOCK b; UNLOCK a; Kristian> we inject a temporary unlock of a: Kristian> LOCK a; [UNLOCK a]; LOCK b; UNLOCK b; [LOCK a]; UNLOCK a; Kristian> But it does not really make things any clearer.]
The only place where we test this variable, as far as I can find, is in wait_for_commit::register_wait_for_prior_commit()
Kristian> The critical place is in wait_for_commit::unregister_wait_for_prior_commit2()
This thread doesn't know if the other thread is just before mysql_mutex_lock(&waitee->LOCK_wait_commit), inside the lock or after the lock. So when we reset the variable should not matter.
The same is true for unregister_wait_for_prior_commit2. Kristian> What we need to ensure is that Kristian> - All the reads of the list in thread 1 see only writes from other threads Kristian> that happened _before_ the variable was reset. Kristian> - All changes to the list in thread 2 happen only _after_ the variable was Kristian> reset in thread 1. Kristian> So a memory barrier is needed, but as you say, lock/unlock should not be Kristian> needed. I still don't understand the current code fully. The issue is that unregister_wait_for_prior_commit2() can be called by thread 2 just before or just after thread 1 is resetting wakeup_subsequent_commits_running. There is not other variables involved. This means that in unregister_wait_for_prior_commit2() if thread 2 is there just before list->wakeup_subsequent_commits_running is set to false, we will go into this code: if (loc_waitee->wakeup_subsequent_commits_running) { /* When a wakeup is running, we cannot safely remove ourselves from the list without corrupting it. Instead we can just wait, as wakeup is already in progress and will thus be immediate. See comments on wakeup_subsequent_commits2() for more details. */ mysql_mutex_unlock(&loc_waitee->LOCK_wait_commit); while (waiting_for_commit) mysql_cond_wait(&COND_wait_commit, &LOCK_wait_commit); } I don't however see any code in :queue_for_group_commit() that will signal COND_wait_commit. What code do we have that will wake up the above code ? Looking at wait_for_commit::wakeup() that makes this: mysql_mutex_lock(&LOCK_wait_commit); waiting_for_commit= false; mysql_cond_signal(&COND_wait_commit); mysql_mutex_unlock(&LOCK_wait_commit); However, in wait_for_commit::register_wait_for_prior_commit() We set wait_for_commit() without a mutex protection ? Is this right? I assume that the code works because things are called in a specific order and some threads can't call other functions until at some certain point when things are safe. Like that wakeup() for a thread can never be called when that thread is just before the first mutex in wait_for_commit::register_wait_for_prior_commit() as in this case 'waiting_for_cond' variable may be set to false (in wakeup()) even if it should be true. Regards, Monty