Sergey Vojtovich <svoj@mariadb.org> writes:
Never seen this before. InnoDB of 5.5 and InnoDB/XtraDB of 10.0 both have:
UNIV_INTERN void mutex_set_waiters( /*==============*/ ib_mutex_t* mutex, /*!< in: mutex */ ulint n) /*!< in: value to set */ { volatile ulint* ptr; /* declared volatile to ensure that the value is stored to memory */ ut_ad(mutex);
ptr = &(mutex->waiters);
os_wmb;
*ptr = n; /* Here we assume that the write of a single word in memory is atomic */ }
But I confirm that I see this in XtraDB of 5.5.
Interesting. Clearly I got confused between 5.5/10.0 and the XtraDB/InnoDB distinction. There might be more to learn here. In fact, I was looking mostly at the assembler output, to be sure I correctly understood all the #ifdef magic and actual instructions generated, and only refering back to source code for the description. Looks like storage/innobase have a __sync_lock_test_and_set() (with a full barrier on x86) in the mutex_test_and_set() instead. But making sure that all the different combinations are correct looks like it could be a lot of fun...
Agree. What we basically do is: lock_word= 0; // atomic store with release barrier isync; if (waiters) // non-atomic load
But that's wrong, isn't it? Doesn't the isync need to go _after_ the conditional branch? isync discards any following speculated execution. So if you load a value and make a conditional jump that depends on the value loaded, then no instructions can start until after the load completed. Because the load must complete before the conditional can be decided. This effectively gives LoadLoad and LoadStore order. But isync _before_ a conditional - I am not sure that will do anything. But I guess, this was your point, as well ;-)
Here we need to ensure that store completes before load. But: - there is no isync on x86 - isync doesn't guarantee StoreLoad order (there are many articles about this)
Agree.
So full memory barrier seem to be the only option.
Yes...
2. In a user hang I investigated in detail, a thread was waiting on log_sys->mutex, but no other thread was holding that mutex (consistent with lost wakeup). The server error monitor thread was itself stuck on a mutex, which would prevent sync_arr_wake_threads_if_sema_free() from hiding the problem. Strange... Monty should have fixed this. Error monitor thread should call log_get_lsn_nowait(), which basically does trylock. Do you happen to have call trace?
This investigation was some time ago (maybe 1-2 months). It seems likely that this was a version before Monty's log_get_lsn_nowait() (eg. 10.0.14 has it, but not 10.0.13).
Do you have a link to articles that define full memory barrier for mutex
Hm, I don't really have a reference to point to. Well, I believe the posix thread definition says that a number of operations ensure full memory ordering across them.
lock/unlock? Probably that was quite outdated writing. See e.g.:
Outdated, perhaps, but aquire/release is AFAIK a relatively new concept. Mutexes, and InnoDB source code predates that? I'm not sure that C++ coming up with some strange new semantics for their language has much bearing on legacy code? But I agree it would be nice to have some references about "old-style" mutexes implying full memory barrier, so that it's not just me...
http://en.cppreference.com/w/cpp/atomic/memory_order
...Atomic load with memory_order_acquire or stronger is an acquire operation. The lock() operation on a Mutex is also an acquire operation...
...Atomic store with memory_order_release or stronger is a release operation. The unlock() operation on a Mutex is also a release operation...
Interesting... so C++ defines a "Mutex" with different semantics than what is usually understood with eg. pthread_mutex...
Full memory barriers are way too heavy even for mutexes. All we need to to is to ensure that: - all loads following lock are performed after lock (acquire) - all stores preceding unlock are completed before unlock (release)
Are you sure? Note that if this is true, then it means that there is _no_ way to get a StoreLoad barrier using only normal mutex operations. That seems odd... I know I have seen, and written, code that depends on lock/unlock being full barrier. How can we be sure that such code doesn't also exist in InnoDB? Though I agree that full barrier is a lot more expensive than just LoadLoad or StoreStore, and best avoided if possible (I even blogged about that not so long ago). - Kristian.