Kristian Nielsen <knielsen@knielsen-hq.org> writes:
Hm, I don't really have a reference to point to. Well, I believe the posix thread definition says that a number of operations ensure full memory ordering across them.
I guess I was thinking of this: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_... It is somewhat wague: "The following functions synchronize memory with respect to other threads: ...". I thought some more about this. Consider a simple example: THREAD A THREAD B lock(M) foo() unlock(M) lock(M) bar() unlock(M) Here, I would expect that bar() in B can see all changes made by foo() in A, because B locked the mutex after A released it. This requires that stores in foo() are completed before loads in bar() begin, effectively a full barrier. However, this can be implemented in different ways. It doesn't have to be done with an explicit full barrier CPU instruction, as long as the correct semantics is preserved. So let's consider unlock() with release semantics, and lock() with aquire semantics: unlock(M): __sync_lock_release(M->lock_word) lock(M): while (__sync_lock_test_and_set(M->lock_word, 1)) /* wait for lock */ ; The release barrier ensures that stores in foo() will complete before the store of lock_word. The acquire barrier in the wait loop ensures that loads in bar() cannot start until a load in the while loop has seen the store to lock_word. Thus, loads in bar() become ordered after stores in foo(), effectively a full memory barrier even though no explicit full barrier instruction was needed. So you may be right that the implementation needs no explicit full memory barrier in theory... On the other hand, I have implemented code that _does_ assume a full barrier for each operation individually (see wait_for_commit::wakeup_subsequent_commits2() in 10.0 for an example). Others could have done the same. Reviewing all InnoDB code to make sure it is not the case, it does not really seem feasible... The burden will be on the one who changes InnoDB locking primitives to prove that such change is safe with respect to the expectations of the code that uses those primitives. This is made all the more challenging by the semantics being somewhat unclear, and the main platform x86 mostly doing "full barrier" by default on atomic operations like test-and-set. - Kristian.