Alex Yurchenko <alexey.yurchenko@codership.com> writes:
On Mon, 29 Mar 2010 00:02:09 +0200, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
The way I understood the above is that global mutex is taken in InnoDB prepare() solely to synchronize binlog and InnoDB commits. Is that so? If
Yes.
it is, than it is precisely the thing we want to achieve, but instead of locking global mutex in Innodb prepare() we'll be doing it in redundancy_service->pre_commit() as discussed earlier:
innodb->prepare();
if (redundancy_service->pre_commit() == SUCCESS) // locks commit_order mtx { innodb->commit(); redundancy_service->post_commit(); // unlocks commit_order mtx } ...
Yes. This way will prevent group commit in InnoDB, as here innodb->commit() does fsync() under a global mutex.
This way global lock in innnodb->prepare() can be naturally removed without any additional provisions. Am I missing something?
Agree that this removes the need for innodb to take its lock in prepare() and release in commit().
On the other hand, if we can reduce the amount of commit ordering operations to the absolute minimum, as you suggest below, it would only benefit performance. I'm just not sure about names. Essentially this means splitting commit() into 2 parts: the one that absolutely must be run under commit_order mutex protection and another that can be run outside of the critical section. I guess in that setup all actual IO can easily go into the 2nd part.
Yes (I did not think long about the names, probably better names can be devised).
lock(global_commit_order_mutex) fix_binlog_or_redundancy_service_commit_order() for (each storage engine) engine->fix_commit_order() unlock(global_commit_order_mutex)
What I'd like to correct here is that ordering is needed at least in redundancy service. You need global trx ID. And I believe storage engines won't be able to do without it either - otherwise we'll need to deal with holes in commit sequence during recovery.
Yes.
Also, I'd suggest to move the global_commit_order_mutex into what goes by "fix_binlog_or_redundancy_service_commit_order()" (the name is misleading - redundancy service determines the order, it does not have to fix it) in the above pseudocode. Locking it outside may seriously reduce concurrency.
Agree (in fact, though I did not say so explicitly, I thought of the entire pseudo code above as being in fact implemented inside the redundancy service plugin). - Kristian.