Alex Yurchenko <alexey.yurchenko@codership.com> writes:
Yes, the idea of this model is that the main purpose of redundancy is durability, which, depending on a plugin, can be of much higher degree than flush to disk (e.g. binlog to a remote machine with the point-in-time recovery ability).
Yes.
There is a subtle moment here however: asynchronous redundancy plugin won't give you that durability. So, again there are
Well, an asynchronous replication with a local binlog like current MySQL replication could provide such durability, by replaying from local binlog.
several way to go about that: some applications may be willing to trade some durability (few last transactions) for speed. Or you can still have an option to ensure durability on engine level. Maybe something else.
Agree. There will be flexibility and various configuration options.
This is an interesting option indeed. However 1) the mapping itself should be durable, so every plugin must design the way to recover it in the case of crash. 2) global transaction ID better be integral part of the server state, otherwise it will complicate state snapshot transfer. E.g. it will be insufficient to just use mysqldump or copy db files, you'll need to carry global transaction ID along.
The above issues are of course resolvable one way or another. It is just not obvious what is easier: fix the engine or implement a workaround (if you consider it in all entirety)
Well, we want to fix all engines we can, and leave an option for backwards compatibility with engines we cannot. The problem is not so much if it is easy to fix the engine, the question is if it is possible? We have to pre_commit the transaction in the engine before we know the global transaction ID. And once the transaction is pre-committed, we cannot add anything to it, so cannot add the global transaction ID. Or is there a way around this?
Ok, I might be too rush here. (Semi)synchronous guarantee is that you don't send OK to client until the change is replicated to (at least some) other nodes. So technically you can do replication after local commit. Still there are 3 pros to call redundancy service before commit as I mentioned before: 1) it is required for consistent ordering in multi-master case
Right (I think with multi-master you mean synchronous replication like Galera. If so, this has probably been a source of slight confusion for me, as I think of multi-master replication as MySQL-style asynchronous replication with dual masters).
2) it gives you durability with synchronous plugins
Yes.
3) if you go asynchronous replication, why not start doing it really asynchronously, that is ASAP?
Yes, my thought as well ;-)
Note, that if you call redundancy service last, you still have to hold commit_order_lock for the duration of the call, so you don't win anything this way.
Yes. I think it makes sense for a primary redundancy service that provides durability and defines the global transaction ID, to be invoked first.
By the way, is it even necessary to have redundancy_service->post_commit()? It seems to me that it probably is not needed?
During local transaction execution some resources will be inevitably allocated for transaction by redundancy service (row modifications, etc.), there might be some buffers shared between the engine and redundancy service (opaque to redundancy service indeed), and finally, in the above code redundancy_service->pre_commit() locks commit_order_lock. redundancy_service->post_commit() purpose is to release those resources and the lock.
Yes. Makes sense. - Kristian.