Robert Hodges <robert.hodges@continuent.com> writes:
Right. I thought about that problem a lot in the Tungsten parallel apply design and ended up with an approach that allows workers to diverge by several minutes or longer. This enables Tungsten to maintain good throughput even in the face of lumpy workloads that contain transactions
So are replication domains "shards"? My definition of a shard in this context is a causally independent stream of transactions, which is effectively a partial order within the fully serialized log. That's an excellent feature. Assuming that's what you have done, how do you handle operations like CREATE USER that are global in effect?
Yes, it sounds like replication domains are basically the same as shard. So in MariaDB, I suppose my approach is that we will try to do some amount of parallelisation automatically, and completely transparent to all applications (this is the in-order parallel replication). If that is not sufficient, the user can additionally help by splitting their load into replication domains, eg. to put the "lumps" in a separate domain, which will allow other transactions to execute ahead. And when splitting into separate domains, the burden falls on the user/application to ensure that different domains can replicate independently. So for something like CREATE USER or CREATE TABLE and the like, it will be necessary to ensure manually that all slaves have replicated the statement with global effect, before doing dependent transactions in a separate domain on the master. One way to ensure this is to run a MASTER_GTID_WAIT() on all slaves with the @@LAST_GTID of the statement from the master.
(Just point me to docs or your blog if you wrote it up. I would love to learn more.)
Docs are here: https://mariadb.com/kb/en/mariadb/documentation/replication-cluster-multi-ma... https://mariadb.com/kb/en/mariadb/documentation/replication-cluster-multi-ma... I wrote some stuff on my blog: http://kristiannielsen.livejournal.com/18435.html http://kristiannielsen.livejournal.com/16826.html http://kristiannielsen.livejournal.com/17008.html http://kristiannielsen.livejournal.com/17238.html http://kristiannielsen.livejournal.com/18308.html I notice that I wrote mostly about global transaction ID, and less about the parallel replication. Well, they are strongly interdependent, and eg. the replication domains are well explained in my writings on GTID, I hope. Though some features of parallel replication can be used even without GTID. - Kristian.