On Tue, 16 Mar 2010 13:20:40 +0100, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Alex Yurchenko <alexey.yurchenko@codership.com> writes:
On Mon, 15 Mar 2010 10:57:41 +0100, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
What I am wondering at the moment is if the concept of global transaction ID should be a part of the new API, or if it is really an implemtation detail of the reduncancy service.
I'd go about it in the following way. We have an SQL server proper. And it has a state (database). And it is a state of the server that we want to be redundant (replicate, log, whatever). The particular server state is identified by a global transaction ID. From here is follows that global transaction ID should be the same regardless of the plugin.
It is also quite clear that each plugin will be using its own ID format internally. E.g. binlogger will be obviously using file offsets and Galera will be using 64-bit signed integers. Then plugins will just have to implement their own mapping to the ID defined in API. Which in most cases will be trivial.
Having a unified global transaction ID is unbelievably good, especially when you have cascading replication, where each cascade can use its own plugin. It is so good that you will never ever have any troubles with it, and no troubles with global transaction ID amounts to nirvana. ;)
Hm.
So in such cascading replication scenario, the changeset would actually keep its identity in the form of the global transaction ID?
So if on master1, the user does
BEGIN; INSERT INTO t1 VALUES (...); COMMIT;
this might get global transaction ID (UUID_master1, 100)
This might get replicated to a slave1 with multiple masters. The slave1 might then end up with three changesets, the one from master1, another from master2, and a third made by the user directly on slave1:
(UUID_master1, 100) (UUID_master2, 200) (UUID_slave1, 50)
So what if we now want to cascade replicate from slave1 (now as a master) to slave2? Would slave2 then see the same three global transaction IDs?
(UUID_master1, 100) (UUID_master2, 200) (UUID_slave1, 50)
That does not seem to work, does it? Seems to me slave1 would need to assign each changeset a new global transaction id in order for slave2 to know in which order to apply the changesets? In particular, whether to apply (UUID_slave1, 50) before or after (UUID_master1, 100).
So I think I misunderstood you here?
Or did you mean that the _format_ of the global transaction ID should be the same across all plugins, so that in a cascading replication scenario where servers are using different replication plugins, the IDs can be treated uniformly?
- Kristian.
Yes, you have misunderstood me, it is the value of the global transaction ID that stays constant (and format too, of course) ;) First of all, your example doesn't work exactly because you have chosen your global trx ID format (source, id_on_source) to be linearly incomparable. Second, let's forget for a moment about global transaction ID format and exact implementation, just remember that you can build a monotonic gapless sequence out of them. And suppose that (UUID_master1, 100) has ID1, (UUID_master2, 200) has ID2 and (UUID_slave1, 50) has ID3. And they are ordered ID1 < ID2 < ID3 without gaps. So slave1 has ID1, ID2, ID3. Slave2 will see the same, as everybody else. Suppose slave2 crashes/reboots after it applied ID1. Now it can connect to ANY node of the cluster and say "hey, I need events starting at ID2". And every node will know where to start from, because ID2 means the same trx on every node. This was all talking about a single Replication Set. You're probably envisioning master1 and master2 modifying disjoint (or maybe even the same) sets of data independently and slave1 aggregating changes from both of them. The masters don't see each others changes, so they can't mutually order their changesets, only slave1 can. How to go about that? Well, the trick here is that master1 and master2 in this case are not really members of the same replication cluster. They don't replicate to each other, right? So they have their own individual RS and their own global transaction ID sequences which are indeed incomparable. slave1 participates in both clusters, but can we say that the db on slave1 is a replica of master1 or master2. Well, it depends. If master1 modifies db1 and master2 modifies db2, then we just have 2 independent master-slave clusters happening to share the same physical hardware as a slave. If master1 and master2 modify the same db independently, then, strictly speaking, we don't have a case of db replication here and slave1 will order the changesets and assign his own ID sequence to them. To summarize, there can be various esoteric setups and RS concept is the key to understand the scope of global transaction ID there. Regards, Alex -- Alexey Yurchenko, Codership Oy, www.codership.com Skype: alexey.yurchenko, Phone: +358-400-516-011