[Maria-developers] GTIDs, events sequence and alternate futures
Kristian, I can't find the answer to my questions in the code and don't know if there's some documentation I can use (is there such thing?), so I'm writing my questions here. As I understand GTID is constructed as <domain id>-<server id>-<sequence number>. Is it guaranteed that sequence numbers are always increasing inside the domain id? So let's say two servers have GTID 0-1-100 and if they have alternate futures then one server will have GTIDs 0-1-101, 0-1-102 and so on, while second server will have 0-2-101, 0-2-102 etc. Is that correct? BTW, where in the code is sequence number increased exactly? Thank you, Pavel
Pavel Ivanov <pivanof@google.com> writes:
As I understand GTID is constructed as <domain id>-<server id>-<sequence number>. Is it guaranteed that sequence numbers are always increasing inside the domain id? So
The intention is that the user/DBA should configure things so that there is only ever one active master for each domain_id. If this is done correctly (eg. all slaves are made read-only, no fiddling with manual queries on slaves), then sequence numbers will be unique globally per domain_id, and server_id is actually unnecessary. Because if some server is first a slave, and then is promoted to master, then it will continue the sequence numbering from the point of the last replicated GTID. However, it is easy for novice users/DBAs to get this incorrect. Maybe they just fiddle a little on a slave with some manual queries or something. One can usually get away with this with old-style replication, so I tried to make it work also for GTID, even though the recommended operation is one-master-per-domain. If a user transaction is done on a slave in parallel with an active master, then there is no way to avoid duplicate sequence number. The server_id then ensures that we can still distinguish the GTIDs from the different events. Basically, the code tries to make sequence numbers unique per domain, but also tries to avoid relying on this uniques for correct operation. (I have thought of making a strict mode for GTID, which will make the slave give an error if it detects non-unique sequence number. This would allow experienced DBA to enforce one-master-per-domain and catch misconfigurations (by setting strict mode), and also allow novice users to not concern themselves with this (by not setting strict mode).
let's say two servers have GTID 0-1-100 and if they have alternate futures then one server will have GTIDs 0-1-101, 0-1-102 and so on, while second server will have 0-2-101, 0-2-102 etc. Is that correct?
Yes (if I understand you correctly). Each server will continue the sequence numbering from whichever GTID was last seen.
BTW, where in the code is sequence number increased exactly?
In MYSQL_BIN_LOG::write_gtid_event(). - Kristian.
Thank you. I need this info to make sure that our tools work properly. They need to determine which slave is further in the replication stream than others (which apparently can be achieved by simple comparison of sequence number) and detect if slaves have alternate futures. The latter can happen not only with novice DBAs and bad setups, but e.g. when experienced DBA has SUPER privilege, makes some changes on the master and at the same time failover happens... BTW, I have related questions. What will happen with MariaDB if sem-sync master is enabled, some transactions is completed, but before it gets any semi-sync acks all slaves get disconnected? Will this transaction be rolled back? Or it will be left committed basically creating an alternate future? When all flushing to disk is turned off (and maybe with some other conditions I don't know) is it possible to get in situation when binlogs and InnoDB state will get out of sync? And I don't mean after a power outage (without flushing it obviously can lead to any kind of bad results), but after mysqld crash. And was the answer to this question different for earlier MariaDB/MySQL versions? On Wed, May 1, 2013 at 12:27 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Pavel Ivanov <pivanof@google.com> writes:
As I understand GTID is constructed as <domain id>-<server id>-<sequence number>. Is it guaranteed that sequence numbers are always increasing inside the domain id? So
The intention is that the user/DBA should configure things so that there is only ever one active master for each domain_id. If this is done correctly (eg. all slaves are made read-only, no fiddling with manual queries on slaves), then sequence numbers will be unique globally per domain_id, and server_id is actually unnecessary.
Because if some server is first a slave, and then is promoted to master, then it will continue the sequence numbering from the point of the last replicated GTID.
However, it is easy for novice users/DBAs to get this incorrect. Maybe they just fiddle a little on a slave with some manual queries or something. One can usually get away with this with old-style replication, so I tried to make it work also for GTID, even though the recommended operation is one-master-per-domain. If a user transaction is done on a slave in parallel with an active master, then there is no way to avoid duplicate sequence number. The server_id then ensures that we can still distinguish the GTIDs from the different events.
Basically, the code tries to make sequence numbers unique per domain, but also tries to avoid relying on this uniques for correct operation.
(I have thought of making a strict mode for GTID, which will make the slave give an error if it detects non-unique sequence number. This would allow experienced DBA to enforce one-master-per-domain and catch misconfigurations (by setting strict mode), and also allow novice users to not concern themselves with this (by not setting strict mode).
let's say two servers have GTID 0-1-100 and if they have alternate futures then one server will have GTIDs 0-1-101, 0-1-102 and so on, while second server will have 0-2-101, 0-2-102 etc. Is that correct?
Yes (if I understand you correctly). Each server will continue the sequence numbering from whichever GTID was last seen.
BTW, where in the code is sequence number increased exactly?
In MYSQL_BIN_LOG::write_gtid_event().
- Kristian.
Pavel Ivanov <pivanof@google.com> writes:
What will happen with MariaDB if sem-sync master is enabled, some transactions is completed, but before it gets any semi-sync acks all slaves get disconnected? Will this transaction be rolled back? Or it will be left committed basically creating an alternate future?
It can not be rolled back. Before slaves can send semi-sync acks, the transaction must first be written to the binlog. Once a transaction has been written to the binlog, it can not be rolled back.
When all flushing to disk is turned off (and maybe with some other conditions I don't know) is it possible to get in situation when binlogs and InnoDB state will get out of sync? And I don't mean after a power outage (without flushing it obviously can lead to any kind of bad results), but after mysqld crash. And was the answer to this
With sync_binlog=0 and innodb_flush_log_at_trx_commit=2, InnoDB and the binlog should still be crash-safe and in sync. Because the data is written to the kernel with the write(2) system call, so will survive mysqld crash. If the kernel crashes (or a power outage), fsync() is required, as you say. I think innodb_flush_log_at_trx_commit=0 is not crash-safe.
question different for earlier MariaDB/MySQL versions?
I do not know of any difference to earlier versions. Except for the odd bug fix, and also of course really old versions (pre-5.0?) have no XA between InnoDB and binlog. - Kristian.
Thank you Kristian. On Wed, May 1, 2013 at 8:19 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Pavel Ivanov <pivanof@google.com> writes:
What will happen with MariaDB if sem-sync master is enabled, some transactions is completed, but before it gets any semi-sync acks all slaves get disconnected? Will this transaction be rolled back? Or it will be left committed basically creating an alternate future?
It can not be rolled back. Before slaves can send semi-sync acks, the transaction must first be written to the binlog. Once a transaction has been written to the binlog, it can not be rolled back.
Here you go, one more way to get alternate future for experienced DBAs. ;-) Although it's not an actual problem, because client doesn't get an acknowledgment of the transaction in such case. But still it's something that needs an additional scripting or even manual intervention to clean up. Pavel
participants (2)
-
Kristian Nielsen
-
Pavel Ivanov