Pavel Ivanov <pivanof@google.com> writes:
I was thinking about your words above and they make me wonder: if we have a master and it doesn't write its current state into rpl_slave_state, then we shut down the master, and while it's out we failover to another server. Then the first server comes back and should start to replicate from the new master. But how will it know which GTID to start from if it doesn't have any state in rpl_slave_state?
It uses what it has in the binlog (I call this the "binlog state"). The value of @@GLOBAL.gtid_pos is obtained as follows: - First take the row with the highest sub_id from mysql.rpl_slave_state, per domain_id. - Then look in the binlog state. If we have something there with higher seq_no, then it overrides what was in rpl_slave_state. So basically if the last transaction originated on this server it uses the binlog state, if the last transaction was replicated it uses mysql.rpl_slave_state.
I can guess that with a graceful shutdown there's also file <log-bin>.state which seem to contain the last state of the binlog. BTW, can you tell more about this file, how it participates in the replication? Will it be used to restore gtid_pos after shutdown?
<log-bin>.state contains the current binlog state at the time of shutdown. This is used to be able to output the correct Gtid_list event at the start of the new binlog created at server startup, and also to initialise the sequence number counter for new events.
But anyway that file seem to be written only at shutdown so in case when mysqld crashes both rpl_slave_state and <log-bin>.state won't exist. Will MariaDB restore its gtid_pos by scanning binlogs?
Yes. This uses the existing crash recovery framework. If the binlog was not closed cleanly ("we crashed"), the old binlog is scanned, any non-committed InnoDB transactions are either committed or rolled back as appropriate. And now with GTID, the binlog state is also restored during the scan.
Another question I have: I see with multiple domains gtid_pos is not sorted by domain id. I would guess it's sorted by the order of last
Yeah, I just did not think of it, it is probably in whatever order the internal hash table spits out stuff. Sorting it is a good idea, I'll add it to my ToDo. - Kristian.