So, you keep insisting that it's important to have information about server_id of all servers that ever were a master for the database even with gtid_strict_mode = 1. :( I don't agree with that, because I believe this information can be useful only in circular replication or something similar. But okay, I understand your point... Pavel On Sun, Sep 15, 2013 at 5:46 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Pavel Ivanov <pivanof@google.com> writes:
3. Set gtid_binlog_state and gtid_slave_pos to '0-3-10' on S1 and to '0-1-1' on S2. Try to start slave on S2. Now I get error "slave has diverged". What gives? It's not diverged, it's just behind.
Yes, it is diverged.
S1 has binlog state '0-3-10'. This means that the only GTIDs it ever contained were 0-3-1, 0-3-2, 0-3-3, ..., 0-3-10.
But S2 has applied GTID 0-1-1, which never existed on S1, nor can it in the future without violating strict mode.
If S2 had been behind S1, then S1 must have had 0-1-1 in its binlog at some point, so the binlog state would have been '0-1-5,0-3-10' or something like that. With such a binlog state, the error message would have been "slave too old".
4. Now execute a couple transactions on S1, its gtid_current_pos is 0-1-12 now. Start slave on S2 (remember -- its gtid_current_pos is 0-1-1). And now I see even more confusing "The binlog on the master is missing the GTID 0-1-1 requested by the slave (even though both a prior and a subsequent sequence number does exist)". I'm sorry, which prior sequence number exists?
Ok, that is a bit unfortunate wording.
The point is - there is a hole in the binlog of S1 at the point of GTID 0-1-1. Because GTID 0-1-11 and 0-1-12 exists, but GTIDs 0-1-1, 0-1-2, ..., 0-1-10 do not exist in the binlog of S1 (and they never existed, according to binlog state).
Let me change the error message text to:
"The binlog on the master is missing the GTID %u-%u-%llu requested by the slave (even though a subsequent sequence number does exist), and GTID strict mode is enabled"
to avoid the confusion in the case where like here, the hole is at the very start of the sequence numbers.
Again, if the intension was that GTID 0-1-1 did exist in the binlog history, but was removed in purge or restore or whatever, the binlog state should have been '0-1-X,0-3-10'. Then the error message would have been 'slave too old'.
- Kristian.