Pavel Ivanov <pivanof@google.com> writes:
I've realized that the way slaves are processed now on the master allows them to connect even if they request non-existent GTID.
What happens here is that S3 requests GTID 0-2-3 from S1. S1 has in binlog: 0-1-1 0-1-2 0-1-3 0-2-4 So there is a "hole" in the binlog of S1, a transaction got missing. However, the code allows S3 to start replicating with 0-2-4 as the first event. Because we can be sure that this is the first event that we _do_ have that follows the requested 0-2-3. Now, if S1 had had only "0-1-1 0-1-2 0-1-3" in the binlog, then S3 would not be allowed to connect. Mainly to protect against the case where no further 0-2-* events ever appear, which would cause S3 to skip events forever waiting for such event.
Is it "works as intended" and will be different in the "strict mode" or you didn't want for such things to happen even in non-strict mode?
I am not sure. But my immediate impression is that this is the most consistent behaviour. In MariaDB GTID, we keep track of only the last applied GTID (within each domain), and rely on binlog sequence being identical between different servers. In this particular example we could detect that this was violated, but it was kind of accidental. If S3 had been stopped one event earlier or later, then we would not be able to detect the error. So catching this error case does not really seem to buy much in general. Also, when using stuff like --replicate-wild-ignore-table, holes can easily appear, and allowing a slave to connect "in the middle of a hole" seems reasonable. But I am open to arguments for the opposite. - Kristian.