Re: [Maria-developers] MariaDB allows for slave to connect with non-existent GTID

7 May 2013

      Pavel Ivanov <pivanof@google.com> writes:
...
I'd say if S3 stopped one event earlier then there would have been no
error at all. If S3 stopped one event later then sure it wouldn't be
possible to detect the error, but it will be detected in strict mode.
Ah, that is a good point.
...
But what I'm not feeling comfortable with is if S3 is stopped as it is
and if it tries to connect to S1 immediately it will cause error. Also
if there was no failover to S2 and S2 didn't author any new GTIDs then
it will cause error as well. It looks like difference between error
and non-error is very vague and fragile.
Right, so in fact, from this is appears that actually the most consistent
behaviour is to give an error in this case (slave requests 0-2-3, master is
missing this but has 0-2-4). Especially so in strict mode.
...
...
Also, when using stuff like --replicate-wild-ignore-table, holes can easily
appear, and allowing a slave to connect "in the middle of a hole" seems
reasonable.
So what you are saying is when stuff like
--replicate-wild-ignore-table is used slave will have holes in binlogs
compared to master. But in that case slaves won't ever have GTID that
is missing on master. But if we have 2nd slave with different table
Agree, it is still indication of something not configured right if we requests
something from slave that is missing on master.
...
filtering it will have different holes in binlogs. In this case if we
failover and make this 2nd slave master then it's quite possible that
1st slave will connect to new master with GTID that does not exist
there. I see how this is kind of valid situation from MariaDB point of
view, but I don't see how it makes sense to do this in real life.
So I see your point and I can't argue that this behavior should change
by default (except that it probably won't make any sense for anybody
to use such feature), but we would really like this situation to be
detected and replication to be stopped either in "gtid strict mode" or
in some other mode that we could turn on.
...
From your arguments above I'm leaning more towards giving an error now.
I think this is what I'll do (further comments welcome though):

1. In GTID strict mode, give an error.

2. In non-strict mode, do as current code.

The main use case for (2) will be to recover from the error in
(1). Temporarily clear GTID strict mode, replicate across the problematic
point, re-enable strict mode.

I think it is important to have a clear overall strategy for handling all
these different error cases. I think it is taking shape. We will have a strict
mode, which will be the recommended mode. And it will generally give an error
as soon as incorrect/dodgy usage is detected. And non-strict mode will
generally try to handle things without error.

And people can use non-strict mode if they think they know not to make
mistakes, or prefer some inconsistencies to having to deal with errors (but
then they should not complain if they shoot themselves in the foot).

And in general, in strict mode, if you get an error, you can handle it by
temporarily switching to non-strict mode to get past the error point. But then
at least you get to know about the potential problem and have a chance to
react on it.

I will put this on the queue to implement.

Thanks for the comments!

 - Kristian.

Re: [Maria-developers] MariaDB allows for slave to connect with non-existent GTID

Kristian Nielsen