Jan Lindström <jplindst@mariadb.org> writes:
I found this disturbing and not fully follow what kind of holes are possible. These GTIDS can be used by human users to start slaves on
For example, if you use some of the many filtering options, like --replicate-ignore-*. Then there will be GTIDs on the master that are not in the binlog on the slave. This slave can itself be a master, and it will have "holes" relative to the original master (but not relative to the sub-cluster it is the master of). So note that one persons missing transaction is another persons deliberate filtering.
particular position. How do you know that there is really a hole in GTID numbers instead that you started slave from incorrect position ?
You can always use the contents of the binlogs to know this. You can search the binlogs for your GTID and determine if it was a) logged in an earlier binlog that was purged, b) found in the binlog, c) a "hole" due to filtering or whatever, or d) not yet existing at all in the binlog (not yet received from the master or completely alternate future).
If you set the starting point to the real hole, what happens, is the replication started from next real GTID or from the beginning ?
Usually I would consider it an error for a slave to try to start from a hole. In gtid strict mode, we give the error. But in non-strict mode, replication is allowed to start from the next GTID, so that we remain compatible with all existing usage of replication. I've very carefully tried to make sure that we can do everything correctly, same as if we enforced monotonic sequence numbers with no holes.
Users could use GTID as a way to verify the slave state consistency using their own software. If the actual implementation does not allow this, it makes creating cluster replication state monitoring software very hard to implement.
What exactly is it that the implementation does not allow, or which is hard to implement?
I found alternate futures also disturbing, in database field that would mean that one server is on one state and another in different state and that would lead to state where you do not know which one is consistent or both are inconsistent.
Yeah, welcome to the scary world of MySQL replication. Some people do crazy stuff with it and certainly have "alternate futures" as daily normal operation. MariaDB GTID needs to support both that kind of anarchy, and also disciplined setups where alternate futures are considered a severe error to be avoided. This is one of the things that makes the problem so hard. Any constructive help in reaching this goal is appreciated. - Kristian.