Kristian, salute. Let me jump at once to the high-level specification, afterwards I am remarking on or dwelling into specific parts of the text. Your last reply made it explicit that you mean totally strict setup on master (p.(2) of the following list): KN> (1). We want the master to "forget about the past" with respect to a given domain. This is easy. All that is needed is to rotate the binlog and omit the domain from the GTID_LIST event at the start of the new binlog. Because when the master searches back for a given GTID in the binlog, it stops when it sees a GTID_LIST event without that domain. KN> (2). We want to prevent a user accidentally putting the server into an inconsistent state with an incorrect DELETE DOMAIN command. This is ensured by the requirement that all existing binlog files are free of that domain. Should a slave later, incorrectly, try to access that domain, it will receive the wrong error (that it is diverged rather than that the necessary binlog file has been purged), but at least it _will_ get an error as it should, not silently corrupt replication. In contrast, I thought of a "liberal" setup provisioned by "the user must know what he is doing". And I did so seeing no other way to help out MDEV-12012 use case. Indeed, when the undesired domain events reside in the very last binlog file and history behind the last file is still important for the user your 4 step strict protocol of
1. FLUSH BINARY LOGS, note the new GTID position.
2. Ensure that all slaves are past the problematic point with MASTER_GTID_WAIT(<pos>). After this, the old errorneous binlog files are no longer needed.
3. PURGE BINARY LOGS to remove the errorneous logs.
4. FLUSH BINARY LOG DELETE DOMAIN domain
might be equivalent to RESET MASTER as the 'erroneous' log file is last. That's why I was content without p.3 and with p.4 that does not necessary error out. Naturally I am fine with the strictness of 1-4. But I can't say for the user whether the new unyielding (always erroring out that is) delete domain FLUSH LOGS would always satisfy. To dramatize mdev-12012 case with a complication, what if p.2 can't be not ensured, say, due to another temporarily stopped slave who (for simplicity) does not care for the being deleted domain? On one hand we can't purge the master's binlogs (the stopped slave constraint), on the other the p.4 alone suffices to either slave (though the stopped one may need reconfiguration to filter out the deleted domain's events). If my concern is practical we may consider *optionally* strict delete domain FLUSH LOGs. The errored out version would maintain a strict gtid semantics on Master. The liberal one would cover the above case as well. And the user would be to choose.
andrei.elkin@pp.inet.fi writes:
Let me propose methods to clean master off unused gtid domains. I would be glad to hear your opinions, dear colleagues.
So a bit of background: The central idea in MariaDB GTID is the sequence of events that created the current master state. This is an abstract concept. Conceptually, the current state of this server is defined as executing a specific sequence of events (in practice it might have been restored from a backup or something). Abstractly, the server's binlog is exactly this sequence of events (in practice the early part probably no longer exists or possibly never did). The sequence is multi-streamed (one stream per domain). Everything (in GTID, but also in parallel replication and group commit) is based on the assumption that each stream in the binlog sequence is strictly ordered, at least on a single given server.
It is important to understand that it is the actual sequence of events that matters, conceptually. The actual GTID format of D-S-N is only an implementation detail that allows the code to work correctly. The sequence is defined by the binlog, not by the particular sequence numbers in GTID or other details.
When a slave connects to our master server, it presents its current position as a single event within each stream. By the above, this is sufficient to reliably find the correct position in the binlog to restart the slave from.
Because MariaDB replication is async, we cannot in general prevent different servers from errorneously ending up with different binlog sequence. However, we can ensure a consistent view of the sequence on a single server, and we can try to detect and flag any inconsistencies between servers as they are noticed.
This is why it is necessary to give an error if a slave presents a position containing an event that is not in the master's binlog. The master cannot know if this is because the slave is ahead (the event in question will arrive later on the master), or because replication has diverged (the event will never arrive on the master, and the replication position is not well defined). It is a central goal in GTID to avoid, as much as possible, silent incorrect operation in replication.
With that explained, now onto some concrete comments/answers:
The past default domain-id is actually permanent past from the user perspective in these cases. Its events have been already replicated and none new will be generated and replicated.
But from the point of view of GTID semantics, the binlog sequence is still defined by this past, and in an inconsistent (and hence incorrect) way.
Therefore such domain conceptually may be cleaned away from either the masters and slave states.
So as you say, the errorneous state must be fixed for GTID to work correctly. One way is to discard the entire incorrect binlog with RESET MASTER. But this discussion is about fixing the binlog in-place, by (conceptually) replacing it with a variant which does not contain the problematic past.
The idea looks quite sane, I only could not grasp why presence of being deleted domains in the very first binlog's GTID_LIST_LOG_EVENT list is warrant for throwing error. Maybe we should leave it out to the user, Kristian? That is to decide what domain is garbage regardless of the binlog state history.
DELETE DOMAIN d1 replaces the conceptual binlog sequence with one in which domain d1 never existed. If there would be actual binlog files containing events in d1, this would be a grave inconsistency.
For example, if an existing slave was still replicating events in d1, if a temporary network error caused it to reconnect to the master, it would fail to reconnect. A slave without knowledge of d1 replicating might start re-applying any events encountered. Basically, after DELETE DOMAIN d1, any binlog file containing d1 is invalid and useless, so it seems appropriate to require the user to PURGE BINARY LOG them first.
'Invalid and useless' is fair as long as the user opts for the strict semantics. But his actual practice may demand flexibility, I hope my example above is relevant.
SET @@SESSION.gtid_seq_no=18446744073709551615; CREATE TABLE IF NOT EXISTS `table_dummy`; SHOW LOCAL VARIABLES LIKE '%gtid_binlog_pos%';
11-1-18446744073709551615
SET @@SESSION.gtid_seq_no=0 DROP TABLE `table_dummy`; SHOW LOCAL VARIABLES LIKE '%gtid_binlog_pos%';
11-1-0
Ouch. That's a bug. This should give an error, I think that could lead to all kinds of extremely nasty problems :-(
I agree. And you don't just mean the zero sequence number is bogus, do you? There must be some reaction on wrap-around itself I believe.
1. Leave wrapping around an old domain to the user via running the queries like above; 2. The binary logger would be made to react on the fact of wrap-around with binary log rotation ("internal" FLUSH BINARY LOG). And the new binlog file won't contain the wrapped "away" domain (because there are no new event group in it of yet).
I am not sure I understand you here. Are you suggesting that the GTID sequence wrap-around bug be instead declared a feature, and be documented as the way to delete a domain in the binlog? I do not think that is appropriate.
Let me highlight it a bit more. When the domain range gets filled up on Master, it can't just wrap it around and log on, even correctly starting with the sequence number 1. In presence of slaves something like your p.2 synchronization would be required before the domain range could be reset and the number 1 reused. But the synchronization (with all slaves) makes the domain obsolete. And your strict semantics would require p.3 purge at time the range becomes reused (otherwise we would have two binlog files with the same gtid). Therefore I think the domain wrap-around relates to the old domain deletion.
As I see it, there are two sides to this.
(1). We want the master to "forget about the past" with respect to a given domain. This is easy. All that is needed is to rotate the binlog and omit the domain from the GTID_LIST event at the start of the new binlog. Because when the master searches back for a given GTID in the binlog, it stops when it sees a GTID_LIST event without that domain.
(2). We want to prevent a user accidentally putting the server into an inconsistent state with an incorrect DELETE DOMAIN command. This is ensured by the requirement that all existing binlog files are free of that domain. Should a slave later, incorrectly, try to access that domain, it will receive the wrong error (that it is diverged rather than that the necessary binlog file has been purged), but at least it _will_ get an error as it should, not silently corrupt replication.
I think the requirement is a reasonable one. The domain was configured incorrectly, the binlog files containing it cannot be used safely with GTID. The procedure to fix it will then be:
1. FLUSH BINARY LOGS, note the new GTID position.
2. Ensure that all slaves are past the problematic point with MASTER_GTID_WAIT(<pos>). After this, the old errorneous binlog files are no longer needed.
3. PURGE BINARY LOGS to remove the errorneous logs.
4. FLUSH BINARY LOG DELETE DOMAIN d
It is of course an option to not do (2). Just be aware that this goes against the whole philosophy that GTID was designed around - to prioritise consistency and "no silent corruption".
Hope this helps. Of course feel free to ask for more details on any point that is not clear.
- Kristian.
Thank you for discussing it with me! Andrei