Re: [Maria-developers] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

7 Sep 2017

      Kristian, salute.

Let me jump at once to the high-level specification, afterwards I am
remarking on or dwelling into specific parts of the text.

Your last reply made it explicit that you mean totally strict setup
on master (p.(2) of the following list):

KN> (1). We want the master to "forget about the past" with respect to a
    given domain. This is easy. All that is needed is to rotate the
    binlog and omit the domain from the GTID_LIST event at the start of
    the new binlog. Because when the master searches back for a given
    GTID in the binlog, it stops when it sees a GTID_LIST event without
    that domain.

KN> (2). We want to prevent a user accidentally putting the server into an
    inconsistent state with an incorrect DELETE DOMAIN command. This is
    ensured by the requirement that all existing binlog files are free
    of that domain.  Should a slave later, incorrectly, try to access
    that domain, it will receive the wrong error (that it is diverged
    rather than that the necessary binlog file has been purged), but at
    least it _will_ get an error as it should, not silently corrupt
    replication.

In contrast, I thought of a "liberal" setup provisioned by "the user
must know what he is doing". And I did so seeing no other way to help
out MDEV-12012 use case. Indeed, when the undesired domain events
reside in the very last binlog file and history behind the last file
is still important for the user your 4 step strict protocol of
...
1. FLUSH BINARY LOGS, note the new GTID position.
...
2. Ensure that all slaves are past the problematic point with
MASTER_GTID_WAIT(<pos>). After this, the old errorneous binlog files are no
longer needed.
3. PURGE BINARY LOGS to remove the errorneous logs.
4. FLUSH BINARY LOG DELETE DOMAIN domain
might be equivalent to RESET MASTER as the 'erroneous' log file is last.
That's why I was content without p.3 and with p.4 that does not
necessary error out.

Naturally I am fine with the strictness of 1-4. But I can't say for the
user whether the new unyielding (always erroring out that is) delete domain
FLUSH LOGS would always satisfy.

To dramatize mdev-12012 case with a complication, what if p.2 can't be
not ensured, say, due to another temporarily stopped slave who (for
simplicity) does not care for the being deleted domain?
On one hand we can't purge the master's binlogs (the stopped slave
constraint), on the other the p.4 alone suffices to either slave (though
the stopped one may need reconfiguration to filter out the deleted
domain's events).

If my concern is practical we may consider *optionally* strict
delete domain FLUSH LOGs. The errored out version would maintain a
strict gtid semantics on Master. The liberal one would cover the above
case as well. And the user would be to choose.
...
andrei.elkin@pp.inet.fi writes:
...
Let me propose methods to clean master off unused gtid domains.
I would be glad to hear your opinions, dear colleagues.
So a bit of background: The central idea in MariaDB GTID is the sequence of
events that created the current master state. This is an abstract concept.
Conceptually, the current state of this server is defined as executing a
specific sequence of events (in practice it might have been restored from a
backup or something). Abstractly, the server's binlog is exactly this
sequence of events (in practice the early part probably no longer exists or
possibly never did). The sequence is multi-streamed (one stream per domain).
Everything (in GTID, but also in parallel replication and group commit) is
based on the assumption that each stream in the binlog sequence is strictly
ordered, at least on a single given server.
It is important to understand that it is the actual sequence of events that
matters, conceptually. The actual GTID format of D-S-N is only an
implementation detail that allows the code to work correctly. The sequence
is defined by the binlog, not by the particular sequence numbers in GTID or
other details.
When a slave connects to our master server, it presents its current position
as a single event within each stream. By the above, this is sufficient to
reliably find the correct position in the binlog to restart the slave from.
Because MariaDB replication is async, we cannot in general prevent different
servers from errorneously ending up with different binlog sequence. However,
we can ensure a consistent view of the sequence on a single server, and we
can try to detect and flag any inconsistencies between servers as they are
noticed.
This is why it is necessary to give an error if a slave presents a position
containing an event that is not in the master's binlog. The master cannot
know if this is because the slave is ahead (the event in question will
arrive later on the master), or because replication has diverged (the event
will never arrive on the master, and the replication position is not well
defined). It is a central goal in GTID to avoid, as much as possible, silent
incorrect operation in replication.
With that explained, now onto some concrete comments/answers:
...
The past default domain-id is actually permanent past from the user
perspective in these cases. Its events have been already replicated and
none new will be generated and replicated.
But from the point of view of GTID semantics, the binlog sequence is still
defined by this past, and in an inconsistent (and hence incorrect) way.
...
Therefore such domain conceptually may be cleaned away from either the
masters and slave states.
So as you say, the errorneous state must be fixed for GTID to work
correctly. One way is to discard the entire incorrect binlog with RESET
MASTER. But this discussion is about fixing the binlog in-place, by
(conceptually) replacing it with a variant which does not contain the
problematic past.
...
The idea looks quite sane, I only could not grasp why presence of being
deleted domains in the very first binlog's GTID_LIST_LOG_EVENT list is
warrant for throwing error.
Maybe we should leave it out to the user, Kristian? That is to decide
what domain is garbage regardless of the binlog state history.
DELETE DOMAIN d1 replaces the conceptual binlog sequence with one in which
domain d1 never existed. If there would be actual binlog files containing
events in d1, this would be a grave inconsistency.
For example, if an existing slave was still replicating events in d1, if a
temporary network error caused it to reconnect to the master, it would fail
to reconnect. A slave without knowledge of d1 replicating might start
re-applying any events encountered. Basically, after DELETE DOMAIN d1, any
binlog file containing d1 is invalid and useless, so it seems appropriate to
require the user to PURGE BINARY LOG them first.
'Invalid and useless' is fair as long as the user opts for the strict
semantics. But his actual practice may demand flexibility, I hope my
example above is relevant.
...
...
SET @@SESSION.gtid_seq_no=18446744073709551615;
CREATE TABLE IF NOT EXISTS `table_dummy`;
SHOW LOCAL VARIABLES LIKE '%gtid_binlog_pos%';
11-1-18446744073709551615
SET @@SESSION.gtid_seq_no=0
DROP TABLE `table_dummy`;
SHOW LOCAL VARIABLES LIKE '%gtid_binlog_pos%';
11-1-0
Ouch. That's a bug. This should give an error, I think that could lead to
all kinds of extremely nasty problems :-(
I agree. And  you don't just mean the zero sequence number is bogus, do
you? There must be some reaction on wrap-around itself I believe.
...
...
1. Leave wrapping around an old domain to the user via running
   the queries like above;
2. The binary logger would be made to react on the fact of wrap-around
   with binary log rotation ("internal" FLUSH BINARY LOG). And the new
   binlog file won't contain the wrapped "away" domain (because there
   are no new event group in it of yet).
I am not sure I understand you here. Are you suggesting that the GTID
sequence wrap-around bug be instead declared a feature, and be documented as
the way to delete a domain in the binlog? I do not think that is
appropriate.
Let me highlight it a bit more.
When the domain range gets filled up on Master, it can't just wrap it
around and log on, even correctly starting with the sequence number 1.
In presence of slaves something like your p.2 synchronization would be
required before the domain range could be reset and the number 1 reused.

But the synchronization (with all slaves) makes the domain obsolete. And
your strict semantics would require p.3 purge at time the range becomes
reused (otherwise we would have two binlog files with the same gtid).

Therefore I think the domain wrap-around relates to the old domain
deletion.
...
As I see it, there are two sides to this.
(1). We want the master to "forget about the past" with respect to a given
domain. This is easy. All that is needed is to rotate the binlog and omit
the domain from the GTID_LIST event at the start of the new binlog. Because
when the master searches back for a given GTID in the binlog, it stops when
it sees a GTID_LIST event without that domain.
(2). We want to prevent a user accidentally putting the server into an
inconsistent state with an incorrect DELETE DOMAIN command. This is ensured
by the requirement that all existing binlog files are free of that domain.
Should a slave later, incorrectly, try to access that domain, it will
receive the wrong error (that it is diverged rather than that the necessary
binlog file has been purged), but at least it _will_ get an error as it
should, not silently corrupt replication.
I think the requirement is a reasonable one. The domain was configured
incorrectly, the binlog files containing it cannot be used safely with GTID.
The procedure to fix it will then be:
1. FLUSH BINARY LOGS, note the new GTID position.
2. Ensure that all slaves are past the problematic point with
MASTER_GTID_WAIT(<pos>). After this, the old errorneous binlog files are no
longer needed.
3. PURGE BINARY LOGS to remove the errorneous logs.
4. FLUSH BINARY LOG DELETE DOMAIN d
It is of course an option to not do (2). Just be aware that this goes
against the whole philosophy that GTID was designed around - to prioritise
consistency and "no silent corruption".
Hope this helps. Of course feel free to ask for more details on any point
that is not clear.
- Kristian.
Thank you for discussing it with me!

Andrei

Re: [Maria-developers] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

andrei.elkin＠pp.inet.fi