Hi Kristian, Sorry for the late response.
On 12 Sep 2017, at 10:21, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Simon Mudd <simon.mudd@booking.com> writes:
ids. Obviously once all appropriate bin logs have been purged (naturally by other means) then no special processing will be needed.
Right. Hence my original idea (which was unfortunately never implemented so far). If at some point a domain has been unused for so long that all GTIDs in that domain are gone, it is relatively safe to pretend that the domain never existed.
I would like to understand if you can think of significant use cases where the DBA needs to have active binlog files in the master containing some domain, while simultaneously pretending that this domain never existed.
The only case I can think of would be this: Imagine a replication chain of M[aster] —> S[lave]1, S[lave]2, A[ggregate]1 and A[ggregate]1 —> A[ggregate]2 , A[ggregate]3, …. A1 one has all the data of M but also extra databases where aggregate information is made of the original data in M/S1/… Additionally it might make sense that writes to A1 use a different domain id. If M dies and say A1 happens to be more up to date than S1, S2 then we may want to promote A1 to be the new master, and move S1, S2 under A1, move A2 under A1 (but promote as the aggregate writeable master), and move A3 under A2. This would not be the “desired” setup as probably we’d end up thowing away all the aggregate data on A1. The topology would end up as: A1 [ignore any aggregate database/data] -> S1, S2, A2 and A2 —> A3 In this specific case it may be you really do want to hide the 2 sets of domains and only show one to the S1, S2 boxes, but maintain 2 domains on A2, A3. I have such a setup but have never had to handle such a failure scenario and it may be that there are better ways to handle this but one thing I’m sure about: I wouldn’t want to wipe out the binlogs especially on A1 as they hold valuable information which may not be stored anywhere else. Does that answer your question?
Or if it is more of a general concern, and the inconvenience for users to have to save old binlogs somewhere else than the master's data directory and binlog index (SHOW BINARY LOGS).
If you have to "save binlogs somewhere” you’re doing stuff manually. If you manage a lot of servers that’s really undesirable. ...
I understand the desire to not delete binlog files.
The problem is: If you want to have GTIDs with some domain in your active binlog files, _and_ you also want to pretend that this domain never existed, what does it mean? What is the semantics? It creates a lot of complexities for defining the semantics, for documenting it, for the users to understand it, and for the code to implement it correctly.
Yes. However, it also reflects realities of a slave connecting and trying to get back in sync to a master which may have “more information” (in terms of GTID info) than the slave but which is unable to “serve that information” to the slave. I’ve forgotten now exactly how this is handled by MariaDB but know that in a situation like that with MySQL GTID the sync just won’t work as you can’t afaik tell the slave to “not worry about “uuid:X” or “domain:Y” but just try to sync the rest. And in MySQL GTID at least injecting millions of empty events to get the slave’s GTID state inline with the missing UUID would be well ... rather stupid. In the perfect world this situation never happens. In the real world it does and the DBA often has to just live with inconsistencies between data stored on a master and a slave (which he can fix later) but *let replication flow*. It depends but in my opinion in most cases letting replication flow is more important than having 100% master and slave consistency. The longer the slave is stopped the more differences there are. And when you get in a situation like this you’re very tempted to go back to binlog file plus position, to scan the bin logs with tools like mysqlbinlog and do it the old way like we used to do years ago. This is tedious and error prone but if you’re careful it works fine. The whole idea of GTID is to avoid the DBA ever having to do this…
So basically, I do not understand what is the intended meaning of FLUSH BINARY LOGS DELETE DOMAIN d _and_ at the same time keeping GTIDs with domain d around in active binlog files? In what respects is the domain deleted, and in what respects not?
expire_logs_days allows you to keep bin logs for a time you’re comfortable with. So you can restore from an old system and then roll forward and do point in time recovery or simply catch up with the master. Normally slaves are _well_ ahead of the oldest bin logs. You have a copy of the bin logs on the master and if log_slave_updates is enabled an "equivalent backup” on a number of slaves. You normally don’t expect to use these files, but they’re there if you need them. If you wipe binlogs earlier than the configure expiration then you may need to keep those bin logs somewhere else. That’s just more to manage as the expiry is handled “automatically" for you by MariaDB. So I see the DELETE DOMAIN (MariaDB) or “remove old UUID” (MySQL) type request to be one that means the master will only pretend that it can serve or knows about the remaining domains or UUIDs and if the slaves are sufficiently up to date they really don’t care as their vision will be similar. Such a command would be replicated, right? It has to be for the slaves to change “their view” at the same moment in replication (not necessarily time) as the master. You need to be aware that slave’s won’t be able to request back before this point but that’s our job. This is after all a clean up / housekeeping type problem, and not something you’ll do much. My guess once every few months or once a year and that’s if you have a large number of domains or uuids. So with MariaDB replication I’d not expect this to happen much. For Oracle GTID since there’s a UUID for each master then master switchovers will add a new UUID to the list and if you do a few of these a year it seems quite possible that you may have a gtid_executed containing 20 or 30 UUIDs. That really is messy and in need of a clean up. MariaDB’s situation is likely I think to require a much smaller number of such clean ups but the idea of why this might be needed I think is the same.
For the master, the binlog files are mainly used to stream to connecting slaves. Deleting a domain means replacing the conceptual binlog history with one in which that domain never existed. So that domain will be ignored in a connecting slaves position, assuming it is served by another multi-source master. If a new GTID in that domain appears later, it will be considered the very first GTID ever in that domain.
So consider what happens if there is anyway GTIDs in that domain deeper in the binlog:
1. An already connected slave may be happily replicating those GTIDs. If that slave reconnects (temporary network error for example), it will instead fail with unknown GTID, or perhaps just start silently ignoring all further GTIDs in that domain. This kind of unpredictable behaviour seems bad.
2. Suppose a slave connects with a position without the deleted domain. The master starts reading the binlog from some point. What happens if a GTID is encountered that contains the deleted domain? The slave will start replicating that domain from some arbitrary point that depends on where it happened to be in other domains at the last disconnect. This also seems undesirable.
There may be other scenarios that I did not think about.
The maintenance we’re talking about should happen rather infrequently. So the situations of it going wrong and “causing confusion” are because you don’t control your replication well enough or are too eager to remove domains while adding new ones. In practice I think this is unlikely to happen but if it does it will be user error because they did not understand what they were doing. Appropriate warnings are definitely in order in the documentation of such commands and their usage.
DBAs do not like to remove bin logs “early" as unless you keep a copy somewhere you may lose valuable information, for recovery, for backups etc. Not everyone will be making automatic copies (as MySQL does not provide an automatic way to do this)
Understood. Maybe what is needed is a PURGE BINARY LOGS that removes the entries from the binlog index (SHOW BINARY LOGS), but leaves the files in the file system for the convenience of the sysadmin? (Well, you can just hand-edit binlog.index, but that requires master restart I think).
You’re back to doing stuff manually. Please: no. It doesn’t scale.
The other comment I see mentioned here was “make sure all slaves are up to date”. That’s going to be hard. The master can only be aware of “connected slaves” and if you have intermediate masters, or a
Indeed, the master cannot ensure this. The idea is that the DBA, who decides to delete a domain, must understand that this should not be done if any slave still needs GTIDs from that domain. This is similar to configuring normal binlog purge, where the DBA needs to ensure that binlogs are kept long enough for the needs of the slowest slave.
Yep. The DBA needs to be smart enough to “not worry about the problem” or to understand what he’s doing. So documentation needs warnings, though of course no-one ever reads the docs...
FWIW expiring old domains is good to do. There’s a similar FR for
completely different the problem space is the same. Coming up with a solution which is simple to use and understand and also avoids where that’s possible making mistakes which may break replication is good. So thanks for looking at this.
Indeed. And the input from people like you with strong operational experience is very valuable to end up with a good solution, hence my request for additional input.
Thanks for sharing your idea of things. Most of these things seem unimportant if you manage one or two servers and can shut them down and reconfigure them at your own convenience. However, when you manage more systems and can not shutdown masters the software (MariaDB/MySQL) needs to help you and make this sort of task an easy no-brainer which we can forget about, allowing us to get on with other things. Any help you can provide in solving these problems is certainly appreciated. Simon