Pavel Ivanov <pivanof@google.com> writes:
Well, I'd argue that accumulating lots of cruft does matter. Let's say it accumulated 10,000 different GTIDs. It will significantly slow down the slave connection initialization and it will blow up binlog size. The first effect could be especially dangerous because if e.g. 3 slaves are connected to master simultaneously master could fully consume 3 CPUs for a prolonged period of time which may affect ability to respond to client queries.
Overall I guess I don't quite like this design decision. It basically means that in the proper configuration special care should be taken to make sure that server_id numbers don't get retired forever but get reused instead...
I do not think it will be a problem. Even with 10000 master failovers (that's 10 times per day, every day for 3 years!), it's just 150kB in each binlog file, plus a single iteration over it during slave connect. But if it turns out that I am wrong and it is in fact a problem, we can easily add later something that prunes this information. A given (domain_id, server_id, seq_no) GTID is only needed as long as there is any slave that might request to start replicating from this position. After that, it can be safely omitted in subsequent Gtid_list_log_events. For example, each time we rotate the binlog, we can check for any GTID in Gtid_list_log_event that was the same in the Gtid_list_log_event of the first unpurged binlog file, and omit any such in the next Gtid_list_log_event output. But there is no reason to spend time on that until we know it is a problem. Maybe the deeper issue is that you would prefer a design where sequence number is assumed unique by itself (or (domain_id, sequence_number) if using multi-source). And the code is allowed to silently break if this assumption is violated by user? This makes a lot of things simpler, of course. I did think a lot about this, and in the end I decided against it, because of numerous cases where this could make things harder for users that are perhaps not intimately aware with how GTID works. I am still hopeful that the current design can give the best of both worlds: something that "just works" in most cases for most users, and still provides what is needed for advanced users that can be expected to know what they are doing. - Kristian.