So suppose now we have a two-statement transaction:
BEGIN; UPDATE t1 SET a=a+1; UPDATE t2 SET b=b-1; COMMIT
and that slave IO thread reconnects in the middle (between the two updates).
First we receive the BEGIN and the first UPDATE and write it to relay log 1:
Format_Description (slave) Format_Description (master) Gtid_list Binlog_checkpoint GTID BEGIN 0-1-1 Query UPDATE t1 SET a=a+1
Then we disconnect and reconnect and write the remaining events to relay log 2:
Format_Description (slave) # master format description received but skipped # GTID 0-1-1 skipped # Query UPDATE t1 skipped UPDATE t2 SET b=b-1 XID COMMIT
and the SQL thread sees an intact transaction. (I am not sure I got 100% right exactly which non-transaction events are written, but this is the overall idea).
And so, the reason it is ok to skip the format description event from the master in relay log 2 is that this _only_ happens when we know for sure that the same master format description event was already written to an earlier relay log and seen by the slave. Or put another way, if we had not reconnected to the master at this point, we would not have received the master's format description event at this point in the first place, but things would still have worked without it obviously.
Makes sense? Or did I miss some case that could make this cause problems?
Yes, this causes problems because SQL thread sees Format_Description (slave) from relay log 2 and replaces its description_event_for_exec with it. So from this moment it continues to parse relay log events using slave's format description when all events actually created (and understood by IO thread) using master's format description.
When IO thread is reconnecting it rotates relay log and as I said it writes format description event at the beginning of the new file. But it writes an event that it created itself, i.e. not the one that master have sent. And as format description event from master is not written into relay log SQL thread from this point on starts to use format description generated by slave which may be different from the one generated by master. It may lead to a broken replication and SQL
But this must be the same problem with normal replication? Whenever the slave decides to rotate the relay log, it will write a format description event created by itself with no following format description created on the master. So it seems this must work somehow, though I'll frankly admit I do not understand the details of how this works (do you know?)
= 4 then it writes description_event_for_queue into relay log too. Also it ensures that the event has created = 0 and artificial_event set to 1. So SQL thread still gets the master's format description and doesn't rollback the transaction. When IO thread reconnects to master the first event it receives is Rotate. For Rotate event queue_event() executes process_io_rotate(). Inside there if mi->rli.relay_log.description_event_for_queue->binlog_version >= 4 it forcefully replaces description_event_for_queue with new event with binlog_version = 3. Then it does the actual relay log rotation during which description_event_for_queue is not written into the new log file (and it shouldn't as it's not master's at this point anyway). The next event IO thread receives is Format_Description. If it's written to relay log at this point then SQL thread will proceed using master's
Yes, I investigated this. During normal replication (when relay log is rotated automatically due to max_size) slave's format description is written at the beginning of the new realy log file, but right after that there's code that if description_event_for_queue->binlog_version format description. And I believe I tested the situation when IO thread reconnects in the middle of transaction and writes format description from master (I've commented those lines you pointed to) and I didn't see that transaction rolled back, everything was fine. Maybe I messed up with testing, I'll try to retest again...
Another somewhat related question: Gtid_log_event::peek() (as well as Gtid_log_event constructor from const char* buf) is implemented with assumption that Format_description_log_event::common_header_len is always equal to LOG_EVENT_HEADER_LEN. While currently it's true I
Agree, it looks like a bug.
Do you have the possibility to help with this? It is a bit hard for me to test such a fix as I do not have an easy way to generate binlogs with different header lengths, but I think perhaps that your team has such capability?
I can try, although first quick look showed that it can be tricky: peek() is used not only in IO thread, but during scanning binlogs too. I didn't look yet if code has correct description_event available at this point. Pavel