[Maria-developers] Incorrect format description event skipping in queue_event()
Kristian, You have the following comment in the queue_event() in sql/slave.cc: /* Do not queue any format description event that we receive after a reconnect where we are skipping over a partial event group received before the reconnect. (If we queued such an event, and it was the first format_description event after master restart, the slave SQL thread would think that the partial event group before it in the relay log was from a previous master crash and should be rolled back). */ I don't understand which failure scenario you are talking about here and I claim that this bypassing of queuing into relay log is incorrect. My reasoning is: relay log can always be rotated in the middle of event group. During rotation when new relay log file is created format description event is always written at the beginning of the file. So in this case SQL thread will see format description event in the middle of event group and according to my testing it doesn't rollback the active transaction because of that. So when do you think it will rollback? Now, you may say that there's no problem in skipping the format description event because everything works. But it's only for now. When IO thread is reconnecting it rotates relay log and as I said it writes format description event at the beginning of the new file. But it writes an event that it created itself, i.e. not the one that master have sent. And as format description event from master is not written into relay log SQL thread from this point on starts to use format description generated by slave which may be different from the one generated by master. It may lead to a broken replication and SQL thread may not recognize events in the relay log if difference between master's and slave's format descriptions is significant. Another somewhat related question: Gtid_log_event::peek() (as well as Gtid_log_event constructor from const char* buf) is implemented with assumption that Format_description_log_event::common_header_len is always equal to LOG_EVENT_HEADER_LEN. While currently it's true I believe common_header_len can be increased in future MariaDB versions. And when it happens old slave won't be able to replicate from a new master. I'd think this is a bug. What do you think? Pavel
Pavel Ivanov <pivanof@google.com> writes:
You have the following comment in the queue_event() in sql/slave.cc:
/* Do not queue any format description event that we receive after a reconnect where we are skipping over a partial event group received before the reconnect.
(If we queued such an event, and it was the first format_description event after master restart, the slave SQL thread would think that the partial event group before it in the relay log was from a previous master crash and should be rolled back). */
I don't understand which failure scenario you are talking about here and I claim that this bypassing of queuing into relay log is incorrect.
It is this code, in Format_description_log_event::do_apply_event(): /* As a transaction NEVER spans on 2 or more binlogs: if we have an active transaction at this point, the master died while writing the transaction to the binary log, i.e. while flushing the binlog cache to the binlog. XA guarantees that master has rolled back. So we roll back. Note: this event could be sent by the master to inform us of the format of its binlog; in other words maybe it is not at its original place when it comes to us; we'll know this by checking log_pos ("artificial" events have log_pos == 0). */ if (!is_artificial_event() && created && thd->transaction.all.ha_list) { /* This is not an error (XA is safe), just an information */ rli->report(INFORMATION_LEVEL, 0, "Rolling back unfinished transaction (no COMMIT " "or ROLLBACK in relay log). A probable cause is that " "the master died while writing the transaction to " "its binary log, thus rolled back too."); const_cast<Relay_log_info*>(rli)->cleanup_context(thd, 1); } Try it, comment out these the lines in slave.cc if (unlikely(mi->gtid_reconnect_event_skip_count && !mi->gtid_event_seen)) gtid_skip_enqueue= true; and run the rpl.rpl_gtid_reconnect test. It will fail because the above code rolls back part of a transaction. So let me first explain the background for the reconnect-handling code, and then why I think the skipping you pointed out is ok. And then if you still think there is an issue, please be persistent - this is a really tricky part of the replication code that is very hard to get right, so there could definitely still be bugs. The issue is when the slave IO thread reconnects to the master. Maybe because of intermittent network issue or master shutdown, or could be explicit STOP SLAVE IO_THREAD. So the SQL thread is running while the reconnect happens. If the reconnect happens in the middle of an event group, we have half a group written to the relay log and potentially already partially executed by the SQL thread. So we need to write the second half of the event group to the relaylog, no more and no less. However, since we connect at a GTID position, we will receive from the master all the events from the start of the event group. So what we do is that we remember how many events from the event group we already wrote to the relay log before the disconnect. And after reconnect we then skip that number of events before writing the rest of the event group to the relay log. [Note that for reconnect after full STOP SLAVE, there is no issue, as in that case we clear all relay logs and start over from a well-defined GTID position. BTW, this is Bug#69758 in the MySQL 5.6 GTID implementation. Basically, the developers totally ignored the slave reconnect issue ... :-/] So suppose now we have a two-statement transaction: BEGIN; UPDATE t1 SET a=a+1; UPDATE t2 SET b=b-1; COMMIT and that slave IO thread reconnects in the middle (between the two updates). First we receive the BEGIN and the first UPDATE and write it to relay log 1: Format_Description (slave) Format_Description (master) Gtid_list Binlog_checkpoint GTID BEGIN 0-1-1 Query UPDATE t1 SET a=a+1 Then we disconnect and reconnect and write the remaining events to relay log 2: Format_Description (slave) # master format description received but skipped # GTID 0-1-1 skipped # Query UPDATE t1 skipped UPDATE t2 SET b=b-1 XID COMMIT and the SQL thread sees an intact transaction. (I am not sure I got 100% right exactly which non-transaction events are written, but this is the overall idea). And so, the reason it is ok to skip the format description event from the master in relay log 2 is that this _only_ happens when we know for sure that the same master format description event was already written to an earlier relay log and seen by the slave. Or put another way, if we had not reconnected to the master at this point, we would not have received the master's format description event at this point in the first place, but things would still have worked without it obviously. Makes sense? Or did I miss some case that could make this cause problems?
When IO thread is reconnecting it rotates relay log and as I said it writes format description event at the beginning of the new file. But it writes an event that it created itself, i.e. not the one that master have sent. And as format description event from master is not written into relay log SQL thread from this point on starts to use format description generated by slave which may be different from the one generated by master. It may lead to a broken replication and SQL
But this must be the same problem with normal replication? Whenever the slave decides to rotate the relay log, it will write a format description event created by itself with no following format description created on the master. So it seems this must work somehow, though I'll frankly admit I do not understand the details of how this works (do you know?)
Another somewhat related question: Gtid_log_event::peek() (as well as Gtid_log_event constructor from const char* buf) is implemented with assumption that Format_description_log_event::common_header_len is always equal to LOG_EVENT_HEADER_LEN. While currently it's true I
Agree, it looks like a bug. Do you have the possibility to help with this? It is a bit hard for me to test such a fix as I do not have an easy way to generate binlogs with different header lengths, but I think perhaps that your team has such capability? - Kristian.
So suppose now we have a two-statement transaction:
BEGIN; UPDATE t1 SET a=a+1; UPDATE t2 SET b=b-1; COMMIT
and that slave IO thread reconnects in the middle (between the two updates).
First we receive the BEGIN and the first UPDATE and write it to relay log 1:
Format_Description (slave) Format_Description (master) Gtid_list Binlog_checkpoint GTID BEGIN 0-1-1 Query UPDATE t1 SET a=a+1
Then we disconnect and reconnect and write the remaining events to relay log 2:
Format_Description (slave) # master format description received but skipped # GTID 0-1-1 skipped # Query UPDATE t1 skipped UPDATE t2 SET b=b-1 XID COMMIT
and the SQL thread sees an intact transaction. (I am not sure I got 100% right exactly which non-transaction events are written, but this is the overall idea).
And so, the reason it is ok to skip the format description event from the master in relay log 2 is that this _only_ happens when we know for sure that the same master format description event was already written to an earlier relay log and seen by the slave. Or put another way, if we had not reconnected to the master at this point, we would not have received the master's format description event at this point in the first place, but things would still have worked without it obviously.
Makes sense? Or did I miss some case that could make this cause problems?
Yes, this causes problems because SQL thread sees Format_Description (slave) from relay log 2 and replaces its description_event_for_exec with it. So from this moment it continues to parse relay log events using slave's format description when all events actually created (and understood by IO thread) using master's format description.
When IO thread is reconnecting it rotates relay log and as I said it writes format description event at the beginning of the new file. But it writes an event that it created itself, i.e. not the one that master have sent. And as format description event from master is not written into relay log SQL thread from this point on starts to use format description generated by slave which may be different from the one generated by master. It may lead to a broken replication and SQL
But this must be the same problem with normal replication? Whenever the slave decides to rotate the relay log, it will write a format description event created by itself with no following format description created on the master. So it seems this must work somehow, though I'll frankly admit I do not understand the details of how this works (do you know?)
= 4 then it writes description_event_for_queue into relay log too. Also it ensures that the event has created = 0 and artificial_event set to 1. So SQL thread still gets the master's format description and doesn't rollback the transaction. When IO thread reconnects to master the first event it receives is Rotate. For Rotate event queue_event() executes process_io_rotate(). Inside there if mi->rli.relay_log.description_event_for_queue->binlog_version >= 4 it forcefully replaces description_event_for_queue with new event with binlog_version = 3. Then it does the actual relay log rotation during which description_event_for_queue is not written into the new log file (and it shouldn't as it's not master's at this point anyway). The next event IO thread receives is Format_Description. If it's written to relay log at this point then SQL thread will proceed using master's
Yes, I investigated this. During normal replication (when relay log is rotated automatically due to max_size) slave's format description is written at the beginning of the new realy log file, but right after that there's code that if description_event_for_queue->binlog_version format description. And I believe I tested the situation when IO thread reconnects in the middle of transaction and writes format description from master (I've commented those lines you pointed to) and I didn't see that transaction rolled back, everything was fine. Maybe I messed up with testing, I'll try to retest again...
Another somewhat related question: Gtid_log_event::peek() (as well as Gtid_log_event constructor from const char* buf) is implemented with assumption that Format_description_log_event::common_header_len is always equal to LOG_EVENT_HEADER_LEN. While currently it's true I
Agree, it looks like a bug.
Do you have the possibility to help with this? It is a bit hard for me to test such a fix as I do not have an easy way to generate binlogs with different header lengths, but I think perhaps that your team has such capability?
I can try, although first quick look showed that it can be tricky: peek() is used not only in IO thread, but during scanning binlogs too. I didn't look yet if code has correct description_event available at this point. Pavel
Pavel Ivanov <pivanof@google.com> writes:
But this must be the same problem with normal replication? Whenever the slave decides to rotate the relay log, it will write a format description event created by itself with no following format description created on the master. So it seems this must work somehow, though I'll frankly admit I do not understand the details of how this works (do you know?)
Yes, I investigated this. During normal replication (when relay log is rotated automatically due to max_size) slave's format description is written at the beginning of the new realy log file, but right after that there's code that if description_event_for_queue->binlog_version
= 4 then it writes description_event_for_queue into relay log too. Also it ensures that the event has created = 0 and artificial_event set to 1. So SQL thread still gets the master's format description and doesn't rollback the transaction.
When IO thread reconnects to master the first event it receives is Rotate. For Rotate event queue_event() executes process_io_rotate(). Inside there if mi->rli.relay_log.description_event_for_queue->binlog_version >= 4 it forcefully replaces description_event_for_queue with new event with binlog_version = 3. Then it does the actual relay log rotation during which description_event_for_queue is not written into the new log file (and it shouldn't as it's not master's at this point anyway). The next
I see. So one possible solution is to do the same at the reconnect case as what we do in relay-log rotate initiated by slave due to size: Write out the description_event_for_queue to the relay log with created=0 and artificial=1. I have attached a patch for this, what do you think? Do you have the possibility to test if this works (eg. when we get a reconnect when the master's description event is incompatible with the slave's)? [This code really is criminally ugly, even for replication standard. But I do not really know how to fix it in any reasonable way :-(]
Maybe I messed up with testing, I'll try to retest again...
FYI: I took unmodified 10.0-base, commented out those two lines, ran rpl.rpl_gtid_reconnect, and it failed. Thanks, - Kristian.
Krisitan, The attached patch doesn't work properly because the format description event it saves is different from what master has sent. It has different number_of_event_types and potentially it has garbage (or incomplete data) in post_header_len. And regarding the fix to Gtid_log_event that we were talking about: I've tried to do that but I felt like I don't understand the code quite well at the moment to do that. So apparently all peek*() functions should have Format_description added as parameter and it's easy to pass one from queue_event(), but then e.g. mysql_binlog_send() will need description event too to pass it to send_event_to_slave() and is_until_reached() so that those could pass it to peek*(). For correctness this format description should be read from binlog, right? That would be pretty non-trivial to do. Also gtid_state_from_pos() needs Format_descirption, but it will apparently meet that at the beginning of the binlog file, so it can be created before it will be actually needed. Although constructor of Format_description_log_event needs another instance of Format_description. Should it be just created like Format_description_log_event(BINLOG_VERSION)? Pavel On Thu, Aug 29, 2013 at 6:24 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Pavel Ivanov <pivanof@google.com> writes:
But this must be the same problem with normal replication? Whenever the slave decides to rotate the relay log, it will write a format description event created by itself with no following format description created on the master. So it seems this must work somehow, though I'll frankly admit I do not understand the details of how this works (do you know?)
Yes, I investigated this. During normal replication (when relay log is rotated automatically due to max_size) slave's format description is written at the beginning of the new realy log file, but right after that there's code that if description_event_for_queue->binlog_version
= 4 then it writes description_event_for_queue into relay log too. Also it ensures that the event has created = 0 and artificial_event set to 1. So SQL thread still gets the master's format description and doesn't rollback the transaction.
When IO thread reconnects to master the first event it receives is Rotate. For Rotate event queue_event() executes process_io_rotate(). Inside there if mi->rli.relay_log.description_event_for_queue->binlog_version >= 4 it forcefully replaces description_event_for_queue with new event with binlog_version = 3. Then it does the actual relay log rotation during which description_event_for_queue is not written into the new log file (and it shouldn't as it's not master's at this point anyway). The next
I see.
So one possible solution is to do the same at the reconnect case as what we do in relay-log rotate initiated by slave due to size: Write out the description_event_for_queue to the relay log with created=0 and artificial=1.
I have attached a patch for this, what do you think?
Do you have the possibility to test if this works (eg. when we get a reconnect when the master's description event is incompatible with the slave's)?
[This code really is criminally ugly, even for replication standard. But I do not really know how to fix it in any reasonable way :-(]
Maybe I messed up with testing, I'll try to retest again...
FYI: I took unmodified 10.0-base, commented out those two lines, ran rpl.rpl_gtid_reconnect, and it failed.
Thanks,
- Kristian.
Pavel Ivanov <pivanof@google.com> writes:
The attached patch doesn't work properly because the format description event it saves is different from what master has sent. It has different number_of_event_types and potentially it has garbage (or incomplete data) in post_header_len.
Aha, so it's because of this, in Format_description_log_event::write() ? uchar buff[FORMAT_DESCRIPTION_HEADER_LEN + BINLOG_CHECKSUM_ALG_DESC_LEN]; size_t rec_size= sizeof(buff); memcpy((char*) buff+ST_COMMON_HEADER_LEN_OFFSET + 1, (uchar*) post_header_len, LOG_EVENT_TYPES); ret= (write_header(file, rec_size) || wrapper_my_b_safe_write(file, buff, rec_size) || So it uses the static size of own server version, not the one of the event that was read. And we get the effect you describe. So then it seems the fix is to modify the created and/or artificial (log_pos) fields manually in the buffer containing the master's description event, fix the checksum, and then write it out normally (not skip)? Or do you have a better suggestion? What I do not understand is why we do not have the exact same bug when the slave rotates the relay log due to max_relay_log_size? It seems to me it does the exact same thing? It writes out description_event_for_queue which can have different number_of_event_types. Maybe the point is that yes, we may have different number of event types and garbage in post_header_len, but: - If the slave knows fewer types than master, the missing types cannot be read by slave anyway. - If the slave knows more types, then those types cannot be written by the master anyway, so the garbage will not be accessed. ... which is horrible, but not much can surprise me anymore about replication code. So what do you think?
quite well at the moment to do that. So apparently all peek*() functions should have Format_description added as parameter and it's easy to pass one from queue_event(), but then e.g. mysql_binlog_send() will need description event too to pass it to send_event_to_slave() and is_until_reached() so that those could pass it to peek*(). For correctness this format description should be read from binlog, right? That would be pretty non-trivial to do.
Isn't it just a matter of reading in each format_description_log_event as it is seen and holding on to it in some description_event_for_binlog_send? Should be easy, right?
Also gtid_state_from_pos() needs Format_descirption, but it will apparently meet that at the beginning of the binlog file, so it can be created before it will be actually needed. Although constructor of
Yeah, again it should be just a matter of reading the format_description_event found at the start of the binlog file.
Format_description_log_event needs another instance of Format_description. Should it be just created like Format_description_log_event(BINLOG_VERSION)?
Yes, that should be ok, I think. It seems to be only used for common_header_len, and the value of that is frozen for Format_description_log_event. If you are interested in helping with testing this, I can write a patch later, but I need a way to test it without having to spend unreasonable amounts of time on it. - Kristian.
On Fri, Aug 30, 2013 at 1:44 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Pavel Ivanov <pivanof@google.com> writes:
The attached patch doesn't work properly because the format description event it saves is different from what master has sent. It has different number_of_event_types and potentially it has garbage (or incomplete data) in post_header_len.
Aha, so it's because of this, in Format_description_log_event::write() ?
uchar buff[FORMAT_DESCRIPTION_HEADER_LEN + BINLOG_CHECKSUM_ALG_DESC_LEN]; size_t rec_size= sizeof(buff); memcpy((char*) buff+ST_COMMON_HEADER_LEN_OFFSET + 1, (uchar*) post_header_len, LOG_EVENT_TYPES); ret= (write_header(file, rec_size) || wrapper_my_b_safe_write(file, buff, rec_size) ||
So it uses the static size of own server version, not the one of the event that was read. And we get the effect you describe.
So then it seems the fix is to modify the created and/or artificial (log_pos) fields manually in the buffer containing the master's description event, fix the checksum, and then write it out normally (not skip)? Or do you have a better suggestion?
Why not change Format_description_log_event::write()? It seems actually that it will be more correct if we change it. I made the following change (line numbers may be off) and it worked: @@ -4922,16 +4947,15 @@ bool Format_description_log_event::write(IO_CACHE* file) We don't call Start_log_event_v3::write() because this would make 2 my_b_safe_write(). */ - uchar buff[FORMAT_DESCRIPTION_HEADER_LEN + BINLOG_CHECKSUM_ALG_DESC_LEN]; - size_t rec_size= sizeof(buff); + uchar buff[START_V3_HEADER_LEN+1]; + size_t rec_size= sizeof(buff) + BINLOG_CHECKSUM_ALG_DESC_LEN + + number_of_event_types; int2store(buff + ST_BINLOG_VER_OFFSET,binlog_version); memcpy((char*) buff + ST_SERVER_VER_OFFSET,server_version,ST_SERVER_VER_LEN); if (!dont_set_created) created= get_time(); int4store(buff + ST_CREATED_OFFSET,created); buff[ST_COMMON_HEADER_LEN_OFFSET]= common_header_len; - memcpy((char*) buff+ST_COMMON_HEADER_LEN_OFFSET + 1, (uchar*) post_header_len, - LOG_EVENT_TYPES); /* if checksum is requested record the checksum-algorithm descriptor next to @@ -4944,7 +4968,7 @@ bool Format_description_log_event::write(IO_CACHE* file) #ifndef DBUG_OFF data_written= 0; // to prepare for need_checksum assert #endif - buff[FORMAT_DESCRIPTION_HEADER_LEN]= need_checksum() ? + uchar checksum_byte= need_checksum() ? checksum_alg : (uint8) BINLOG_CHECKSUM_ALG_OFF; /* FD of checksum-aware server is always checksum-equipped, (V) is in, @@ -4964,7 +4988,9 @@ bool Format_description_log_event::write(IO_CACHE* file) checksum_alg= BINLOG_CHECKSUM_ALG_CRC32; // Forcing (V) room to fill anyway } ret= (write_header(file, rec_size) || - wrapper_my_b_safe_write(file, buff, rec_size) || + wrapper_my_b_safe_write(file, buff, sizeof(buff)) || + wrapper_my_b_safe_write(file, (uchar*) post_header_len, number_of_event_types) || + wrapper_my_b_safe_write(file, &checksum_byte, sizeof(checksum_byte)) || write_footer(file)); if (no_checksum) checksum_alg= BINLOG_CHECKSUM_ALG_OFF;
What I do not understand is why we do not have the exact same bug when the slave rotates the relay log due to max_relay_log_size? It seems to me it does the exact same thing? It writes out description_event_for_queue which can have different number_of_event_types.
Probably I just didn't test this properly yet. Looking at the code, yes it shouldn't work, but the write() fix above should fix both cases.
quite well at the moment to do that. So apparently all peek*() functions should have Format_description added as parameter and it's easy to pass one from queue_event(), but then e.g. mysql_binlog_send() will need description event too to pass it to send_event_to_slave() and is_until_reached() so that those could pass it to peek*(). For correctness this format description should be read from binlog, right? That would be pretty non-trivial to do.
Isn't it just a matter of reading in each format_description_log_event as it is seen and holding on to it in some description_event_for_binlog_send? Should be easy, right?
Well, mysql_binlog_send() doesn't start to read the binlog file from the beginning. And if slave connects not using GTID but by binlog position no one reads the beginning of the binlog file before mysql_binlog_send() too. So that means that we have to add scanning of binlogs from the beginning inside mysql_binlog_send(). Of course, it will have an additional benefit of issuing sane error message if the binlog position slave requests is in the middle of event. But it's still a non-trivial change...
If you are interested in helping with testing this, I can write a patch later, but I need a way to test it without having to spend unreasonable amounts of time on it.
Sure, I'll be happy to test. Pavel
Pavel Ivanov <pivanof@google.com> writes:
Why not change Format_description_log_event::write()? It seems actually that it will be more correct if we change it. I made the following change (line numbers may be off) and it worked:
Agree, that sounds like a good idea, thanks!
If you are interested in helping with testing this, I can write a patch later, but I need a way to test it without having to spend unreasonable amounts of time on it.
Sure, I'll be happy to test.
Ok, great. Attached is a new patch for this. It makes mysql_binlog_send() and gtid_state_from_pos() construct the Format_description_log_event and pass it to Gtid_log_event::peek() and Gtid_list_log_event() to get the correct (I hope) common_header_length. The patch also includes my original patch for outputting the master's Format_description event during slave reconnect, fixed with out suggested changes. - Kristian.
On Tue, Sep 3, 2013 at 3:55 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
If you are interested in helping with testing this, I can write a patch later, but I need a way to test it without having to spend unreasonable amounts of time on it.
Sure, I'll be happy to test.
Ok, great.
Attached is a new patch for this. It makes mysql_binlog_send() and gtid_state_from_pos() construct the Format_description_log_event and pass it to Gtid_log_event::peek() and Gtid_list_log_event() to get the correct (I hope) common_header_length.
The patch also includes my original patch for outputting the master's Format_description event during slave reconnect, fixed with out suggested changes.
Thanks. This patch works for me. Note that there's also Query_log_event::peek() that should have the same dependency on common_header_len. It's hard (if at all possible) for me to experience that code path in my tests, but for the sake of general correctness I guess it should be fixed in a similar way too. Another thing that I noticed looking through the code: in the queue_event() function in case of GTID_EVENT inc_pos doesn't seem to be assigned to anything. Is it a bug or am I missing something? Thank you, Pavel
Pavel Ivanov <pivanof@google.com> writes:
Thanks. This patch works for me.
Ok, I'll push it then.
Note that there's also Query_log_event::peek() that should have the same dependency on common_header_len. It's hard (if at all possible) for me to experience that code path in my tests, but for the sake of general correctness I guess it should be fixed in a similar way too.
If you look closely, you will see that Query_log_event::peek_is_commit_rollback() does not depend on the format of the event header - it just looks at the query from the end of the event. This is faster than first parsing the header and then skipping any extra information before the query. We need to call this for _every_ event sent to the slave to check if it is the end of an event group, so it should be fast. So it should be ok without dependency on common_header_len.
Another thing that I noticed looking through the code: in the queue_event() function in case of GTID_EVENT inc_pos doesn't seem to be assigned to anything. Is it a bug or am I missing something?
Yes, looks like a bug, thanks for spotting! - Kristian.
On Wed, Sep 4, 2013 at 3:23 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Thanks. This patch works for me.
Ok, I'll push it then.
I've just noticed that my claim that patch works is based on my tree which also includes fixes from https://mariadb.atlassian.net/browse/MDEV-4645, in particular *_3 patch is of notice. So probably you want to review that and push too.
Note that there's also Query_log_event::peek() that should have the same dependency on common_header_len. It's hard (if at all possible) for me to experience that code path in my tests, but for the sake of general correctness I guess it should be fixed in a similar way too.
If you look closely, you will see that Query_log_event::peek_is_commit_rollback() does not depend on the format of the event header - it just looks at the query from the end of the event. This is faster than first parsing the header and then skipping any extra information before the query. We need to call this for _every_ event sent to the slave to check if it is the end of an event group, so it should be fast.
So it should be ok without dependency on common_header_len.
Right, but it checks if event_len < LOG_EVENT_HEADER_LEN + QUERY_HEADER_LEN. I'm not sure if this will be always false when common_header_len > LOG_EVENT_HEADER_LEN. Probably just this condition should be removed as there's check for event_len < 9 anyway. Pavel
participants (2)
-
Kristian Nielsen
-
Pavel Ivanov