Hi!
"Pavel" == Pavel Ivanov <pivanof@google.com> writes:
Pavel> sql/transaction.cc:139(trans_begin(THD*, unsigned int))[0x788e20] Pavel> sql/log_event.cc:6478(Gtid_log_event::do_apply_event(rpl_group_info*))[0x93a685] Pavel> sql/log_event.h:1341(Log_event::apply_event(rpl_group_info*))[0x5ca108] Pavel> sql/slave.cc:3191(apply_event_and_update_pos(Log_event*, THD*, Pavel> rpl_group_info*, rpl_parallel_thread*))[0x5c0da8] Pavel> sql/slave.cc:3464(exec_relay_log_event)[0x5c1498] Pavel> sql/slave.cc:4516(handle_slave_sql)[0x5c44e9]
Pavel> Which means that slave tries to execute BEGIN event while OPTION_BEGIN Pavel> is set which shouldn't ever happen.
The assert you put in the code doesn't show that anything is wrong.
The reason is the following code in log_event.cc:
Pavel> On Wed, Feb 26, 2014 at 8:12 PM, Michael Widenius <monty@askmonty.org> wrote: Pavel> And then it said that slave died with the stack trace thd-> variables.option_bits|= OPTION_BEGIN | OPTION_GTID_BEGIN;
DBUG_PRINT("info", ("Set OPTION_GTID_BEGIN")); trans_begin(thd, 0);
In other words, we do set OPTION_BEGIN just before calling trans_begin(), so the assert is wrong.
Pavel> To me this code looks clearly wrong. You set OPTION_BEGIN just before Pavel> calling trans_begin() which forces trans_begin() to kick off the Pavel> commit machinery. And even though it ends up doing nothing, I don't Pavel> know how trivial is the number of CPU cycles it spends on that. But Pavel> why set OPTION_BEGIN if it will be set in the trans_begin() anyway? The reason was simply that I wanted almost all lines that was using OPTION_GTID_BEGIN to set and reset OPTION_BEGIN. This was mainly to make it trivial to ensure with 'grep' that I did not miss any cases. What I had missed was that trans_begin() did check for OPTION_BEGIN and when this was set did a few extra unnecessary tests to commit non-existing things. I agree that it should not be set here. Pavel> So I fixed that and made the line to look like thd-> variables.option_bits|= OPTION_GTID_BEGIN; Corect. Pavel> But testing that more and looking at the code I realized there's Pavel> another problem in these 3 lines: why did you add the call to Pavel> trans_begin() at all? Right after it mysql_parse() is called to Pavel> execute "BEGIN" statement, which again kicks off the commit machinery Pavel> without any necessity to do that. The original reason was to not have to call mysql_parse() at all for this case (no reason to parse something that we know what it is). Apparently I missed to remove the extra calls. I will fix that now. Thanks a lot for noticing this. Pavel> So FYI: I removed setting OPTION_BEGIN here and removed the call to Pavel> trans_begin() and all tests passed (including my additional assert). Thanks for testing this! Regards, Monty