On Wed, Feb 26, 2014 at 8:12 PM, Michael Widenius <monty@askmonty.org> wrote:
Pavel> And then it said that slave died with the stack trace
Pavel> sql/transaction.cc:139(trans_begin(THD*, unsigned int))[0x788e20] Pavel> sql/log_event.cc:6478(Gtid_log_event::do_apply_event(rpl_group_info*))[0x93a685] Pavel> sql/log_event.h:1341(Log_event::apply_event(rpl_group_info*))[0x5ca108] Pavel> sql/slave.cc:3191(apply_event_and_update_pos(Log_event*, THD*, Pavel> rpl_group_info*, rpl_parallel_thread*))[0x5c0da8] Pavel> sql/slave.cc:3464(exec_relay_log_event)[0x5c1498] Pavel> sql/slave.cc:4516(handle_slave_sql)[0x5c44e9]
Pavel> Which means that slave tries to execute BEGIN event while OPTION_BEGIN Pavel> is set which shouldn't ever happen.
The assert you put in the code doesn't show that anything is wrong.
The reason is the following code in log_event.cc:
thd->variables.option_bits|= OPTION_BEGIN | OPTION_GTID_BEGIN; DBUG_PRINT("info", ("Set OPTION_GTID_BEGIN")); trans_begin(thd, 0);
In other words, we do set OPTION_BEGIN just before calling trans_begin(), so the assert is wrong.
To me this code looks clearly wrong. You set OPTION_BEGIN just before calling trans_begin() which forces trans_begin() to kick off the commit machinery. And even though it ends up doing nothing, I don't know how trivial is the number of CPU cycles it spends on that. But why set OPTION_BEGIN if it will be set in the trans_begin() anyway? So I fixed that and made the line to look like thd->variables.option_bits|= OPTION_GTID_BEGIN; But testing that more and looking at the code I realized there's another problem in these 3 lines: why did you add the call to trans_begin() at all? Right after it mysql_parse() is called to execute "BEGIN" statement, which again kicks off the commit machinery without any necessity to do that. So FYI: I removed setting OPTION_BEGIN here and removed the call to trans_begin() and all tests passed (including my additional assert). Pavel