[Maria-developers] MariaDB / Galera BUG
Environment: MariaDB-10.0.16 Galera-3-25.3.9 This statement does nasty things to a MariaDB+Galera cluster: CREATE TABLE u91rw_zoo_version (`version` varchar(255) NOT NULL) ENGINE=innoDB DEFAULT CHARSET=utf8 SELECT '3.3.3' as version; Statement fails to replicate to the other nodes because the target table doesn't exist. Cluster then disassembles itself. This patch makes the problem disappear. Please evaluate? (note: patch based against github.com/MariaDB/server.git) ============= diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc index be57be9b223a..d0521ed896b8 100644 --- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -3325,6 +3325,11 @@ mysql_execute_command(THD *thd) /* Store reference to table in case of LOCK TABLES */ create_info.table= create_table->table; + if (WSREP(thd) && (!thd->is_current_stmt_binlog_format_row() || + !create_info.tmp_table())) + { + WSREP_TO_ISOLATION_BEGIN(create_table->db, create_table->table_name, NULL) + } /* select_create is currently not re-execution friendly and needs to be created for every execution of a PS/SP. ============= We have a full environment setup in which to test. debug/trace available on request. Will open bug if required for your workflow. Thanks, Andy -- Andrew W. Elble aweits@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
Hi Andy! On Thu, Mar 5, 2015 at 2:27 PM, Andrew W Elble <aweits@rit.edu> wrote:
Environment: MariaDB-10.0.16 Galera-3-25.3.9
This statement does nasty things to a MariaDB+Galera cluster:
CREATE TABLE u91rw_zoo_version (`version` varchar(255) NOT NULL) ENGINE=innoDB DEFAULT CHARSET=utf8 SELECT '3.3.3' as version;
Statement fails to replicate to the other nodes because the target table doesn't exist. Cluster then disassembles itself.
Right.
This patch makes the problem disappear. Please evaluate? (note: patch based against github.com/MariaDB/server.git)
=============
diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc index be57be9b223a..d0521ed896b8 100644 --- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -3325,6 +3325,11 @@ mysql_execute_command(THD *thd) /* Store reference to table in case of LOCK TABLES */ create_info.table= create_table->table;
+ if (WSREP(thd) && (!thd->is_current_stmt_binlog_format_row() || + !create_info.tmp_table())) + { + WSREP_TO_ISOLATION_BEGIN(create_table->db, create_table->table_name, NULL) + } /* select_create is currently not re-execution friendly and needs to be created for every execution of a PS/SP.
=============
The patch looks good to me.
We have a full environment setup in which to test. debug/trace available on request. Will open bug if required for your workflow.
I have filed a bug reporting this issue : https://mariadb.atlassian.net/browse/MDEV-7673 Thanks for the patch. Cheers! -- Nirbhay
Thanks,
Andy
-- Andrew W. Elble aweits@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Nirbhay, We've been poking at this some more. Seems like we're retracing some steps as documented here: https://bugs.launchpad.net/codership-mysql/+bug/1052002 My patch will need to be reverted and replaced with something else. MDEV-7673 is a regression caused by a portion of the fix for MDEV-6924. Specifically this: in sql/sql_class.cc -> THD::binlog_query() #ifdef WITH_WSREP /* Even though wsrep only supports ROW binary log format, a user can set binlog format to STATEMENT (wsrep_forced_binlog_format). In which case the control might reach here even when binary logging (--log-bin) is not enabled. This is possible because wsrep patch partially enables binary logging by setting wsrep_emulate_binlog. */ if (mysql_bin_log.is_open()) #endif /* WITH_WSREP */ see sql/sql_insert.cc -> select_create::binlog_show_create_table() There are other issues here which we're still trying to track down (simultaneously running 'CREATE TABLE t1 AS SELECT SLEEP(30)' on multiple nodes is bad, for instance). I wanted to at least start the discussion here... Thanks, Andy -- Andrew W. Elble aweits@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
Nirbhay, I'm going to have a number of patches/suggestions from chasing this. Hopefully I'll have them in a consumable fashion soon. Is it preferred to send them all to the list? This is a rough summary of what we've found so far: 1.) MDEV-6924: either: fix because CTAS uses THD::STMT_QUERY_TYPE alternatively: Query_log_event::Query_log_event() flips the setting of "direct" when binlog is not row/ picks inapproriate setting of use_cache 1a.) Revert patch for MDEV-7673, as it apparently can cause a crash with WSREP: FSM: no such a transition REPLICATING -> REPLICATING 2.) select_insert::send_eof() will call my_ok() when called from select_create::send_eof() even if abort_result_set() is going to be called. Rectify for CTAS case. 3.) wsrep_applier thread tends to spin and try to apply the same transaction multiple times to cluster failure even though the selected victim thread is slowly trying to abort. a.) increase timeout if a victim has been selected b.) don't downcall from wsrep_abort_thd if victim is already aborting 4.) select_create::send_eof() sets exit_done before seeing if galera is going to call abort_result_set(), which can lead to unexpected tables being present + cluster failure as result. 5.) handle_select() resets thd->killed() even when thread was a victim thread, causing crash. 6.) cherry-picking upstream commit cc3d09bc8d5a78abc064d289045b20363aab9d28 (I believe you're already aware of this one seeing as how your name is on it) Thanks, Andy -- Andrew W. Elble aweits@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
Hi Andy! On Fri, Apr 24, 2015 at 3:27 PM, Andrew W Elble <aweits@rit.edu> wrote:
Nirbhay,
I'm going to have a number of patches/suggestions from chasing this. Hopefully I'll have them in a consumable fashion soon. Is it preferred to send them all to the list?
Great! Will it be possible for you to create a pull request?
This is a rough summary of what we've found so far:
1.) MDEV-6924: either: fix because CTAS uses THD::STMT_QUERY_TYPE alternatively: Query_log_event::Query_log_event() flips the setting of "direct" when binlog is not row/ picks inapproriate setting of use_cache
I have pushed a related fix recently in 5.5-galera (to be upmerged to higher versions). https://github.com/MariaDB/server/commit/581b49dd3d3e2e253812bb24fa881148675... Perhaps, you can take a look to see if it does not conflict with your additions.
1a.) Revert patch for MDEV-7673, as it apparently can cause a crash with WSREP: FSM: no such a transition REPLICATING -> REPLICATING
2.) select_insert::send_eof() will call my_ok() when called from select_create::send_eof() even if abort_result_set() is going to be called. Rectify for CTAS case.
3.) wsrep_applier thread tends to spin and try to apply the same transaction multiple times to cluster failure even though the selected victim thread is slowly trying to abort. a.) increase timeout if a victim has been selected b.) don't downcall from wsrep_abort_thd if victim is already aborting
4.) select_create::send_eof() sets exit_done before seeing if galera is going to call abort_result_set(), which can lead to unexpected tables being present + cluster failure as result.
5.) handle_select() resets thd->killed() even when thread was a victim thread, causing crash.
6.) cherry-picking upstream commit cc3d09bc8d5a78abc064d289045b20363aab9d28 (I believe you're already aware of this one seeing as how your name is on it)
Is this correct? Which repo/branch are you referring to? $ git branch --contains cc3d09bc8d5a78abc064d289045b20363aab9d28 error: no such commit cc3d09bc8d5a78abc064d289045b20363aab9d28 Best. -- Nirbhay
Thanks,
Andy
-- Andrew W. Elble aweits@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
Great! Will it be possible for you to create a pull request?
I'll work on that.
https://github.com/MariaDB/server/commit/581b49dd3d3e2e253812bb24fa881148675...
Pulling it in for testing.
6.) cherry-picking upstream commit cc3d09bc8d5a78abc064d289045b20363aab9d28 (I believe you're already aware of this one seeing as how your name is on it)
Is this correct? Which repo/branch are you referring to?
$ git branch --contains cc3d09bc8d5a78abc064d289045b20363aab9d28 error: no such commit cc3d09bc8d5a78abc064d289045b20363aab9d28
https://github.com/codership/mysql-wsrep/issues/13 https://github.com/codership/mysql-wsrep/commit/cc3d09bc8d5a78abc064d289045b... Thanks, Andy -- Andrew W. Elble aweits@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
Hi Andrew! On Wed, Apr 15, 2015 at 11:45 AM, Andrew W Elble <aweits@rit.edu> wrote:
Nirbhay,
We've been poking at this some more. Seems like we're retracing some steps as documented here:
https://bugs.launchpad.net/codership-mysql/+bug/1052002
My patch will need to be reverted and replaced with something else.
Yes, after fixing MDEV-7995, I realized that your patch can be reverted.
MDEV-7673 is a regression caused by a portion of the fix for MDEV-6924.
Specifically this:
in sql/sql_class.cc -> THD::binlog_query() #ifdef WITH_WSREP /* Even though wsrep only supports ROW binary log format, a user can set binlog format to STATEMENT (wsrep_forced_binlog_format). In which case the control might reach here even when binary logging (--log-bin) is not enabled. This is possible because wsrep patch partially enables binary logging by setting wsrep_emulate_binlog. */ if (mysql_bin_log.is_open()) #endif /* WITH_WSREP */
That's right, removed as part of fix for MDEV-7995.
see sql/sql_insert.cc -> select_create::binlog_show_create_table()
There are other issues here which we're still trying to track down (simultaneously running 'CREATE TABLE t1 AS SELECT SLEEP(30)' on multiple nodes is bad, for instance). I wanted to at least start the
Thanks! -- Nirbhay
participants (2)
-
Andrew W Elble
-
Nirbhay Choubey