Re: [Maria-developers] 6d0c1f3ae12: MDEV-23328 Server hang due to Galera lock conflict resolution
Hi, Jan! This was good! Just a few questions, see below On Oct 24, Jan Lindström wrote:
revision-id: 6d0c1f3ae12 (mariadb-10.2.40-114-g6d0c1f3ae12) parent(s): 3153cc799e2 author: Jan Lindström committer: Jan Lindström timestamp: 2021-10-22 07:02:39 +0300 message:
MDEV-23328 Server hang due to Galera lock conflict resolution ... diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc index 3e1f248b082..5d40106d0ad 100644 --- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -2159,6 +2159,7 @@ bool dispatch_command(enum enum_server_command command, THD *thd, } DBUG_PRINT("quit",("Got shutdown command for level %u", level)); general_log_print(thd, command, NullS); + DBUG_EXECUTE_IF("shutdown_unireg_abort", { unireg_abort(1); });
what's the point of this test?
my_eof(thd); kill_mysql(thd); error=TRUE; diff --git a/sql/wsrep_mysqld.cc b/sql/wsrep_mysqld.cc index e60100e2e90..7466654eead 100644 --- a/sql/wsrep_mysqld.cc +++ b/sql/wsrep_mysqld.cc @@ -835,13 +835,25 @@ void wsrep_thr_init() DBUG_VOID_RETURN; }
+/* This is wrapper for wsrep_break_lock in thr_lock.c */ +static int wsrep_thr_abort_thd(void *bf_thd_ptr, void *victim_thd_ptr, my_bool signal)
When is it called? Under what conditions?
+{ + THD* victim_thd= (THD *) victim_thd_ptr; + /* We need to lock THD::LOCK_thd_data to protect victim + from concurrent usage or disconnect or delete. */ + mysql_mutex_lock(&victim_thd->LOCK_thd_data); + int res= wsrep_abort_thd(bf_thd_ptr, victim_thd_ptr, signal); + return res; +} + diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc index 73b3625b6fd..56400264245 100644 --- a/storage/innobase/handler/ha_innodb.cc +++ b/storage/innobase/handler/ha_innodb.cc @@ -19720,46 +19716,113 @@ wsrep_innobase_kill_one_trx( +/******************************************************************* +This function is used to kill one transaction in BF. */ +void +wsrep_innobase_kill_one_trx( + MYSQL_THD const bf_thd, + const trx_t * const bf_trx, + trx_t *victim_trx, + my_bool signal) +{ + ut_ad(bf_thd); + ut_ad(victim_trx); + ut_ad(lock_mutex_own()); + ut_ad(trx_mutex_own(victim_trx)); + + DBUG_ENTER("wsrep_innobase_kill_one_trx"); + THD *thd= (THD *) victim_trx->mysql_thd; + + /* Here we need to lock THD::LOCK_thd_data to protect from + concurrent usage or disconnect or delete. */ + DEBUG_SYNC(bf_thd, "wsrep_before_BF_victim_lock"); + my_sleep(100000);
Eh? Forgot to remove after debugging?
+ wsrep_thd_LOCK(thd); + my_sleep(100000);
and here
+ DEBUG_SYNC(bf_thd, "wsrep_after_BF_victim_lock"); +
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hi Sergei, Jan, On Sun, Oct 24, 2021 at 9:11 PM Sergei Golubchik <serg@mariadb.org> wrote:
+/* This is wrapper for wsrep_break_lock in thr_lock.c */ +static int wsrep_thr_abort_thd(void *bf_thd_ptr, void *victim_thd_ptr, my_bool signal)
When is it called? Under what conditions?
Right, comments that state the obvious ("what") are rather useless. They should always explain "why".
+ DBUG_ENTER("wsrep_innobase_kill_one_trx"); + THD *thd= (THD *) victim_trx->mysql_thd; + + /* Here we need to lock THD::LOCK_thd_data to protect from + concurrent usage or disconnect or delete. */ + DEBUG_SYNC(bf_thd, "wsrep_before_BF_victim_lock"); + my_sleep(100000);
Eh? Forgot to remove after debugging?
+ wsrep_thd_LOCK(thd); + my_sleep(100000);
and here
Several months ago when I reviewed part of this fix, I suggested that this code be tested with some delays added. It could be useful to preserve those delays in some form (embedded inside DBUG_EXECUTE_IF, or commented out) for future use. I am sad to see that my comment regarding wsrep_is_BF_lock_timeout() that I made in https://github.com/MariaDB/server/commit/b74b53f0515b360bb5cddec1a506a2f4d4d... (June 17) has not been addressed. Do we really need that output? Do we see that output in our internal testing? If not, then we have not tested that that code is free from race conditions or hangs. (It should be a lot safer to avoid such unnecessary unlock/lock exercises involving multiple mutexes.) If yes, then why have we not added source code comments to document when such scenarios would occur? I believe that it is better to rely on some operating system features (such as stack traces from a debugger) rather than to try to implement partial logging. Best regards, Marko -- Marko Mäkelä, Lead Developer InnoDB MariaDB Corporation
Hi Marko,
I am sad to see that my comment regarding wsrep_is_BF_lock_timeout() that I made in https://github.com/MariaDB/server/commit/b74b53f0515b360bb5cddec1a506a2f4d4d... (June 17) has not been addressed. Do we really need that output? Do we see that output in our internal testing? If not, then we have not tested that that code is free from race conditions or hangs. (It should be a lot safer to avoid such unnecessary unlock/lock exercises involving multiple mutexes.) If yes, then why have we not added source code comments to document when such scenarios would occur? I believe that it is better to rely on some operating system features (such as stack traces from a debugger) rather than to try to implement partial logging.
I know, this is not related to MDEV-23328 or MDEV-25114 but ok, I can remove most of this stupid code. R: Jan
Hi Jan, Ramesh said that he has not observed any "WSREP: BF lock wait long for trx" messages while running tests. This would suggest that the diagnostic output code is essentially untested. I would say that the diagnostic output is related to MDEV-23328 or MDEV-25114, because it involves the same mutexes as the MDEV-23328 hang. Furthermore, the MDEV-23328 scenario is forced abort of lock-holding lower-priority transactions due to applying certified transactions (called "BF" or "brute force" in Galera). I am glad that you will reconsider my request to remove wsrep_trx_print_locking() and the mutex operations around the call. Marko On Mon, Oct 25, 2021 at 9:04 AM Jan Lindström <jan.lindstrom@mariadb.com> wrote:
Hi Marko,
I am sad to see that my comment regarding wsrep_is_BF_lock_timeout() that I made in https://github.com/MariaDB/server/commit/b74b53f0515b360bb5cddec1a506a2f4d4d... (June 17) has not been addressed. Do we really need that output? Do we see that output in our internal testing? If not, then we have not tested that that code is free from race conditions or hangs. (It should be a lot safer to avoid such unnecessary unlock/lock exercises involving multiple mutexes.) If yes, then why have we not added source code comments to document when such scenarios would occur? I believe that it is better to rely on some operating system features (such as stack traces from a debugger) rather than to try to implement partial logging.
I know, this is not related to MDEV-23328 or MDEV-25114 but ok, I can remove most of this stupid code.
R: Jan
-- Marko Mäkelä, Lead Developer InnoDB MariaDB Corporation
Hi Sergei,
--- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -2159,6 +2159,7 @@ bool dispatch_command(enum enum_server_command command, THD *thd, } DBUG_PRINT("quit",("Got shutdown command for level %u", level)); general_log_print(thd, command, NullS); + DBUG_EXECUTE_IF("shutdown_unireg_abort", { unireg_abort(1); });
what's the point of this test?
I have a debug test for wsrep_close_connections and this will cause it to be run
+/* This is wrapper for wsrep_break_lock in thr_lock.c */ +static int wsrep_thr_abort_thd(void *bf_thd_ptr, void *victim_thd_ptr, my_bool signal)
When is it called? Under what conditions?
BF here is TOI and victim is not. However, I could not find a single test case that would cause this code to be executed. Not sure if it even needed. I asked Seppo
+ /* Here we need to lock THD::LOCK_thd_data to protect from + concurrent usage or disconnect or delete. */ + DEBUG_SYNC(bf_thd, "wsrep_before_BF_victim_lock"); + my_sleep(100000);
Eh? Forgot to remove after debugging?
No, I did not, This was added for Rames to test with pquery R: Jan
Hi, Jan! On Oct 25, Jan Lindström wrote:
Hi Sergei,
--- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -2159,6 +2159,7 @@ bool dispatch_command( } DBUG_PRINT("quit",("Got shutdown command for level %u", level)); general_log_print(thd, command, NullS); + DBUG_EXECUTE_IF("shutdown_unireg_abort", { unireg_abort(1); });
what's the point of this test?
I have a debug test for wsrep_close_connections and this will cause it to be run
Yes, I saw a test. What I mean is, DBUG_EXECUTE_IF() is used to have some rare condition happen deterministically. E.g. to simulate a network error or OOM. Or to make a server crash at a certain point (like if it got a kill -9) or to add a delay. But you're adding a proper server shutdown at a point where none can ever happen. I don't see how such a test can be of any value.
+ /* Here we need to lock THD::LOCK_thd_data to protect from + concurrent usage or disconnect or delete. */ + DEBUG_SYNC(bf_thd, "wsrep_before_BF_victim_lock"); + my_sleep(100000);
Eh? Forgot to remove after debugging?
No, I did not, This was added for Rames to test with pquery
Just make sure they're removed before the push :) Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
participants (3)
-
Jan Lindström
-
Marko Mäkelä
-
Sergei Golubchik