[Maria-developers] Deadlock with STOP SLAVE and LOCK_active_mi
So while looking at MDEV-9573, I found this code in mysql_execute_command() for STOP SLAVE: mysql_mutex_lock(&LOCK_active_mi); if ((mi= (master_info_index-> get_master_info(&lex_mi->connection_name, Sql_condition::WARN_LEVEL_ERROR)))) if (!stop_slave(thd, mi, 1/* net report*/)) my_ok(thd); mysql_mutex_unlock(&LOCK_active_mi); So basically, this code is holding LOCK_active_mi for the entire duration of the STOP SLAVE operation. This seems completely broken. Anything in a slave thread that tries to take LOCK_active_mi will then deadlock with the STOP SLAVE operation. It is simple enough to trigger, testcase (with sleep-injecting patch) at the end of the email. This uses INFORMATION_SCHEMA.SESSSION_STATUS; I could imagine accessing a number of system variables could trigger it as well.
From a quick check, it looks like this has been like this forever, eg. 5.1 has similar code.
Any suggestions for how this is supposed to work? Or is it just broken by design, but saved because normally slave threads do not need to access SHOW STATUS or system variables? - Kristian. ----------------------------------------------------------------------- diff --git a/sql/mysqld.cc b/sql/mysqld.cc index ced9225..2358d6b 100644 --- a/sql/mysqld.cc +++ b/sql/mysqld.cc @@ -7765,6 +7765,7 @@ static int show_heartbeat_period(THD *thd, SHOW_VAR *var, char *buff, var->type= SHOW_CHAR; var->value= buff; +if (thd->slave_thread) my_sleep(5000000); mysql_mutex_lock(&LOCK_active_mi); if (master_info_index) { ----------------------------------------------------------------------- --source include/have_innodb.inc --source include/master-slave.inc --connection master CREATE TABLE t1 (a INT PRIMARY KEY) ENGINE=InnoDB; --save_master_pos --connection slave --sync_with_master --source include/stop_slave.inc --connection master INSERT INTO t1 SELECT VARIABLE_VALUE FROM INFORMATION_SCHEMA.SESSION_STATUS WHERE VARIABLE_NAME='COM_COMMIT'; --save_master_pos --connection slave --source include/start_slave.inc # Wait a bit to hit a sleep() in code after taking LOCK_active_mi. SELECT SLEEP(2); STOP SLAVE; SELECT * FROM t1 ORDER BY a; START SLAVE; --sync_with_master SELECT * FROM t1 ORDER BY a; # Clean up. --connection master DROP TABLE t1; --source include/rpl_end.inc -----------------------------------------------------------------------
Hi, Kristian! On Apr 15, Kristian Nielsen wrote:
So while looking at MDEV-9573, I found this code in mysql_execute_command() for STOP SLAVE:
mysql_mutex_lock(&LOCK_active_mi); if ((mi= (master_info_index-> get_master_info(&lex_mi->connection_name, Sql_condition::WARN_LEVEL_ERROR)))) if (!stop_slave(thd, mi, 1/* net report*/)) my_ok(thd); mysql_mutex_unlock(&LOCK_active_mi);
So basically, this code is holding LOCK_active_mi for the entire duration of the STOP SLAVE operation.
This seems completely broken. Anything in a slave thread that tries to take LOCK_active_mi will then deadlock with the STOP SLAVE operation. It is simple enough to trigger, testcase (with sleep-injecting patch) at the end of the email. This uses INFORMATION_SCHEMA.SESSSION_STATUS; I could imagine accessing a number of system variables could trigger it as well.
From a quick check, it looks like this has been like this forever, eg. 5.1 has similar code.
Any suggestions for how this is supposed to work? Or is it just broken by design, but saved because normally slave threads do not need to access SHOW STATUS or system variables?
These locks were added in this commit: commit f808030 Author: unknown <guilhem@mysql.com> Date: Thu Mar 11 16:23:35 2004 +0100 Fix for BUG#2921 "Replication problem on mutex lock in mySQL-4.0.18": re-using unused LOCK_active_mi to serialize all administrative commands related to replication: START SLAVE, STOP SLAVE, RESET SLAVE, CHANGE MASTER, init_slave() (replication autostart at server startup), end_slave() (replication autostop at server shutdown), LOAD DATA FROM MASTER. This protects us against a handful of deadlocks (like BUG#2921 when two START SLAVE, but when two STOP SLAVE too). Removing unused variables. So, apparently LOCK_active_mi was unused at that time and Guilhem took it to serialize replication administrative commands. Now this lock is used. The fix, I suppose, should be to create a new lock and use it here. Something like LOCK_serialize_replication_admin_commands. Regards, Sergei Chief Architect MariaDB and security@mariadb.org
Sergei Golubchik <serg@mariadb.org> writes:
On Apr 15, Kristian Nielsen wrote:
Any suggestions for how this is supposed to work? Or is it just broken by design, but saved because normally slave threads do not need to access SHOW STATUS or system variables?
commit f808030 Author: unknown <guilhem@mysql.com> Date: Thu Mar 11 16:23:35 2004 +0100
Fix for BUG#2921 "Replication problem on mutex lock in mySQL-4.0.18": re-using unused LOCK_active_mi to serialize all administrative commands related to replication: START SLAVE, STOP SLAVE, RESET SLAVE, CHANGE MASTER, init_slave() (replication autostart at server startup), end_slave() (replication autostop at server shutdown), LOAD DATA FROM MASTER. This protects us against a handful of deadlocks (like BUG#2921 when two START SLAVE, but when two STOP SLAVE too). Removing unused variables.
So, apparently LOCK_active_mi was unused at that time and Guilhem took it to serialize replication administrative commands. Now this lock is used. The fix, I suppose, should be to create a new lock and use it here. Something like LOCK_serialize_replication_admin_commands.
Hm. So I went through all of the code that references LOCK_active_mi to try and understand this. It seems there are two main uses: 1. As you say, to serialise admin commands. I think the list is: end_slave() (server shutdown); CHANGE MASTER; START/STOP SLAVE; START/STOP ALL SLAVES; RESET SLAVE; FLUSH SLAVE. 2. To protect access to master_info_index, eg. to prevent a Master_info from disappearing while it is accessed. There is a comment in rpl_rli.h that LOCK_active_mi protects Relay_log_info::inited, but I did not understand it - maybe it is wrong? So the idea would be to make a new lock for (1), and keep LOCK_active_mi for (2). Actually, using a lock for (1) seems a bad idea - STOP SLAVE can take a long time to run, and a mutex makes it unkillable. A condition variable might be better (to make STOP SLAVE killable). But I guess that is a separate issue, it would not affect the possibility of deadlocks. Both LOCK_active_mi and the new LOCK_serialize_replication_admin_commands must by themselves prevent master_info_index from having elements added or removed. This is needed for start_all_slaves() and stop_all_slaves() to work correctly, I think. Something will need to be done for remove_master_info(). Currently, it also tries to stop slave threads (from within free_key_master_info()) while holding LOCK_active_mi. It seems that threads should be stopped before attempting to remove their mi from master_info_index, probably? Actually, there would still be the deadlock problem after introducing the new lock, because it is possible for a slave thread to run CHANGE MASTER! (and I think START SLAVE / RESET SLAVE as well). But that is probably a bug, does not seem good that this is possible. I guess these need a check to throw an ER_LOCK_OR_ACTIVE_TRANSACTION error, like STOP SLAVE. I think I also found another lock problem while reading the code, for MASTER_POS_WAIT() og SHOW BINLOG EVENTS. They both grab a Master_info under LOCK_active_mi, but then proceed to access it after releasing the lock. I do not see anything that prevents the Master_info to disappear under their feet if RESET SLAVE ALL is run in another thread? But again, that's a separate issue, I suppose. Probably there are some other tricky deadlock issues lurking here. I don't know. It seems splitting into a new LOCK_serialize_replication_admin_commands could solve the deadlock issue, maybe, if the other problems mentioned above are addressed. It seems very hard to convince oneself that this would not introduce new problems somewhere, the locking in replication is really complex and feels very fragile :-/ It doesn't really feel like something one would want to do in 10.1 (certainly not 10.0), but maybe 10.2? It's kind of a very edge-case problem (and seems to have been there for a _long_ time). Still, hanging the mysqld process hard, requiring kill -9, is also not nice... Thanks, - Kristian.
participants (2)
-
Kristian Nielsen
-
Sergei Golubchik