[Maria-developers] Problem with debug_sync, THD::enter_cond(), and kill ...
I got a crash in my parallel replication test case, and this one turned out to be caused by a fundamental problem in debug_sync. I am not sure how to solve it, so I wanted to explain the issue in case others have some ideas. The crash happens in THD::awake() when a thread is killed: mysql_mutex_lock(&mysys_var->mutex); ... int ret= mysql_mutex_trylock(mysys_var->current_mutex); mysql_cond_broadcast(mysys_var->current_cond); if (!ret) mysql_mutex_unlock(mysys_var->current_mutex); The problem is that mysys_var->current_mutex changed between the trylock() and the unlock(), so we unlock a different mutex than the one we locked - ouch! The mutex changed because of this code in debug_sync.cc: if (thd->mysys_var) { old_mutex= thd->mysys_var->current_mutex; old_cond= thd->mysys_var->current_cond; thd->mysys_var->current_mutex= &debug_sync_global.ds_mutex; thd->mysys_var->current_cond= &debug_sync_global.ds_cond; } There is no mutex protection here. So this means that it is not safe to use debug_sync inside enter_cond()/exit_cond() if kill will be used on the thread at that point. Since I have a lot of tests for parallel replication where I test exactly for correct error handling when killing threads at various points in the parallel processing, it is not too surprising that I would eventually be hit by this :) I am going to avoid using debug_sync inside enter_cond()/exit_cond() for now. But any ideas for how to solve this properly? This is particularly nasty to be hit by, as it is not at all obvious that this use of debug_sync should be a problem. And it is not exactly easy to guess from the failure what the real problem is - assuming one can even reproduce the failure, the window of opportunity in the race is after all rather quite small. - Kristian.
Looking at the code it looks like most places (including THD::awake) assume that thd->mysys_var->current_mutex and thd->mysys_var->current_cond can be changed only when thd->mysys_var->mutex is locked. debug_sync.cc is breaking this assumption. I've found that Item_func_sleep::val_int(), THD::enter_cond() and THD::exit_cond() break this assumption too. I believe all of those places should be fixed to lock thd->mysys_var->mutex before any attempts to read or write to thd->mysys_var->current_mutex and thd->mysys_var->current_cond. Pavel On Wed, Feb 26, 2014 at 8:29 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
I got a crash in my parallel replication test case, and this one turned out to be caused by a fundamental problem in debug_sync. I am not sure how to solve it, so I wanted to explain the issue in case others have some ideas.
The crash happens in THD::awake() when a thread is killed:
mysql_mutex_lock(&mysys_var->mutex); ... int ret= mysql_mutex_trylock(mysys_var->current_mutex); mysql_cond_broadcast(mysys_var->current_cond); if (!ret) mysql_mutex_unlock(mysys_var->current_mutex);
The problem is that mysys_var->current_mutex changed between the trylock() and the unlock(), so we unlock a different mutex than the one we locked - ouch!
The mutex changed because of this code in debug_sync.cc:
if (thd->mysys_var) { old_mutex= thd->mysys_var->current_mutex; old_cond= thd->mysys_var->current_cond; thd->mysys_var->current_mutex= &debug_sync_global.ds_mutex; thd->mysys_var->current_cond= &debug_sync_global.ds_cond; }
There is no mutex protection here.
So this means that it is not safe to use debug_sync inside enter_cond()/exit_cond() if kill will be used on the thread at that point. Since I have a lot of tests for parallel replication where I test exactly for correct error handling when killing threads at various points in the parallel processing, it is not too surprising that I would eventually be hit by this :)
I am going to avoid using debug_sync inside enter_cond()/exit_cond() for now. But any ideas for how to solve this properly? This is particularly nasty to be hit by, as it is not at all obvious that this use of debug_sync should be a problem. And it is not exactly easy to guess from the failure what the real problem is - assuming one can even reproduce the failure, the window of opportunity in the race is after all rather quite small.
- Kristian.
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Pavel Ivanov <pivanof@google.com> writes:
Looking at the code it looks like most places (including THD::awake) assume that thd->mysys_var->current_mutex and thd->mysys_var->current_cond can be changed only when thd->mysys_var->mutex is locked. debug_sync.cc is breaking this assumption. I've found that Item_func_sleep::val_int(), THD::enter_cond() and THD::exit_cond() break this assumption too. I believe all of those places should be fixed to lock thd->mysys_var->mutex before any attempts to read or write to thd->mysys_var->current_mutex and thd->mysys_var->current_cond.
Hm... well, as I understand the code in THD::awake(), the idea is that it is ok for enter_cond() to _set_ the new values without locks. THD::awake() will use them if they are set, and ignore them if not. However, to _clear_ them, the mysys_var->mutex needs to be held - and awake() holds this mutex while using the values. It would seem a bit sad to introduce an extra mutex for every call to enter_cond(), only to satisfy some debug code which is never used in production. But maybe it is not so important, since that is only done when the thread is going to sleep on a condition anyway? I am also not sure if it could introduce risk of deadlocks or other issues if taking the mysys_var->mutex while also holding the mysys_var->current_mutex. There is a comment in exit_cond() that says that current_mutex must be unlocked before mysys_var->mutex is locked. But I do not know enough of this code to be sure if that is a problem also in enter_cond(). - Kristian.
On Wed, Feb 26, 2014 at 11:51 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Looking at the code it looks like most places (including THD::awake) assume that thd->mysys_var->current_mutex and thd->mysys_var->current_cond can be changed only when thd->mysys_var->mutex is locked. debug_sync.cc is breaking this assumption. I've found that Item_func_sleep::val_int(), THD::enter_cond() and THD::exit_cond() break this assumption too. I believe all of those places should be fixed to lock thd->mysys_var->mutex before any attempts to read or write to thd->mysys_var->current_mutex and thd->mysys_var->current_cond.
Hm... well, as I understand the code in THD::awake(), the idea is that it is ok for enter_cond() to _set_ the new values without locks. THD::awake() will use them if they are set, and ignore them if not. However, to _clear_ them, the mysys_var->mutex needs to be held - and awake() holds this mutex while using the values.
It would seem a bit sad to introduce an extra mutex for every call to enter_cond(), only to satisfy some debug code which is never used in production. But maybe it is not so important, since that is only done when the thread is going to sleep on a condition anyway?
I am also not sure if it could introduce risk of deadlocks or other issues if taking the mysys_var->mutex while also holding the mysys_var->current_mutex. There is a comment in exit_cond() that says that current_mutex must be unlocked before mysys_var->mutex is locked. But I do not know enough of this code to be sure if that is a problem also in enter_cond().
I didn't look at it this way (that current_cond and current_mutex can go NULL -> non-NULL without mutex and then changed under mutex). Probably that could usually work. And that would mean that only debug_sync.cc must be changed to lock thd->mysys_var->mutex before changing current_cond and current_mutex. But I'd say trying to build system like that is very fragile and unreliable. You can't trust the fact that someone won't later look at THD::enter_cond(), decide that current_cond and current_mutex can be changed anywhere without mutex, and add some code doing that. And you can't even rely on the fact that compiler won't optimize something and won't move assignment of current_mutex and current_cond into some inappropriate place or won't even write some temporary values in there -- compiler is totally allowed to do that. Besides the code like this makes it completely impossible to analyze MySQL/MariaDB execution using dynamic tools like ThreadSanitizer (https://code.google.com/p/data-race-test/wiki/ThreadSanitizer) which is very helpful in finding bugs. So I'd suggest to add an extra mutex or come up with some other synchronization solution that will guarantee cleanliness without any assumptions. Pavel
Hi, Kristian! On Feb 26, Kristian Nielsen wrote:
I got a crash in my parallel replication test case, and this one turned out to be caused by a fundamental problem in debug_sync. I am not sure how to solve it, so I wanted to explain the issue in case others have some ideas.
The crash happens in THD::awake() when a thread is killed:
mysql_mutex_lock(&mysys_var->mutex); ... int ret= mysql_mutex_trylock(mysys_var->current_mutex); mysql_cond_broadcast(mysys_var->current_cond); if (!ret) mysql_mutex_unlock(mysys_var->current_mutex);
The problem is that mysys_var->current_mutex changed between the trylock() and the unlock(), so we unlock a different mutex than the one we locked - ouch!
What about saving mysys_var->current_mutex locally before trylock-ing it? Then you can be sure that you unlock what you've locked. Races are still possible, you can signal a condition that doesn't correspond to the mutex - but it doesn't really matter here. Regards, Sergei
participants (3)
-
Kristian Nielsen
-
Pavel Ivanov
-
Sergei Golubchik