Re: [Maria-developers] [Commits] 49b2502: Fix assertion/hang in read_init_file()
Monty, My latest push uncovered a problem in the 10.2 tree; however the problem was there before, just not triggered by the testsuite. I pushed the below patch to fix the test failures in Buildbot, but I am not sure it is the correct solution, so please check it. It appears the problem was introduced with this patch: commit 3d4a7390c1a94ef6e07b04b52ea94a95878cda1b Author: Monty <monty@mariadb.org> Date: Mon Feb 1 12:45:39 2016 +0200 MDEV-6150 Speed up connection speed by moving creation of THD to new thread diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc index 84f0c63..355f62d 100644 --- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -844,12 +844,13 @@ void do_handle_bootstrap(THD *thd) delete thd; #ifndef EMBEDDED_LIBRARY - thread_safe_decrement32(&thread_count); + DBUG_ASSERT(thread_count == 1); in_bootstrap= FALSE; - - mysql_mutex_lock(&LOCK_thread_count); - mysql_cond_broadcast(&COND_thread_count); - mysql_mutex_unlock(&LOCK_thread_count); + /* + dec_thread_count will signal bootstrap() function that we have ended as + thread_count will become 0. + */ + dec_thread_count(); my_thread_end(); pthread_exit(0); #endif The problem is that do_handle_bootstrap is not only used for bootstrap - it is also used for read_init_file(). In the latter case, it is _not_ guaranteed that thread_count will drop to zero. For example, the binlog background thread might be running. You can see this in 10.2 (before my just-pushed fix) by running ./mtr main.init_file --mysqld=--log-bin=mysql-bin It will assert, or hang if assert is disabled. To get rid of the failures in Buildbot, I pushed the below patch. I think the patch is perfectly safe. However, maybe there is a better way - it seems a bit messy, mixing bootstrap and read_init_file() and having these if (opt_bootstrap) conditions. And maybe there are other problems caused by this as well that I did not notice? As I am not much familiar with how this thread count checking / bootstrap / init file handling works. Thanks, - Kristian. Kristian Nielsen <knielsen@knielsen-hq.org> writes:
revision-id: 49b25020ef512866751f192b91d8439670d0430b (mariadb-10.1.8-257-g49b2502) parent(s): be2b833c426b420073c50564125049e2b4a95e8b committer: Kristian Nielsen timestamp: 2016-09-09 18:09:59 +0200 message:
Fix assertion/hang in read_init_file()
If there are other threads running (for example binlog background thread), then the thread count may not drop to zero at the end of do_handle_bootstrap(). This caused an assertion and missing wakeup of the main thread. The missing wakeup is because THD::~THD() only signals the COND_thread_count mutex when the number of threads drops to zero.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
--- sql/sql_parse.cc | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc index 7263082..6baff31 100644 --- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -1082,9 +1082,20 @@ void do_handle_bootstrap(THD *thd) end: in_bootstrap= FALSE; delete thd; + if (!opt_bootstrap) + { + /* + We need to wake up main thread in case of read_init_file(). + This is not done by THD::~THD() when there are other threads running + (binlog background thread, for example). So do it here again. + */ + mysql_mutex_lock(&LOCK_thread_count); + mysql_cond_broadcast(&COND_thread_count); + mysql_mutex_unlock(&LOCK_thread_count); + }
#ifndef EMBEDDED_LIBRARY - DBUG_ASSERT(thread_count == 0); + DBUG_ASSERT(!opt_bootstrap || thread_count == 0); my_thread_end(); pthread_exit(0); #endif _______________________________________________ commits mailing list commits@mariadb.org https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
Hi! On Fri, Sep 9, 2016 at 7:31 PM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Monty,
My latest push uncovered a problem in the 10.2 tree; however the problem was there before, just not triggered by the testsuite. I pushed the below patch to fix the test failures in Buildbot, but I am not sure it is the correct solution, so please check it.
It appears the problem was introduced with this patch:
commit 3d4a7390c1a94ef6e07b04b52ea94a95878cda1b Author: Monty <monty@mariadb.org> Date: Mon Feb 1 12:45:39 2016 +0200
MDEV-6150 Speed up connection speed by moving creation of THD to new thread
diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc index 84f0c63..355f62d 100644 --- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -844,12 +844,13 @@ void do_handle_bootstrap(THD *thd) delete thd;
#ifndef EMBEDDED_LIBRARY - thread_safe_decrement32(&thread_count); + DBUG_ASSERT(thread_count == 1); in_bootstrap= FALSE; - - mysql_mutex_lock(&LOCK_thread_count); - mysql_cond_broadcast(&COND_thread_count); - mysql_mutex_unlock(&LOCK_thread_count); + /* + dec_thread_count will signal bootstrap() function that we have ended as + thread_count will become 0. + */ + dec_thread_count(); my_thread_end(); pthread_exit(0); #endif
The problem is that do_handle_bootstrap is not only used for bootstrap - it is also used for read_init_file(). In the latter case, it is _not_ guaranteed that thread_count will drop to zero. For example, the binlog background thread might be running. You can see this in 10.2 (before my just-pushed fix) by running
./mtr main.init_file --mysqld=--log-bin=mysql-bin
It will assert, or hang if assert is disabled.
To get rid of the failures in Buildbot, I pushed the below patch. I think the patch is perfectly safe. However, maybe there is a better way - it seems a bit messy, mixing bootstrap and read_init_file() and having these if (opt_bootstrap) conditions. And maybe there are other problems caused by this as well that I did not notice? As I am not much familiar with how this thread count checking / bootstrap / init file handling works.
Thanks for the patch and for finding the bug! However, Serg already removed the thread count handling from do_handle_bootstrap. I will today check if the problem still exists and if yes, fix it by always sending the broadcast. (A quick look suggest it still exist). Regards, Monty
Hi! <cut>
Thanks for the patch and for finding the bug!
However, Serg already removed the thread count handling from do_handle_bootstrap. I will today check if the problem still exists and if yes, fix it by always sending the broadcast. (A quick look suggest it still exist).
Just noticed that you had pushed your patch into the main 10.2 tree. I checked it and think that's the right way to fix this. Thanks also for the great comment in the patch that explains it! I also agree that's a bit strange that read_init_file is using the bootstrap() to read the init file. This is probably perfectly safe, but it would have been good if the comment for the bootstrap() function would have said so! Will add this next time I touch the code in this area. Regards, Monty
Michael Widenius <michael.widenius@gmail.com> writes:
Just noticed that you had pushed your patch into the main 10.2 tree. I checked it and think that's the right way to fix this. Thanks also
Ok, great, thanks for checking. - Kristian.
participants (2)
-
Kristian Nielsen
-
Michael Widenius