Hi! On Fri, Sep 9, 2016 at 7:31 PM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Monty,
My latest push uncovered a problem in the 10.2 tree; however the problem was there before, just not triggered by the testsuite. I pushed the below patch to fix the test failures in Buildbot, but I am not sure it is the correct solution, so please check it.
It appears the problem was introduced with this patch:
commit 3d4a7390c1a94ef6e07b04b52ea94a95878cda1b Author: Monty <monty@mariadb.org> Date: Mon Feb 1 12:45:39 2016 +0200
MDEV-6150 Speed up connection speed by moving creation of THD to new thread
diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc index 84f0c63..355f62d 100644 --- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -844,12 +844,13 @@ void do_handle_bootstrap(THD *thd) delete thd;
#ifndef EMBEDDED_LIBRARY - thread_safe_decrement32(&thread_count); + DBUG_ASSERT(thread_count == 1); in_bootstrap= FALSE; - - mysql_mutex_lock(&LOCK_thread_count); - mysql_cond_broadcast(&COND_thread_count); - mysql_mutex_unlock(&LOCK_thread_count); + /* + dec_thread_count will signal bootstrap() function that we have ended as + thread_count will become 0. + */ + dec_thread_count(); my_thread_end(); pthread_exit(0); #endif
The problem is that do_handle_bootstrap is not only used for bootstrap - it is also used for read_init_file(). In the latter case, it is _not_ guaranteed that thread_count will drop to zero. For example, the binlog background thread might be running. You can see this in 10.2 (before my just-pushed fix) by running
./mtr main.init_file --mysqld=--log-bin=mysql-bin
It will assert, or hang if assert is disabled.
To get rid of the failures in Buildbot, I pushed the below patch. I think the patch is perfectly safe. However, maybe there is a better way - it seems a bit messy, mixing bootstrap and read_init_file() and having these if (opt_bootstrap) conditions. And maybe there are other problems caused by this as well that I did not notice? As I am not much familiar with how this thread count checking / bootstrap / init file handling works.
Thanks for the patch and for finding the bug! However, Serg already removed the thread count handling from do_handle_bootstrap. I will today check if the problem still exists and if yes, fix it by always sending the broadcast. (A quick look suggest it still exist). Regards, Monty