Hi, Aleksey! On Jun 03, Aleksey Midenkov wrote:
On Wed, Jun 2, 2021 at 11:37 PM Sergei Golubchik <serg@mariadb.org> wrote:
On Jun 02, Aleksey Midenkov wrote:
On Wed, Jun 2, 2021 at 9:06 PM Sergei Golubchik <serg@mariadb.org> wrote:
On Jun 02, Aleksey Midenkov wrote:
> > > > > > > > > > - if (!(sql_lock= (MYSQL_LOCK*) > > > > > - my_malloc(key_memory_MYSQL_LOCK, sizeof(*sql_lock) + > > > > > - sizeof(THR_LOCK_DATA*)*((a->lock_count+b->lock_count)*2) + > > > > > - sizeof(TABLE*)*(a->table_count+b->table_count),MYF(MY_WME)))) > > > > > - DBUG_RETURN(0); // Fatal error > > > > > + const size_t lock_size= sizeof(*sql_lock) + > > > > > + sizeof(THR_LOCK_DATA *) * ((a->lock_count + b->lock_count) * 2) + > > > > > + sizeof(TABLE *) * (a->table_count + b->table_count); > > > > > + if (thd) > > > > > + { > > > > > + sql_lock= (MYSQL_LOCK *) thd->alloc(lock_size); > > > > > + if (!sql_lock) > > > > > + DBUG_RETURN(0); > > > > > + sql_lock->flags= GET_LOCK_ON_THD; > > > > > Yes, that is the part of the bug fix.
This was not a bug, this was a new change in recover_from_failed_open().
Hmm, so was it part of the bug fix or not?
Also freeing was impossible for locks on thd.
Yes, this change I understand, no questions about freeing.
> > > > > + /* > > > > > + NOTE: The semantics of vers_set_hist_part() is double: > > > > > > > > twofold
Please, fix the language to be proper English.
Don't you want to ask Ian now? Please look for "double semantics" collocation in Google query (quotes are important). There are quite a number of examples including scientific books and IETF drafts.
No, I don't. "double semantics" looks good, if you want to change the comment to Note the double semantics of vers_set_hist_part() ... this is fine.
> > > > > + table_list->vers_skip_create= false; > > > > > + ot_ctx->vers_create_count= 0; > > > > > + action= Open_table_context::OT_REOPEN_TABLES; > > > > > + table_arg= NULL; > > > > > + } > > > > > > > > I'm afraid I don't understand. All this business with > > > > vers_skip_create and vers_skip_auto_create, it wasn't in > > > > the previous version of the patch. So, I believe it was > > > > a fix for one of the MDEV bugs reported and fixed > > > > meanwhile. > > > > > > No, that was the multi-threaded case which worked good for > > > me, but suddenly I discovered it fails on some buildbot. > > > > Could you elaborate on what the problem was? Two threads > > trying to add the partition in parallel? I'd expect > > MDL_EXCLUSIVE to prevent that. > > MDL_EXCLUSIVE prevents parallel execution of > repair_from_failed_open(), but not sequential. So it can add > several partitions instead of 1, one after another.
What's the sequence of events? One thread decides to add a partition, takes an MDL_EXCLUSIVE, the other thread decides to add a partition, waits for MDL_EXCLUSIVE, the first one finishes adding a partition, releases the lock, the second grabs it and adds a second partition?
Right.
Okay. Then, why a thread didn't check the number of partitions after acquiring MDL_EXCLUSIVE? Like with mutexes, if there's a mutex that protects a shared variable, you first acquire the mutex, then read the variable's value. Not the other way around.
Number of partitions is not a shared variable. part_info is kept in TABLE instance. To get new value it must reacquire share, reparse part_sql string. Then comparing with what? After releasing MDL_SHARED_WRITE everything is gone: TABLE, TABLE_SHARE. You must store somewhere old value, presumably in Open_table_context.
I thought that after acquiring MDL_EXCLUSIVE, just as the thread is trying to add a new partition, it could check the conditions if the new partition, indeed, needs to be added.
If it were so easy as it sounds I'd make it.
Then, please, help me to understand why it's not easy. Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org