Sergei, On Wed, Jun 2, 2021 at 11:37 PM Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Aleksey!
On Jun 02, Aleksey Midenkov wrote:
On Wed, Jun 2, 2021 at 9:06 PM Sergei Golubchik <serg@mariadb.org> wrote:
On Jun 02, Aleksey Midenkov wrote:
> > > > > > > > - if (!(sql_lock= (MYSQL_LOCK*) > > > > - my_malloc(key_memory_MYSQL_LOCK, sizeof(*sql_lock) + > > > > - sizeof(THR_LOCK_DATA*)*((a->lock_count+b->lock_count)*2) + > > > > - sizeof(TABLE*)*(a->table_count+b->table_count),MYF(MY_WME)))) > > > > - DBUG_RETURN(0); // Fatal error > > > > + const size_t lock_size= sizeof(*sql_lock) + > > > > + sizeof(THR_LOCK_DATA *) * ((a->lock_count + b->lock_count) * 2) + > > > > + sizeof(TABLE *) * (a->table_count + b->table_count); > > > > + if (thd) > > > > + { > > > > + sql_lock= (MYSQL_LOCK *) thd->alloc(lock_size); > > > > + if (!sql_lock) > > > > + DBUG_RETURN(0); > > > > + sql_lock->flags= GET_LOCK_ON_THD; > > > Sorry, I meant the moment when MDEV-23639 calls mysql_lock_merge(). Yes, that is the part of the bug fix.
What did it fix and how?
That's not about changing the method, that's about merging locks. When I merged my locks with thd there were already thd-allocated locks.
And? How did allocating locks on THD fix anything?
LTM_PRELOCKED had locks on thd, I keep merged locks on thd. Isn't that enough for you?
I'm just trying to understand what the bug was. And I still cannot.
This was not a bug, this was a new change in recover_from_failed_open().
Also freeing was impossible for locks on thd.
Yes, this change I understand, no questions about freeing.
> > > > + /* > > > > + NOTE: The semantics of vers_set_hist_part() is double: > > > > > > twofold
Please, fix the language to be proper English.
Don't you want to ask Ian now? Please look for "double semantics" collocation in Google query (quotes are important). There are quite a number of examples including scientific books and IETF drafts.
> > > > + table_list->vers_skip_create= false; > > > > + ot_ctx->vers_create_count= 0; > > > > + action= Open_table_context::OT_REOPEN_TABLES; > > > > + table_arg= NULL; > > > > + } > > > > > > I'm afraid I don't understand. All this business with > > > vers_skip_create and vers_skip_auto_create, it wasn't in > > > the previous version of the patch. So, I believe it was > > > a fix for one of the MDEV bugs reported and fixed > > > meanwhile. > > > > No, that was the multi-threaded case which worked good for > > me, but suddenly I discovered it fails on some buildbot. > > Could you elaborate on what the problem was? Two threads > trying to add the partition in parallel? I'd expect > MDL_EXCLUSIVE to prevent that.
MDL_EXCLUSIVE prevents parallel execution of repair_from_failed_open(), but not sequential. So it can add several partitions instead of 1, one after another.
What's the sequence of events? One thread decides to add a partition, takes an MDL_EXCLUSIVE, the other thread decides to add a partition, waits for MDL_EXCLUSIVE, the first one finishes adding a partition, releases the lock, the second grabs it and adds a second partition?
Right.
Okay. Then, why a thread didn't check the number of partitions after acquiring MDL_EXCLUSIVE? Like with mutexes, if there's a mutex that protects a shared variable, you first acquire the mutex, then read the variable's value. Not the other way around.
Number of partitions is not a shared variable. part_info is kept in TABLE instance. To get new value it must reacquire share, reparse part_sql string. Then comparing with what? After releasing MDL_SHARED_WRITE everything is gone: TABLE, TABLE_SHARE. You must store somewhere old value, presumably in Open_table_context.
I thought that after acquiring MDL_EXCLUSIVE, just as the thread is trying to add a new partition, it could check the conditions if the new partition, indeed, needs to be added.
If it were so easy as it sounds I'd make it.
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- All the best, Aleksey Midenkov @midenok