Hi, Aleksey! On Jun 02, Aleksey Midenkov wrote:
On Wed, Jun 2, 2021 at 9:06 PM Sergei Golubchik <serg@mariadb.org> wrote:
On Jun 02, Aleksey Midenkov wrote:
> > > > > > - if (!(sql_lock= (MYSQL_LOCK*) > > > - my_malloc(key_memory_MYSQL_LOCK, sizeof(*sql_lock) + > > > - sizeof(THR_LOCK_DATA*)*((a->lock_count+b->lock_count)*2) + > > > - sizeof(TABLE*)*(a->table_count+b->table_count),MYF(MY_WME)))) > > > - DBUG_RETURN(0); // Fatal error > > > + const size_t lock_size= sizeof(*sql_lock) + > > > + sizeof(THR_LOCK_DATA *) * ((a->lock_count + b->lock_count) * 2) + > > > + sizeof(TABLE *) * (a->table_count + b->table_count); > > > + if (thd) > > > + { > > > + sql_lock= (MYSQL_LOCK *) thd->alloc(lock_size); > > > + if (!sql_lock) > > > + DBUG_RETURN(0); > > > + sql_lock->flags= GET_LOCK_ON_THD; > > Sorry, I meant the moment when MDEV-23639 calls mysql_lock_merge(). Yes, that is the part of the bug fix.
What did it fix and how?
That's not about changing the method, that's about merging locks. When I merged my locks with thd there were already thd-allocated locks.
And? How did allocating locks on THD fix anything?
LTM_PRELOCKED had locks on thd, I keep merged locks on thd. Isn't that enough for you?
I'm just trying to understand what the bug was. And I still cannot.
Also freeing was impossible for locks on thd.
Yes, this change I understand, no questions about freeing.
> > > + /* > > > + NOTE: The semantics of vers_set_hist_part() is double: > > > > twofold
Please, fix the language to be proper English.
> > > + table_list->vers_skip_create= false; > > > + ot_ctx->vers_create_count= 0; > > > + action= Open_table_context::OT_REOPEN_TABLES; > > > + table_arg= NULL; > > > + } > > > > I'm afraid I don't understand. All this business with > > vers_skip_create and vers_skip_auto_create, it wasn't in > > the previous version of the patch. So, I believe it was > > a fix for one of the MDEV bugs reported and fixed > > meanwhile. > > No, that was the multi-threaded case which worked good for > me, but suddenly I discovered it fails on some buildbot.
Could you elaborate on what the problem was? Two threads trying to add the partition in parallel? I'd expect MDL_EXCLUSIVE to prevent that.
MDL_EXCLUSIVE prevents parallel execution of repair_from_failed_open(), but not sequential. So it can add several partitions instead of 1, one after another.
What's the sequence of events? One thread decides to add a partition, takes an MDL_EXCLUSIVE, the other thread decides to add a partition, waits for MDL_EXCLUSIVE, the first one finishes adding a partition, releases the lock, the second grabs it and adds a second partition?
Right.
Okay. Then, why a thread didn't check the number of partitions after acquiring MDL_EXCLUSIVE? Like with mutexes, if there's a mutex that protects a shared variable, you first acquire the mutex, then read the variable's value. Not the other way around.
Number of partitions is not a shared variable. part_info is kept in TABLE instance. To get new value it must reacquire share, reparse part_sql string. Then comparing with what? After releasing MDL_SHARED_WRITE everything is gone: TABLE, TABLE_SHARE. You must store somewhere old value, presumably in Open_table_context.
I thought that after acquiring MDL_EXCLUSIVE, just as the thread is trying to add a new partition, it could check the conditions if the new partition, indeed, needs to be added. Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org