Re: [Maria-developers] Rev 3967: MDEV-5597 - Reduce usage of LOCK_open: LOCK_flush
Hi, Sergey! On Jan 31, Sergey Vojtovich wrote:
At lp:maria/10.0
------------------------------------------------------------ revno: 3967 revision-id: svoj@mariadb.org-20140131141943-fjyyih3h92lek62n parent: svoj@mariadb.org-20140130124905-vy28zbs9swlpg8uq committer: Sergey Vojtovich <svoj@mariadb.org> branch nick: 10.0 timestamp: Fri 2014-01-31 18:19:43 +0400 message: MDEV-5597 - Reduce usage of LOCK_open: LOCK_flush
Replaced LOCK_flush with per-share conditional variable. === modified file 'sql/table_cache.cc' --- a/sql/table_cache.cc 2014-01-30 12:49:05 +0000 +++ b/sql/table_cache.cc 2014-01-31 14:19:43 +0000 @@ -945,6 +949,7 @@ bool tdc_remove_table(THD *thd, enum_tdc if ((share= tdc_delete_share(db, table_name))) { I_P_List <TABLE, TABLE_share> purge_tables; + uint my_refs= 1;
mysql_mutex_lock(&LOCK_open); /* @@ -969,28 +974,48 @@ bool tdc_remove_table(THD *thd, enum_tdc if (kill_delayed_threads) kill_delayed_threads_for_table(share);
-#ifndef DBUG_OFF - if (remove_type == TDC_RT_REMOVE_NOT_OWN) + if (remove_type == TDC_RT_REMOVE_NOT_OWN || + remove_type == TDC_RT_REMOVE_NOT_OWN_KEEP_SHARE) { TABLE_SHARE::All_share_tables_list::Iterator it(share->tdc.all_tables); while ((table= it++)) + { + my_refs++; DBUG_ASSERT(table->in_use == thd); + } } -#endif - - mysql_rwlock_rdlock(&LOCK_flush); + DBUG_ASSERT(share->tdc.all_tables.is_empty() || remove_type != TDC_RT_REMOVE_ALL); mysql_mutex_unlock(&LOCK_open);
while ((table= purge_tables.pop_front())) intern_close_table(table); - mysql_rwlock_unlock(&LOCK_flush);
- DBUG_ASSERT(share->tdc.all_tables.is_empty() || remove_type != TDC_RT_REMOVE_ALL); - tdc_release_share(share); + if (remove_type != TDC_RT_REMOVE_UNUSED) + { + /* + Even though current thread holds exclusive metadata lock on this share + (asserted above), concurrent FLUSH TABLES threads may be in process of + closing unused table instances belonging to this share. E.g.: + thr1 (FLUSH TABLES): table= share->tdc.free_tables.pop_front(); + thr1 (FLUSH TABLES): share->tdc.all_tables.remove(table); + thr2 (ALTER TABLE): tdc_remove_table(); + thr1 (FLUSH TABLES): intern_close_table(table); + + Current remove type assumes that all table instances (except for those + that are owned by current thread) must be closed before + thd_remove_table() returns. Wait for such tables here. + + Unfortunately TABLE_SHARE::wait_for_old_version() cannot be used here + because it waits for all table instances, whereas we have to wait only + for those that are not owned by current thread. + */ + mysql_mutex_lock(&share->tdc.LOCK_table_share); + while (share->tdc.ref_count > my_refs) + mysql_cond_wait(&share->tdc.COND_release, &share->tdc.LOCK_table_share); + mysql_mutex_unlock(&share->tdc.LOCK_table_share);
I'm sorry, I don't understand that at all. Why my_refs is only incremented in debug builds? In non-debug builds you wait until all tables but one are closed, in debug builds you may keep more than one table opened. You wait here on a condition, how do you know there's FLUSH TABLES (or anything at all) running that will signal this condition?
+ }
- /* Wait for concurrent threads to free unused objects. */ - mysql_rwlock_wrlock(&LOCK_flush); - mysql_rwlock_unlock(&LOCK_flush); + tdc_release_share(share);
found= true; }
Regards, Sergei
Hi Sergei, On Fri, Jan 31, 2014 at 04:59:41PM +0100, Sergei Golubchik wrote: > Hi, Sergey! > > On Jan 31, Sergey Vojtovich wrote: > > At lp:maria/10.0 > > > > ------------------------------------------------------------ > > revno: 3967 > > revision-id: svoj@mariadb.org-20140131141943-fjyyih3h92lek62n > > parent: svoj@mariadb.org-20140130124905-vy28zbs9swlpg8uq > > committer: Sergey Vojtovich <svoj@mariadb.org> > > branch nick: 10.0 > > timestamp: Fri 2014-01-31 18:19:43 +0400 > > message: > > MDEV-5597 - Reduce usage of LOCK_open: LOCK_flush > > > > Replaced LOCK_flush with per-share conditional variable. > > === modified file 'sql/table_cache.cc' > > --- a/sql/table_cache.cc 2014-01-30 12:49:05 +0000 > > +++ b/sql/table_cache.cc 2014-01-31 14:19:43 +0000 > > @@ -945,6 +949,7 @@ bool tdc_remove_table(THD *thd, enum_tdc > > if ((share= tdc_delete_share(db, table_name))) > > { > > I_P_List <TABLE, TABLE_share> purge_tables; > > + uint my_refs= 1; > > > > mysql_mutex_lock(&LOCK_open); > > /* > > @@ -969,28 +974,48 @@ bool tdc_remove_table(THD *thd, enum_tdc > > if (kill_delayed_threads) > > kill_delayed_threads_for_table(share); > > > > -#ifndef DBUG_OFF > > - if (remove_type == TDC_RT_REMOVE_NOT_OWN) > > + if (remove_type == TDC_RT_REMOVE_NOT_OWN || > > + remove_type == TDC_RT_REMOVE_NOT_OWN_KEEP_SHARE) > > { > > TABLE_SHARE::All_share_tables_list::Iterator it(share->tdc.all_tables); > > while ((table= it++)) > > + { > > + my_refs++; > > DBUG_ASSERT(table->in_use == thd); > > + } > > } > > -#endif > > - > > - mysql_rwlock_rdlock(&LOCK_flush); > > + DBUG_ASSERT(share->tdc.all_tables.is_empty() || remove_type != TDC_RT_REMOVE_ALL); > > mysql_mutex_unlock(&LOCK_open); > > > > while ((table= purge_tables.pop_front())) > > intern_close_table(table); > > - mysql_rwlock_unlock(&LOCK_flush); > > > > - DBUG_ASSERT(share->tdc.all_tables.is_empty() || remove_type != TDC_RT_REMOVE_ALL); > > - tdc_release_share(share); > > + if (remove_type != TDC_RT_REMOVE_UNUSED) > > + { > > + /* > > + Even though current thread holds exclusive metadata lock on this share > > + (asserted above), concurrent FLUSH TABLES threads may be in process of > > + closing unused table instances belonging to this share. E.g.: > > + thr1 (FLUSH TABLES): table= share->tdc.free_tables.pop_front(); > > + thr1 (FLUSH TABLES): share->tdc.all_tables.remove(table); > > + thr2 (ALTER TABLE): tdc_remove_table(); > > + thr1 (FLUSH TABLES): intern_close_table(table); > > + > > + Current remove type assumes that all table instances (except for those > > + that are owned by current thread) must be closed before > > + thd_remove_table() returns. Wait for such tables here. > > + > > + Unfortunately TABLE_SHARE::wait_for_old_version() cannot be used here > > + because it waits for all table instances, whereas we have to wait only > > + for those that are not owned by current thread. > > + */ > > + mysql_mutex_lock(&share->tdc.LOCK_table_share); > > + while (share->tdc.ref_count > my_refs) > > + mysql_cond_wait(&share->tdc.COND_release, &share->tdc.LOCK_table_share); > > + mysql_mutex_unlock(&share->tdc.LOCK_table_share); > > I'm sorry, I don't understand that at all. > > Why my_refs is only incremented in debug builds? In non-debug builds you > wait until all tables but one are closed, in debug builds you may keep > more than one table opened. See above, this patch removes #ifndef DBUG_OFF, so my_refs are counted properly in both builds. > You wait here on a condition, how do you know there's FLUSH TABLES (or > anything at all) running that will signal this condition? 1. we're protected by exclusive metadata lock, so the only possible statement that may cause concurrency issues should be FLUSH TABLES (in fact there are more cases, like ha_table_exists()). 2. share->tdc.all_tables holds only tables owned by current thread (or empty), there are assertions for this above. 3. if concurrent FLUSH TABLES removed some TABLE instances from tdc.all_tables, but didn't close them yet, share ref_count reflects that. 4. this FLUSH TABLES must call intern_close_table(), which will eventually call tdc_release_share()/mysql_cond_broadcast(). That's it. > > > + } > > > > - /* Wait for concurrent threads to free unused objects. */ > > - mysql_rwlock_wrlock(&LOCK_flush); > > - mysql_rwlock_unlock(&LOCK_flush); > > + tdc_release_share(share); > > > > found= true; > > } > > Regards, > Sergei Thanks, Sergey
Hi, Sergey! On Jan 31, Sergey Vojtovich wrote: > > > > I'm sorry, I don't understand that at all. > > > > Why my_refs is only incremented in debug builds? In non-debug builds you > > wait until all tables but one are closed, in debug builds you may keep > > more than one table opened. > See above, this patch removes #ifndef DBUG_OFF, so my_refs are counted properly > in both builds. Right. Sorry. My mistake. > > You wait here on a condition, how do you know there's FLUSH TABLES (or > > anything at all) running that will signal this condition? > 1. we're protected by exclusive metadata lock, so the only possible statement > that may cause concurrency issues should be FLUSH TABLES (in fact there are > more cases, like ha_table_exists()). > 2. share->tdc.all_tables holds only tables owned by current thread (or empty), > there are assertions for this above. > 3. if concurrent FLUSH TABLES removed some TABLE instances from tdc.all_tables, > but didn't close them yet, share ref_count reflects that. > 4. this FLUSH TABLES must call intern_close_table(), which will eventually call > tdc_release_share()/mysql_cond_broadcast(). > > That's it. Okay, thanks. Got it. What did you mean "like ha_table_exists()" ? Regards, Sergei
Hi Sergei, On Mon, Feb 03, 2014 at 05:22:50PM +0100, Sergei Golubchik wrote: > Hi, Sergey! > > On Jan 31, Sergey Vojtovich wrote: > > > > > > I'm sorry, I don't understand that at all. > > > > > > Why my_refs is only incremented in debug builds? In non-debug builds you > > > wait until all tables but one are closed, in debug builds you may keep > > > more than one table opened. > > See above, this patch removes #ifndef DBUG_OFF, so my_refs are counted properly > > in both builds. > > Right. Sorry. My mistake. > > > > You wait here on a condition, how do you know there's FLUSH TABLES (or > > > anything at all) running that will signal this condition? > > 1. we're protected by exclusive metadata lock, so the only possible statement > > that may cause concurrency issues should be FLUSH TABLES (in fact there are > > more cases, like ha_table_exists()). > > 2. share->tdc.all_tables holds only tables owned by current thread (or empty), > > there are assertions for this above. > > 3. if concurrent FLUSH TABLES removed some TABLE instances from tdc.all_tables, > > but didn't close them yet, share ref_count reflects that. > > 4. this FLUSH TABLES must call intern_close_table(), which will eventually call > > tdc_release_share()/mysql_cond_broadcast(). > > > > That's it. > > Okay, thanks. Got it. > What did you mean "like ha_table_exists()" ? FWICS ha_table_exists() may be called without prior acquisition of MDL, e.g. by fill_schema_table_names() (SHOW TABLES). It does the following: <code> TABLE_SHARE *share= tdc_lock_share(db, table_name); // <LOCK_table_share>ref_count++</LOCK_table_share> if (share) { if (hton) *hton= share->db_type(); tdc_unlock_share(share); // <LOCK_table_share>ref_count--</LOCK_table_share> DBUG_RETURN(TRUE); } </code> Note that it locks and then immediately unlocks table share, in other words condition signal is guaranteed. In fact we don't need to touch ref_count here - all we need is hash lookup (there is no API for that yet). I'll add this idea to MDEV-5471. Regards, Sergey
Hi, Sergey! On Jan 31, Sergey Vojtovich wrote: > > > You wait here on a condition, how do you know there's FLUSH TABLES (or > > anything at all) running that will signal this condition? > 1. we're protected by exclusive metadata lock, so the only possible statement > that may cause concurrency issues should be FLUSH TABLES (in fact there are > more cases, like ha_table_exists()). > 2. share->tdc.all_tables holds only tables owned by current thread (or empty), > there are assertions for this above. > 3. if concurrent FLUSH TABLES removed some TABLE instances from tdc.all_tables, > but didn't close them yet, share ref_count reflects that. > 4. this FLUSH TABLES must call intern_close_table(), which will eventually call > tdc_release_share()/mysql_cond_broadcast(). > > That's it. I see. Then, I suppose, it's ok to push. But please add this your explanation in a comment. Regards, Sergei
participants (2)
-
Sergei Golubchik
-
Sergey Vojtovich