Re: [Maria-developers] [Commits] f2aea43: MDEV-10649: Optimizer sometimes use "index" instead of "range" access for UPDATE
On Tue, Sep 6, 2016 at 8:37 PM, Sergei Petrunia <psergey@askmonty.org> wrote:
revision-id: f2aea435df7e92fcf8f09f8f6c160161168c5bed parent(s): a14f61ef749ad9f9ab2b0f5badf6754ba7443c9e committer: Sergei Petrunia branch nick: 10.0 timestamp: 2016-09-06 20:37:21 +0300 message:
MDEV-10649: Optimizer sometimes use "index" instead of "range" access for UPDATE
(XtraDB variant only, for now)
Re-opening a TABLE object (after e.g. FLUSH TABLES or open table cache eviction) causes ha_innobase to call dict_stats_update(DICT_STATS_FETCH_ONLY_IF_NOT_IN_MEMORY).
Inside this call, the following is done: dict_stats_empty_table(table); dict_stats_copy(table, t);
On the other hand, commands like UPDATE make this call to get the "rows in table" statistics in table->stats.records:
ha_innobase->info(HA_STATUS_VARIABLE|HA_STATUS_NO_LOCK)
note the HA_STATUS_NO_LOCK parameter. It means, no locks are taken by ::info() If the ::info() call happens between dict_stats_empty_table and dict_stats_copy calls, the UPDATE's optimizer will get an estimate of table->stats.records=1, which causes it to pick a full table scan, which in turn will take a lot of row locks and cause other bad consequences.
--- storage/xtradb/dict/dict0stats.cc | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/storage/xtradb/dict/dict0stats.cc b/storage/xtradb/dict/ dict0stats.cc index b073398..a4aa436 100644 --- a/storage/xtradb/dict/dict0stats.cc +++ b/storage/xtradb/dict/dict0stats.cc @@ -673,7 +673,10 @@ Write all zeros (or 1 where it makes sense) into a table and its indexes' dict_stats_copy( /*============*/ dict_table_t* dst, /*!< in/out: destination table */ - const dict_table_t* src) /*!< in: source table */ + const dict_table_t* src, /*!< in: source table */ + bool reset_ignored_indexes) /*!< in: if true, set ignored indexes + to have the same statistics as if + the table was empty */ { dst->stats_last_recalc = src->stats_last_recalc; dst->stat_n_rows = src->stat_n_rows; @@ -692,7 +695,16 @@ Write all zeros (or 1 where it makes sense) into a table and its indexes' && (src_idx = dict_table_get_next_index(src_idx)))) {
if (dict_stats_should_ignore_index(dst_idx)) { - continue; + if (reset_ignored_indexes) { + /* Reset index statistics for all ignored indexes, + unless they are FT indexes (these have no statistics)*/ + if (dst_idx->type & DICT_FTS) { + continue; + } + dict_stats_empty_index(dst_idx);
Does this really help? Yes, we hold dict_sys mutex here so dict_stats_empty_index here is safe for all readers using same mutex. However, as you pointed up above, info() uses no locking method. Thus, we really do not know what it will return, all values before dict_stats_copy(), some of the indexes with new stats, all indexes with new stats.
@@ -3240,13 +3252,10 @@ N*AVG(Ui). In each call it searches for the currently fetched index into
dict_table_stats_lock(table, RW_X_LATCH);
- /* Initialize all stats to dummy values before - copying because dict_stats_table_clone_create() does - skip corrupted indexes so our dummy object 't' may - have less indexes than the real object 'table'. */ - dict_stats_empty_table(table); - - dict_stats_copy(table, t); + /* Pass reset_ignored_indexes=true as parameter + to dict_stats_copy. This will cause statictics + for corrupted indexes to be set to empty values */ + dict_stats_copy(table, t, true);
This is better solution than the original but as noted above, is this enough ? R: Jan
On Wed, Sep 07, 2016 at 10:36:32AM +0300, Jan Lindström wrote:
On Tue, Sep 6, 2016 at 8:37 PM, Sergei Petrunia <psergey@askmonty.org> wrote:
revision-id: f2aea435df7e92fcf8f09f8f6c160161168c5bed parent(s): a14f61ef749ad9f9ab2b0f5badf6754ba7443c9e committer: Sergei Petrunia branch nick: 10.0 timestamp: 2016-09-06 20:37:21 +0300 message:
MDEV-10649: Optimizer sometimes use "index" instead of "range" access for UPDATE
(XtraDB variant only, for now)
Re-opening a TABLE object (after e.g. FLUSH TABLES or open table cache eviction) causes ha_innobase to call dict_stats_update(DICT_STATS_FETCH_ONLY_IF_NOT_IN_MEMORY).
Inside this call, the following is done: dict_stats_empty_table(table); dict_stats_copy(table, t);
On the other hand, commands like UPDATE make this call to get the "rows in table" statistics in table->stats.records:
ha_innobase->info(HA_STATUS_VARIABLE|HA_STATUS_NO_LOCK)
note the HA_STATUS_NO_LOCK parameter. It means, no locks are taken by ::info() If the ::info() call happens between dict_stats_empty_table and dict_stats_copy calls, the UPDATE's optimizer will get an estimate of table->stats.records=1, which causes it to pick a full table scan, which in turn will take a lot of row locks and cause other bad consequences.
--- storage/xtradb/dict/dict0stats.cc | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/storage/xtradb/dict/dict0stats.cc b/storage/xtradb/dict/ dict0stats.cc index b073398..a4aa436 100644 --- a/storage/xtradb/dict/dict0stats.cc +++ b/storage/xtradb/dict/dict0stats.cc @@ -673,7 +673,10 @@ Write all zeros (or 1 where it makes sense) into a table and its indexes' dict_stats_copy( /*============*/ dict_table_t* dst, /*!< in/out: destination table */ - const dict_table_t* src) /*!< in: source table */ + const dict_table_t* src, /*!< in: source table */ + bool reset_ignored_indexes) /*!< in: if true, set ignored indexes + to have the same statistics as if + the table was empty */ { dst->stats_last_recalc = src->stats_last_recalc; dst->stat_n_rows = src->stat_n_rows; @@ -692,7 +695,16 @@ Write all zeros (or 1 where it makes sense) into a table and its indexes' && (src_idx = dict_table_get_next_index(src_idx)))) {
if (dict_stats_should_ignore_index(dst_idx)) { - continue; + if (reset_ignored_indexes) { + /* Reset index statistics for all ignored indexes, + unless they are FT indexes (these have no statistics)*/ + if (dst_idx->type & DICT_FTS) { + continue; + } + dict_stats_empty_index(dst_idx);
Does this really help? Yes, we hold dict_sys mutex here so dict_stats_empty_index here is safe for all readers using same mutex. However, as you pointed up above, info() uses no locking method. Thus, we really do not know what it will return, all values before dict_stats_copy(), some of the indexes with new stats, all indexes with new stats.
I think there two issues here: issue#1 is the above chunk of code. It is executed only for indexes that have dict_stats_should_ignore_index()= true. These are fulltext indexes, indexes that are corrupted or are being dropped. As far I understand, the optimizer will not (or should not) attempt to use index statistics on such indexes, anyway. issue#2 is with updating individual index statistics members. It does exist, there are race conditions also near ANALYZE code: https://jira.mariadb.org/browse/MDEV-10790 I would like to limit the scope of this fix to address the race condition with reading/updating dict_table_t::stat_n_rows. (I believe this is done here, to a full extent). As for race conditions with dict_index_t members, I want to address them in MDEV-10790.
@@ -3240,13 +3252,10 @@ N*AVG(Ui). In each call it searches for the currently fetched index into
dict_table_stats_lock(table, RW_X_LATCH);
- /* Initialize all stats to dummy values before - copying because dict_stats_table_clone_create() does - skip corrupted indexes so our dummy object 't' may - have less indexes than the real object 'table'. */ - dict_stats_empty_table(table); - - dict_stats_copy(table, t); + /* Pass reset_ignored_indexes=true as parameter + to dict_stats_copy. This will cause statictics + for corrupted indexes to be set to empty values */ + dict_stats_copy(table, t, true);
This is better solution than the original but as noted above, is this enough ?
R: Jan
-- BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
I'm fine with this. Ok to push #1. 28.9.2016 15.32 "Sergey Petrunia" <sergey@mariadb.com> kirjoitti:
On Wed, Sep 07, 2016 at 10:36:32AM +0300, Jan Lindström wrote:
On Tue, Sep 6, 2016 at 8:37 PM, Sergei Petrunia <psergey@askmonty.org> wrote:
revision-id: f2aea435df7e92fcf8f09f8f6c160161168c5bed parent(s): a14f61ef749ad9f9ab2b0f5badf6754ba7443c9e committer: Sergei Petrunia branch nick: 10.0 timestamp: 2016-09-06 20:37:21 +0300 message:
MDEV-10649: Optimizer sometimes use "index" instead of "range" access for UPDATE
(XtraDB variant only, for now)
Re-opening a TABLE object (after e.g. FLUSH TABLES or open table cache eviction) causes ha_innobase to call dict_stats_update(DICT_STATS_FETCH_ONLY_IF_NOT_IN_MEMORY).
Inside this call, the following is done: dict_stats_empty_table(table); dict_stats_copy(table, t);
On the other hand, commands like UPDATE make this call to get the "rows in table" statistics in table->stats.records:
ha_innobase->info(HA_STATUS_VARIABLE|HA_STATUS_NO_LOCK)
note the HA_STATUS_NO_LOCK parameter. It means, no locks are taken by ::info() If the ::info() call happens between dict_stats_empty_table and dict_stats_copy calls, the UPDATE's optimizer will get an estimate of table->stats.records=1, which causes it to pick a full table scan, which in turn will take a lot of row locks and cause other bad consequences.
--- storage/xtradb/dict/dict0stats.cc | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/storage/xtradb/dict/dict0stats.cc b/storage/xtradb/dict/ dict0stats.cc index b073398..a4aa436 100644 --- a/storage/xtradb/dict/dict0stats.cc +++ b/storage/xtradb/dict/dict0stats.cc @@ -673,7 +673,10 @@ Write all zeros (or 1 where it makes sense) into a table and its indexes' dict_stats_copy( /*============*/ dict_table_t* dst, /*!< in/out: destination table */ - const dict_table_t* src) /*!< in: source table */ + const dict_table_t* src, /*!< in: source table */ + bool reset_ignored_indexes) /*!< in: if true, set ignored indexes + to have the same statistics as if + the table was empty */ { dst->stats_last_recalc = src->stats_last_recalc; dst->stat_n_rows = src->stat_n_rows; @@ -692,7 +695,16 @@ Write all zeros (or 1 where it makes sense) into a table and its indexes' && (src_idx = dict_table_get_next_index(src_idx)))) {
if (dict_stats_should_ignore_index(dst_idx)) { - continue; + if (reset_ignored_indexes) { + /* Reset index statistics for all ignored indexes, + unless they are FT indexes (these have no statistics)*/ + if (dst_idx->type & DICT_FTS) { + continue; + } + dict_stats_empty_index(dst_idx);
Does this really help? Yes, we hold dict_sys mutex here so dict_stats_empty_index here is safe for all readers using same mutex. However, as you pointed up above, info() uses no locking method. Thus, we really do not know what it will return, all values before dict_stats_copy(), some of the indexes with new stats, all indexes with new stats.
I think there two issues here:
issue#1 is the above chunk of code. It is executed only for indexes that have dict_stats_should_ignore_index()= true. These are fulltext indexes, indexes that are corrupted or are being dropped. As far I understand, the optimizer will not (or should not) attempt to use index statistics on such indexes, anyway.
issue#2 is with updating individual index statistics members. It does exist, there are race conditions also near ANALYZE code: https://jira.mariadb.org/browse/MDEV-10790
I would like to limit the scope of this fix to address the race condition with reading/updating dict_table_t::stat_n_rows. (I believe this is done here, to a full extent).
As for race conditions with dict_index_t members, I want to address them in MDEV-10790.
@@ -3240,13 +3252,10 @@ N*AVG(Ui). In each call it searches for the currently fetched index into
dict_table_stats_lock(table, RW_X_LATCH);
- /* Initialize all stats to dummy values before - copying because dict_stats_table_clone_create(
)
does - skip corrupted indexes so our dummy object 't' may - have less indexes than the real object 'table'. */ - dict_stats_empty_table(table); - - dict_stats_copy(table, t); + /* Pass reset_ignored_indexes=true as parameter + to dict_stats_copy. This will cause statictics + for corrupted indexes to be set to empty values */ + dict_stats_copy(table, t, true);
This is better solution than the original but as noted above, is this enough ?
R: Jan
-- BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
participants (2)
-
Jan Lindström
-
Sergey Petrunia