I'm fine with this. Ok to push #1.
On Wed, Sep 07, 2016 at 10:36:32AM +0300, Jan Lindström wrote:
> On Tue, Sep 6, 2016 at 8:37 PM, Sergei Petrunia <psergey@askmonty.org>
> wrote:
>
> > revision-id: f2aea435df7e92fcf8f09f8f6c160161168c5bed
> > parent(s): a14f61ef749ad9f9ab2b0f5badf6754ba7443c9e
> > committer: Sergei Petrunia
> > branch nick: 10.0
> > timestamp: 2016-09-06 20:37:21 +0300
> > message:
> >
> > MDEV-10649: Optimizer sometimes use "index" instead of "range" access for
> > UPDATE
> >
> > (XtraDB variant only, for now)
> >
> > Re-opening a TABLE object (after e.g. FLUSH TABLES or open table cache
> > eviction) causes ha_innobase to call
> > dict_stats_update(DICT_STATS_FETCH_ONLY_IF_NOT_IN_MEMORY).
> >
> > Inside this call, the following is done:
> > dict_stats_empty_table(table);
> > dict_stats_copy(table, t);
> >
> > On the other hand, commands like UPDATE make this call to get the "rows in
> > table" statistics in table->stats.records:
> >
> > ha_innobase->info(HA_STATUS_VARIABLE|HA_STATUS_NO_LOCK)
> >
> > note the HA_STATUS_NO_LOCK parameter. It means, no locks are taken by
> > ::info() If the ::info() call happens between dict_stats_empty_table
> > and dict_stats_copy calls, the UPDATE's optimizer will get an estimate
> > of table->stats.records=1, which causes it to pick a full table scan,
> > which in turn will take a lot of row locks and cause other bad
> > consequences.
> >
> > ---
> > storage/xtradb/dict/dict0stats.cc | 29 +++++++++++++++++++----------
> > 1 file changed, 19 insertions(+), 10 deletions(-)
> >
> > diff --git a/storage/xtradb/dict/dict0stats.cc b/storage/xtradb/dict/
> > dict0stats.cc
> > index b073398..a4aa436 100644
> > --- a/storage/xtradb/dict/dict0stats.cc
> > +++ b/storage/xtradb/dict/dict0stats.cc
> > @@ -673,7 +673,10 @@ Write all zeros (or 1 where it makes sense) into a
> > table and its indexes'
> > dict_stats_copy(
> > /*============*/
> > dict_table_t* dst, /*!< in/out: destination table */
> > - const dict_table_t* src) /*!< in: source table */
> > + const dict_table_t* src, /*!< in: source table */
> > + bool reset_ignored_indexes) /*!< in: if true, set ignored
> > indexes
> > + to have the same statistics
> > as if
> > + the table was empty */
> > {
> > dst->stats_last_recalc = src->stats_last_recalc;
> > dst->stat_n_rows = src->stat_n_rows;
> > @@ -692,7 +695,16 @@ Write all zeros (or 1 where it makes sense) into a
> > table and its indexes'
> > && (src_idx = dict_table_get_next_index(src_idx)))) {
> >
> > if (dict_stats_should_ignore_index(dst_idx)) {
> > - continue;
> > + if (reset_ignored_indexes) {
> > + /* Reset index statistics for all ignored
> > indexes,
> > + unless they are FT indexes (these have no
> > statistics)*/
> > + if (dst_idx->type & DICT_FTS) {
> > + continue;
> > + }
> > + dict_stats_empty_index(dst_idx);
> >
>
> Does this really help? Yes, we hold dict_sys mutex here so
> dict_stats_empty_index here is safe for all readers using same mutex.
> However, as you pointed up above, info() uses no locking method.
> Thus, we really do not know what it will return, all values before
> dict_stats_copy(), some of the indexes with new stats, all indexes with new
> stats.
I think there two issues here:
issue#1 is the above chunk of code.
It is executed only for indexes that have dict_stats_should_ignore_index()= true.
These are fulltext indexes, indexes that are corrupted or are being dropped.
As far I understand, the optimizer will not (or should not) attempt to use
index statistics on such indexes, anyway.
issue#2 is with updating individual index statistics members. It does exist,
there are race conditions also near ANALYZE code:
https://jira.mariadb.org/browse/MDEV-10790
I would like to limit the scope of this fix to address the race condition with
reading/updating dict_table_t::stat_n_rows. (I believe this is done here, to a
full extent).
As for race conditions with dict_index_t members, I want to address them in
MDEV-10790.
>
> >
> > @@ -3240,13 +3252,10 @@ N*AVG(Ui). In each call it searches for the
> > currently fetched index into
> >
> > dict_table_stats_lock(table, RW_X_LATCH);
> >
> > - /* Initialize all stats to dummy values before
> > - copying because dict_stats_table_clone_create()
> > does
> > - skip corrupted indexes so our dummy object 't' may
> > - have less indexes than the real object 'table'. */
> > - dict_stats_empty_table(table);
> > -
> > - dict_stats_copy(table, t);
> > + /* Pass reset_ignored_indexes=true as parameter
> > + to dict_stats_copy. This will cause statictics
> > + for corrupted indexes to be set to empty values */
> > + dict_stats_copy(table, t, true);
> >
>
> This is better solution than the original but as noted above, is this
> enough ?
>
> R: Jan
--
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog