On Fri, Mar 30, 2018 at 04:51:18AM +0530, Varun wrote:
revision-id: e13108c7fcf472faa7eaba1a7083d80c9bad4f55 (mariadb-10.3.0-696-ge13108c7fcf) parent(s): 87ee85634cae8538b922b6eb3d275a5f4343dbb1 author: Varun Gupta committer: Varun Gupta timestamp: 2018-03-30 04:50:47 +0530 message:
MDEV-15306: Wrong/Unexpected result with the value optimizer_use_condition_selectivity set to 4
Currently we perform range analysis for a column even when we don't have any statistics(EITS). This makes less sense but is used to catch contradiction for WHERE condition.
So the solution is to not perform range analysis for columns that do not have statistics that can be used by the optimizer.
--- mysql-test/main/selectivity.result | 48 ++++++++++++++++++++++++++++--- mysql-test/main/selectivity.test | 37 ++++++++++++++++++++++++ mysql-test/main/selectivity_innodb.result | 48 ++++++++++++++++++++++++++++--- sql/opt_range.cc | 10 +++++-- sql/sql_statistics.h | 17 ++++++++++- 5 files changed, 149 insertions(+), 11 deletions(-)
...
index 38dbed92a22..269d4af667d 100644 --- a/sql/opt_range.cc +++ b/sql/opt_range.cc @@ -2733,13 +2733,18 @@ bool create_key_parts_for_pseudo_indexes(RANGE_OPT_PARAM *param,
for (field_ptr= table->field; *field_ptr; field_ptr++) { - if (bitmap_is_set(used_fields, (*field_ptr)->field_index)) + Column_statistics* col_stats= (*field_ptr)->read_stats; + if (col_stats && !col_stats->check_all_values_are_default() + && bitmap_is_set(used_fields, (*field_ptr)->field_index)) parts++; }
KEY_PART *key_part; uint keys= 0;
+ if(!parts) + return TRUE; coding style: please add a space after 'if('
+ if (!(key_part= (KEY_PART *) alloc_root(param->mem_root, sizeof(KEY_PART) * parts))) return TRUE; @@ -3025,7 +3030,8 @@ bool calculate_cond_selectivity_for_table(THD *thd, TABLE *table, Item **cond) */
if (thd->variables.optimizer_use_condition_selectivity > 2 && - !bitmap_is_clear_all(used_fields)) + !bitmap_is_clear_all(used_fields) && + thd->variables.use_stat_tables >0) { PARAM param; MEM_ROOT alloc; diff --git a/sql/sql_statistics.h b/sql/sql_statistics.h index 7ec742b70db..4a7b0df0caf 100644 --- a/sql/sql_statistics.h +++ b/sql/sql_statistics.h @@ -383,7 +383,22 @@ class Column_statistics bool min_max_values_are_provided() { return !is_null(COLUMN_STAT_MIN_VALUE) && - !is_null(COLUMN_STAT_MIN_VALUE); + !is_null(COLUMN_STAT_MAX_VALUE); + } + + /* + This function checks whether the values for the fields of the statistical + tables that were NULL by DEFAULT for a column have changed or not. + + @retval + TRUE: Statistics are not present for a column + FALSE: Statisitics are present for a column + */ + bool check_all_values_are_default()
I don't like the name. - look at other "check if ..." functions above. They don't start with check_. - Typically, "check_" functions in MySQL codebase make some expensive computations (e.g. check_quick_select()). - "default values" also is not meanigful. They are not "default", they are "not present". There is no meaninful default for e.g. MIN_VALUE. I think something like "bool no_stat_values_provided()" would be better. I don't like "provided" but there's min_max_values_are_provided() so it's good to follow it.
+ { + return !(column_stat_nulls ^ + ((1 << (COLUMN_STAT_HISTOGRAM-COLUMN_STAT_COLUMN_NAME))-1) << + (COLUMN_STAT_COLUMN_NAME+1));
The above is very difficult to read. I undestand that the condition is "inspired" the the code in set_all_nulls() in the same class, but let's try making it more readable. As far as I understand, the bitmap has bits covering the available fields: enum enum_column_stat_col { COLUMN_STAT_DB_NAME, // least significant bit (1) COLUMN_STAT_TABLE_NAME, // second bit, 1 << 1 COLUMN_STAT_COLUMN_NAME, // 1 << 2 COLUMN_STAT_MIN_VALUE, // and so forth. COLUMN_STAT_MAX_VALUE, COLUMN_STAT_NULLS_RATIO, COLUMN_STAT_AVG_LENGTH, COLUMN_STAT_AVG_FREQUENCY, COLUMN_STAT_HIST_SIZE, COLUMN_STAT_HIST_TYPE, COLUMN_STAT_HISTOGRAM, COLUMN_STAT_N_FIELDS }; DB_NAME, TABLE_NAME, and COLUMN_NAME are expected to be set. We want to check if there is at least one other bit present. const int name_bits= (1 << COLUMN_STAT_DB_NAME) | (1 << COLUMN_STAT_TABLE_NAME) | (1 << COLUMN_STAT_COLUMN_NAME); // then: if (column_stat_nulls & ~name_bits) { // there is at least statistical data bit set. return false; } // no other values. return true;
} };
Does this patch make the patch for MDEV-16995 unnecessary? BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog