Re: [Maria-developers] d67c3f88883: MDEV-27744 InnoDB: Failing assertion: !cursor->index->is_committed() in row0ins.cc (from row_ins_sec_index_entry_by_modify) | Assertion `0' failed in row_upd_sec_index_entry (debug) | Corruption
Hi, Alexander, A question: if the frm will *always* be created in the canonical, non-ORACLE mode and always parsed in the same mode, will we even need these schema-qualified native functions? okay, the schema-qualified trick will help to avoid creating REGEX_REPLACE_ORACLE and so on. But is it needed to fix MDEV-27744? More comments below On Apr 27, Alexander Barkov wrote:
revision-id: d67c3f88883 (mariadb-10.3.33-61-gd67c3f88883) parent(s): 7355f7b1f5c author: Alexander Barkov committer: Alexander Barkov timestamp: 2022-04-22 15:35:16 +0400 message:
MDEV-27744 InnoDB: Failing assertion: !cursor->index->is_committed() in row0ins.cc (from row_ins_sec_index_entry_by_modify) | Assertion `0' failed in row_upd_sec_index_entry (debug) | Corruption
The crash happened with an indexed virtual column whose value is evaluated using a function that has a different meaning in sql_mode='' vs sql_mode=ORACLE: ... good comment
diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc index 59d2ea60715..8e83ee02782 100644 --- a/sql/sql_partition.cc +++ b/sql/sql_partition.cc @@ -2591,11 +2591,9 @@ char *generate_partition_syntax_for_frm(THD *thd, partition_info *part_info, HA_CREATE_INFO *create_info, Alter_info *alter_info) { - sql_mode_t old_mode= thd->variables.sql_mode; - thd->variables.sql_mode &= ~MODE_ANSI_QUOTES; + Sql_mode_save_for_frm_handling sql_mode_save(thd);
right. thanks
char *res= generate_partition_syntax(thd, part_info, buf_length, true, create_info, alter_info); - thd->variables.sql_mode= old_mode; return res; }
diff --git a/sql/unireg.cc b/sql/unireg.cc index efbb5e773b4..3b37944afa5 100644 --- a/sql/unireg.cc +++ b/sql/unireg.cc @@ -652,6 +649,7 @@ static bool pack_expression(String *buf, Virtual_column_info *vcol, static bool pack_vcols(String *buf, List<Create_field> &create_fields, List<Virtual_column_info> *check_constraint_list) { + Sql_mode_save_for_frm_handling sql_mode_save(current_thd);
well, it was done above pack_vcols(), because the caller had access to the thd. Why not to keep it that way?
List_iterator<Create_field> it(create_fields); Create_field *field;
diff --git a/sql/item_cmpfunc.cc b/sql/item_cmpfunc.cc index f41414f8ae9..0f5c7519d07 100644 --- a/sql/item_cmpfunc.cc +++ b/sql/item_cmpfunc.cc @@ -3292,9 +3292,24 @@ void Item_func_case_simple::print(String *str, enum_query_type query_type) }
+Item_func_decode_oracle * +Item_func_decode_oracle::create(THD *thd, const LEX_CSTRING &name, + List<Item> *item_list) +{ + if (unlikely(!item_list || item_list->elements < 3)) + { + wrong_param_count_error(oracle_schema_ref.name(), name); + return NULL; + } + return new (thd->mem_root) Item_func_decode_oracle(thd, *item_list); +}
why? items are normally created in the Create_xxx::create* methods, defined in item_create.cc. Why did you delegate that to static methods in Item_func_xxx ?
+ + void Item_func_decode_oracle::print(String *str, enum_query_type query_type) { - str->append(func_name()); + print_sql_mode_dependent_name(str, query_type, + oracle_schema_ref, + Item_func_decode_oracle::func_name()); str->append('('); args[0]->print(str, query_type); for (uint i= 1, count= when_count() ; i <= count; i++) diff --git a/sql/mysqld.h b/sql/mysqld.h index d40e1d170d0..54069c3cd5a 100644 --- a/sql/mysqld.h +++ b/sql/mysqld.h @@ -741,12 +741,25 @@ enum enum_query_type QT_ITEM_SUBSELECT_ID_ONLY,
QT_SHOW_SELECT_NUMBER= (1<<10), + + //QT_ITEM_IDENT_DISABLE_DB_TABLE_NAMES= (1 <<11), -- this is taken in 10.5 + + /// Print sql_mode-dependent functions with the schema qualifier + /// even if the currently implied (by sql_mode) schema is equal to + /// to the function schema, e.g. mariadb_schema.concat('a'). + QT_ITEM_FUNC_FORCE_SCHEMA_NAME= (1 << 12), + + /// Print for FRM file. Focus on parse-back. + /// e.g. VIEW expressions and virtual column expressions + QT_FOR_FRM= (1 << 13), + /// This is used for EXPLAIN EXTENDED extra warnings / Be more detailed /// Be more detailed than QT_EXPLAIN. /// Perhaps we should eventually include QT_ITEM_IDENT_SKIP_CURRENT_DATABASE /// here, as it would give better readable results QT_EXPLAIN_EXTENDED= QT_TO_SYSTEM_CHARSET| - QT_SHOW_SELECT_NUMBER, + QT_SHOW_SELECT_NUMBER| + QT_ITEM_FUNC_FORCE_SCHEMA_NAME,
don't do this. let's try to keep the behavior *exactly as before* for people who don't switch between sql_modes all the time on the second thought. why do you need QT_ITEM_FUNC_FORCE_SCHEMA_NAME at all? Besides EXPLAIN EXTENDED (where we'd better not use it), it's only used in generating view's sql for I_S. It's not supposed to be used for generating views, so why would it need QT_ITEM_FUNC_FORCE_SCHEMA_NAME? May be it can be deleted altogether?
// If an expression is constant, print the expression, not the value // it evaluates to. Should be used for error messages, so that they diff --git a/sql/sql_class.h b/sql/sql_class.h index 8268ebe0bcd..6b988938f0b 100644 --- a/sql/sql_class.h +++ b/sql/sql_class.h @@ -6589,6 +6589,50 @@ class Sql_mode_save sql_mode_t old_mode; // SQL mode saved at construction time. };
+ +/* + Save the current sql_mode. Switch off sql_mode flags which can prevent + normal parsing of VIEWs, expressions in generated columns. + Restore the old sql_mode on destructor. +*/ +class Sql_mode_save_for_frm_handling: public Sql_mode_save +{ +public: + Sql_mode_save_for_frm_handling(THD *thd) + :Sql_mode_save(thd) + { + /* + - MODE_REAL_AS_FLOAT affect only CREATE TABLE parsing + + MODE_PIPES_AS_CONCAT affect expression parsing + + MODE_ANSI_QUOTES affect expression parsing + + MODE_IGNORE_SPACE affect expression parsing + - MODE_IGNORE_BAD_TABLE_OPTIONS affect only CREATE/ALTER TABLE parsing + * MODE_ONLY_FULL_GROUP_BY affect execution + * MODE_NO_UNSIGNED_SUBTRACTION affect execution + - MODE_NO_DIR_IN_CREATE affect table creation only + - MODE_POSTGRESQL compounded from other modes + - MODE_ORACLE affects Item creation (e.g for CONCAT)
should be '+' for MODE_ORACLE, not '-', right?
+ - MODE_MSSQL compounded from other modes + - MODE_DB2 compounded from other modes + - MODE_MAXDB affect only CREATE TABLE parsing + - MODE_NO_KEY_OPTIONS affect only SHOW + - MODE_NO_TABLE_OPTIONS affect only SHOW + - MODE_NO_FIELD_OPTIONS affect only SHOW + - MODE_MYSQL323 affect only SHOW + - MODE_MYSQL40 affect only SHOW + - MODE_ANSI compounded from other modes + (+ transaction mode) + ? MODE_NO_AUTO_VALUE_ON_ZERO affect UPDATEs + + MODE_NO_BACKSLASH_ESCAPES affect expression parsing + + MODE_EMPTY_STRING_IS_NULL affect expression parsing + */ + thd->variables.sql_mode&= ~(MODE_PIPES_AS_CONCAT | MODE_ANSI_QUOTES | + MODE_IGNORE_SPACE | MODE_NO_BACKSLASH_ESCAPES | + MODE_ORACLE | MODE_EMPTY_STRING_IS_NULL); + }; +}; + + class Abort_on_warning_instant_set { THD *m_thd; diff --git a/sql/sql_schema.h b/sql/sql_schema.h index 7c8f284d526..36914e36990 100644 --- a/sql/sql_schema.h +++ b/sql/sql_schema.h @@ -33,6 +36,34 @@ class Schema { return src; } + + + /** + Find the native function builder associated with a given function name. + @param thd The current thread + @param name The native function name + @return The native function builder associated with the name, or NULL + */ + virtual Create_func *find_native_function_builder(THD *thd, + const LEX_CSTRING &name) + const; + /** + Find a native function builder, return an error if not found, + build an Item otherwise. + */ + Item *make_item_func_call_native(THD *thd, + const Lex_ident_sys &name, + List<Item> *args) const; + + // Builders for native SQL function with a special syntax in sql_yacc.yy + virtual Item *make_item_func_substr(THD *thd, + const Lex_substring_spec_st &spec) const; + + virtual Item *make_item_func_trim(THD *thd, const Lex_trim_st &spec) const; + virtual Item *make_item_func_replace(THD *thd, + Item *subj, + Item *find, + Item *replace) const;
Why no Lex_replace_spec_st ?
/* For now we have *hard-coded* compatibility schemas: schema_mariadb, schema_oracle, schema_maxdb. @@ -66,5 +97,6 @@ class Schema
extern Schema mariadb_schema; +extern const Schema &oracle_schema_ref;
??? why did you declare mariadb_schema extern, but oracle_schema_ref only via a const ref?
#endif // SQL_SCHEMA_H_INCLUDED diff --git a/sql/item_create.h b/sql/item_create.h index 894e9777b8d..3fadaecb090 100644 --- a/sql/item_create.h +++ b/sql/item_create.h @@ -215,9 +205,48 @@ struct Native_func_registry Create_func *builder; };
+ +class Native_functions_hash: public HASH +{ +public: + Native_functions_hash() + { + bzero(this, sizeof(*this)); + } + ~Native_functions_hash() + { + /* + No automatic free. + The the upper level code should call cleanup() explicitly.
why is that?
+ */ + DBUG_ASSERT(!records); + } + bool init(size_t count); + bool init(const Native_func_registry array[], size_t count) + { + return init(count) || append(array); + } + bool append(const Native_func_registry array[]); + bool remove(const Native_func_registry array[]); + void cleanup(); + /** + Find the native function builder associated with a given function name. + @param thd The current thread + @param name The native function name + @return The native function builder associated with the name, or NULL + */ + Create_func *find(THD *thd, const LEX_CSTRING &name) const; +}; + +extern Native_functions_hash native_functions_hash; +extern Native_functions_hash native_functions_hash_oracle_overrides; + +extern const Native_func_registry func_array[]; +extern const size_t func_array_length; + int item_create_init(); -int item_create_append(Native_func_registry array[]); -int item_create_remove(Native_func_registry array[]); +int item_create_append(const Native_func_registry array[]); +int item_create_remove(const Native_func_registry array[]); void item_create_cleanup();
Item *create_func_dyncol_create(THD *thd, List<DYNCALL_CREATE_DEF> &list); diff --git a/sql/sql_schema.cc b/sql/sql_schema.cc index 0bf4a63c2f8..e2993f51c6d 100644 --- a/sql/sql_schema.cc +++ b/sql/sql_schema.cc @@ -78,3 +89,86 @@ Schema *Schema::find_implied(THD *thd) return &maxdb_schema; return &mariadb_schema; } + + +Create_func * +Schema::find_native_function_builder(THD *thd, const LEX_CSTRING &name) const +{ + return native_functions_hash.find(thd, name); +} + + +Create_func * +Schema_oracle::find_native_function_builder(THD *thd, + const LEX_CSTRING &name) const +{ + Create_func *create= native_functions_hash_oracle_overrides.find(thd, name); + if (create) + return create; + return native_functions_hash.find(thd, name);
may be populate native_functions_hash_oracle will all names to do only one lookup here? this will be used often in oracle mode
+} + + +Item *Schema::make_item_func_call_native(THD *thd, + const Lex_ident_sys &name, + List<Item> *args) const +{ + Create_func *builder= find_native_function_builder(thd, name); + if (builder) + return builder->create_func(thd, &name, args); + my_error(ER_FUNCTION_NOT_DEFINED, MYF(0), name.str); + return NULL; +} + + +Item *Schema::make_item_func_trim(THD *thd, const Lex_trim_st &spec) const +{ + return spec.make_item_func_trim_std(thd);
why does trim is constructed via helper in spec, but substr and replace are not?
+} + + +Item *Schema::make_item_func_substr(THD *thd, + const Lex_substring_spec_st &spec) const +{ + return spec.m_for ? + new (thd->mem_root) Item_func_substr(thd, spec.m_subject, spec.m_from, + spec.m_for) : + new (thd->mem_root) Item_func_substr(thd, spec.m_subject, spec.m_from); +} + + +Item *Schema_oracle::make_item_func_substr(THD *thd, + const Lex_substring_spec_st &spec) const +{ + return spec.m_for ? + new (thd->mem_root) Item_func_substr_oracle(thd, spec.m_subject, + spec.m_from, + spec.m_for) : + new (thd->mem_root) Item_func_substr_oracle(thd, spec.m_subject, + spec.m_from); +} + + +Item *Schema_oracle::make_item_func_trim(THD *thd, + const Lex_trim_st &spec) const +{ + return spec.make_item_func_trim_oracle(thd); +} + + +Item *Schema::make_item_func_replace(THD *thd, + Item *subj, + Item *find, + Item *replace) const +{ + return new (thd->mem_root) Item_func_replace(thd, subj, find, replace); +} + + +Item *Schema_oracle::make_item_func_replace(THD *thd, + Item *subj, + Item *find, + Item *replace) const +{ + return new (thd->mem_root) Item_func_replace_oracle(thd, subj, find, replace); +} diff --git a/sql/item_func.h b/sql/item_func.h index 754b1cd1eb2..52bf7f661d1 100644 --- a/sql/item_func.h +++ b/sql/item_func.h @@ -56,8 +56,117 @@ class Item_func :public Item_func_or_sum bool check_argument_types_can_return_text(uint start, uint end) const; bool check_argument_types_can_return_date(uint start, uint end) const; bool check_argument_types_can_return_time(uint start, uint end) const; + + void print_schema_qualified_name(String *to, + const LEX_CSTRING &schema_name, + const char *function_name) const + { + // e.g. oracle_schema.func() + to->append(schema_name); + to->append('.'); + to->append(function_name); + } + + void print_name_with_optional_suffix(String *to, + const char *function_name, + const LEX_CSTRING &suffix) const + { + // e.g. func_oracle() + to->append(function_name); + if (suffix.length) + { + to->append("_"); + to->append(suffix); + } + } + + void print_sql_mode_dependent_name(String *to, enum_query_type query_type, + const Schema &schema, + const char *function_name, + const LEX_CSTRING &suffix) const + { + if (query_type & QT_ITEM_FUNC_FORCE_SCHEMA_NAME) + { + DBUG_ASSERT((query_type & QT_FOR_FRM) == 0); + print_schema_qualified_name(to, schema.name(), function_name); + } + else if (query_type & QT_FOR_FRM) + { + // e.g. substr_oracle() + DBUG_ASSERT((query_type & QT_ITEM_FUNC_FORCE_SCHEMA_NAME) == 0); + print_name_with_optional_suffix(to, function_name, suffix); + } + else if (&schema != Schema::find_implied(current_thd)) + print_schema_qualified_name(to, schema.name(), function_name); + else + to->append(function_name); + } + + void print_sql_mode_dependent_name(String *to, enum_query_type query_type, + const Schema &schema, + const char *function_name) const + { + static const LEX_CSTRING oracle= {STRING_WITH_LEN("oracle")};
can it be oracle_schema.name() instead?
+ static const LEX_CSTRING empty= {NULL, 0}; + const LEX_CSTRING suffix= &schema == &oracle_schema_ref ? oracle : empty; + print_sql_mode_dependent_name(to, query_type, + schema, function_name, suffix); + } + void print_sql_mode_dependent_name(String *to, enum_query_type query_type, + const char *function_name) const + { + print_sql_mode_dependent_name(to, query_type, *schema(), function_name); + } + void print_sql_mode_dependent(String *to, enum_query_type query_type) + { + print_sql_mode_dependent_name(to, query_type, func_name()); + print_args_parenthesized(to, query_type); + } public:
+ // Print an error message for a builtin-schema qualified function call + static void wrong_param_count_error(const LEX_CSTRING &schema_name, + const LEX_CSTRING &func_name); + + // Check that the number of arguments is greater or equal to "expected" + static bool create_check_args_ge(const LEX_CSTRING &name, + const List<Item> *item_list, + uint expected) + { + DBUG_ASSERT(expected > 0); + if (!item_list || item_list->elements < expected) + { + my_error(ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT, MYF(0), name.str); + return true; + } + return false; + } + + // Check that the number of arguments is less or equal to "expected" + static bool create_check_args_le(const LEX_CSTRING &name, + const List<Item> *item_list, + uint expected) + { + DBUG_ASSERT(expected > 0); + if (!item_list || item_list->elements > expected) + { + my_error(ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT, MYF(0), name.str); + return true; + } + return false; + } + + // Check that the number of arguments is in the allowed range + static bool create_check_args_between(const LEX_CSTRING &name, + const List<Item> *item_list, + uint min_arg_count, + uint max_arg_count) + { + DBUG_ASSERT(min_arg_count < max_arg_count); + return create_check_args_ge(name, item_list, min_arg_count) || + create_check_args_le(name, item_list, max_arg_count); + } + table_map not_null_tables_cache;
enum Functype { UNKNOWN_FUNC,EQ_FUNC,EQUAL_FUNC,NE_FUNC,LT_FUNC,LE_FUNC, diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy index 3770488f3f4..e0bafc75a36 100644 --- a/sql/sql_yacc.yy +++ b/sql/sql_yacc.yy @@ -11235,6 +11236,34 @@ function_call_generic: if (unlikely(!($$= Lex->make_item_func_call_generic(thd, &$1, &$3, &$5, $7)))) MYSQL_YYABORT; } + | ident_cli '.' REPLACE '(' expr ',' expr ',' expr ')' + { + if (unlikely(!($$= Lex->make_item_func_replace(thd, $1, $3, + $5, $7, $9)))) + MYSQL_YYABORT; + } + | ident_cli '.' SUBSTRING '(' substring_operands ')' + { + if (unlikely(!($$= Lex->make_item_func_substr(thd, $1, $3, $5)))) + MYSQL_YYABORT; + } + | ident_cli '.' TRIM '(' trim_operands ')' + { + if (unlikely(!($$= $5.make_item_func_trim(thd, $1, $3)))) + MYSQL_YYABORT; + } + /* + We don't add a qualified syntax for TRIM_ORACLE here, + as this syntax is not absolutely required: + SELECT mariadb_schema.TRIM_ORACLE(..); + What absolutely required is only: + SELECT mariadb_schema.TRIM(..); + Adding a qualified syntax for TRIM_ORACLE would be tricky because + it is a non-reserved keyword. To avoid new shift/reduce conflicts + it would require grammar changes, like introducing a new rule + ident_step2_cli (which would include everything that ident_cli + includes but TRIM_ORACLE).
Did you add a qualified syntax for any other xxx_ORACLE functions?
+ */ ;
fulltext_options: diff --git a/sql/sql_lex.cc b/sql/sql_lex.cc index 7f1b362d91d..6b0a351f76e 100644 --- a/sql/sql_lex.cc +++ b/sql/sql_lex.cc @@ -2008,7 +2023,59 @@ bool Lex_input_stream::get_7bit_or_8bit_ident(THD *thd, uchar *last_char) }
+/* + Resolve special SQL functions that have a qualified syntax in sql_yacc.yy. + These functions are not listed in the native function registry + because of a special syntax, or a reserved keyword: + + mariadb_schema.SUBSTRING('a' FROM 1 FOR 2) -- Special syntax + mariadb_schema.TRIM(BOTH ' ' FROM 'a') -- Special syntax + mariadb_schema.REPLACE('a','b','c') -- Verb keyword +*/ + +int Lex_input_stream::find_keyword_qualified_special_func(Lex_ident_cli_st *str, + uint length) const +{ + /* + There are many other special functions, see the following grammar rules: + function_call_keyword + function_call_nonkeyword + Here we resolve only those that have a qualified syntax to handle + different behavior in different @@sql_mode settings. + + Other special functions do not work in qualified context: + SELECT mariadb_schema.year(now()); -- Function year is not defined + SELECT mariadb_schema.now(); -- Function now is not defined + + We don't resolve TRIM_ORACLE here, because it does not have + a qualified syntax yet. Search for "trim_operands" in sql_yacc.yy + to find more comments. + */ + static LEX_CSTRING funcs[]= + { + {STRING_WITH_LEN("SUBSTRING")}, + {STRING_WITH_LEN("SUBSTR")}, + {STRING_WITH_LEN("TRIM")}, + {STRING_WITH_LEN("REPLACE")} + }; + + int tokval= find_keyword(str, length, true); + if (!tokval) + return 0; + for (size_t i= 0; i < array_elements(funcs); i++) + { + CHARSET_INFO *cs= system_charset_info; + if (!cs->coll->strnncollsp(cs,
may be strcasecmp? or with cs=latin1? you know the repertoire is ascii, so you can cut corners here.
+ (const uchar *) m_tok_start, length, + (const uchar *) funcs[i].str, funcs[i].length)) + return tokval; + } + return 0; +} + + -int Lex_input_stream::scan_ident_sysvar(THD *thd, Lex_ident_cli_st *str) +int Lex_input_stream::scan_ident_common(THD *thd, Lex_ident_cli_st *str, + Ident_mode mode) { uchar last_char; uint length; @@ -8035,30 +8133,49 @@ bool LEX::add_grant_command(THD *thd, enum_sql_command sql_command_arg, }
+const Schema * +LEX::find_func_schema_by_name_or_error(const Lex_ident_sys &schema, + const Lex_ident_sys &func) +{ + Schema *res= Schema::find_by_name(schema); + if (res) + return res; + Database_qualified_name qname(schema, func); + my_error(ER_FUNCTION_NOT_DEFINED, MYF(0), ErrConvDQName(&qname).ptr());
eh, not quite. I can define a function with the same name as a native function in any db. E.g. create test.replace(a int) returns int and it'll be callable as test.replace(1) So ER_FUNCTION_NOT_DEFINED is wrong here. But note that even though test.substr() is possible, it should not allow substr() special syntax.
+ return NULL; +} -Item *LEX::make_item_func_substr(THD *thd, Item *a, Item *b, Item *c) -{ - return (thd->variables.sql_mode & MODE_ORACLE) ? - new (thd->mem_root) Item_func_substr_oracle(thd, a, b, c) : - new (thd->mem_root) Item_func_substr(thd, a, b, c); -}
-Item *LEX::make_item_func_substr(THD *thd, Item *a, Item *b) -{ - return (thd->variables.sql_mode & MODE_ORACLE) ? - new (thd->mem_root) Item_func_substr_oracle(thd, a, b) : - new (thd->mem_root) Item_func_substr(thd, a, b); -} +Item *LEX::make_item_func_substr(THD *thd, + const Lex_ident_cli_st &schema_name_cli, + const Lex_ident_cli_st &func_name_cli, + const Lex_substring_spec_st &spec) +{ + Lex_ident_sys schema_name(thd, &schema_name_cli); + Lex_ident_sys func_name(thd, &func_name_cli); + if (schema_name.is_null() || func_name.is_null()) + return NULL; // EOM + const Schema *schema= find_func_schema_by_name_or_error(schema_name, + func_name); + return schema ? schema->make_item_func_substr(thd, spec) : NULL; +}
Item *LEX::make_item_func_replace(THD *thd, + const Lex_ident_cli_st &schema_name_cli, + const Lex_ident_cli_st &func_name_cli, Item *org, Item *find, Item *replace) { - return (thd->variables.sql_mode & MODE_ORACLE) ? - new (thd->mem_root) Item_func_replace_oracle(thd, org, find, replace) : - new (thd->mem_root) Item_func_replace(thd, org, find, replace); + Lex_ident_sys schema_name(thd, &schema_name_cli); + Lex_ident_sys func_name(thd, &func_name_cli); + if (schema_name.is_null() || func_name.is_null()) + return NULL; // EOM + const Schema *schema= find_func_schema_by_name_or_error(schema_name, + func_name); + return schema ? schema->make_item_func_replace(thd, org, find, replace) : + NULL; }
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hello Sergei, Thanks for your review. Please find a new version here: https://github.com/MariaDB/server/commits/bb-10.4-bar-MDEV-27744-v2 I moved pure refacoting into separate patches, they are in the same branch: MDEV-31184 Remove parser tokens DECODE_MARIADB_SYM and DECODE_ORACLE_SYM MDEV-31187 Add class Sql_mode_save_for_frm_handling So now the main patch is much smaller. On 4/27/23 2:44 AM, Sergei Golubchik wrote:
Hi, Alexander,
A question: if the frm will *always* be created in the canonical, non-ORACLE mode and always parsed in the same mode, will we even need these schema-qualified native functions?
okay, the schema-qualified trick will help to avoid creating REGEX_REPLACE_ORACLE and so on. But is it needed to fix MDEV-27744?
Unfortunately I did not check how much of the patch can be removed to fix only MDEV-27744. My intent was to do everything at the same time: 1. Fix MDEV-27744 2. Prepare infrustructure to avoid adding mew XXX_ORACLE functions 3. Make the code symmetric for oracle_schema and mariadb_schema for simplicity of the patch. 4. Make it possible to call MariaDB functions from inside Oracle stored procedures 5. Make SHOW CREATE TABLE and SHOW CREATE VIEW display MariaDB versions of sql_mode-dependent functions with the mariadb_schema prefix if the current sql_mode is ORACLE. I think this is needed to avoid confusion for the user. But after that if we display MariaDB functions using mariadb_schema, we also need to parse mariadb_schema. Now that we have a customer bug "MDEV-29095 REGEXP_REPLACE ...", I think it's better not to minimize the patch to MDEV-27744. This symmetric solution is more reliable for my opinion.
More comments below
On Apr 27, Alexander Barkov wrote:
revision-id: d67c3f88883 (mariadb-10.3.33-61-gd67c3f88883) parent(s): 7355f7b1f5c author: Alexander Barkov committer: Alexander Barkov timestamp: 2022-04-22 15:35:16 +0400 message:
<cut>
diff --git a/sql/unireg.cc b/sql/unireg.cc index efbb5e773b4..3b37944afa5 100644 --- a/sql/unireg.cc +++ b/sql/unireg.cc @@ -652,6 +649,7 @@ static bool pack_expression(String *buf, Virtual_column_info *vcol, static bool pack_vcols(String *buf, List<Create_field> &create_fields, List<Virtual_column_info> *check_constraint_list) { + Sql_mode_save_for_frm_handling sql_mode_save(current_thd);
well, it was done above pack_vcols(), because the caller had access to the thd. Why not to keep it that way?
I added a THD* parameter to pack_vcols. I think when we have a code line this: change_sql_mode(); call_just_one_function(); restore_sql_mode(); then change/restore should go inside call_just_one_function(). This looks more readable for me. But I won't insist. Please suggest.
List_iterator<Create_field> it(create_fields); Create_field *field;
diff --git a/sql/item_cmpfunc.cc b/sql/item_cmpfunc.cc index f41414f8ae9..0f5c7519d07 100644 --- a/sql/item_cmpfunc.cc +++ b/sql/item_cmpfunc.cc @@ -3292,9 +3292,24 @@ void Item_func_case_simple::print(String *str, enum_query_type query_type) }
+Item_func_decode_oracle * +Item_func_decode_oracle::create(THD *thd, const LEX_CSTRING &name, + List<Item> *item_list) +{ + if (unlikely(!item_list || item_list->elements < 3)) + { + wrong_param_count_error(oracle_schema_ref.name(), name); + return NULL; + } + return new (thd->mem_root) Item_func_decode_oracle(thd, *item_list); +}
why? items are normally created in the Create_xxx::create* methods, defined in item_create.cc. Why did you delegate that to static methods in Item_func_xxx ?
I thought it would be more future proof. It would allow to use very short templates for Create_func_xxx by passing Item_func_xxx as a template parameter. But you're right, this is a separate issue. I removed these Item_func_xxx::create() helper methods for now.
diff --git a/sql/mysqld.h b/sql/mysqld.h index d40e1d170d0..54069c3cd5a 100644 --- a/sql/mysqld.h +++ b/sql/mysqld.h @@ -741,12 +741,25 @@ enum enum_query_type QT_ITEM_SUBSELECT_ID_ONLY,
QT_SHOW_SELECT_NUMBER= (1<<10), + + //QT_ITEM_IDENT_DISABLE_DB_TABLE_NAMES= (1 <<11), -- this is taken in 10.5 + + /// Print sql_mode-dependent functions with the schema qualifier + /// even if the currently implied (by sql_mode) schema is equal to + /// to the function schema, e.g. mariadb_schema.concat('a'). + QT_ITEM_FUNC_FORCE_SCHEMA_NAME= (1 << 12), + + /// Print for FRM file. Focus on parse-back. + /// e.g. VIEW expressions and virtual column expressions + QT_FOR_FRM= (1 << 13), + /// This is used for EXPLAIN EXTENDED extra warnings / Be more detailed /// Be more detailed than QT_EXPLAIN. /// Perhaps we should eventually include QT_ITEM_IDENT_SKIP_CURRENT_DATABASE /// here, as it would give better readable results QT_EXPLAIN_EXTENDED= QT_TO_SYSTEM_CHARSET| - QT_SHOW_SELECT_NUMBER, + QT_SHOW_SELECT_NUMBER| + QT_ITEM_FUNC_FORCE_SCHEMA_NAME,
don't do this. let's try to keep the behavior *exactly as before* for people who don't switch between sql_modes all the time
on the second thought. why do you need QT_ITEM_FUNC_FORCE_SCHEMA_NAME at all? Besides EXPLAIN EXTENDED (where we'd better not use it), it's only used in generating view's sql for I_S. It's not supposed to be used for generating views, so why would it need QT_ITEM_FUNC_FORCE_SCHEMA_NAME? May be it can be deleted altogether?
Fixed. I removed QT_ITEM_FUNC_FORCE_SCHEMA_NAME. Now the patch is only adding QT_FOR_FRM.
// If an expression is constant, print the expression, not the value // it evaluates to. Should be used for error messages, so that they diff --git a/sql/sql_class.h b/sql/sql_class.h index 8268ebe0bcd..6b988938f0b 100644 --- a/sql/sql_class.h +++ b/sql/sql_class.h @@ -6589,6 +6589,50 @@ class Sql_mode_save sql_mode_t old_mode; // SQL mode saved at construction time. };
+ +/* + Save the current sql_mode. Switch off sql_mode flags which can prevent + normal parsing of VIEWs, expressions in generated columns. + Restore the old sql_mode on destructor. +*/ +class Sql_mode_save_for_frm_handling: public Sql_mode_save +{ +public: + Sql_mode_save_for_frm_handling(THD *thd) + :Sql_mode_save(thd) + { + /* + - MODE_REAL_AS_FLOAT affect only CREATE TABLE parsing + + MODE_PIPES_AS_CONCAT affect expression parsing + + MODE_ANSI_QUOTES affect expression parsing + + MODE_IGNORE_SPACE affect expression parsing + - MODE_IGNORE_BAD_TABLE_OPTIONS affect only CREATE/ALTER TABLE parsing + * MODE_ONLY_FULL_GROUP_BY affect execution + * MODE_NO_UNSIGNED_SUBTRACTION affect execution + - MODE_NO_DIR_IN_CREATE affect table creation only + - MODE_POSTGRESQL compounded from other modes + - MODE_ORACLE affects Item creation (e.g for CONCAT)
should be '+' for MODE_ORACLE, not '-', right?
Fixed.
diff --git a/sql/sql_schema.h b/sql/sql_schema.h index 7c8f284d526..36914e36990 100644 --- a/sql/sql_schema.h +++ b/sql/sql_schema.h @@ -33,6 +36,34 @@ class Schema { return src; } + + + /** + Find the native function builder associated with a given function name. + @param thd The current thread + @param name The native function name + @return The native function builder associated with the name, or NULL + */ + virtual Create_func *find_native_function_builder(THD *thd, + const LEX_CSTRING &name) + const; + /** + Find a native function builder, return an error if not found, + build an Item otherwise. + */ + Item *make_item_func_call_native(THD *thd, + const Lex_ident_sys &name, + List<Item> *args) const; + + // Builders for native SQL function with a special syntax in sql_yacc.yy + virtual Item *make_item_func_substr(THD *thd, + const Lex_substring_spec_st &spec) const; + + virtual Item *make_item_func_trim(THD *thd, const Lex_trim_st &spec) const; + virtual Item *make_item_func_replace(THD *thd, + Item *subj, + Item *find, + Item *replace) const;
Why no Lex_replace_spec_st ?
The intent for adding Lex_substring_spec_st and Lex_trim_spec_st was to move the grammar into separate rules: substring_operands: trim_operands: REPLACE operands on the other hand have a very simple notation: | REPLACE '(' expr ',' expr ',' expr ')' I didn't find useful to add Lex_replace_spec_st.
/* For now we have *hard-coded* compatibility schemas: schema_mariadb, schema_oracle, schema_maxdb. @@ -66,5 +97,6 @@ class Schema
extern Schema mariadb_schema; +extern const Schema &oracle_schema_ref;
??? why did you declare mariadb_schema extern, but oracle_schema_ref only via a const ref?
The internal structure of class Schema_oracle is visible only inside sql_schema.cc. I did not want to expose it to sql_schema.h. It looks like compatibility modes like ORACLE can in the future very easily go into a new type of plugins - compatibility plugins, and oracle_schema_ref plays a role of the entry point of such a plugin. So I made it this way intentionally, to avoid Schema_oracle specific methods or members to be visible outside.
#endif // SQL_SCHEMA_H_INCLUDED diff --git a/sql/item_create.h b/sql/item_create.h index 894e9777b8d..3fadaecb090 100644 --- a/sql/item_create.h +++ b/sql/item_create.h @@ -215,9 +205,48 @@ struct Native_func_registry Create_func *builder; };
+ +class Native_functions_hash: public HASH +{ +public: + Native_functions_hash() + { + bzero(this, sizeof(*this)); + } + ~Native_functions_hash() + { + /* + No automatic free. + The the upper level code should call cleanup() explicitly.
why is that?
As discussed on slack - instances of this class are global variables and they destruct at the server shutdown time only. At this point mysys can already be deinitialized.
Item *create_func_dyncol_create(THD *thd, List<DYNCALL_CREATE_DEF> &list); diff --git a/sql/sql_schema.cc b/sql/sql_schema.cc index 0bf4a63c2f8..e2993f51c6d 100644 --- a/sql/sql_schema.cc +++ b/sql/sql_schema.cc @@ -78,3 +89,86 @@ Schema *Schema::find_implied(THD *thd) return &maxdb_schema; return &mariadb_schema; } + + +Create_func * +Schema::find_native_function_builder(THD *thd, const LEX_CSTRING &name) const +{ + return native_functions_hash.find(thd, name); +} + + +Create_func * +Schema_oracle::find_native_function_builder(THD *thd, + const LEX_CSTRING &name) const +{ + Create_func *create= native_functions_hash_oracle_overrides.find(thd, name); + if (create) + return create; + return native_functions_hash.find(thd, name);
may be populate native_functions_hash_oracle will all names to do only one lookup here? this will be used often in oracle mode
Done.
+} + + +Item *Schema::make_item_func_call_native(THD *thd, + const Lex_ident_sys &name, + List<Item> *args) const +{ + Create_func *builder= find_native_function_builder(thd, name); + if (builder) + return builder->create_func(thd, &name, args); + my_error(ER_FUNCTION_NOT_DEFINED, MYF(0), name.str); + return NULL; +} + + +Item *Schema::make_item_func_trim(THD *thd, const Lex_trim_st &spec) const +{ + return spec.make_item_func_trim_std(thd);
why does trim is constructed via helper in spec, but substr and replace are not?
SUBSTR and TRIM are constructed using helpers: virtual Item *make_item_func_substr(THD *thd, const Lex_substring_spec_st &spec) const; virtual Item *make_item_func_trim(THD *thd, const Lex_trim_st &spec) const; REPLACE is not - because it does not have a helper (explained above). <cut>
diff --git a/sql/item_func.h b/sql/item_func.h index 754b1cd1eb2..52bf7f661d1 100644 --- a/sql/item_func.h +++ b/sql/item_func.h
+ void print_sql_mode_dependent_name(String *to, enum_query_type query_type, + const Schema &schema, + const char *function_name) const + { + static const LEX_CSTRING oracle= {STRING_WITH_LEN("oracle")};
can it be oracle_schema.name() instead?
Unfortunately, it can not. oracle_schema.name() is "oracle_schema", it's not "oracle". Btw, this code does not exists any more. I changed a few Item_func_xxx_oracle classes to do it like this: void print(String *str, enum_query_type query_type) { if (query_type & QT_FOR_FRM) str->append(STRING_WITH_LEN("lpad_oracle")); else print_sql_mode_qualified_name(str, query_type); print_args_parenthesized(str, query_type); } Oh, by the way, it's better to add a comment here if (query_type & QT_FOR_FRM) { // 10.3 FRM downgrade compatibility str->append(STRING_WITH_LEN("lpad_oracle")); } I find this way more clear to follow. This way we can see which functions support backward FRM compatibility, and which functions do not. Item_func_regexp_replace_oracle will not support. But there is now a similar code in here: void Item_func_trim::print(String *str, enum_query_type query_type) { LEX_CSTRING suffix= {STRING_WITH_LEN("_oracle")}; Now it handles both sql_mode=DEFAULT and sql_mode=ORACLE. Should I split out Oracle-specific code into a new overriding methods Item_func_trim_oracle::print()? So it will pass "trim_oracle" as a parameter to a shared function, while Item_func_trim::print() won't have any references to "oracle" in the body.
diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy index 3770488f3f4..e0bafc75a36 100644 --- a/sql/sql_yacc.yy +++ b/sql/sql_yacc.yy @@ -11235,6 +11236,34 @@ function_call_generic: if (unlikely(!($$= Lex->make_item_func_call_generic(thd, &$1, &$3, &$5, $7)))) MYSQL_YYABORT; } + | ident_cli '.' REPLACE '(' expr ',' expr ',' expr ')' + { + if (unlikely(!($$= Lex->make_item_func_replace(thd, $1, $3, + $5, $7, $9)))) + MYSQL_YYABORT; + } + | ident_cli '.' SUBSTRING '(' substring_operands ')' + { + if (unlikely(!($$= Lex->make_item_func_substr(thd, $1, $3, $5)))) + MYSQL_YYABORT; + } + | ident_cli '.' TRIM '(' trim_operands ')' + { + if (unlikely(!($$= $5.make_item_func_trim(thd, $1, $3)))) + MYSQL_YYABORT; + } + /* + We don't add a qualified syntax for TRIM_ORACLE here, + as this syntax is not absolutely required: + SELECT mariadb_schema.TRIM_ORACLE(..); + What absolutely required is only: + SELECT mariadb_schema.TRIM(..); + Adding a qualified syntax for TRIM_ORACLE would be tricky because + it is a non-reserved keyword. To avoid new shift/reduce conflicts + it would require grammar changes, like introducing a new rule + ident_step2_cli (which would include everything that ident_cli + includes but TRIM_ORACLE).
Did you add a qualified syntax for any other xxx_ORACLE functions?
Functions defined through "const Native_func_registry func_array[]" support qualifiers automatically. All this work: select mariadb_schema.decode_oracle(1,1,'11'); select mariadb_schema.substr_oracle('abcde',2,3); select mariadb_schema.replace_oracle('abc','a','b'); select oracle_schema.decode_oracle(1,1,'11'); select oracle_schema.substr_oracle('abcde',2,3); select oracle_schema.replace_oracle('abc','a','b'); TRIM_ORACLE is different - it's a keyword defined in lex.h. It does not support qualifiers automatically. And I don't think it needs to.
+ */ ;
fulltext_options: diff --git a/sql/sql_lex.cc b/sql/sql_lex.cc index 7f1b362d91d..6b0a351f76e 100644 --- a/sql/sql_lex.cc +++ b/sql/sql_lex.cc @@ -2008,7 +2023,59 @@ bool Lex_input_stream::get_7bit_or_8bit_ident(THD *thd, uchar *last_char) }
+/* + Resolve special SQL functions that have a qualified syntax in sql_yacc.yy. + These functions are not listed in the native function registry + because of a special syntax, or a reserved keyword: + + mariadb_schema.SUBSTRING('a' FROM 1 FOR 2) -- Special syntax + mariadb_schema.TRIM(BOTH ' ' FROM 'a') -- Special syntax + mariadb_schema.REPLACE('a','b','c') -- Verb keyword +*/ + +int Lex_input_stream::find_keyword_qualified_special_func(Lex_ident_cli_st *str, + uint length) const +{ + /* + There are many other special functions, see the following grammar rules: + function_call_keyword + function_call_nonkeyword + Here we resolve only those that have a qualified syntax to handle + different behavior in different @@sql_mode settings. + + Other special functions do not work in qualified context: + SELECT mariadb_schema.year(now()); -- Function year is not defined + SELECT mariadb_schema.now(); -- Function now is not defined + + We don't resolve TRIM_ORACLE here, because it does not have + a qualified syntax yet. Search for "trim_operands" in sql_yacc.yy + to find more comments. + */ + static LEX_CSTRING funcs[]= + { + {STRING_WITH_LEN("SUBSTRING")}, + {STRING_WITH_LEN("SUBSTR")}, + {STRING_WITH_LEN("TRIM")}, + {STRING_WITH_LEN("REPLACE")} + }; + + int tokval= find_keyword(str, length, true); + if (!tokval) + return 0; + for (size_t i= 0; i < array_elements(funcs); i++) + { + CHARSET_INFO *cs= system_charset_info; + if (!cs->coll->strnncollsp(cs,
may be strcasecmp? or with cs=latin1? you know the repertoire is ascii, so you can cut corners here.
+ (const uchar *) m_tok_start, length, + (const uchar *) funcs[i].str, funcs[i].length)) + return tokval; + }
Fixed to additionally check length equality to avoid non-ASCII letters matching. Note, strcasecmp() cannot be used here because one side it not a 0-terminated string.
+ return 0; +} + + -int Lex_input_stream::scan_ident_sysvar(THD *thd, Lex_ident_cli_st *str) +int Lex_input_stream::scan_ident_common(THD *thd, Lex_ident_cli_st *str, + Ident_mode mode) { uchar last_char; uint length; @@ -8035,30 +8133,49 @@ bool LEX::add_grant_command(THD *thd, enum_sql_command sql_command_arg, }
+const Schema * +LEX::find_func_schema_by_name_or_error(const Lex_ident_sys &schema, + const Lex_ident_sys &func) +{ + Schema *res= Schema::find_by_name(schema); + if (res) + return res; + Database_qualified_name qname(schema, func); + my_error(ER_FUNCTION_NOT_DEFINED, MYF(0), ErrConvDQName(&qname).ptr());
eh, not quite. I can define a function with the same name as a native function in any db. E.g.
create test.replace(a int) returns int
and it'll be callable as test.replace(1) So ER_FUNCTION_NOT_DEFINED is wrong here.
But note that even though test.substr() is possible, it should not allow substr() special syntax.
Good catch, I overlooked this. From a glance, a workaround could be to use test.`substr`(). But it's better to provide backward compatibility. I don't have a quick idea how to fix this properly. If you have, please let me know. <cut>
Hello Sergei, On 5/4/23 4:20 PM, Alexander Barkov wrote: <cut>
eh, not quite. I can define a function with the same name as a native function in any db. E.g.
create test.replace(a int) returns int
and it'll be callable as test.replace(1) So ER_FUNCTION_NOT_DEFINED is wrong here.
But note that even though test.substr() is possible, it should not allow substr() special syntax.
Good catch, I overlooked this.
From a glance, a workaround could be to use test.`substr`(). But it's better to provide backward compatibility.
I don't have a quick idea how to fix this properly. If you have, please let me know.
I fixed this issue. A new patch version is here: https://github.com/MariaDB/server/commits/bb-10.4-bar-MDEV-27744-v2 Thanks.
<cut>
participants (2)
-
Alexander Barkov
-
Sergei Golubchik