Re: 02b85f1cf9b: MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp()
Hi, Alexander, Phew. It was a big patch. Mostly mechanical changes, admittedly, but still a lot to read. Generally I quite liked it, great work! Shorter way to write an identifier comparison, type safety that enforces namespaces and correct collations. Nice. But there are, of course, comments and questions, please, see below. On Dec 10, Alexander Barkov wrote:
commit 02b85f1cf9b Author: Alexander Barkov <bar@mariadb.com> Date: Wed Apr 26 15:27:01 2023 +0400
MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp()
- Removing the virtual function strnncoll() from MY_COLLATION_HANDLER
- Adding a wrapper function CHARSET_INFO::streq(), to compare two strings for equality. For now it calls strnncoll() internally. In the future it will turn into a virtual function.
why would you want it as a virtual function? What is collation-dependent there?
- Adding new accent sensitive case insensitive collations: - utf8mb4_general1400_as_ci - utf8mb3_general1400_as_ci They implements accent sensitive case insensitive comparison. The weight of a character is equal to the code point of its upper case variant. These collation use Unicode-14.0.0 casefolding data.
The result of my_charset_utf8mb3_general1400_as_ci.strcoll() is very close to the former my_charset_utf8mb3_general_ci.strcasecmp()
There is only a difference in a couple dozen rare characters, because: - the switch from "tolower" to "toupper" comparison, to make utf8mb3_general1400_as_ci closer to utf8mb3_general_ci - the switch from Unicode-3.0.0 to Unicode-14.0.0 This difference should be tolarable. See the list of affected characters in the MDEV description.
Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters! Unlike utf8mb4_general_ci, it does not treat all BMP characters as equal.
- Adding classes representing names of the file based database objects:
Lex_ident_db Lex_ident_table Lex_ident_trigger
Their comparison collation depends on the underlying file system case sensitivity and on --lower-case-table-names and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci.
- Adding classes representing names of other database objects, whose names have case insensitive comparison style, using my_charset_utf8mb3_general1400_as_ci:
Lex_ident_column Lex_ident_sys_var Lex_ident_user_var Lex_ident_sp_var
I don't think Lex_ident_sp_var is a separate domain, it's the same as Lex_ident_column. Both logically: select a from t1 where b > c (here you don't know what's a column and what's an sp var) and also in the code, in fill_spvar_definition() you yourself cast from Lex_ident_sp_var to Lex_ident_column, which kind of defeats the purpose of having two different types and emphasizes that it's actually just one and the same.
Lex_ident_ps Lex_ident_i_s_table Lex_ident_window Lex_ident_func Lex_ident_partition Lex_ident_with_element Lex_ident_rpl_filter Lex_ident_master_info Lex_ident_host Lex_ident_locale Lex_ident_plugin Lex_ident_engine Lex_ident_server Lex_ident_savepoint Lex_ident_table_option_name Lex_ident_table_option_value
option value is a string, not an ident
Lex_ident_charset
- All the mentioned Lex_ident_xxx classes implement a method streq():
if (ident1.streq(ident2)) do_equal();
This method works as a wrapper for CHARSET_INFO::streq().
- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name" in class members and in function/method parameters.
- Replacing all calls like system_charset_info->coll->strcasecmp(ident1, ident2) to ident1.streq(ident2)
- Taking advantage of the c++11 user defined literal operator for Lex_ident_xxx data type. See lex_ident.h. Use example:
const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column;
is now a shorter version of:
const Lex_ident_column primary_key_name= Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
Of all the things in your patch this will be the first thing that Monty will remove when he'll stumble upon it. Could you instead implement a constructor that understands const Lex_ident_column primary_key_name= {STRING_WITH_LEN("PRIMARY")};
diff --git a/include/m_ctype.h b/include/m_ctype.h index 9d13989d1fe..e5c5145b256 100644 --- a/include/m_ctype.h +++ b/include/m_ctype.h @@ -997,6 +1006,18 @@ struct charset_info_st return state & MY_CS_COMPILED; }
+ /* + There will be a separate virtual function streq() in + MY_COLLATION_HANDLER eventually. + */ + my_bool streq(const LEX_CSTRING a, const LEX_CSTRING b) const + { + DBUG_ASSERT(is_valid_comparison_operand(a)); + DBUG_ASSERT(is_valid_comparison_operand(b)); + return 0 == (coll->strnncoll)(this, (const uchar *) a.str, a.length, + (const uchar *) b.str, b.length, FALSE);
as you have also added strnncoll helper, you can write the above as return 0 == strnncoll(a, b); also, perhaps you should move these asserts into strnncoll helper?
+ } + int strnncoll(const uchar *a, size_t alen, const uchar *b, size_t blen, my_bool b_is_prefix= FALSE) const { diff --git a/mysql-test/main/ctype_utf8mb4_general1400_as_ci_casefold.result b/mysql-test/main/ctype_utf8mb4_general1400_as_ci_casefold.result new file mode 100644 index 00000000000..c6ccf662d16 --- /dev/null +++ b/mysql-test/main/ctype_utf8mb4_general1400_as_ci_casefold.result @@ -0,0 +1,1446 @@ +# +# Start of 11.4 tests +# +# +# MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp() +# +SET NAMES utf8mb4 COLLATE utf8mb4_general1400_as_ci; +EXECUTE IMMEDIATE SFORMAT(' +CREATE VIEW v_bmp AS +SELECT + seq AS codepoint, + LPAD(HEX(seq),4,''0'') AS codepoint_hex4, + CONVERT(CHAR(seq USING utf32) USING {}) COLLATE {} AS c +FROM + seq_0_to_65535', @@character_set_connection, @@collation_connection); +SELECT COLLATION(c) FROM v_bmp LIMIT 1; +COLLATION(c) +utf8mb4_general1400_as_ci +SELECT +codepoint_hex4, +HEX(CAST(LOWER(c) AS CHAR CHARACTER SET ucs2)), +HEX(CAST(UPPER(c) AS CHAR CHARACTER SET ucs2)) +FROM v_bmp +WHERE BINARY(c)<>BINARY(LOWER(c)) OR BINARY(c)<>BINARY(UPPER(c)); +codepoint_hex4 HEX(CAST(LOWER(c) AS CHAR CHARACTER SET ucs2)) HEX(CAST(UPPER(c) AS CHAR CHARACTER SET ucs2))
what is the point of this table? detect changes when going to the next unicode version?
+0041 0061 0041 ... +# +# End of 11.4 tests +# diff --git a/mysql-test/main/ctype_utf8mb4_general1400_as_ci_ws.result b/mysql-test/main/ctype_utf8mb4_general1400_as_ci_ws.result new file mode 100644 index 00000000000..49f43777feb --- /dev/null +++ b/mysql-test/main/ctype_utf8mb4_general1400_as_ci_ws.result diff --git a/mysql-test/main/ctype_utf8mb4_general1400_as_ci_ws.test b/mysql-test/main/ctype_utf8mb4_general1400_as_ci_ws.test new file mode 100644 index 00000000000..b5eb369ae9a --- /dev/null +++ b/mysql-test/main/ctype_utf8mb4_general1400_as_ci_ws.test @@ -0,0 +1,39 @@ +--source include/have_utf32.inc +--source include/have_sequence.inc + +--echo # +--echo # Start of 11.1 tests +--echo # + +--echo # +--echo # Collation utfmb4_general1400_as_ci +--echo # + +SET NAMES utf8mb4 COLLATE utf8mb4_general1400_as_ci; + +EXECUTE IMMEDIATE SFORMAT(' +CREATE VIEW v_bmp AS +SELECT + seq AS codepoint, + LPAD(HEX(seq),6,''0'') AS codepoint_hex6, + CONVERT(CHAR(seq USING utf32) USING {}) COLLATE {} AS c +FROM + seq_0_to_1114111', @@character_set_connection, @@collation_connection); + +SELECT COLLATION(c) FROM v_bmp LIMIT 1; + +SELECT + SUM(codepoint_hex6=HEX(LOWER(c))) AS count_bmp_weight_is_lower, + SUM(codepoint_hex6<>HEX(LOWER(c))) AS count_bmp_weight_is_not_lower +FROM v_bmp; + +SELECT codepoint_hex6,HEX(WEIGHT_STRING(c)) +FROM v_bmp +WHERE codepoint_hex6<>HEX(WEIGHT_STRING(c)); + +DROP VIEW v_bmp; + + +--echo # +--echo # End of 11.1 tests
11.4 ?
+--echo # diff --git a/mysql-test/suite/compat/oracle/r/lower_case_table_names.result b/mysql-test/suite/compat/oracle/r/lower_case_table_names.result new file mode 100644 index 00000000000..2518be31402 --- /dev/null +++ b/mysql-test/suite/compat/oracle/r/lower_case_table_names.result @@ -0,0 +1,9 @@ +SELECT @@lower_case_table_names; +@@lower_case_table_names +1 +# +# TODO: Report this problem as MDEV
what is the problem? isn't the comparison accent-sensitive?
+# +SET NAMES utf8; +CREATE TABLE t1 (a öracle_schema.date); +ERROR HY000: Unknown data type: 'öracle_schema.date' diff --git a/mysql-test/suite/unit/disabled.def b/mysql-test/suite/unit/disabled.def new file mode 100644 index 00000000000..3a68bea32d0 --- /dev/null +++ b/mysql-test/suite/unit/disabled.def @@ -0,0 +1,12 @@ +############################################################################## +# +# List the test cases that are to be disabled temporarily. +# +# Separate the test case name and the comment with ':'. +# +# <testcasename> : BUG#<xxxx> <date disabled> <disabler> <comment> +# +# Do not use any TAB characters for whitespace. +# +############################################################################## +conc_charset : TODO: client side ID for utf8mb4_tolower_ci
1. There is no utf8mb4_tolower_ci 2. what do you mean "client side ID"?
diff --git a/sql/ddl_log.cc b/sql/ddl_log.cc index 61c8f50d1e3..fdfb605a57f 100644 --- a/sql/ddl_log.cc +++ b/sql/ddl_log.cc @@ -1146,22 +1146,23 @@ static void execute_rename_table(DDL_LOG_ENTRY *ddl_log_entry, handler *file, static void rename_triggers(THD *thd, DDL_LOG_ENTRY *ddl_log_entry, bool swap_tables) { - LEX_CSTRING to_table, from_table, to_db, from_db, from_converted_name; + Lex_ident_table to_table, from_table, from_converted_name; + Lex_ident_db to_db, from_db; char to_path[FN_REFLEN+1], from_path[FN_REFLEN+1], conv_path[FN_REFLEN+1];
if (!swap_tables) { - from_db= ddl_log_entry->db; - from_table= ddl_log_entry->name; - to_db= ddl_log_entry->from_db; - to_table= ddl_log_entry->from_name; + from_db= Lex_ident_db(ddl_log_entry->db);
this would look better with implicit casting, that is, if you'd have an assignment operator from LEX_CSTRING
+ from_table= Lex_ident_table(ddl_log_entry->name); + to_db= Lex_ident_db(ddl_log_entry->from_db); + to_table= Lex_ident_table(ddl_log_entry->from_name); } else { - from_db= ddl_log_entry->from_db; - from_table= ddl_log_entry->from_name; - to_db= ddl_log_entry->db; - to_table= ddl_log_entry->extra_name; + from_db= Lex_ident_db(ddl_log_entry->from_db); + from_table= Lex_ident_table(ddl_log_entry->from_name); + to_db= Lex_ident_db(ddl_log_entry->db); + to_table= Lex_ident_table(ddl_log_entry->extra_name); }
build_filename_and_delete_tmp_file(from_path, sizeof(from_path), @@ -1211,11 +1212,11 @@ static void rename_triggers(THD *thd, DDL_LOG_ENTRY *ddl_log_entry,
(void) Table_triggers_list::prepare_for_rename(thd, &trigger_param, - &from_db, - &from_table, - &from_converted_name, - &to_db, - &to_table); + from_db, + from_table, + from_converted_name, + to_db, + to_table);
You rather inconsistently replace pointers with const references in some methods, like above, but not in others, like below or in the next hunk.
(void) Table_triggers_list::change_table_name(thd, &trigger_param, &from_db, @@ -1707,9 +1708,8 @@ static int ddl_log_execute_action(THD *thd, MEM_ROOT *mem_root,
switch (ddl_log_entry->phase) { case DDL_DROP_DB_PHASE_INIT: - drop_database_objects(thd, &path, &db, - !my_strcasecmp(system_charset_info, - MYSQL_SCHEMA_NAME.str, db.str)); + // QQ: was a bug here?
This comment makes little sense in the code, as one cannot see what *was* here before. Just drop it.
+ drop_database_objects(thd, &path, &db, MYSQL_SCHEMA_NAME.streq(db));
strxnmov(to_path, sizeof(to_path)-1, path.str, MY_DB_OPT_FILE, NullS); mysql_file_delete_with_symlink(key_file_misc, to_path, "", MYF(0)); diff --git a/sql/debug_sync.cc b/sql/debug_sync.cc index 362463a7f22..4704691615a 100644 --- a/sql/debug_sync.cc +++ b/sql/debug_sync.cc @@ -858,6 +858,13 @@ static st_debug_sync_action *debug_sync_get_action(THD *thd, }
+class Debug_token: public Lex_ident_ci +{ +public: + using Lex_ident_ci::Lex_ident_ci; +};
I wonder if you can create those types with less boilerplate will typedef work? will using work? if not, may be, like template<> Lex_ident_templ: Lex_ident_ci { ... } Lex_ident_templ<> Debug_token;
+ + /** Set a debug sync action.
@@ -909,7 +916,8 @@ static bool debug_sync_set_action(THD *thd, st_debug_sync_action *action) } else { - const char *dsp_name= action->sync_point.c_ptr(); + const Debug_token dsp_name(action->sync_point.c_ptr(), + action->sync_point.length());
or dsp_name(action->sync_point->to_lex_cstring()) ?
#ifdef DBUG_TRACE DBUG_EXECUTE("debug_sync", { /* Functions as DBUG_PRINT args can change keyword and line nr. */ diff --git a/sql/grant.cc b/sql/grant.cc index 1ba197bc1d9..0ebf47431bc 100644 --- a/sql/grant.cc +++ b/sql/grant.cc @@ -31,8 +31,8 @@ bool Grant_privilege::add_column_privilege(THD *thd, class LEX_COLUMN *point; while ((point=iter++)) { - if (!my_strcasecmp(system_charset_info, - point->column.c_ptr(), new_str->c_ptr())) + const Lex_ident_column tmp(point->column.c_ptr(), point->column.length());
That feels like an overkill, creating Lex_ident_column just to compare two strings. What is it compiled to at -O3?
+ if (tmp.streq(new_str->to_lex_cstring())) break; } m_column_privilege_total|= which_grant; diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc index a615d26b9a9..fc840c21191 100644 --- a/sql/ha_partition.cc +++ b/sql/ha_partition.cc @@ -118,7 +118,7 @@ int ha_partition::notify_tabledef_changed(LEX_CSTRING *db, LEX_CSTRING table_name; const char *table_name_ptr; if (create_partition_name(from_buff, sizeof(from_buff), - from_path, name_buffer_ptr, + from_path, Lex_cstring_strlen(name_buffer_ptr),
Don't, please. You _always_ use Lex_cstring_strlen, in _all_ invocations of create_partition_name(). Just pass a char* down as before and do strlen inside, if you need it.
NORMAL_PART_NAME, FALSE)) res=1; table_name_ptr= from_buff + dirname_length(from_buff); diff --git a/sql/handler.cc b/sql/handler.cc index a977054d0da..7b9d11d00a2 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -7531,7 +7528,8 @@ int handler::check_duplicate_long_entry_key(const uchar *new_rec, uint key_no) else { Item_func_left *fnc= static_cast<Item_func_left *>(arguments[j]); - DBUG_ASSERT(!my_strcasecmp(system_charset_info, "left", fnc->func_name())); + DBUG_ASSERT(Lex_ident_func(fnc->func_name_cstring()).
eh, why func_name_cstring() doesn't return Lex_ident_func?
+ streq("left"_Lex_cstring)); DBUG_ASSERT(fnc->arguments()[0]->type() == Item::FIELD_ITEM); t_field= static_cast<Item_field *>(fnc->arguments()[0])->field; uint length= (uint)fnc->arguments()[1]->val_int(); diff --git a/sql/item.cc b/sql/item.cc index 5c4f1704ebe..5689756eca9 100644 --- a/sql/item.cc +++ b/sql/item.cc @@ -3463,13 +3463,11 @@ bool Item_field::eq(const Item *item, bool binary_cmp) const (In cases where we would choose wrong we would have to generate a ER_NON_UNIQ_ERROR). */ - return (!lex_string_cmp(system_charset_info, &item_field->name, - &field_name) && - (!item_field->table_name.str || !table_name.str || - (!my_strcasecmp(table_alias_charset, item_field->table_name.str, - table_name.str) && - (!item_field->db_name.str || !db_name.str || - (item_field->db_name.str && !strcmp(item_field->db_name.str, + return (item_field->name.streq(field_name) && + (!item_field->table_name.str || !table_name.str || + (item_field->table_name.streq(table_name) && + (!item_field->db_name.str || !db_name.str || + (item_field->db_name.str && !strcmp(item_field->db_name.str, db_name.str))))));
I presume, # db_name should also be compared with streq
}
diff --git a/sql/item_create.cc b/sql/item_create.cc index 4f0df8f732d..501db19c1b8 100644 --- a/sql/item_create.cc +++ b/sql/item_create.cc @@ -2638,7 +2639,8 @@ Create_qfunc::create_func(THD *thd, const LEX_CSTRING *name, if (thd->lex->copy_db_to(&db)) return NULL;
- return create_with_db(thd, &db, name, false, item_list);
Again, can LEX_CSTRING be implicitly cast to the correct Lex_ident variant?
+ return create_with_db(thd, Lex_ident_db(db), Lex_ident_func(*name), + false, item_list); }
diff --git a/sql/item_windowfunc.cc b/sql/item_windowfunc.cc index 5438cade27a..f0a5737bdf7 100644 --- a/sql/item_windowfunc.cc +++ b/sql/item_windowfunc.cc @@ -28,7 +28,7 @@ Item_window_func::resolve_window_name(THD *thd) return false; } DBUG_ASSERT(window_name != NULL && window_spec == NULL); - const char *ref_name= window_name->str; + const LEX_CSTRING &ref_name= *window_name;
Why LEX_CSTRING ?
/* !TODO: Add the code to resolve ref_name in outer queries */ /* diff --git a/sql/opt_trace.cc b/sql/opt_trace.cc index 1008690ccaa..dbb2708f487 100644 --- a/sql/opt_trace.cc +++ b/sql/opt_trace.cc @@ -43,7 +43,7 @@ bool list_has_optimizer_trace_table(const TABLE_LIST *tbl) for (; tbl; tbl= tbl->next_global) { if (tbl->schema_table && - 0 == strcmp(tbl->schema_table->table_name, I_S_table_name)) + 0 == strcmp(tbl->schema_table->table_name.str, I_S_table_name))
Why not streq?
return true; } return false; @@ -190,9 +190,8 @@ void opt_trace_disable_if_no_security_context_access(THD *thd) if (!(thd->main_security_ctx.check_access(GLOBAL_ACLS & ~GRANT_ACL)) && (0 != strcmp(thd->main_security_ctx.priv_user, thd->security_context()->priv_user) || - 0 != my_strcasecmp(system_charset_info, - thd->main_security_ctx.priv_host, - thd->security_context()->priv_host))) + !Lex_ident_host(Lex_cstring_strlen(thd->main_security_ctx.priv_host)). + streq(Lex_cstring_strlen(thd->security_context()->priv_host))))
Doesn't really look like an improvement :) May be you can get rid of intermediate Lex_cstring_strlen()'s ?
trace->missing_privilege(); }
diff --git a/sql/partition_info.cc b/sql/partition_info.cc index 1b65de6e3c0..7b76c6d65b1 100644 --- a/sql/partition_info.cc +++ b/sql/partition_info.cc @@ -1403,8 +1397,8 @@ bool partition_info::check_partition_info(THD *thd, handlerton **eng_type, num_parts_not_set++; part_elem->engine_type= default_engine_type; } - if (check_table_name(part_elem->partition_name, - strlen(part_elem->partition_name), FALSE)) + if (check_table_name(part_elem->partition_name.str, + part_elem->partition_name.length, FALSE))
better change check_table_name to accept a LEX_CSTRING or Lex_ident, it'll simplify almost all (except one) invocations of it
{ my_error(ER_WRONG_PARTITION_NAME, MYF(0)); goto end; @@ -2604,8 +2596,8 @@ bool partition_info::has_same_partitioning(partition_info *new_part_info) part_state must be PART_NORMAL! */ if (!part_elem || !new_part_elem || - strcmp(part_elem->partition_name, - new_part_elem->partition_name) || + strcmp(part_elem->partition_name.str, + new_part_elem->partition_name.str) ||
why strcmp?
part_elem->part_state != PART_NORMAL || new_part_elem->part_state != PART_NORMAL || part_elem->max_value != new_part_elem->max_value || @@ -2666,8 +2658,8 @@ bool partition_info::has_same_partitioning(partition_info *new_part_info) sub_part_elem->engine_type != new_sub_part_elem->engine_type) DBUG_RETURN(false);
- if (strcmp(sub_part_elem->partition_name, - new_sub_part_elem->partition_name) || + if (strcmp(sub_part_elem->partition_name.str, + new_sub_part_elem->partition_name.str) ||
same here
sub_part_elem->part_state != PART_NORMAL || new_sub_part_elem->part_state != PART_NORMAL || sub_part_elem->part_min_rows != diff --git a/sql/partition_info.h b/sql/partition_info.h index 3a8c3a37524..92e8906274f 100644 --- a/sql/partition_info.h +++ b/sql/partition_info.h @@ -526,11 +527,13 @@ void partition_info::vers_update_el_ids() }
-inline -bool make_partition_name(char *move_ptr, uint i) +static inline +Lex_ident_partition make_partition_name_ls(char *move_ptr, uint i)
it's not _ls really, is it? why not to keep the old name?
{ int res= snprintf(move_ptr, MAX_PART_NAME_SIZE + 1, "p%u", i); - return res < 0 || res > MAX_PART_NAME_SIZE; + return res < 0 || res > MAX_PART_NAME_SIZE ? + Lex_ident_partition() : + Lex_ident_partition(move_ptr, (size_t) res); }
@@ -549,15 +552,16 @@ uint partition_info::next_part_no(uint new_parts) const for (uint cur_part= 0; cur_part < new_parts; ++cur_part, ++suffix) { uint32 cur_suffix= suffix; - if (make_partition_name(part_name, suffix)) + Lex_ident_partition part_name_ls(make_partition_name_ls(part_name, suffix)); + if (!part_name_ls.str) return 0; partition_element *el; it.rewind(); while ((el= it++)) { - if (0 == my_strcasecmp(&my_charset_latin1, el->partition_name, part_name)) + if (el->partition_name.streq(part_name_ls)) { - if (make_partition_name(part_name, ++suffix)) + if (!(part_name_ls= make_partition_name_ls(part_name, ++suffix)).str)
you don't seem to use part_name_ls below
return 0; it.rewind(); } diff --git a/sql/set_var.h b/sql/set_var.h index aed4955ef62..8f7e3609f5f 100644 --- a/sql/set_var.h +++ b/sql/set_var.h @@ -383,7 +383,7 @@ class set_var_default_role: public set_var_base { LEX_USER *user, *real_user; LEX_CSTRING role; - const char *real_role; + LEX_CSTRING real_role;
Not Lex_ident_user ?
public: set_var_default_role(LEX_USER *user_arg, LEX_CSTRING role_arg) : user(user_arg), role(role_arg) {} diff --git a/sql/slave.cc b/sql/slave.cc index 0a15cc55146..daeb05f423e 100644 --- a/sql/slave.cc +++ b/sql/slave.cc @@ -3403,8 +3403,9 @@ static bool send_show_master_info_data(THD *thd, Master_info *mi, bool full, static int cmp_mi_by_name(const Master_info **arg1, const Master_info **arg2) { - return my_strcasecmp(system_charset_info, (*arg1)->connection_name.str, - (*arg2)->connection_name.str); + return Lex_ident_master_info::charset_info()->strnncoll( + (*arg1)->connection_name, + (*arg2)->connection_name);
Why not (*arg1)->connection_name.streq() ?
}
diff --git a/sql/sp_pcontext.cc b/sql/sp_pcontext.cc index 848d1f0c655..761f535832d 100644 --- a/sql/sp_pcontext.cc +++ b/sql/sp_pcontext.cc @@ -139,8 +139,7 @@ sp_pcontext *sp_pcontext::push_context(THD *thd, sp_pcontext::enum_scope scope)
bool cmp_labels(sp_label *a, sp_label *b) { - return (lex_string_cmp(system_charset_info, &a->name, &b->name) == 0 && - a->type == b->type); + return a->name.streq(b->name) && a->type == b->type;
put type comparison first, please
}
sp_pcontext *sp_pcontext::pop_context() diff --git a/sql/sql_acl.cc b/sql/sql_acl.cc index 09af5523c76..dd36504ed56 100644 --- a/sql/sql_acl.cc +++ b/sql/sql_acl.cc @@ -567,14 +560,17 @@ static uchar* acl_role_get_key(ACL_ROLE *entry, size_t *length,
struct ROLE_GRANT_PAIR : public Sql_alloc { - char *u_uname; - char *u_hname; - char *r_uname; + LEX_CSTRING u_uname; + LEX_CSTRING u_hname; + LEX_CSTRING r_uname;
Lex_ident_user ? (and in many places below)
LEX_STRING hashkey; bool with_admin;
- bool init(MEM_ROOT *mem, const char *username, const char *hostname, - const char *rolename, bool with_admin_option); + bool init(MEM_ROOT *mem, + const LEX_CSTRING &username, + const LEX_CSTRING &hostname, + const LEX_CSTRING &rolename, + bool with_admin_option); };
static uchar* acl_role_map_get_key(ROLE_GRANT_PAIR *entry, size_t *length, @@ -2731,7 +2733,7 @@ static bool acl_load(THD *thd, const Grant_tables& tables) char *db_name; db.user=safe_str(get_field(&acl_memroot, db_table.user())); const char *hostname= get_field(&acl_memroot, db_table.host()); - if (!hostname && find_acl_role(db.user, true)) + if (!hostname && find_acl_role(Lex_cstring_strlen(db.user), true))
Perhaps it'd be better to change user and db to be Lex_ident_* in ACL_DB?
hostname= ""; update_hostname(&db.host, hostname); db.db= db_name= get_field(&acl_memroot, db_table.db()); @@ -2822,14 +2824,29 @@ static bool acl_load(THD *thd, const Grant_tables& tables) init_alloc_root(key_memory_acl_mem, &temp_root, ACL_ALLOC_BLOCK_SIZE, 0, MYF(0)); while (!(read_record_info.read_record())) { - char *hostname= safe_str(get_field(&temp_root, roles_mapping_table.host())); - char *username= safe_str(get_field(&temp_root, roles_mapping_table.user())); - char *rolename= safe_str(get_field(&temp_root, roles_mapping_table.role())); + StringBuffer<MAX_FIELD_WIDTH> host_buff, user_buff, role_buff; + /* + find_user_by_name() called recursively later in this code block
There's no find_user_by_name() and acl_find_user_by_name does not seem to be called recursively. What did you mean here? also, why you didn't change acl_find_user_by_name to take a Lex_ident?
+ needs a 0-terminated string. So does sql_print_error(). + As val_lex_cstring() does not guarantee a 0-terminated string, + let's alloc 0-terminated copies of val_lex_cstring() results on + the memory root.
you've already added Field::val_lex_string_strmake() in August, why not to use it here?
+ */ + const Lex_cstring hostname= safe_lexcstrdup_root(&temp_root, + roles_mapping_table.host()-> + val_lex_cstring(&host_buff)); + const Lex_cstring username= safe_lexcstrdup_root(&temp_root, + roles_mapping_table.user()-> + val_lex_cstring(&user_buff)); + const Lex_cstring rolename= safe_lexcstrdup_root(&temp_root, + roles_mapping_table.role()-> + val_lex_cstring(&role_buff)); + bool with_grant_option= get_YN_as_bool(roles_mapping_table.admin_option());
if (add_role_user_mapping(username, hostname, rolename)) { sql_print_error("Invalid roles_mapping table entry user:'%s@%s', rolename:'%s'", - username, hostname, rolename); + username.str, hostname.str, rolename.str); continue; }
@@ -4443,28 +4484,31 @@ static ACL_USER * find_user_wild(const char *host, const char *user, const char /* Find a role with the specified name */ -static ACL_ROLE *find_acl_role(const char *role, bool allow_public) +static ACL_ROLE *find_acl_role(const LEX_CSTRING &role, bool allow_public) { - size_t length= strlen(role); DBUG_ENTER("find_acl_role"); - DBUG_PRINT("enter",("role: '%s'", role)); + DBUG_PRINT("enter",("role: '%s'", role.str)); DBUG_PRINT("info", ("Hash elements: %ld", acl_roles.records));
mysql_mutex_assert_owner(&acl_cache->lock);
- if (!length || (!allow_public && strcasecmp(role, public_name.str) == 0)) + if (!role.length || + (!allow_public && + my_charset_utf8mb3_general1400_as_ci.streq(role, public_name)))
Lex_ident_user would've been nice here too
DBUG_RETURN(NULL);
- ACL_ROLE *r= (ACL_ROLE *)my_hash_search(&acl_roles, (uchar *)role, length); + ACL_ROLE *r= (ACL_ROLE *)my_hash_search(&acl_roles, (uchar *)role.str, + role.length); DBUG_RETURN(r); }
/* Finds a grantee - something that privileges or roles can be granted to. */ -static ACL_USER_BASE *find_acl_user_base(const char *user, const char *host) +static ACL_USER_BASE *find_acl_user_base(const LEX_CSTRING &user, + const LEX_CSTRING &host) { - if (*host) + if (*host.str) return find_user_exact(host, user);
return find_acl_role(user, true); @@ -5657,8 +5696,8 @@ static GRANT_NAME *name_hash_search(HASH *name_hash, { if (!grant_name->host.hostname || (host && - !my_strcasecmp(system_charset_info, host, - grant_name->host.hostname)) || + Lex_ident_host(Lex_cstring_strlen(host)).
Still don't like these Lex_ident_something(Lex_cstring_strlen(xxx)) Let's get rid of them
+ streq(Lex_cstring_strlen(grant_name->host.hostname))) || (ip && !strcmp(ip, grant_name->host.hostname))) return grant_name; } @@ -7373,8 +7414,9 @@ int mysql_table_grant(THD *thd, TABLE_LIST *table_list, } } if (Str->is_role()) - propagate_role_grants(find_acl_role(Str->user.str, true), - PRIVS_TO_MERGE::TABLE_COLUMN, db_name, table_name); + propagate_role_grants(find_acl_role(Str->user, true), + PRIVS_TO_MERGE::TABLE_COLUMN, + db_name.str, table_name.str);
may be pass Lex_ident_* down into propagate_role_grants too?
}
thd->mem_root= old_root; diff --git a/sql/sql_base.cc b/sql/sql_base.cc index 9d2415d3a4d..d7d7ebee024 100644 --- a/sql/sql_base.cc +++ b/sql/sql_base.cc @@ -233,33 +233,41 @@ uint get_table_def_key(const TABLE_LIST *table_list, const char **key) # Pointer to list of names of open tables. */
-struct list_open_tables_arg +class list_open_tables_arg { +public: THD *thd; - const char *db; + const Lex_ident_db db; const char *wild; TABLE_LIST table_list; OPEN_TABLE_LIST **start_list, *open_list; + + list_open_tables_arg(THD *thd_arg, const LEX_CSTRING &db_arg, + const char *wild_arg) + :thd(thd_arg), db(db_arg), wild(wild_arg), + start_list(&open_list), open_list(0) + { + bzero((char*) &table_list, sizeof(table_list)); + } };
static my_bool list_open_tables_callback(TDC_element *element, list_open_tables_arg *arg) { - const char *db= (char*) element->m_key; - size_t db_length= strlen(db); - const char *table_name= db + db_length + 1; + const Lex_ident_db + db= Lex_ident_db(Lex_cstring_strlen((const char*) element->m_key)); + const char *table_name= db.str + db.length + 1;
- if (arg->db && my_strcasecmp(system_charset_info, arg->db, db)) + // QQ: a bug was here?
same as above, "was"
+ if (arg->db.str && !arg->db.streq(db)) return FALSE; if (arg->wild && wild_compare(table_name, arg->wild, 0)) return FALSE;
/* Check if user has SELECT privilege for any column in the table */ - arg->table_list.db.str= db; - arg->table_list.db.length= db_length; - arg->table_list.table_name.str= table_name; - arg->table_list.table_name.length= strlen(table_name); + arg->table_list.db= db; + arg->table_list.table_name= Lex_cstring_strlen(table_name); arg->table_list.grant.privilege= NO_ACL;
if (check_table_access(arg->thd, SELECT_ACL, &arg->table_list, TRUE, 1, TRUE)) @@ -1906,7 +1904,7 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) DBUG_ASSERT(!table_list->table);
/* an open table operation needs a lot of the stack space */ - if (check_stack_overrun(thd, STACK_MIN_SIZE_FOR_OPEN, (uchar *)&alias)) + if (check_stack_overrun(thd, STACK_MIN_SIZE_FOR_OPEN, (uchar *)&alias.str))
better pass `key` above. doesn't matter much but looks a bit less weird
DBUG_RETURN(TRUE);
if (!(flags & MYSQL_OPEN_IGNORE_KILLED) && thd->killed) @@ -1954,7 +1952,10 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) if (table->s->table_cache_key.length == key_length && !memcmp(table->s->table_cache_key.str, key, key_length)) { - if (!my_strcasecmp(system_charset_info, table->alias.c_ptr(), alias) && + // QQ: Why table_alias_charset is not used for comparison?
Did you try to correct this?
+ if (my_charset_utf8mb3_general1400_as_ci.streq( + table->alias.to_lex_cstring(), + alias) && table->query_id != thd->query_id && /* skip tables already used */ (thd->locked_tables_mode == LTM_LOCK_TABLES || table->query_id == 0)) @@ -6353,26 +6355,25 @@ find_field_in_natural_join(THD *thd, TABLE_LIST *table_ref, const char *name, si */
Field * -find_field_in_table(THD *thd, TABLE *table, const char *name, size_t length, +find_field_in_table(THD *thd, TABLE *table, const Lex_ident_column &name, bool allow_rowid, field_index_t *cached_field_index_ptr) { Field *field; field_index_t cached_field_index= *cached_field_index_ptr; DBUG_ENTER("find_field_in_table"); DBUG_PRINT("enter", ("table: '%s', field name: '%s'", table->alias.c_ptr(), - name)); + name.str));
/* We assume here that table->field < NO_CACHED_FIELD_INDEX = UINT_MAX */ if (cached_field_index < table->s->fields && - !my_strcasecmp(system_charset_info, - table->field[cached_field_index]->field_name.str, name)) + table->field[cached_field_index]->field_name.streq(name)) { field= table->field[cached_field_index]; DEBUG_SYNC(thd, "table_field_cached"); } else { - LEX_CSTRING fname= {name, length}; + LEX_CSTRING fname= name;
Do you need that? You can use `name` directly below
field= table->find_field_by_name(&fname); }
@@ -6999,9 +7000,9 @@ find_item_in_list(Item *find, List<Item> &items, uint *counter, List_iterator<Item> li(items); uint n_items= limit == 0 ? items.elements : limit; Item **found=0, **found_unaliased= 0, *item; - const char *db_name=0; + const LEX_CSTRING *db_name=0; const LEX_CSTRING *field_name= 0; - const char *table_name=0; + const LEX_CSTRING *table_name=0;
Why not Lex_ident's ?
bool found_unaliased_non_uniq= 0; /* true if the item that we search for is a valid name reference @@ -7059,12 +7060,11 @@ find_item_in_list(Item *find, List<Item> &items, uint *counter, item is not fix_field()'ed yet. */ if (item_field->field_name.str && item_field->table_name.str && - !lex_string_cmp(system_charset_info, &item_field->field_name, - field_name) && - !my_strcasecmp(table_alias_charset, item_field->table_name.str, - table_name) && - (!db_name || (item_field->db_name.str && - !strcmp(item_field->db_name.str, db_name)))) + item_field->field_name.streq(*field_name) && + item_field->table_name.streq(*table_name) && + (!db_name || !db_name->str || + (item_field->db_name.str && + !strcmp(item_field->db_name.str, db_name->str))))
why strcmp?
{ if (found_unaliased) { @@ -7135,11 +7132,10 @@ find_item_in_list(Item *find, List<Item> &items, uint *counter, } } } - else if (!table_name) + else if (!table_name || !table_name->str) { if (is_ref_by_name && find->name.str && item->name.str && - find->name.length == item->name.length && - !lex_string_cmp(system_charset_info, &item->name, &find->name)) + item->name.streq(find->name))
Do you have tests that will fail if streq will compare lengths?
{ found= li.ref(); *counter= i; diff --git a/sql/sql_class.cc b/sql/sql_class.cc index a2414de645b..1b36cd9416f 100644 --- a/sql/sql_class.cc +++ b/sql/sql_class.cc @@ -855,10 +853,13 @@ THD::THD(my_thread_id id, bool is_wsrep_applier) profiling.set_thd(this); #endif user_connect=(USER_CONN *)0; - my_hash_init(key_memory_user_var_entry, &user_vars, system_charset_info, + my_hash_init(key_memory_user_var_entry, &user_vars, + Lex_ident_user_var::charset_info(), USER_VARS_HASH_SIZE, 0, 0, (my_hash_get_key) get_var_key, (my_hash_free_key) free_user_var, HASH_THREAD_SPECIFIC); - my_hash_init(PSI_INSTRUMENT_ME, &sequences, system_charset_info, + my_hash_init(PSI_INSTRUMENT_ME, &sequences, + // QQ: why case insensitive?
looks like a bug, can you do a test case for it, please?
+ &my_charset_utf8mb3_general1400_as_ci, SEQUENCES_HASH_SIZE, 0, 0, (my_hash_get_key) get_sequence_last_key, (my_hash_free_key) free_sequence_last, HASH_THREAD_SPECIFIC); @@ -4011,7 +4015,7 @@ Table_ident::to_ident_db_internal_with_error(Query_arena *arena) const if (is_derived_table()) { DBUG_ASSERT(db.str == empty_c_string && db.length == 0); - return Lex_ident_db(empty_c_string, 0); + return Lex_ident_db(empty_c_string, (size_t) 0);
why?
} // Normal table or JSON table return arena->to_ident_db_internal_with_error(db); diff --git a/sql/sql_class.h b/sql/sql_class.h index 214118d8431..5685240ceaf 100644 --- a/sql/sql_class.h +++ b/sql/sql_class.h @@ -433,13 +435,14 @@ class Alter_rename_key : public Sql_alloc class Alter_index_ignorability: public Sql_alloc { public: - Alter_index_ignorability(const char *name, bool is_ignored, bool if_exists) : + Alter_index_ignorability(const LEX_CSTRING &name, + bool is_ignored, bool if_exists) : m_name(name), m_is_ignored(is_ignored), m_if_exists(if_exists) { - assert(name != NULL); + assert(name.str != NULL);
DBUG_ASSERT
}
- const char *name() const { return m_name; } + const Lex_ident_column &name() const { return m_name; } bool if_exists() const { return m_if_exists; }
/* The ignorability after the operation is performed. */ @@ -7989,42 +7995,24 @@ class Identifier_chain2 class Database_qualified_name { public: - LEX_CSTRING m_db; - LEX_CSTRING m_name; - Database_qualified_name(const LEX_CSTRING *db, const LEX_CSTRING *name) - :m_db(*db), m_name(*name) + Lex_ident_db m_db; + Lex_cstring m_name; // no comparison semantics + Database_qualified_name() { } - Database_qualified_name(const LEX_CSTRING &db, const LEX_CSTRING &name) + Database_qualified_name(const Lex_ident_db &db, const LEX_CSTRING &name) :m_db(db), m_name(name) { } - Database_qualified_name(const char *db, size_t db_length, - const char *name, size_t name_length) - { - m_db.str= db; - m_db.length= db_length; - m_name.str= name; - m_name.length= name_length; - } - - bool eq(const Database_qualified_name *other) const + bool eq_routine_name(const Database_qualified_name *other) const
eq_func_name, because the comparison semantics comes from Lex_ident_func
{ - CHARSET_INFO *cs= lower_case_table_names ? - &my_charset_utf8mb3_general_ci : - &my_charset_utf8mb3_bin; - return - m_db.length == other->m_db.length && - m_name.length == other->m_name.length && - !cs->strnncoll(m_db.str, m_db.length, - other->m_db.str, other->m_db.length) && - !cs->strnncoll(m_name.str, m_name.length, - other->m_name.str, other->m_name.length); + return m_db.streq(other->m_db) &&
no need to take lower_case_table_names into account when comparing db names here?
+ Lex_ident_func(m_name).streq(other->m_name); } /* Make copies of "db" and "name" on the memory root in internal format: - Lower-case "db" if lower-case-table-names==1. - Preserve "name" as is. */ - bool copy_sp_name_internal(MEM_ROOT *mem_root, const LEX_CSTRING &db, + bool copy_sp_name_internal(MEM_ROOT *mem_root, const Lex_ident_db &db, const LEX_CSTRING &name);
// Export db and name as a qualified name string: 'db.name' diff --git a/sql/sql_connect.cc b/sql/sql_connect.cc index c5619b28f69..4e10f498e3c 100644 --- a/sql/sql_connect.cc +++ b/sql/sql_connect.cc @@ -509,7 +514,9 @@ extern "C" void free_table_stats(TABLE_STATS* table_stats)
void init_global_table_stats(void) { - my_hash_init(PSI_INSTRUMENT_ME, &global_table_stats, system_charset_info, + my_hash_init(PSI_INSTRUMENT_ME, &global_table_stats, + // QQ: why not Lex_ident_table::charset_info() ?
just change it
+ &my_charset_utf8mb3_general1400_as_ci, max_connections, 0, 0, (my_hash_get_key) get_key_table_stats, (my_hash_free_key) free_table_stats, 0); } diff --git a/sql/sql_db.cc b/sql/sql_db.cc index 08441d9f067..51635bdf0ce 100644 --- a/sql/sql_db.cc +++ b/sql/sql_db.cc @@ -98,12 +98,10 @@ typedef struct my_dbopt_st */
static inline bool -cmp_db_names(LEX_CSTRING *db1_name, const LEX_CSTRING *db2_name) +cmp_db_names(const Lex_ident_db &db1_name, const Lex_ident_db &db2_name) { - return (db1_name->length == db2_name->length && - (db1_name->length == 0 ||
can it be used with empty names, really?
- my_strcasecmp(table_alias_charset, - db1_name->str, db2_name->str) == 0)); + return (db1_name.length == 0 && db2_name.length == 0) || + db1_name.streq(db2_name); }
#ifdef HAVE_PSI_INTERFACE @@ -1087,8 +1085,7 @@ mysql_rm_db_internal(THD *thd, const Lex_ident_db &db, bool if_exists, Disable drop of enabled log tables, must be done before name locking. This check is only needed if we are dropping the "mysql" database. */ - if ((rm_mysql_schema= - (my_strcasecmp(system_charset_info, MYSQL_SCHEMA_NAME.str, db.str) == 0))) + if ((rm_mysql_schema= MYSQL_SCHEMA_NAME.streq(db))) // QQ: was a bug here?
again, "was"
{ for (table= tables; table; table= table->next_local) if (check_if_log_table(table, TRUE, "DROP")) diff --git a/sql/sql_handler.cc b/sql/sql_handler.cc index 406f67ff955..def1fc44392 100644 --- a/sql/sql_handler.cc +++ b/sql/sql_handler.cc @@ -1080,11 +1079,14 @@ static SQL_HANDLER *mysql_ha_find_match(THD *thd, TABLE_LIST *tables) { if (tables->is_anonymous_derived_table()) continue; + /* + TODO: for some reasons the code user case-insensitive comparison. + Shouldn't it use Lex_ident_db/Lex_ident_table instead, + to take into account the filesystem case sensitivity? + */
must be a bug. try to create a test case for it. I think it could be something like (untested) create table t1 ...; create table T1; handler t1 open; handler t1 read first; handler t1 read next; drop table T1; handler t1 read next;
if ((! tables->db.str[0] || - ! my_strcasecmp(&my_charset_latin1, hash_tables->db.str, - tables->get_db_name())) && - ! my_strcasecmp(&my_charset_latin1, hash_tables->table_name.str, - tables->get_table_name())) + Lex_ident_ci(hash_tables->db).streq(tables->get_db_name())) && + Lex_ident_ci(hash_tables->table_name).streq(tables->get_table_name())) { /* Link into hash_tables list */ hash_tables->next= head; diff --git a/sql/sql_i_s.h b/sql/sql_i_s.h index 263031ae2c9..127b35f6ab2 100644 --- a/sql/sql_i_s.h +++ b/sql/sql_i_s.h @@ -329,7 +325,7 @@ typedef class Item COND;
typedef struct st_schema_table { - const char *table_name; + LEX_CSTRING table_name;
why LEX_CSTRING?
ST_FIELD_INFO *fields_info; /* for FLUSH table_name */ int (*reset_table) (); diff --git a/sql/sql_lex.cc b/sql/sql_lex.cc index a84fd29affc..b74ecb51a69 100644 --- a/sql/sql_lex.cc +++ b/sql/sql_lex.cc @@ -9370,8 +9359,8 @@ bool LEX::create_package_finalize(THD *thd, { if (name2 && (name2->m_explicit_name != name->m_explicit_name || - strcmp(name2->m_db.str, name->m_db.str) || - !Sp_handler::eq_routine_name(name2->m_name, name->m_name))) + // QQ: was a bug here?
"was" may be it was intentional, looking for an exact match only?
+ !name2->eq_routine_name(name))) { bool exp= name2->m_explicit_name || name->m_explicit_name; my_error(ER_END_IDENTIFIER_DOES_NOT_MATCH, MYF(0), diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc index 665d2143872..1430aa473af 100644 --- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -1131,8 +1131,8 @@ static bool wsrep_tables_accessible_when_detached(const TABLE_LIST *tables) { for (const TABLE_LIST *table= tables; table; table= table->next_global)
I'd personally use const TABLE_LIST *t= tables; ... it's a 3-line loop
{ - LEX_CSTRING db= table->db, tn= table->table_name; - if (get_table_category(&db, &tn) < TABLE_CATEGORY_INFORMATION) + if (get_table_category(table->db, + table->table_name) < TABLE_CATEGORY_INFORMATION) return false; } return tables != NULL; diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc index 958c52fd95c..c8d64ea37ff 100644 --- a/sql/sql_partition.cc +++ b/sql/sql_partition.cc @@ -8055,13 +8050,9 @@ void make_used_partitions_str(MEM_ROOT *alloc, if (parts_str->length()) parts_str->append(','); uint index= parts_str->length(); - parts_str->append(head_pe->partition_name, - strlen(head_pe->partition_name), - system_charset_info); + parts_str->append(head_pe->partition_name, system_charset_info); parts_str->append('_'); - parts_str->append(pe->partition_name, - strlen(pe->partition_name), - system_charset_info); + parts_str->append(pe->partition_name, system_charset_info);
better pe->partition_name->charset_info() (here and above and below)
used_partitions_list.append_str(alloc, parts_str->ptr() + index); } partition_id++; diff --git a/sql/sql_plugin.cc b/sql/sql_plugin.cc index 8bab244c709..f335226b9a2 100644 --- a/sql/sql_plugin.cc +++ b/sql/sql_plugin.cc @@ -374,16 +374,16 @@ bool check_valid_path(const char *path, size_t len)
static void fix_dl_name(MEM_ROOT *root, LEX_CSTRING *dl) { - const size_t so_ext_len= sizeof(SO_EXT) - 1; - if (dl->length < so_ext_len || - my_strcasecmp(&my_charset_latin1, dl->str + dl->length - so_ext_len, - SO_EXT)) + const Lex_ident_plugin so_ext(SO_EXT, sizeof(SO_EXT) - 1);
STRING_WITH_LEN
+ if (dl->length < so_ext.length || + !so_ext.streq(Lex_cstring(dl->str + dl->length - so_ext.length, + so_ext.length))) { - char *s= (char*)alloc_root(root, dl->length + so_ext_len + 1); + char *s= (char*)alloc_root(root, dl->length + so_ext.length + 1); memcpy(s, dl->str, dl->length); strcpy(s + dl->length, SO_EXT); dl->str= s; - dl->length+= so_ext_len; + dl->length+= so_ext.length; } }
diff --git a/sql/sql_show.cc b/sql/sql_show.cc index 1fcafd6ec08..7548ff1ec57 100644 --- a/sql/sql_show.cc +++ b/sql/sql_show.cc @@ -7182,10 +7170,12 @@ static int get_schema_views_record(THD *thd, TABLE_LIST *tables, Security_context *sctx= thd->security_ctx; if (!tables->allowed_show) { - if (!my_strcasecmp(system_charset_info, tables->definer.user.str, - sctx->priv_user) && - !my_strcasecmp(system_charset_info, tables->definer.host.str, - sctx->priv_host)) + // QQ: why case insensitive comparision?
Same as above? Try a test case and fix if a bug? it's likely create user foo; create user FOO; create view ... definer=foo v1 ... connect(as FOO); select * from information schema.views;
+ if (my_charset_utf8mb3_general1400_as_ci.streq( + tables->definer.user, + Lex_cstring_strlen(sctx->priv_user)) && + Lex_ident_host(tables->definer.host). + streq(Lex_cstring_strlen(sctx->priv_host))) tables->allowed_show= TRUE; #ifndef NO_EMBEDDED_ACCESS_CHECKS else @@ -7893,8 +7883,7 @@ static int get_schema_partitions_record(THD *thd, TABLE_LIST *tables,
while ((part_elem= part_it++)) { - table->field[3]->store(part_elem->partition_name, - strlen(part_elem->partition_name), cs); + table->field[3]->store(part_elem->partition_name, cs);
may be part_elem->partition_name->charset_info() ?
table->field[3]->set_notnull(); /* PARTITION_ORDINAL_POSITION */ table->field[5]->store((longlong) ++part_pos, TRUE); diff --git a/sql/sql_trigger.cc b/sql/sql_trigger.cc index 5304a535ad0..dcba31d06c9 100644 --- a/sql/sql_trigger.cc +++ b/sql/sql_trigger.cc @@ -470,7 +470,7 @@ bool mysql_create_or_drop_trigger(THD *thd, TABLE_LIST *tables, bool create) /* We don't allow creating triggers on tables in the 'mysql' schema */ - if (create && lex_string_eq(&tables->db, STRING_WITH_LEN("mysql"))) + if (create && tables->db.streq(MYSQL_SCHEMA_NAME)) // QQ: was a bug here?
"was"
{ my_error(ER_NO_TRIGGERS_ON_SYSTEM_SCHEMA, MYF(0)); DBUG_RETURN(TRUE); diff --git a/sql/sql_trigger.h b/sql/sql_trigger.h index 493349afb9d..5648f8f15ff 100644 --- a/sql/sql_trigger.h +++ b/sql/sql_trigger.h @@ -74,8 +74,10 @@ struct st_trg_execution_order /** Trigger name referenced in the FOLLOWS/PRECEDES clause of the CREATE TRIGGER statement. + Cannot be Lex_ident_trigger, + as this structure is used in %union in sql_yacc.yy
Thanks. I already wrote above "why it's not simply a Lex_ident_trigger. That explains it, deleted my previous comment :)
*/ - LEX_CSTRING anchor_trigger_name; + LEX_CSTRING anchor_trigger_name; // Used in sql_yacc %union };
diff --git a/sql/sql_truncate.cc b/sql/sql_truncate.cc index beeee9da72f..e3efc3122df 100644 --- a/sql/sql_truncate.cc +++ b/sql/sql_truncate.cc @@ -147,14 +147,12 @@ fk_truncate_illegal_if_parent(THD *thd, TABLE *table) /* Loop over the set of foreign keys for which this table is a parent. */ while ((fk_info= it++)) { - if (lex_string_cmp(system_charset_info, fk_info->referenced_db, - &table->s->db) || - lex_string_cmp(system_charset_info, fk_info->referenced_table, - &table->s->table_name) || - lex_string_cmp(system_charset_info, fk_info->foreign_db, - &table->s->db) || - lex_string_cmp(system_charset_info, fk_info->foreign_table, - &table->s->table_name)) + // QQ: why db and table don't use table_alias_charset for comparison?
Same as above? Try a test case and fix if a bug? test looks fairly straightforward here too
+ if (!Lex_ident_ci(*fk_info->referenced_db).streq(table->s->db) || + !Lex_ident_ci(*fk_info->referenced_table). + streq(table->s->table_name) || + !Lex_ident_ci(*fk_info->foreign_db).streq(table->s->db) || + !Lex_ident_ci(*fk_info->foreign_table).streq(table->s->table_name)) break; }
diff --git a/sql/sql_view.cc b/sql/sql_view.cc index 10588b54fa5..751373b4571 100644 --- a/sql/sql_view.cc +++ b/sql/sql_view.cc @@ -1347,12 +1347,9 @@ bool mysql_make_view(THD *thd, TABLE_SHARE *share, TABLE_LIST *table, precedent; precedent= precedent->referencing_view) { - if (precedent->view_name.length == table->table_name.length && - precedent->view_db.length == table->db.length && - my_strcasecmp(system_charset_info, - precedent->view_name.str, table->table_name.str) == 0 && - my_strcasecmp(system_charset_info, - precedent->view_db.str, table->db.str) == 0) + // QQ: was a bug here?
"was"
+ if (precedent->view_name.streq(table->table_name) && + precedent->view_db.streq(table->db)) { my_error(ER_VIEW_RECURSIVE, MYF(0), top_view->view_db.str, top_view->view_name.str); diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy index fcfe490c035..d01c195cb8a 100644 --- a/sql/sql_yacc.yy +++ b/sql/sql_yacc.yy @@ -15676,7 +15680,8 @@ user_maybe_role: It's OK to use in-place lowercase as long as the character set is utf8. */ - my_casedn_str(system_charset_info, (char*) $$->host.str); + $$->host.length= my_casedn_str(system_charset_info, + (char*) $$->host.str);
I thought my_casedn_str() changes were in the different commit?
} else { diff --git a/sql/structs.h b/sql/structs.h index 75bdc9e276b..d349ab72970 100644 --- a/sql/structs.h +++ b/sql/structs.h @@ -321,11 +322,25 @@ typedef struct user_conn { uint conn_per_hour, updates, questions; /* Maximum amount of resources which account is allowed to consume. */ USER_RESOURCES user_resources; + + /* + The CHARSET_INFO used for hashes to compare the entire 'user\0hash' key. + QQ: perhaps the user should be compared case sensitively, + while host host should be comparad case insensitively.
correct
+ */ + static CHARSET_INFO *user_host_key_charset_info_for_hash() + { + return &my_charset_utf8mb3_general1400_as_ci; + } } USER_CONN;
typedef struct st_user_stats { char user[MY_MAX(USERNAME_LENGTH, LIST_PROCESS_HOST_LEN) + 1]; + static CHARSET_INFO *user_key_charset_info_for_hash() + { + return &my_charset_utf8mb3_general1400_as_ci; + } // Account name the user is mapped to when this is a user from mapped_user. // Otherwise, the same value as user. char priv_user[MY_MAX(USERNAME_LENGTH, LIST_PROCESS_HOST_LEN) + 1]; diff --git a/sql/table.cc b/sql/table.cc index 8bc16753e68..d9bc5fabb4f 100644 --- a/sql/table.cc +++ b/sql/table.cc @@ -88,23 +88,34 @@ struct extra2_fields static Virtual_column_info * unpack_vcol_info_from_frm(THD *, TABLE *, String *, Virtual_column_info **, bool *);
+/* + Lex_ident_db does not have operator""_Lex_ident_db, + because its constructor contains DBUG_SLOW_ASSERT + and therefore it's not a constexpr constructor. + So let's initialize a number of Lex_ident_db constants + using operator""_Lex_cstring. +*/ + /* INFORMATION_SCHEMA name */ -LEX_CSTRING INFORMATION_SCHEMA_NAME= {STRING_WITH_LEN("information_schema")}; +Lex_ident_db + INFORMATION_SCHEMA_NAME= Lex_ident_db("information_schema"_Lex_cstring);
don't think so. information_schema is case insensitive, even if fs is case sensitivity
/* PERFORMANCE_SCHEMA name */ -LEX_CSTRING PERFORMANCE_SCHEMA_DB_NAME= {STRING_WITH_LEN("performance_schema")}; +Lex_ident_db + PERFORMANCE_SCHEMA_DB_NAME= Lex_ident_db("performance_schema"_Lex_cstring);
same
/* MYSQL_SCHEMA name */ -LEX_CSTRING MYSQL_SCHEMA_NAME= {STRING_WITH_LEN("mysql")}; +Lex_ident_db + MYSQL_SCHEMA_NAME= Lex_ident_db("mysql"_Lex_cstring);
this uses fs case sensitivity, correct
/* GENERAL_LOG name */ -LEX_CSTRING GENERAL_LOG_NAME= {STRING_WITH_LEN("general_log")}; +Lex_ident_table GENERAL_LOG_NAME= "general_log"_Lex_ident_table;
/* SLOW_LOG name */ -LEX_CSTRING SLOW_LOG_NAME= {STRING_WITH_LEN("slow_log")}; +Lex_ident_table SLOW_LOG_NAME= "slow_log"_Lex_ident_table;
-LEX_CSTRING TRANSACTION_REG_NAME= {STRING_WITH_LEN("transaction_registry")}; -LEX_CSTRING MYSQL_PROC_NAME= {STRING_WITH_LEN("proc")}; +Lex_ident_table TRANSACTION_REG_NAME= "transaction_registry"_Lex_ident_table; +Lex_ident_table MYSQL_PROC_NAME= "proc"_Lex_ident_table;
/* Keyword added as a prefix when parsing the defining expression for a @@ -280,42 +291,38 @@ const char *fn_frm_ext(const char *name) }
-TABLE_CATEGORY get_table_category(const LEX_CSTRING *db, - const LEX_CSTRING *name) +TABLE_CATEGORY get_table_category(const Lex_ident_db &db, + const Lex_ident_table &name) { - DBUG_ASSERT(db != NULL); - DBUG_ASSERT(name != NULL); - #ifdef WITH_WSREP - if (db->str && - my_strcasecmp(system_charset_info, db->str, WSREP_SCHEMA) == 0) + if (db.str && db.streq(MYSQL_SCHEMA_NAME)) //QQ: was a bug here?
"was"
{ - if ((my_strcasecmp(system_charset_info, name->str, WSREP_STREAMING_TABLE) == 0 || - my_strcasecmp(system_charset_info, name->str, WSREP_CLUSTER_TABLE) == 0 || - my_strcasecmp(system_charset_info, name->str, WSREP_MEMBERS_TABLE) == 0)) + if (name.streq(Lex_ident_table{STRING_WITH_LEN(WSREP_STREAMING_TABLE)}) || + name.streq(Lex_ident_table{STRING_WITH_LEN(WSREP_CLUSTER_TABLE)}) || + name.streq(Lex_ident_table{STRING_WITH_LEN(WSREP_MEMBERS_TABLE)})) { return TABLE_CATEGORY_INFORMATION; } } #endif /* WITH_WSREP */ - if (is_infoschema_db(db)) + if (is_infoschema_db(&db)) return TABLE_CATEGORY_INFORMATION;
- if (is_perfschema_db(db)) + if (is_perfschema_db(&db)) return TABLE_CATEGORY_PERFORMANCE;
- if (lex_string_eq(&MYSQL_SCHEMA_NAME, db)) + if (db.streq(MYSQL_SCHEMA_NAME)) // QQ: was a bug here?
"was"
{ - if (is_system_table_name(name->str, name->length)) + if (is_system_table_name(name.str, name.length)) return TABLE_CATEGORY_SYSTEM;
- if (lex_string_eq(&GENERAL_LOG_NAME, name)) + if (name.streq(GENERAL_LOG_NAME)) // QQ: was bug here?
"was"
return TABLE_CATEGORY_LOG;
- if (lex_string_eq(&SLOW_LOG_NAME, name)) + if (name.streq(SLOW_LOG_NAME)) // QQ: was a bug here?
"was"
return TABLE_CATEGORY_LOG;
- if (lex_string_eq(&TRANSACTION_REG_NAME, name)) + if (name.streq(TRANSACTION_REG_NAME)) // QQ: was a bug here?
"was"
return TABLE_CATEGORY_LOG; }
diff --git a/sql/table.h b/sql/table.h index 35be3bd4b2d..fe65025478d 100644 --- a/sql/table.h +++ b/sql/table.h @@ -3402,27 +3409,27 @@ static inline int set_zone(int nr,int min_zone,int max_zone) }
/* performance schema */ -extern LEX_CSTRING PERFORMANCE_SCHEMA_DB_NAME; +extern Lex_ident_db PERFORMANCE_SCHEMA_DB_NAME;
-extern LEX_CSTRING GENERAL_LOG_NAME; -extern LEX_CSTRING SLOW_LOG_NAME; -extern LEX_CSTRING TRANSACTION_REG_NAME; +extern Lex_ident_table GENERAL_LOG_NAME; +extern Lex_ident_table SLOW_LOG_NAME; +extern Lex_ident_table TRANSACTION_REG_NAME;
/* information schema */ -extern LEX_CSTRING INFORMATION_SCHEMA_NAME; -extern LEX_CSTRING MYSQL_SCHEMA_NAME; +extern Lex_ident_db INFORMATION_SCHEMA_NAME; +extern Lex_ident_db MYSQL_SCHEMA_NAME;
/* table names */ -extern LEX_CSTRING MYSQL_PROC_NAME; +extern Lex_ident_table MYSQL_PROC_NAME;
inline bool is_infoschema_db(const LEX_CSTRING *name) { - return lex_string_eq(&INFORMATION_SCHEMA_NAME, name); + return lex_string_eq(&INFORMATION_SCHEMA_NAME, name); // QQ: use Lex_ident_ci ?
yes
}
inline bool is_perfschema_db(const LEX_CSTRING *name) { - return lex_string_eq(&PERFORMANCE_SCHEMA_DB_NAME, name); + return lex_string_eq(&PERFORMANCE_SCHEMA_DB_NAME, name); // QQ: use Lex_ident_ci?
yes
}
inline void mark_as_null_row(TABLE *table) diff --git a/storage/blackhole/ha_blackhole.cc b/storage/blackhole/ha_blackhole.cc index 343f3c70286..ce48c95f837 100644 --- a/storage/blackhole/ha_blackhole.cc +++ b/storage/blackhole/ha_blackhole.cc @@ -416,7 +416,9 @@ static int blackhole_init(void *p) mysql_mutex_init(bh_key_mutex_blackhole, &blackhole_mutex, MY_MUTEX_INIT_FAST); (void) my_hash_init(PSI_INSTRUMENT_ME, &blackhole_open_tables, - system_charset_info, 32, 0, 0, + // QQ: was a bug here?
"was"
+ Lex_ident_table::charset_info(), + 32, 0, 0, (my_hash_get_key) blackhole_get_key, (my_hash_free_key) blackhole_free_key, 0);
diff --git a/storage/csv/ha_tina.cc b/storage/csv/ha_tina.cc index e705ff0e7c0..9fc0bcf604b 100644 --- a/storage/csv/ha_tina.cc +++ b/storage/csv/ha_tina.cc @@ -184,7 +184,9 @@ static int tina_init_func(void *p) tina_hton= (handlerton *)p; mysql_mutex_init(csv_key_mutex_tina, &tina_mutex, MY_MUTEX_INIT_FAST); (void) my_hash_init(csv_key_memory_tina_share, &tina_open_tables, - system_charset_info, 32, 0, 0, (my_hash_get_key) + // QQ: was a bug here?
"was"
+ table_alias_charset, + 32, 0, 0, (my_hash_get_key) tina_get_key, 0, 0); tina_hton->db_type= DB_TYPE_CSV_DB; tina_hton->create= tina_create_handler; diff --git a/storage/innobase/dict/dict0dict.cc b/storage/innobase/dict/dict0dict.cc index 5545727b015..231bcc1a132 100644 --- a/storage/innobase/dict/dict0dict.cc +++ b/storage/innobase/dict/dict0dict.cc @@ -284,27 +284,26 @@ otherwise table->n_def */ ulint dict_table_has_column( const dict_table_t* table, - const char* col_name, + const LEX_CSTRING &col_name,
looks ok to me, but to avoid surprises Marko should review it too
ulint col_nr) { ulint col_max = table->n_def;
ut_ad(table); - ut_ad(col_name); + ut_ad(col_name.str); ut_ad(table->magic_n == DICT_TABLE_MAGIC_N);
if (col_nr < col_max - && innobase_strcasecmp( - col_name, dict_table_get_col_name(table, col_nr)) == 0) { + && Lex_ident_column(col_name). + streq(dict_table_get_col_name_ls(table, col_nr))) { return(col_nr); }
/** The order of column may changed, check it with other columns */ for (ulint i = 0; i < col_max; i++) { if (i != col_nr - && innobase_strcasecmp( - col_name, dict_table_get_col_name(table, i)) == 0) { - + && Lex_ident_column(col_name). + streq(dict_table_get_col_name_ls(table, i))) { return(i); } } diff --git a/storage/mroonga/ha_mroonga.cpp b/storage/mroonga/ha_mroonga.cpp index 44fd09e74ed..71784dad475 100644 --- a/storage/mroonga/ha_mroonga.cpp +++ b/storage/mroonga/ha_mroonga.cpp @@ -1936,7 +1936,10 @@ static int mrn_init(void *p) MY_MUTEX_INIT_FAST) != 0)) { goto err_allocated_thds_mutex_init; } - if (mrn_my_hash_init(&mrn_allocated_thds, system_charset_info, 32, 0, 0, + if (mrn_my_hash_init(&mrn_allocated_thds, + // QQ: was a bug here?
"was"
+ &my_charset_bin, + 32, 0, 0, mrn_allocated_thds_get_key, 0, 0)) { goto error_allocated_thds_hash_init; } @@ -1945,7 +1948,10 @@ static int mrn_init(void *p) MY_MUTEX_INIT_FAST) != 0)) { goto err_allocated_open_tables_mutex_init; } - if (mrn_my_hash_init(&mrn_open_tables, system_charset_info, 32, 0, 0, + if (mrn_my_hash_init(&mrn_open_tables, + // QQ: was a bug here?
"was"
+ Lex_ident_table::charset_info(), + 32, 0, 0, mrn_open_tables_get_key, 0, 0)) { goto error_allocated_open_tables_hash_init; } @@ -1954,7 +1960,10 @@ static int mrn_init(void *p) MY_MUTEX_INIT_FAST) != 0)) { goto error_allocated_long_term_share_mutex_init; } - if (mrn_my_hash_init(&mrn_long_term_share, system_charset_info, 32, 0, 0, + if (mrn_my_hash_init(&mrn_long_term_share, + // QQ: was a bug here?
"was"
+ Lex_ident_table::charset_info(), + 32, 0, 0, mrn_long_term_share_get_key, 0, 0)) { goto error_allocated_long_term_share_hash_init; } diff --git a/storage/perfschema/pfs_engine_table.h b/storage/perfschema/pfs_engine_table.h index 1ef1e0282de..b4f4c9f0acc 100644 --- a/storage/perfschema/pfs_engine_table.h +++ b/storage/perfschema/pfs_engine_table.h @@ -25,6 +25,22 @@
#include "table.h" #include "sql_acl.h" + +/* + The performance schema is implemented as a storage engine, in memory. + The current storage engine interface exposed by the server, + and in particular handlerton::discover, uses 'FRM' files to describe a + table structure, which are later stored on disk, by the server, + in ha_create_table_from_engine(). + Because the table metadata is stored on disk, the table naming rules + used by the performance schema then have to comply with the constraints + imposed by the disk storage, and in particular with lower_case_table_names. +*/ + +using PFS_ident_db= Lex_ident_db; +using PFS_ident_table = Lex_ident_table;
No, table metadata for performance schema are never stored on disk. let's use case insensitive comparison here, if possible.
+ + /** @file storage/perfschema/pfs_engine_table.h Performance schema tables (declarations).
Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
Hello Sergei,
Hi, Alexander,
Phew. It was a big patch. Mostly mechanical changes, admittedly, but still a lot to read.
Generally I quite liked it, great work! Shorter way to write an identifier comparison, type safety that enforces namespaces and correct collations. Nice.
Thanks for your review. The new patch version is here: https://github.com/MariaDB/server/commit/15c315211befe7da677f9a533f0ceabe21c... Please see my answers below:
But there are, of course, comments and questions, please, see below.
On Dec 10, Alexander Barkov wrote:
commit 02b85f1cf9b Author: Alexander Barkov <bar@mariadb.com> Date: Wed Apr 26 15:27:01 2023 +0400
MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp()
- Removing the virtual function strnncoll() from
MY_COLLATION_HANDLER
- Adding a wrapper function CHARSET_INFO::streq(), to compare two strings for equality. For now it calls strnncoll() internally. In the future it will turn into a virtual function.
why would you want it as a virtual function? What is collation-dependent there?
Catching the fact of non-equality is a simpler operation that figuring out which side is greater. A virtual implementation of streq() can be faster than just a wrapper for strnncoll().
- Adding new accent sensitive case insensitive collations: - utf8mb4_general1400_as_ci - utf8mb3_general1400_as_ci They implements accent sensitive case insensitive comparison. The weight of a character is equal to the code point of its upper case variant. These collations use Unicode-14.0.0
casefolding data.
The result of my_charset_utf8mb3_general1400_as_ci.strcoll() is very close to the former my_charset_utf8mb3_general_ci.strcasecmp()
There is only a difference in a couple dozen rare characters,
because:
- the switch from "tolower" to "toupper" comparison, to make utf8mb3_general1400_as_ci closer to utf8mb3_general_ci - the switch from Unicode-3.0.0 to Unicode-14.0.0 This difference should be tolarable. See the list of affected characters in the MDEV description.
Note, utf8mb4_general1400_as_ci correctly handles non-BMP
characters!
Unlike utf8mb4_general_ci, it does not treat all BMP characters as equal.
- Adding classes representing names of the file based database
objects:
Lex_ident_db Lex_ident_table Lex_ident_trigger
Their comparison collation depends on the underlying file system case sensitivity and on --lower-case-table-names and can be either my_charset_bin or
my_charset_utf8mb3_general1400_as_ci.
- Adding classes representing names of other database objects, whose names have case insensitive comparison style, using my_charset_utf8mb3_general1400_as_ci:
Lex_ident_column Lex_ident_sys_var Lex_ident_user_var Lex_ident_sp_var
I don't think Lex_ident_sp_var is a separate domain, it's the same as Lex_ident_column. Both logically:
select a from t1 where b > c
(here you don't know what's a column and what's an sp var) and also in the code, in fill_spvar_definition() you yourself cast from Lex_ident_sp_var to Lex_ident_column, which kind of defeats the purpose of having two different types and emphasizes that it's actually just one and the same.
Fixed.
Lex_ident_ps Lex_ident_i_s_table Lex_ident_window Lex_ident_func Lex_ident_partition Lex_ident_with_element Lex_ident_rpl_filter Lex_ident_master_info Lex_ident_host Lex_ident_locale Lex_ident_plugin Lex_ident_engine Lex_ident_server Lex_ident_savepoint Lex_ident_table_option_name Lex_ident_table_option_value
option value is a string, not an ident
Fixed. I moved the two Lex_ident_table_option_xxx classes into: - engine_option_value::Name, deriving from Lex_ident_ci - engine_option_value::Value::Value, deriving from Lex_cstring
Lex_ident_charset
- All the mentioned Lex_ident_xxx classes implement a method
streq():
if (ident1.streq(ident2)) do_equal();
This method works as a wrapper for CHARSET_INFO::streq().
- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name" in class members and in function/method parameters.
- Replacing all calls like system_charset_info->coll->strcasecmp(ident1, ident2) to ident1.streq(ident2)
- Taking advantage of the c++11 user defined literal operator for Lex_ident_xxx data type. See lex_ident.h. Use example:
const Lex_ident_column primary_key_name=
"PRIMARY"_Lex_ident_column;
is now a shorter version of:
const Lex_ident_column primary_key_name= Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
Of all the things in your patch this will be the first thing that Monty will remove when he'll stumble upon it.
I discussed it earlier with Monty, and checked again yesterday. He's fine with this literal operator.
Could you instead implement a constructor that understands
const Lex_ident_column primary_key_name= {STRING_WITH_LEN("PRIMARY")};
The are constructors: Lex_ident_xxx(const char *str, size_t length) Lex_ident_xxx(const LEX_CSTRING &str) All these notations work: Lex_ident_db MYSQL_SCHEMA_NAME("mysql"_LEX_CSTRING); Lex_ident_db MYSQL_SCHEMA_NAME2{"mysql"_LEX_CSTRING}; Lex_ident_db MYSQL_SCHEMA_NAME3= {STRING_WITH_LEN("mysql")}; Lex_ident_db MYSQL_SCHEMA_NAME4{STRING_WITH_LEN("mysql")};
diff --git a/include/m_ctype.h b/include/m_ctype.h index 9d13989d1fe..e5c5145b256 100644 --- a/include/m_ctype.h +++ b/include/m_ctype.h @@ -997,6 +1006,18 @@ struct charset_info_st return state & MY_CS_COMPILED; }
+ /* + There will be a separate virtual function streq() in + MY_COLLATION_HANDLER eventually. + */ + my_bool streq(const LEX_CSTRING a, const LEX_CSTRING b) const + { + DBUG_ASSERT(is_valid_comparison_operand(a)); + DBUG_ASSERT(is_valid_comparison_operand(b)); + return 0 == (coll->strnncoll)(this, (const uchar *) a.str,
a.length,
+ (const uchar *) b.str, b.length, FALSE);
as you have also added strnncoll helper, you can write the above as
return 0 == strnncoll(a, b);
Done.
also, perhaps you should move these asserts into strnncoll helper?
Done.
+ } + int strnncoll(const uchar *a, size_t alen, const uchar *b, size_t blen, my_bool b_is_prefix=
FALSE) const
{ diff --git a/mysql-test/main/ctype_utf8mb4_general1400_as_ci_casefold.result b/mysql-test/main/ctype_utf8mb4_general1400_as_ci_casefold.result new file mode 100644 index 00000000000..c6ccf662d16 --- /dev/null +++ b/mysql-test/main/ctype_utf8mb4_general1400_as_ci_casefold.result @@ -0,0 +1,1446 @@ +# +# Start of 11.4 tests +# +# +# MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp() +# +SET NAMES utf8mb4 COLLATE utf8mb4_general1400_as_ci; +EXECUTE IMMEDIATE SFORMAT(' +CREATE VIEW v_bmp AS +SELECT + seq AS codepoint, + LPAD(HEX(seq),4,''0'') AS codepoint_hex4, + CONVERT(CHAR(seq USING utf32) USING {}) COLLATE {} AS c +FROM + seq_0_to_65535', @@character_set_connection, @@collation_connection); +SELECT COLLATION(c) FROM v_bmp LIMIT 1; +COLLATION(c) +utf8mb4_general1400_as_ci +SELECT +codepoint_hex4, +HEX(CAST(LOWER(c) AS CHAR CHARACTER SET ucs2)), +HEX(CAST(UPPER(c) AS CHAR CHARACTER SET ucs2)) +FROM v_bmp +WHERE BINARY(c)<>BINARY(LOWER(c)) OR BINARY(c)<>BINARY(UPPER(c)); +codepoint_hex4 HEX(CAST(LOWER(c) AS CHAR CHARACTER SET ucs2)) HEX(CAST(UPPER(c) AS CHAR CHARACTER SET ucs2))
what is the point of this table? detect changes when going to the next unicode version?
To fix the behavior, proving that casefolding is implemented according to Unicode-14.0. If the Unicode casefolding data ever changes in the collation definition - intetionally to some newer version - or unintentionally e.g. to one of the older existing implemented versions Unicode-3.0 or Unicode-5.2 this test will fail, so we don't forget to address such change properly (revert the unintentional change, or document the intentional change).
+0041 0061 0041 ...
<cut>
index 00000000000..b5eb369ae9a --- /dev/null +++ b/mysql-test/main/ctype_utf8mb4_general1400_as_ci_ws.test @@ -0,0 +1,39 @@ +--source include/have_utf32.inc +--source include/have_sequence.inc + +--echo # +--echo # Start of 11.1 tests +--echo # + +--echo # +--echo # Collation utfmb4_general1400_as_ci +--echo # + +SET NAMES utf8mb4 COLLATE utf8mb4_general1400_as_ci; + +EXECUTE IMMEDIATE SFORMAT(' +CREATE VIEW v_bmp AS +SELECT + seq AS codepoint, + LPAD(HEX(seq),6,''0'') AS codepoint_hex6, + CONVERT(CHAR(seq USING utf32) USING {}) COLLATE {} AS c +FROM + seq_0_to_1114111', @@character_set_connection, @@collation_connection); + +SELECT COLLATION(c) FROM v_bmp LIMIT 1; + +SELECT + SUM(codepoint_hex6=HEX(LOWER(c))) AS count_bmp_weight_is_lower, + SUM(codepoint_hex6<>HEX(LOWER(c))) AS count_bmp_weight_is_not_lower +FROM v_bmp; + +SELECT codepoint_hex6,HEX(WEIGHT_STRING(c)) +FROM v_bmp +WHERE codepoint_hex6<>HEX(WEIGHT_STRING(c)); + +DROP VIEW v_bmp; + + +--echo # +--echo # End of 11.1 tests
11.4 ?
Fixed.
+--echo # diff --git
a/mysql-test/suite/compat/oracle/r/lower_case_table_names.result b/mysql-test/suite/compat/oracle/r/lower_case_table_names.result
new file mode 100644 index 00000000000..2518be31402 --- /dev/null +++ b/mysql-test/suite/compat/oracle/r/lower_case_table_names.result @@ -0,0 +1,9 @@ +SELECT @@lower_case_table_names; +@@lower_case_table_names +1 +# +# TODO: Report this problem as MDEV
what is the problem? isn't the comparison accent-sensitive?
It was accent insensitive before the patch. I now reported it as: MDEV-33050 Build-in schemas like oracle_schema are accent insensitive
+# +SET NAMES utf8; +CREATE TABLE t1 (a öracle_schema.date); +ERROR HY000: Unknown data type: 'öracle_schema.date' diff --git a/mysql-test/suite/unit/disabled.def
b/mysql-test/suite/unit/disabled.def
new file mode 100644 index 00000000000..3a68bea32d0 --- /dev/null +++ b/mysql-test/suite/unit/disabled.def @@ -0,0 +1,12 @@
+##############################################################################
+# +# List the test cases that are to be disabled temporarily. +# +# Separate the test case name and the comment with ':'. +# +# <testcasename> : BUG#<xxxx> <date disabled> <disabler> <comment> +# +# Do not use any TAB characters for whitespace. +#
+##############################################################################
+conc_charset : TODO: client side ID for utf8mb4_tolower_ci
1. There is no utf8mb4_tolower_ci
Fixed.
2. what do you mean "client side ID"?
The client library should be fixed to know new collations utf8mb3_general_as_ci and utf8mb4_general_as_ci and their IDs.
diff --git a/sql/ddl_log.cc b/sql/ddl_log.cc index 61c8f50d1e3..fdfb605a57f 100644 --- a/sql/ddl_log.cc +++ b/sql/ddl_log.cc @@ -1146,22 +1146,23 @@ static void execute_rename_table(DDL_LOG_ENTRY *ddl_log_entry, handler *file, static void rename_triggers(THD *thd, DDL_LOG_ENTRY *ddl_log_entry, bool swap_tables) { - LEX_CSTRING to_table, from_table, to_db, from_db, from_converted_name; + Lex_ident_table to_table, from_table, from_converted_name; + Lex_ident_db to_db, from_db; char to_path[FN_REFLEN+1], from_path[FN_REFLEN+1], conv_path[FN_REFLEN+1];
if (!swap_tables) { - from_db= ddl_log_entry->db; - from_table= ddl_log_entry->name; - to_db= ddl_log_entry->from_db; - to_table= ddl_log_entry->from_name; + from_db= Lex_ident_db(ddl_log_entry->db);
this would look better with implicit casting, that is, if you'd have an assignment operator from LEX_CSTRING
As discussed on slack, let's leave it as is. There are at least two reasons not to allow the implicit cast: 1. Adding the implicit cast from LEX_CSTRING will also allow to assignment of Lex_ident_xxx of different types: Lex_ident_table tbl; Lex_ident_column col= tbl; -- This is not desirable 2. The explicit cast signals that in this code something is possibly wrong with the data types. For example, when "Lex_ident_xxx a" gets assigned from some "LEX_CSTRING b" tells that "b" should likely also be of the data type Lex_ident_xxx. I did not change all appropriate LEX_CSTRINGs yet in structures and in function parameters to Lex_ident_xxx. Only changed those that were blockers for getting rid of strcasecmp(). I find it useful to change the appropriate LEX_CSTRINGs to Lex_ident_xxx asap, than to allow the implicit cast. If we allow the implitic cast, such LEX_CSTRINGs will stay in the code forever. I propose we create a new MDEV to analyze all explicit casts to Lex_ident_xxx and replace all suitable LEX_CSTRINGs to Lex_ident_xxx in structures, function parameters, local variables.
+ from_table= Lex_ident_table(ddl_log_entry->name); + to_db= Lex_ident_db(ddl_log_entry->from_db); + to_table= Lex_ident_table(ddl_log_entry->from_name); } else { - from_db= ddl_log_entry->from_db; - from_table= ddl_log_entry->from_name; - to_db= ddl_log_entry->db; - to_table= ddl_log_entry->extra_name; + from_db= Lex_ident_db(ddl_log_entry->from_db); + from_table= Lex_ident_table(ddl_log_entry->from_name); + to_db= Lex_ident_db(ddl_log_entry->db); + to_table= Lex_ident_table(ddl_log_entry->extra_name); }
build_filename_and_delete_tmp_file(from_path, sizeof(from_path), @@ -1211,11 +1212,11 @@ static void rename_triggers(THD *thd,
DDL_LOG_ENTRY *ddl_log_entry,
(void) Table_triggers_list::prepare_for_rename(thd, &trigger_param, - &from_db, - &from_table, -
&from_converted_name,
- &to_db, - &to_table); + from_db, + from_table, + from_converted_name, + to_db, + to_table);
You rather inconsistently replace pointers with const references in some methods, like above, but not in others, like below or in the next hunk.
I tried to do it this way: 1. Replace old pointers to new references: "const LEX_CSTRING *x" -> "const Lex_ident_xxx &x" 2. Do not replace old pointers to new pointers "const LEX_CSTRING *x" -> "const Lex_ident_xxx *x" (use #1 instead) 3. Do not replace old pointers to old references "const LEX_CSTRING *x" to "const LEX_CSTRING &x" (i.e. don't touch data types in cases not directly related to the MDEV)
(void) Table_triggers_list::change_table_name(thd, &trigger_param, &from_db, @@ -1707,9 +1708,8 @@ static int ddl_log_execute_action(THD *thd,
MEM_ROOT *mem_root,
switch (ddl_log_entry->phase) { case DDL_DROP_DB_PHASE_INIT: - drop_database_objects(thd, &path, &db, - !my_strcasecmp(system_charset_info, - MYSQL_SCHEMA_NAME.str,
db.str));
+ // QQ: was a bug here?
This comment makes little sense in the code, as one cannot see what *was* here before. Just drop it.
Remove the comment.
+ drop_database_objects(thd, &path, &db,
MYSQL_SCHEMA_NAME.streq(db));
strxnmov(to_path, sizeof(to_path)-1, path.str,
MY_DB_OPT_FILE, NullS);
mysql_file_delete_with_symlink(key_file_misc, to_path, "",
MYF(0));
diff --git a/sql/debug_sync.cc b/sql/debug_sync.cc index 362463a7f22..4704691615a 100644 --- a/sql/debug_sync.cc +++ b/sql/debug_sync.cc @@ -858,6 +858,13 @@ static st_debug_sync_action *debug_sync_get_action(THD *thd, }
+class Debug_token: public Lex_ident_ci +{ +public: + using Lex_ident_ci::Lex_ident_ci; +};
I wonder if you can create those types with less boilerplate
will typedef work? will using work? if not, may be, like
template<> Lex_ident_templ: Lex_ident_ci { ... } Lex_ident_templ<> Debug_token;
These classes are small enough. I think there are no needs for a template. We have developers who don't like templates, unless they are absolutely necessary. And this not an absolutly necessary case :)
+ + /** Set a debug sync action.
@@ -909,7 +916,8 @@ static bool debug_sync_set_action(THD *thd,
st_debug_sync_action *action)
} else { - const char *dsp_name= action->sync_point.c_ptr(); + const Debug_token dsp_name(action->sync_point.c_ptr(), + action->sync_point.length());
or dsp_name(action->sync_point->to_lex_cstring()) ?
Fixed. Also changed the DBUG_PRINT format from '%.s' to '%.*s'.
#ifdef DBUG_TRACE DBUG_EXECUTE("debug_sync", { /* Functions as DBUG_PRINT args can change keyword and line
diff --git a/sql/grant.cc b/sql/grant.cc index 1ba197bc1d9..0ebf47431bc 100644 --- a/sql/grant.cc +++ b/sql/grant.cc @@ -31,8 +31,8 @@ bool Grant_privilege::add_column_privilege(THD *thd, class LEX_COLUMN *point; while ((point=iter++)) { - if (!my_strcasecmp(system_charset_info, - point->column.c_ptr(), new_str->c_ptr())) + const Lex_ident_column tmp(point->column.c_ptr(),
nr. */ point->column.length());
That feels like an overkill, creating Lex_ident_column just to compare two strings. What is it compiled to at -O3?
I checked the assembler code using "objdump -DC -l": - Without -O3 the constructor call is seen in the output. - With -O3 the constructor call disappears. There's only the code putting strnncall() arguments to registers followed by the call for strnncall() itself. So there seems to be no overkills in the constructor call itself. However, there is an overill in: calling c_ptr(), which is required to construct a valid Lex_ident_xxx versus calling just ptr(), which is perfectly enough for comparison. So I changed these lines:
+ if (tmp.streq(new_str->to_lex_cstring())) break;
to: if (Lex_ident_column::charset_info()->streq(point->column.to_lex_cstring(), name)) break;
} m_column_privilege_total|= which_grant; diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc index a615d26b9a9..fc840c21191 100644 --- a/sql/ha_partition.cc +++ b/sql/ha_partition.cc @@ -118,7 +118,7 @@ int ha_partition::notify_tabledef_changed(LEX_CSTRING *db, LEX_CSTRING table_name; const char *table_name_ptr; if (create_partition_name(from_buff, sizeof(from_buff), - from_path, name_buffer_ptr, + from_path, Lex_cstring_strlen(name_buffer_ptr),
Don't, please. You _always_ use Lex_cstring_strlen, in _all_ invocations of create_partition_name(). Just pass a char* down as before and do strlen inside, if you need it.
Fixed.
NORMAL_PART_NAME, FALSE)) res=1; table_name_ptr= from_buff + dirname_length(from_buff); diff --git a/sql/handler.cc b/sql/handler.cc index a977054d0da..7b9d11d00a2 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -7531,7 +7528,8 @@ int
handler::check_duplicate_long_entry_key(const uchar *new_rec, uint key_no)
else { Item_func_left *fnc= static_cast<Item_func_left
*>(arguments[j]);
- DBUG_ASSERT(!my_strcasecmp(system_charset_info, "left", fnc->func_name())); + DBUG_ASSERT(Lex_ident_func(fnc->func_name_cstring()).
eh, why func_name_cstring() doesn't return Lex_ident_func?
I agree it should be Lex_ident_func. However, "grep LEX_CSTRING.func_name_cstring" shows 388 instances. It's going to be a huge change. Can we do it in a separate patch?
+ streq("left"_Lex_cstring)); DBUG_ASSERT(fnc->arguments()[0]->type() == Item::FIELD_ITEM); t_field= static_cast<Item_field
*>(fnc->arguments()[0])->field;
uint length= (uint)fnc->arguments()[1]->val_int(); diff --git a/sql/item.cc b/sql/item.cc index 5c4f1704ebe..5689756eca9 100644 --- a/sql/item.cc +++ b/sql/item.cc @@ -3463,13 +3463,11 @@ bool Item_field::eq(const Item *item, bool
binary_cmp) const
(In cases where we would choose wrong we would have to generate a ER_NON_UNIQ_ERROR). */ - return (!lex_string_cmp(system_charset_info, &item_field->name, - &field_name) && - (!item_field->table_name.str || !table_name.str || - (!my_strcasecmp(table_alias_charset, item_field->table_name.str, - table_name.str) && - (!item_field->db_name.str || !db_name.str || - (item_field->db_name.str && !strcmp(item_field->db_name.str, + return (item_field->name.streq(field_name) && + (!item_field->table_name.str || !table_name.str || + (item_field->table_name.streq(table_name) && + (!item_field->db_name.str || !db_name.str || + (item_field->db_name.str &&
!strcmp(item_field->db_name.str,
db_name.str))))));
I presume, # db_name should also be compared with streq
Fixed.
}
diff --git a/sql/item_create.cc b/sql/item_create.cc index 4f0df8f732d..501db19c1b8 100644 --- a/sql/item_create.cc +++ b/sql/item_create.cc @@ -2638,7 +2639,8 @@ Create_qfunc::create_func(THD *thd, const
LEX_CSTRING *name,
if (thd->lex->copy_db_to(&db)) return NULL;
- return create_with_db(thd, &db, name, false, item_list);
Again, can LEX_CSTRING be implicitly cast to the correct Lex_ident variant?
Replied above.
+ return create_with_db(thd, Lex_ident_db(db), Lex_ident_func(*name), + false, item_list); }
diff --git a/sql/item_windowfunc.cc b/sql/item_windowfunc.cc index 5438cade27a..f0a5737bdf7 100644 --- a/sql/item_windowfunc.cc +++ b/sql/item_windowfunc.cc @@ -28,7 +28,7 @@ Item_window_func::resolve_window_name(THD *thd) return false; } DBUG_ASSERT(window_name != NULL && window_spec == NULL); - const char *ref_name= window_name->str; + const LEX_CSTRING &ref_name= *window_name;
Why LEX_CSTRING ?
Item_window_func::window_name haven't been changed to LEX_CSTRING yet. I agree, this should be also fixed eventually to Lex_ident_xxx. ref_name will change its data type at the same time.
/* !TODO: Add the code to resolve ref_name in outer queries */ /* diff --git a/sql/opt_trace.cc b/sql/opt_trace.cc index 1008690ccaa..dbb2708f487 100644 --- a/sql/opt_trace.cc +++ b/sql/opt_trace.cc @@ -43,7 +43,7 @@ bool list_has_optimizer_trace_table(const
TABLE_LIST *tbl)
for (; tbl; tbl= tbl->next_global) { if (tbl->schema_table && - 0 == strcmp(tbl->schema_table->table_name, I_S_table_name)) + 0 == strcmp(tbl->schema_table->table_name.str, I_S_table_name))
Why not streq?
I'm not sure how exactly it should be compared here: 1. Like a normal table, according to Lex_ident_table 2. Like an I_S table, according to Lex_ident_i_s_table 3. Case sensitively, like it was in the old code. I decided to go with #3, to preserve the old behavior. Should we report it as a separate MDEV and ask Sergey Petrunya to check?
return true; } return false; @@ -190,9 +190,8 @@ void
opt_trace_disable_if_no_security_context_access(THD *thd)
if (!(thd->main_security_ctx.check_access(GLOBAL_ACLS & ~GRANT_ACL)) && (0 != strcmp(thd->main_security_ctx.priv_user, thd->security_context()->priv_user) || - 0 != my_strcasecmp(system_charset_info, - thd->main_security_ctx.priv_host, - thd->security_context()->priv_host))) + !Lex_ident_host(Lex_cstring_strlen(thd->main_security_ctx.priv_host)). + streq(Lex_cstring_strlen(thd->security_context()->priv_host))))
Doesn't really look like an improvement 🙂
It does not have to look like an improvement in cases where a C string gets converted to Lex_ident_xxx. It should look heavy, to encorange the switch from C strings to LEX_CSTRING/Lex_ident_xxx based data types.
May be you can get rid of intermediate Lex_cstring_strlen()'s ?
I'd like to avoid adding direct Lex_ident_xxx constructors getting a "const char*". This will hide the problem of a not efficient code. Having Lex_cstring_strlen() in the code is a good signal that something is not perfect with data types.
trace->missing_privilege(); }
diff --git a/sql/partition_info.cc b/sql/partition_info.cc index 1b65de6e3c0..7b76c6d65b1 100644 --- a/sql/partition_info.cc +++ b/sql/partition_info.cc @@ -1403,8 +1397,8 @@ bool partition_info::check_partition_info(THD
*thd, handlerton **eng_type,
num_parts_not_set++; part_elem->engine_type= default_engine_type; } - if (check_table_name(part_elem->partition_name, - strlen(part_elem->partition_name), FALSE)) + if (check_table_name(part_elem->partition_name.str, + part_elem->partition_name.length, FALSE))
better change check_table_name to accept a LEX_CSTRING or Lex_ident, it'll simplify almost all (except one) invocations of it
Done.
{ my_error(ER_WRONG_PARTITION_NAME, MYF(0)); goto end; @@ -2604,8 +2596,8 @@ bool
partition_info::has_same_partitioning(partition_info *new_part_info)
part_state must be PART_NORMAL! */ if (!part_elem || !new_part_elem || - strcmp(part_elem->partition_name, - new_part_elem->partition_name) || + strcmp(part_elem->partition_name.str, + new_part_elem->partition_name.str) ||
why strcmp?
strcmp() can be intentional here. If the table has "PARTITION P00 VALUES LESS THAN (0)" (with upper case P in P00) and an LATER TABLE has "PARTITION p00 VALUES LESS THAN (0)" (with lower case p in p00), the table should probably be rebuild to change the name. Even if it was not intentional, this code works only for pre 5.5.3 versions. It's not important to change it now.
part_elem->part_state != PART_NORMAL || new_part_elem->part_state != PART_NORMAL || part_elem->max_value != new_part_elem->max_value || @@ -2666,8 +2658,8 @@ bool
partition_info::has_same_partitioning(partition_info *new_part_info)
sub_part_elem->engine_type !=
new_sub_part_elem->engine_type)
DBUG_RETURN(false);
- if (strcmp(sub_part_elem->partition_name, - new_sub_part_elem->partition_name) || + if (strcmp(sub_part_elem->partition_name.str, + new_sub_part_elem->partition_name.str) ||
same here
See above.
sub_part_elem->part_state != PART_NORMAL || new_sub_part_elem->part_state != PART_NORMAL || sub_part_elem->part_min_rows != diff --git a/sql/partition_info.h b/sql/partition_info.h index 3a8c3a37524..92e8906274f 100644 --- a/sql/partition_info.h +++ b/sql/partition_info.h @@ -526,11 +527,13 @@ void partition_info::vers_update_el_ids() }
-inline -bool make_partition_name(char *move_ptr, uint i) +static inline +Lex_ident_partition make_partition_name_ls(char *move_ptr, uint i)
it's not _ls really, is it? why not to keep the old name?
Fixed.
{ int res= snprintf(move_ptr, MAX_PART_NAME_SIZE + 1, "p%u", i); - return res < 0 || res > MAX_PART_NAME_SIZE; + return res < 0 || res > MAX_PART_NAME_SIZE ? + Lex_ident_partition() : + Lex_ident_partition(move_ptr, (size_t) res); }
@@ -549,15 +552,16 @@ uint partition_info::next_part_no(uint
for (uint cur_part= 0; cur_part < new_parts; ++cur_part, ++suffix) { uint32 cur_suffix= suffix; - if (make_partition_name(part_name, suffix)) + Lex_ident_partition
new_parts) const part_name_ls(make_partition_name_ls(part_name, suffix));
+ if (!part_name_ls.str) return 0; partition_element *el; it.rewind(); while ((el= it++)) { - if (0 == my_strcasecmp(&my_charset_latin1, el->partition_name, part_name)) + if (el->partition_name.streq(part_name_ls)) { - if (make_partition_name(part_name, ++suffix)) + if (!(part_name_ls= make_partition_name_ls(part_name, ++suffix)).str)
you don't seem to use part_name_ls below
It's used on the next iteration in this line above: if (el->partition_name.streq(part_name_ls))
return 0; it.rewind(); } diff --git a/sql/set_var.h b/sql/set_var.h index aed4955ef62..8f7e3609f5f 100644 --- a/sql/set_var.h +++ b/sql/set_var.h @@ -383,7 +383,7 @@ class set_var_default_role: public set_var_base { LEX_USER *user, *real_user; LEX_CSTRING role; - const char *real_role; + LEX_CSTRING real_role;
Not Lex_ident_user ?
I agree, it'll be nice to have Lex_ident_user. However, I did not add a data type "Lex_ident_user" at this point for some reasons: - The main idea of this patch is to replace strcasecmp() to Lex_ident_xxx::streq(). All identifiers being added in this patch are either always case insensitive, or depend on lower-case-table-names. There are no new data types for pure case sensitive identifiers. User names are somewhat out of scope of this patch, as they are case sensitive. Currently there is no a goal to replace all strcmp() calls to Lex_ident_yyy::streq(). - It'll possibly require changes in sql_acl.cc, which depends too much on the other patch coming (MDEV-31531 Remove my_casedn_str/my_caseup_str). I wanted to have as few changes in sql_acl.cc as possible. I propose to introduce Lex_ident_user separately.
public: set_var_default_role(LEX_USER *user_arg, LEX_CSTRING role_arg) : user(user_arg), role(role_arg) {} diff --git a/sql/slave.cc b/sql/slave.cc index 0a15cc55146..daeb05f423e 100644 --- a/sql/slave.cc +++ b/sql/slave.cc @@ -3403,8 +3403,9 @@ static bool send_show_master_info_data(THD
*thd, Master_info *mi, bool full,
static int cmp_mi_by_name(const Master_info **arg1, const Master_info **arg2) { - return my_strcasecmp(system_charset_info, (*arg1)->connection_name.str, - (*arg2)->connection_name.str); + return Lex_ident_master_info::charset_info()->strnncoll( + (*arg1)->connection_name, + (*arg2)->connection_name);
Why not (*arg1)->connection_name.streq() ?
Here we compare for sorting. It should return negative, positive or zero results. Not only "equal vs not equal".
}
diff --git a/sql/sp_pcontext.cc b/sql/sp_pcontext.cc index 848d1f0c655..761f535832d 100644 --- a/sql/sp_pcontext.cc +++ b/sql/sp_pcontext.cc @@ -139,8 +139,7 @@ sp_pcontext *sp_pcontext::push_context(THD *thd,
sp_pcontext::enum_scope scope)
bool cmp_labels(sp_label *a, sp_label *b) { - return (lex_string_cmp(system_charset_info, &a->name, &b->name)
== 0 &&
- a->type == b->type); + return a->name.streq(b->name) && a->type == b->type;
put type comparison first, please
Done.
}
sp_pcontext *sp_pcontext::pop_context() diff --git a/sql/sql_acl.cc b/sql/sql_acl.cc index 09af5523c76..dd36504ed56 100644 --- a/sql/sql_acl.cc +++ b/sql/sql_acl.cc @@ -567,14 +560,17 @@ static uchar* acl_role_get_key(ACL_ROLE
*entry, size_t *length,
struct ROLE_GRANT_PAIR : public Sql_alloc { - char *u_uname; - char *u_hname; - char *r_uname; + LEX_CSTRING u_uname; + LEX_CSTRING u_hname; + LEX_CSTRING r_uname;
Lex_ident_user ? (and in many places below)
See comments about Lex_ident_user above.
LEX_STRING hashkey; bool with_admin;
- bool init(MEM_ROOT *mem, const char *username, const char *hostname, - const char *rolename, bool with_admin_option); + bool init(MEM_ROOT *mem, + const LEX_CSTRING &username, + const LEX_CSTRING &hostname, + const LEX_CSTRING &rolename, + bool with_admin_option); };
static uchar* acl_role_map_get_key(ROLE_GRANT_PAIR *entry, size_t
*length,
@@ -2731,7 +2733,7 @@ static bool acl_load(THD *thd, const Grant_tables& tables) char *db_name; db.user=safe_str(get_field(&acl_memroot, db_table.user())); const char *hostname= get_field(&acl_memroot, db_table.host()); - if (!hostname && find_acl_role(db.user, true)) + if (!hostname && find_acl_role(Lex_cstring_strlen(db.user), true))
Perhaps it'd be better to change user and db to be Lex_ident_* in ACL_DB?
That's true. Can we do it in a separate task?
hostname= ""; update_hostname(&db.host, hostname); db.db= db_name= get_field(&acl_memroot, db_table.db()); @@ -2822,14 +2824,29 @@ static bool acl_load(THD *thd, const
Grant_tables& tables)
init_alloc_root(key_memory_acl_mem, &temp_root,
ACL_ALLOC_BLOCK_SIZE, 0, MYF(0));
while (!(read_record_info.read_record())) { - char *hostname= safe_str(get_field(&temp_root,
roles_mapping_table.host()));
- char *username= safe_str(get_field(&temp_root, roles_mapping_table.user())); - char *rolename= safe_str(get_field(&temp_root, roles_mapping_table.role())); + StringBuffer<MAX_FIELD_WIDTH> host_buff, user_buff, role_buff; + /* + find_user_by_name() called recursively later in this code block
There's no find_user_by_name() and acl_find_user_by_name does not seem to be called recursively. What did you mean here?
I removed the word "recursive" and rephased the comment.
also, why you didn't change acl_find_user_by_name to take a Lex_ident?
I did not find reasons for that: - it does not need the user length, so we don't safe anything inside. - it's used only in two places, so we don't safe much in the places where it's called
+ needs a 0-terminated string. So does sql_print_error(). + As val_lex_cstring() does not guarantee a 0-terminated string, + let's alloc 0-terminated copies of val_lex_cstring() results on + the memory root.
you've already added Field::val_lex_string_strmake() in August, why not to use it here?
Yes, val_lex_string_strmake() suites perfectly here, fixed. Thanks.
+ */ + const Lex_cstring hostname= safe_lexcstrdup_root(&temp_root, + roles_mapping_table.host()-> + val_lex_cstring(&host_buff)); + const Lex_cstring username= safe_lexcstrdup_root(&temp_root, + roles_mapping_table.user()-> + val_lex_cstring(&user_buff)); + const Lex_cstring rolename= safe_lexcstrdup_root(&temp_root, + roles_mapping_table.role()-> + val_lex_cstring(&role_buff)); + bool with_grant_option=
if (add_role_user_mapping(username, hostname, rolename)) { sql_print_error("Invalid roles_mapping table entry
user:'%s@%s', rolename:'%s'",
- username, hostname, rolename); + username.str, hostname.str, rolename.str); continue; }
@@ -4443,28 +4484,31 @@ static ACL_USER * find_user_wild(const char *host, const char *user, const char /* Find a role with the specified name */ -static ACL_ROLE *find_acl_role(const char *role, bool allow_public) +static ACL_ROLE *find_acl_role(const LEX_CSTRING &role, bool allow_public) { - size_t length= strlen(role); DBUG_ENTER("find_acl_role"); - DBUG_PRINT("enter",("role: '%s'", role)); + DBUG_PRINT("enter",("role: '%s'", role.str)); DBUG_PRINT("info", ("Hash elements: %ld", acl_roles.records));
mysql_mutex_assert_owner(&acl_cache->lock);
- if (!length || (!allow_public && strcasecmp(role,
get_YN_as_bool(roles_mapping_table.admin_option()); public_name.str) == 0))
+ if (!role.length || + (!allow_public && + my_charset_utf8mb3_general1400_as_ci.streq(role, public_name)))
Lex_ident_user would've been nice here too
See comments about Lex_ident_user above.
DBUG_RETURN(NULL);
- ACL_ROLE *r= (ACL_ROLE *)my_hash_search(&acl_roles, (uchar
*)role, length);
+ ACL_ROLE *r= (ACL_ROLE *)my_hash_search(&acl_roles, (uchar *)role.str, + role.length); DBUG_RETURN(r); }
/* Finds a grantee - something that privileges or roles can be granted to. */ -static ACL_USER_BASE *find_acl_user_base(const char *user, const char *host) +static ACL_USER_BASE *find_acl_user_base(const LEX_CSTRING &user, + const LEX_CSTRING &host) { - if (*host) + if (*host.str) return find_user_exact(host, user);
return find_acl_role(user, true); @@ -5657,8 +5696,8 @@ static GRANT_NAME *name_hash_search(HASH *name_hash, { if (!grant_name->host.hostname || (host && - !my_strcasecmp(system_charset_info, host, - grant_name->host.hostname)) || + Lex_ident_host(Lex_cstring_strlen(host)).
Still don't like these Lex_ident_something(Lex_cstring_strlen(xxx)) Let's get rid of them
We'll get rid of them when when change grant_name->host.hostname from "const char *" to Lex_ident_host. Can we please do this in a separate task?
+ streq(Lex_cstring_strlen(grant_name->host.hostname))) || (ip && !strcmp(ip, grant_name->host.hostname))) return grant_name; } @@ -7373,8 +7414,9 @@ int mysql_table_grant(THD *thd, TABLE_LIST
} } if (Str->is_role()) - propagate_role_grants(find_acl_role(Str->user.str, true), - PRIVS_TO_MERGE::TABLE_COLUMN, db_name,
*table_list, table_name);
+ propagate_role_grants(find_acl_role(Str->user, true), + PRIVS_TO_MERGE::TABLE_COLUMN, + db_name.str, table_name.str);
may be pass Lex_ident_* down into propagate_role_grants too?
Why? Db and table lengths does not seem to be used inside propagate_role_grants().
}
thd->mem_root= old_root; diff --git a/sql/sql_base.cc b/sql/sql_base.cc index 9d2415d3a4d..d7d7ebee024 100644 --- a/sql/sql_base.cc +++ b/sql/sql_base.cc @@ -233,33 +233,41 @@ uint get_table_def_key(const TABLE_LIST
*table_list, const char **key)
# Pointer to list of names of open tables. */
-struct list_open_tables_arg +class list_open_tables_arg { +public: THD *thd; - const char *db; + const Lex_ident_db db; const char *wild; TABLE_LIST table_list; OPEN_TABLE_LIST **start_list, *open_list; + + list_open_tables_arg(THD *thd_arg, const LEX_CSTRING &db_arg, + const char *wild_arg) + :thd(thd_arg), db(db_arg), wild(wild_arg), + start_list(&open_list), open_list(0) + { + bzero((char*) &table_list, sizeof(table_list)); + } };
static my_bool list_open_tables_callback(TDC_element *element, list_open_tables_arg *arg) { - const char *db= (char*) element->m_key; - size_t db_length= strlen(db); - const char *table_name= db + db_length + 1; + const Lex_ident_db + db= Lex_ident_db(Lex_cstring_strlen((const char*) element->m_key)); + const char *table_name= db.str + db.length + 1;
- if (arg->db && my_strcasecmp(system_charset_info, arg->db, db)) + // QQ: a bug was here?
same as above, "was"
I reported this problem as: MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0 This patch fixes the problem. But it should be backported to 10.4.
+ if (arg->db.str && !arg->db.streq(db)) return FALSE; if (arg->wild && wild_compare(table_name, arg->wild, 0)) return FALSE;
/* Check if user has SELECT privilege for any column in the table */ - arg->table_list.db.str= db; - arg->table_list.db.length= db_length; - arg->table_list.table_name.str= table_name; - arg->table_list.table_name.length= strlen(table_name); + arg->table_list.db= db; + arg->table_list.table_name= Lex_cstring_strlen(table_name); arg->table_list.grant.privilege= NO_ACL;
if (check_table_access(arg->thd, SELECT_ACL, &arg->table_list,
TRUE, 1, TRUE))
@@ -1906,7 +1904,7 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) DBUG_ASSERT(!table_list->table);
/* an open table operation needs a lot of the stack space */ - if (check_stack_overrun(thd, STACK_MIN_SIZE_FOR_OPEN, (uchar *)&alias)) + if (check_stack_overrun(thd, STACK_MIN_SIZE_FOR_OPEN, (uchar *)&alias.str))
better pass `key` above. doesn't matter much but looks a bit less weird
I'd like to avoid to change the logic that's not directly related to the task.
DBUG_RETURN(TRUE);
if (!(flags & MYSQL_OPEN_IGNORE_KILLED) && thd->killed) @@ -1954,7 +1952,10 @@ bool open_table(THD *thd, TABLE_LIST
if (table->s->table_cache_key.length == key_length && !memcmp(table->s->table_cache_key.str, key, key_length)) { - if (!my_strcasecmp(system_charset_info,
*table_list, Open_table_context *ot_ctx) table->alias.c_ptr(), alias) &&
+ // QQ: Why table_alias_charset is not used for comparison?
Did you try to correct this?
As we agreed on slack, this was a bug. I reported it as: MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0 This patch fixes the problem. But it should also be backported to 10.4.
+ if (my_charset_utf8mb3_general1400_as_ci.streq( +
+ alias) && table->query_id != thd->query_id && /* skip tables already used */ (thd->locked_tables_mode == LTM_LOCK_TABLES || table->query_id == 0)) @@ -6353,26 +6355,25 @@ find_field_in_natural_join(THD *thd, TABLE_LIST *table_ref, const char *name, si */
Field * -find_field_in_table(THD *thd, TABLE *table, const char *name, size_t length, +find_field_in_table(THD *thd, TABLE *table, const Lex_ident_column &name, bool allow_rowid, field_index_t *cached_field_index_ptr) { Field *field; field_index_t cached_field_index= *cached_field_index_ptr; DBUG_ENTER("find_field_in_table"); DBUG_PRINT("enter", ("table: '%s', field name: '%s'",
- name)); + name.str));
/* We assume here that table->field < NO_CACHED_FIELD_INDEX = UINT_MAX */ if (cached_field_index < table->s->fields && - !my_strcasecmp(system_charset_info, -
table->alias.to_lex_cstring(), table->alias.c_ptr(), table->field[cached_field_index]->field_name.str, name))
+ table->field[cached_field_index]->field_name.streq(name)) { field= table->field[cached_field_index]; DEBUG_SYNC(thd, "table_field_cached"); } else { - LEX_CSTRING fname= {name, length}; + LEX_CSTRING fname= name;
Do you need that? You can use `name` directly below
Fixed.
field= table->find_field_by_name(&fname); }
@@ -6999,9 +7000,9 @@ find_item_in_list(Item *find, List<Item>
&items, uint *counter,
List_iterator<Item> li(items); uint n_items= limit == 0 ? items.elements : limit; Item **found=0, **found_unaliased= 0, *item; - const char *db_name=0; + const LEX_CSTRING *db_name=0; const LEX_CSTRING *field_name= 0; - const char *table_name=0; + const LEX_CSTRING *table_name=0;
Why not Lex_ident's ?
Fixed.
bool found_unaliased_non_uniq= 0; /* true if the item that we search for is a valid name reference @@ -7059,12 +7060,11 @@ find_item_in_list(Item *find, List<Item>
&items, uint *counter,
item is not fix_field()'ed yet. */ if (item_field->field_name.str && item_field->table_name.str && - !lex_string_cmp(system_charset_info, &item_field->field_name, - field_name) && - !my_strcasecmp(table_alias_charset,
item_field->table_name.str,
- table_name) && - (!db_name || (item_field->db_name.str && - !strcmp(item_field->db_name.str, db_name)))) + item_field->field_name.streq(*field_name) && + item_field->table_name.streq(*table_name) && + (!db_name || !db_name->str || + (item_field->db_name.str && + !strcmp(item_field->db_name.str, db_name->str))))
why strcmp?
Fixed to: item_field->db_name.streq(*db_name)
{ if (found_unaliased) { @@ -7135,11 +7132,10 @@ find_item_in_list(Item *find, List<Item>
&items, uint *counter,
} } } - else if (!table_name) + else if (!table_name || !table_name->str) { if (is_ref_by_name && find->name.str && item->name.str && - find->name.length == item->name.length && - !lex_string_cmp(system_charset_info, &item->name, &find->name)) + item->name.streq(find->name))
Do you have tests that will fail if streq will compare lengths?
Thanks for noticing this. I have now added a test into identifier.test
{ found= li.ref(); *counter= i; diff --git a/sql/sql_class.cc b/sql/sql_class.cc index a2414de645b..1b36cd9416f 100644 --- a/sql/sql_class.cc +++ b/sql/sql_class.cc @@ -855,10 +853,13 @@ THD::THD(my_thread_id id, bool is_wsrep_applier) profiling.set_thd(this); #endif user_connect=(USER_CONN *)0; - my_hash_init(key_memory_user_var_entry, &user_vars,
system_charset_info,
+ my_hash_init(key_memory_user_var_entry, &user_vars, + Lex_ident_user_var::charset_info(), USER_VARS_HASH_SIZE, 0, 0, (my_hash_get_key) get_var_key, (my_hash_free_key) free_user_var, HASH_THREAD_SPECIFIC); - my_hash_init(PSI_INSTRUMENT_ME, &sequences, system_charset_info, + my_hash_init(PSI_INSTRUMENT_ME, &sequences, + // QQ: why case insensitive?
looks like a bug, can you do a test case for it, please?
There is a bug indeed. I reported it as: MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0 The current patch fixes it by using Lex_ident_fs::charset_info() instead of system_charset_info. The fix should also be backported to 10.4.
+ &my_charset_utf8mb3_general1400_as_ci, SEQUENCES_HASH_SIZE, 0, 0, (my_hash_get_key) get_sequence_last_key, (my_hash_free_key)
free_sequence_last,
HASH_THREAD_SPECIFIC); @@ -4011,7 +4015,7 @@
Table_ident::to_ident_db_internal_with_error(Query_arena *arena) const
if (is_derived_table()) { DBUG_ASSERT(db.str == empty_c_string && db.length == 0); - return Lex_ident_db(empty_c_string, 0); + return Lex_ident_db(empty_c_string, (size_t) 0);
why?
Removed the cast.
} // Normal table or JSON table return arena->to_ident_db_internal_with_error(db); diff --git a/sql/sql_class.h b/sql/sql_class.h index 214118d8431..5685240ceaf 100644 --- a/sql/sql_class.h +++ b/sql/sql_class.h @@ -433,13 +435,14 @@ class Alter_rename_key : public Sql_alloc class Alter_index_ignorability: public Sql_alloc { public: - Alter_index_ignorability(const char *name, bool is_ignored, bool
if_exists) :
+ Alter_index_ignorability(const LEX_CSTRING &name, + bool is_ignored, bool if_exists) : m_name(name), m_is_ignored(is_ignored), m_if_exists(if_exists) { - assert(name != NULL); + assert(name.str != NULL);
DBUG_ASSERT
Fixed.
}
- const char *name() const { return m_name; } + const Lex_ident_column &name() const { return m_name; } bool if_exists() const { return m_if_exists; }
/* The ignorability after the operation is performed. */ @@ -7989,42 +7995,24 @@ class Identifier_chain2 class Database_qualified_name { public: - LEX_CSTRING m_db; - LEX_CSTRING m_name; - Database_qualified_name(const LEX_CSTRING *db, const LEX_CSTRING *name) - :m_db(*db), m_name(*name) + Lex_ident_db m_db; + Lex_cstring m_name; // no comparison semantics + Database_qualified_name() { } - Database_qualified_name(const LEX_CSTRING &db, const LEX_CSTRING &name) + Database_qualified_name(const Lex_ident_db &db, const LEX_CSTRING &name) :m_db(db), m_name(name) { } - Database_qualified_name(const char *db, size_t db_length, - const char *name, size_t name_length) - { - m_db.str= db; - m_db.length= db_length; - m_name.str= name; - m_name.length= name_length; - } - - bool eq(const Database_qualified_name *other) const + bool eq_routine_name(const Database_qualified_name *other) const
eq_func_name, because the comparison semantics comes from Lex_ident_func
As agreed on slack, I renamed Lex_ident_func to Lex_ident_routine, as it's used for built-in and stored procedures (not only for built-in and stored functions).
{ - CHARSET_INFO *cs= lower_case_table_names ? - &my_charset_utf8mb3_general_ci : - &my_charset_utf8mb3_bin; - return - m_db.length == other->m_db.length && - m_name.length == other->m_name.length && - !cs->strnncoll(m_db.str, m_db.length, - other->m_db.str, other->m_db.length) && - !cs->strnncoll(m_name.str, m_name.length, - other->m_name.str, other->m_name.length); + return m_db.streq(other->m_db) &&
no need to take lower_case_table_names into account when comparing db names here?
m_db.streq() takes into account lower_case_table_names.
+ Lex_ident_func(m_name).streq(other->m_name); } /* Make copies of "db" and "name" on the memory root in internal
- Lower-case "db" if lower-case-table-names==1. - Preserve "name" as is. */ - bool copy_sp_name_internal(MEM_ROOT *mem_root, const LEX_CSTRING &db, + bool copy_sp_name_internal(MEM_ROOT *mem_root, const Lex_ident_db
&db,
const LEX_CSTRING &name);
// Export db and name as a qualified name string: 'db.name' diff --git a/sql/sql_connect.cc b/sql/sql_connect.cc index c5619b28f69..4e10f498e3c 100644 --- a/sql/sql_connect.cc +++ b/sql/sql_connect.cc @@ -509,7 +514,9 @@ extern "C" void free_table_stats(TABLE_STATS*
format: table_stats)
void init_global_table_stats(void) { - my_hash_init(PSI_INSTRUMENT_ME, &global_table_stats,
system_charset_info,
+ my_hash_init(PSI_INSTRUMENT_ME, &global_table_stats, + // QQ: why not Lex_ident_table::charset_info() ?
just change it
Done. Reported as: MDEV-33108 TABLE_STATISTICS and INDEX_STATISTICS are case insensitive with lower-case-table-names=0\ Should be backported to 10.4.
+ &my_charset_utf8mb3_general1400_as_ci, max_connections, 0, 0, (my_hash_get_key) get_key_table_stats, (my_hash_free_key) free_table_stats, 0); } diff --git a/sql/sql_db.cc b/sql/sql_db.cc index 08441d9f067..51635bdf0ce 100644 --- a/sql/sql_db.cc +++ b/sql/sql_db.cc @@ -98,12 +98,10 @@ typedef struct my_dbopt_st */
static inline bool -cmp_db_names(LEX_CSTRING *db1_name, const LEX_CSTRING *db2_name) +cmp_db_names(const Lex_ident_db &db1_name, const Lex_ident_db &db2_name) { - return (db1_name->length == db2_name->length && - (db1_name->length == 0 ||
can it be used with empty names, really?
Yes, it can. thd->db is passed to this function. This is the comment before the function: /** ... The function allows to compare database names according to the MariaDB rules. The database names db1 and db2 are equal if: - db1 is NULL and db2 is NULL; or - db1 is not-NULL, db2 is not-NULL, db1 is equal to db2 in table_alias_charset ... */
- my_strcasecmp(table_alias_charset, - db1_name->str, db2_name->str) == 0)); + return (db1_name.length == 0 && db2_name.length == 0) || + db1_name.streq(db2_name); }
#ifdef HAVE_PSI_INTERFACE @@ -1087,8 +1085,7 @@ mysql_rm_db_internal(THD *thd, const
Lex_ident_db &db, bool if_exists,
Disable drop of enabled log tables, must be done before name
locking.
This check is only needed if we are dropping the "mysql" database. */ - if ((rm_mysql_schema= - (my_strcasecmp(system_charset_info, MYSQL_SCHEMA_NAME.str,
db.str) == 0)))
+ if ((rm_mysql_schema= MYSQL_SCHEMA_NAME.streq(db))) // QQ: was a bug here?
again, "was"
Reported as: MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0 Should be backported to 10.4.
{ for (table= tables; table; table= table->next_local) if (check_if_log_table(table, TRUE, "DROP")) diff --git a/sql/sql_handler.cc b/sql/sql_handler.cc index 406f67ff955..def1fc44392 100644 --- a/sql/sql_handler.cc +++ b/sql/sql_handler.cc @@ -1080,11 +1079,14 @@ static SQL_HANDLER *mysql_ha_find_match(THD
*thd, TABLE_LIST *tables)
{ if (tables->is_anonymous_derived_table()) continue; + /* + TODO: for some reasons the code user case-insensitive
comparison.
+ Shouldn't it use Lex_ident_db/Lex_ident_table instead, + to take into account the filesystem case sensitivity? + */
must be a bug. try to create a test case for it. I think it could be something like (untested)
create table t1 ...; create table T1; handler t1 open; handler t1 read first; handler t1 read next; drop table T1; handler t1 read next;
It's a bug. I reported it as: MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0 This patch fixes the problem. It should also be backported to 10.4.
if ((! tables->db.str[0] || - ! my_strcasecmp(&my_charset_latin1, hash_tables->db.str, - tables->get_db_name())) && - ! my_strcasecmp(&my_charset_latin1,
hash_tables->table_name.str,
- tables->get_table_name())) + Lex_ident_ci(hash_tables->db).streq(tables->get_db_name())) && + Lex_ident_ci(hash_tables->table_name).streq(tables->get_table_name())) { /* Link into hash_tables list */ hash_tables->next= head; diff --git a/sql/sql_i_s.h b/sql/sql_i_s.h index 263031ae2c9..127b35f6ab2 100644 --- a/sql/sql_i_s.h +++ b/sql/sql_i_s.h @@ -329,7 +325,7 @@ typedef class Item COND;
typedef struct st_schema_table { - const char *table_name; + LEX_CSTRING table_name;
why LEX_CSTRING?
Changed to Lex_ident_i_s_table.
ST_FIELD_INFO *fields_info; /* for FLUSH table_name */ int (*reset_table) (); diff --git a/sql/sql_lex.cc b/sql/sql_lex.cc index a84fd29affc..b74ecb51a69 100644 --- a/sql/sql_lex.cc +++ b/sql/sql_lex.cc @@ -9370,8 +9359,8 @@ bool LEX::create_package_finalize(THD *thd, { if (name2 && (name2->m_explicit_name != name->m_explicit_name || - strcmp(name2->m_db.str, name->m_db.str) || - !Sp_handler::eq_routine_name(name2->m_name, name->m_name))) + // QQ: was a bug here?
"was" may be it was intentional, looking for an exact match only?
+ !name2->eq_routine_name(name))) { bool exp= name2->m_explicit_name || name->m_explicit_name; my_error(ER_END_IDENTIFIER_DOES_NOT_MATCH, MYF(0), diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc index 665d2143872..1430aa473af 100644 --- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -1131,8 +1131,8 @@ static bool wsrep_tables_accessible_when_detached(const TABLE_LIST *tables) { for (const TABLE_LIST *table= tables; table; table=
It was not intentional. But there were no bugs here. The database names come to here in normalized format, i.e. lower-cased when --lower-case-table-names >0. So strcmp() worked fine. I removed the QQ comment, but I won't revert this chunk, it's not harfmul. I added MTR tests into lowercase_table.test and lowercase_table2.test. table->next_global)
I'd personally use const TABLE_LIST *t= tables; ... it's a 3-line loop
Fixed.
{ - LEX_CSTRING db= table->db, tn= table->table_name; - if (get_table_category(&db, &tn) < TABLE_CATEGORY_INFORMATION) + if (get_table_category(table->db, + table->table_name) <
TABLE_CATEGORY_INFORMATION)
return false; } return tables != NULL; diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc index 958c52fd95c..c8d64ea37ff 100644 --- a/sql/sql_partition.cc +++ b/sql/sql_partition.cc @@ -8055,13 +8050,9 @@ void make_used_partitions_str(MEM_ROOT *alloc, if (parts_str->length()) parts_str->append(','); uint index= parts_str->length(); - parts_str->append(head_pe->partition_name, - strlen(head_pe->partition_name), - system_charset_info); + parts_str->append(head_pe->partition_name,
system_charset_info);
parts_str->append('_'); - parts_str->append(pe->partition_name, - strlen(pe->partition_name), - system_charset_info); + parts_str->append(pe->partition_name, system_charset_info);
better pe->partition_name->charset_info() (here and above and below)
Fixed.
used_partitions_list.append_str(alloc, parts_str->ptr() +
index);
} partition_id++; diff --git a/sql/sql_plugin.cc b/sql/sql_plugin.cc index 8bab244c709..f335226b9a2 100644 --- a/sql/sql_plugin.cc +++ b/sql/sql_plugin.cc @@ -374,16 +374,16 @@ bool check_valid_path(const char *path, size_t
len)
static void fix_dl_name(MEM_ROOT *root, LEX_CSTRING *dl) { - const size_t so_ext_len= sizeof(SO_EXT) - 1; - if (dl->length < so_ext_len || - my_strcasecmp(&my_charset_latin1, dl->str + dl->length -
so_ext_len,
- SO_EXT)) + const Lex_ident_plugin so_ext(SO_EXT, sizeof(SO_EXT) - 1);
STRING_WITH_LEN
Fixed. <cut>
diff --git a/sql/sql_show.cc b/sql/sql_show.cc index 1fcafd6ec08..7548ff1ec57 100644 --- a/sql/sql_show.cc +++ b/sql/sql_show.cc @@ -7182,10 +7170,12 @@ static int get_schema_views_record(THD *thd, TABLE_LIST *tables, Security_context *sctx= thd->security_ctx; if (!tables->allowed_show) { - if (!my_strcasecmp(system_charset_info, tables->definer.user.str, - sctx->priv_user) && - !my_strcasecmp(system_charset_info, tables->definer.host.str, - sctx->priv_host)) + // QQ: why case insensitive comparision?
Same as above? Try a test case and fix if a bug? it's likely
create user foo; create user FOO; create view ... definer=foo v1 ... connect(as FOO); select * from information schema.views;
Reported as: MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS This patch fixes the problem. Should be backported to 10.4.
+ if (my_charset_utf8mb3_general1400_as_ci.streq( + tables->definer.user, + Lex_cstring_strlen(sctx->priv_user)) && + Lex_ident_host(tables->definer.host). + streq(Lex_cstring_strlen(sctx->priv_host))) tables->allowed_show= TRUE; #ifndef NO_EMBEDDED_ACCESS_CHECKS else @@ -7893,8 +7883,7 @@ static int get_schema_partitions_record(THD
*thd, TABLE_LIST *tables,
while ((part_elem= part_it++)) { - table->field[3]->store(part_elem->partition_name, - strlen(part_elem->partition_name), cs); + table->field[3]->store(part_elem->partition_name, cs);
may be part_elem->partition_name->charset_info() ?
I added these two helper methods in Field: int store_ident(const Lex_ident_ci &str) { return store(str, str.charset_info()); } int store_ident(const Lex_ident_fs &str) { return store(str, str.charset_info()); } So now the call looks even simpler: table->field[3]->store_ident(part_elem->partition_name);
table->field[3]->set_notnull(); /* PARTITION_ORDINAL_POSITION */ table->field[5]->store((longlong) ++part_pos, TRUE); diff --git a/sql/sql_trigger.cc b/sql/sql_trigger.cc index 5304a535ad0..dcba31d06c9 100644 --- a/sql/sql_trigger.cc +++ b/sql/sql_trigger.cc @@ -470,7 +470,7 @@ bool mysql_create_or_drop_trigger(THD *thd,
TABLE_LIST *tables, bool create)
/* We don't allow creating triggers on tables in the 'mysql' schema */ - if (create && lex_string_eq(&tables->db, STRING_WITH_LEN("mysql"))) + if (create && tables->db.streq(MYSQL_SCHEMA_NAME)) // QQ: was a bug here?
"was"
{ my_error(ER_NO_TRIGGERS_ON_SYSTEM_SCHEMA, MYF(0)); DBUG_RETURN(TRUE); diff --git a/sql/sql_trigger.h b/sql/sql_trigger.h index 493349afb9d..5648f8f15ff 100644 --- a/sql/sql_trigger.h +++ b/sql/sql_trigger.h @@ -74,8 +74,10 @@ struct st_trg_execution_order /** Trigger name referenced in the FOLLOWS/PRECEDES clause of the CREATE TRIGGER statement. + Cannot be Lex_ident_trigger, + as this structure is used in %union in sql_yacc.yy
Thanks. I already wrote above "why it's not simply a Lex_ident_trigger. That explains it, deleted my previous comment 🙂
*/ - LEX_CSTRING anchor_trigger_name; + LEX_CSTRING anchor_trigger_name; // Used in sql_yacc %union };
diff --git a/sql/sql_truncate.cc b/sql/sql_truncate.cc index beeee9da72f..e3efc3122df 100644 --- a/sql/sql_truncate.cc +++ b/sql/sql_truncate.cc @@ -147,14 +147,12 @@ fk_truncate_illegal_if_parent(THD *thd, TABLE *table) /* Loop over the set of foreign keys for which this table is a
I reported this problem as: MDEV-33088 Cannot create triggers in the database `MYSQL` This patch fixes the problem and add an MTR test. But it should be backported to 10.4. parent. */
while ((fk_info= it++)) { - if (lex_string_cmp(system_charset_info, fk_info->referenced_db, - &table->s->db) || - lex_string_cmp(system_charset_info, fk_info->referenced_table, - &table->s->table_name) || - lex_string_cmp(system_charset_info, fk_info->foreign_db, - &table->s->db) || - lex_string_cmp(system_charset_info, fk_info->foreign_table, - &table->s->table_name)) + // QQ: why db and table don't use table_alias_charset for comparison?
Same as above? Try a test case and fix if a bug? test looks fairly straightforward here too
+ if (!Lex_ident_ci(*fk_info->referenced_db).streq(table->s->db) || + !Lex_ident_ci(*fk_info->referenced_table). + streq(table->s->table_name) || + !Lex_ident_ci(*fk_info->foreign_db).streq(table->s->db) || + !Lex_ident_ci(*fk_info->foreign_table).streq(table->s->table_name)) break; }
diff --git a/sql/sql_view.cc b/sql/sql_view.cc index 10588b54fa5..751373b4571 100644 --- a/sql/sql_view.cc +++ b/sql/sql_view.cc @@ -1347,12 +1347,9 @@ bool mysql_make_view(THD *thd, TABLE_SHARE *share, TABLE_LIST *table, precedent; precedent= precedent->referencing_view) { - if (precedent->view_name.length == table->table_name.length && - precedent->view_db.length == table->db.length && - my_strcasecmp(system_charset_info, - precedent->view_name.str,
I removed the "QQ" message and fixed to use Lex_ident_db::streq() and Lex_ident_table::streq(), but I could not make a test. - If the table "T1" is not referenced in any foreign keys (but at the same time if another table "t1" is referenced), then this function returns earlier, before this "while" loop. - If both tables "T1" and "t1" are referenced from foreign keys, then the letter case in the above tests does not matter :) table->table_name.str) == 0 &&
- my_strcasecmp(system_charset_info, - precedent->view_db.str, table->db.str) == 0) + // QQ: was a bug here?
"was"
I spent an hour trying to make an MTR test, but failed. Just removed the "QQ" line, without adding a test.
+ if (precedent->view_name.streq(table->table_name) && + precedent->view_db.streq(table->db)) { my_error(ER_VIEW_RECURSIVE, MYF(0), top_view->view_db.str, top_view->view_name.str); diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy index fcfe490c035..d01c195cb8a 100644 --- a/sql/sql_yacc.yy +++ b/sql/sql_yacc.yy @@ -15676,7 +15680,8 @@ user_maybe_role: It's OK to use in-place lowercase as long as the character set is utf8. */ - my_casedn_str(system_charset_info, (char*) $$->host.str); + $$->host.length= my_casedn_str(system_charset_info, + (char*) $$->host.str);
I thought my_casedn_str() changes were in the different commit?
Correct, in the patch for: MDEV-31531 Remove my_casedn_str() and my_caseup_str() which should be pushed first.
} else { diff --git a/sql/structs.h b/sql/structs.h index 75bdc9e276b..d349ab72970 100644 --- a/sql/structs.h +++ b/sql/structs.h @@ -321,11 +322,25 @@ typedef struct user_conn { uint conn_per_hour, updates, questions; /* Maximum amount of resources which account is allowed to
consume. */
USER_RESOURCES user_resources; + + /* + The CHARSET_INFO used for hashes to compare the entire 'user\0hash' key. + QQ: perhaps the user should be compared case sensitively, + while host host should be comparad case insensitively.
correct
Thanks for confirming. I fixed the comment to: /* The CHARSET_INFO used for hashes to compare the entire 'user\0hash' key. Eventually we should fix it as follows: - the user part should be hashed and compared case sensitively, - the host part should be hashed and compared case insensitively. */
+ */ + static CHARSET_INFO *user_host_key_charset_info_for_hash() + { + return &my_charset_utf8mb3_general1400_as_ci; + } } USER_CONN;
typedef struct st_user_stats { char user[MY_MAX(USERNAME_LENGTH, LIST_PROCESS_HOST_LEN) + 1]; + static CHARSET_INFO *user_key_charset_info_for_hash() + { + return &my_charset_utf8mb3_general1400_as_ci; + } // Account name the user is mapped to when this is a user from mapped_user. // Otherwise, the same value as user. char priv_user[MY_MAX(USERNAME_LENGTH, LIST_PROCESS_HOST_LEN) + 1]; diff --git a/sql/table.cc b/sql/table.cc index 8bc16753e68..d9bc5fabb4f 100644 --- a/sql/table.cc +++ b/sql/table.cc @@ -88,23 +88,34 @@ struct extra2_fields static Virtual_column_info * unpack_vcol_info_from_frm(THD *, TABLE *, String *, Virtual_column_info **, bool *);
+/* + Lex_ident_db does not have operator""_Lex_ident_db, + because its constructor contains DBUG_SLOW_ASSERT + and therefore it's not a constexpr constructor. + So let's initialize a number of Lex_ident_db constants + using operator""_Lex_cstring. +*/ + /* INFORMATION_SCHEMA name */ -LEX_CSTRING INFORMATION_SCHEMA_NAME= {STRING_WITH_LEN("information_schema")}; +Lex_ident_db + INFORMATION_SCHEMA_NAME= Lex_ident_db("information_schema"_Lex_cstring);
don't think so. information_schema is case insensitive, even if fs is case sensitivity
Fixed.
/* PERFORMANCE_SCHEMA name */ -LEX_CSTRING PERFORMANCE_SCHEMA_DB_NAME=
{STRING_WITH_LEN("performance_schema")};
+Lex_ident_db + PERFORMANCE_SCHEMA_DB_NAME= Lex_ident_db("performance_schema"_Lex_cstring);
same
Fixed.
/* MYSQL_SCHEMA name */ -LEX_CSTRING MYSQL_SCHEMA_NAME= {STRING_WITH_LEN("mysql")}; +Lex_ident_db + MYSQL_SCHEMA_NAME= Lex_ident_db("mysql"_Lex_cstring);
this uses fs case sensitivity, correct
/* GENERAL_LOG name */ -LEX_CSTRING GENERAL_LOG_NAME= {STRING_WITH_LEN("general_log")}; +Lex_ident_table GENERAL_LOG_NAME= "general_log"_Lex_ident_table;
/* SLOW_LOG name */ -LEX_CSTRING SLOW_LOG_NAME= {STRING_WITH_LEN("slow_log")}; +Lex_ident_table SLOW_LOG_NAME= "slow_log"_Lex_ident_table;
-LEX_CSTRING TRANSACTION_REG_NAME=
{STRING_WITH_LEN("transaction_registry")};
-LEX_CSTRING MYSQL_PROC_NAME= {STRING_WITH_LEN("proc")}; +Lex_ident_table TRANSACTION_REG_NAME= "transaction_registry"_Lex_ident_table; +Lex_ident_table MYSQL_PROC_NAME= "proc"_Lex_ident_table;
/* Keyword added as a prefix when parsing the defining expression for a @@ -280,42 +291,38 @@ const char *fn_frm_ext(const char *name) }
-TABLE_CATEGORY get_table_category(const LEX_CSTRING *db, - const LEX_CSTRING *name) +TABLE_CATEGORY get_table_category(const Lex_ident_db &db, + const Lex_ident_table &name) { - DBUG_ASSERT(db != NULL); - DBUG_ASSERT(name != NULL); - #ifdef WITH_WSREP - if (db->str && - my_strcasecmp(system_charset_info, db->str, WSREP_SCHEMA) == 0) + if (db.str && db.streq(MYSQL_SCHEMA_NAME)) //QQ: was a bug here?
"was"
Removed the "QQ" line and reported this problem as: MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0 This patch fixes the problem and add MTR tests. But it should be backported to 10.4
{ - if ((my_strcasecmp(system_charset_info, name->str,
WSREP_STREAMING_TABLE) == 0 ||
- my_strcasecmp(system_charset_info, name->str, WSREP_CLUSTER_TABLE) == 0 || - my_strcasecmp(system_charset_info, name->str, WSREP_MEMBERS_TABLE) == 0)) + if (name.streq(Lex_ident_table{STRING_WITH_LEN(WSREP_STREAMING_TABLE)}) || + name.streq(Lex_ident_table{STRING_WITH_LEN(WSREP_CLUSTER_TABLE)}) || + name.streq(Lex_ident_table{STRING_WITH_LEN(WSREP_MEMBERS_TABLE)})) { return TABLE_CATEGORY_INFORMATION; } } #endif /* WITH_WSREP */ - if (is_infoschema_db(db)) + if (is_infoschema_db(&db)) return TABLE_CATEGORY_INFORMATION;
- if (is_perfschema_db(db)) + if (is_perfschema_db(&db)) return TABLE_CATEGORY_PERFORMANCE;
- if (lex_string_eq(&MYSQL_SCHEMA_NAME, db)) + if (db.streq(MYSQL_SCHEMA_NAME)) // QQ: was a bug here?
"was"
It's MDEV-33120 again. Removed the "QQ" line and added an MTR test.
{ - if (is_system_table_name(name->str, name->length)) + if (is_system_table_name(name.str, name.length)) return TABLE_CATEGORY_SYSTEM;
- if (lex_string_eq(&GENERAL_LOG_NAME, name)) + if (name.streq(GENERAL_LOG_NAME)) // QQ: was bug here?
"was"
It's MDEV-33120 again. Removed the "QQ" line and added an MTR test.
return TABLE_CATEGORY_LOG;
- if (lex_string_eq(&SLOW_LOG_NAME, name)) + if (name.streq(SLOW_LOG_NAME)) // QQ: was a bug here?
"was"
It's MDEV-33120 again. Removed the "QQ" line and added an MTR test.
return TABLE_CATEGORY_LOG;
- if (lex_string_eq(&TRANSACTION_REG_NAME, name)) + if (name.streq(TRANSACTION_REG_NAME)) // QQ: was a bug here?
"was"
It's MDEV-33120 again. Removed the "QQ" line and added an MTR test.
return TABLE_CATEGORY_LOG; }
diff --git a/sql/table.h b/sql/table.h index 35be3bd4b2d..fe65025478d 100644 --- a/sql/table.h +++ b/sql/table.h @@ -3402,27 +3409,27 @@ static inline int set_zone(int nr,int
min_zone,int max_zone)
}
/* performance schema */ -extern LEX_CSTRING PERFORMANCE_SCHEMA_DB_NAME; +extern Lex_ident_db PERFORMANCE_SCHEMA_DB_NAME;
-extern LEX_CSTRING GENERAL_LOG_NAME; -extern LEX_CSTRING SLOW_LOG_NAME; -extern LEX_CSTRING TRANSACTION_REG_NAME; +extern Lex_ident_table GENERAL_LOG_NAME; +extern Lex_ident_table SLOW_LOG_NAME; +extern Lex_ident_table TRANSACTION_REG_NAME;
/* information schema */ -extern LEX_CSTRING INFORMATION_SCHEMA_NAME; -extern LEX_CSTRING MYSQL_SCHEMA_NAME; +extern Lex_ident_db INFORMATION_SCHEMA_NAME; +extern Lex_ident_db MYSQL_SCHEMA_NAME;
/* table names */ -extern LEX_CSTRING MYSQL_PROC_NAME; +extern Lex_ident_table MYSQL_PROC_NAME;
inline bool is_infoschema_db(const LEX_CSTRING *name) { - return lex_string_eq(&INFORMATION_SCHEMA_NAME, name); + return lex_string_eq(&INFORMATION_SCHEMA_NAME, name); // QQ: use Lex_ident_ci ?
yes
Fixed
}
inline bool is_perfschema_db(const LEX_CSTRING *name) { - return lex_string_eq(&PERFORMANCE_SCHEMA_DB_NAME, name); + return lex_string_eq(&PERFORMANCE_SCHEMA_DB_NAME, name); // QQ:
use Lex_ident_ci?
yes
Fixed
}
inline void mark_as_null_row(TABLE *table) diff --git a/storage/blackhole/ha_blackhole.cc
b/storage/blackhole/ha_blackhole.cc
index 343f3c70286..ce48c95f837 100644 --- a/storage/blackhole/ha_blackhole.cc +++ b/storage/blackhole/ha_blackhole.cc @@ -416,7 +416,9 @@ static int blackhole_init(void *p) mysql_mutex_init(bh_key_mutex_blackhole, &blackhole_mutex, MY_MUTEX_INIT_FAST); (void) my_hash_init(PSI_INSTRUMENT_ME, &blackhole_open_tables, - system_charset_info, 32, 0, 0, + // QQ: was a bug here?
"was"
Just removed the comment. I don't know how to make an MTR test.
+ Lex_ident_table::charset_info(), + 32, 0, 0, (my_hash_get_key) blackhole_get_key, (my_hash_free_key) blackhole_free_key, 0);
diff --git a/storage/csv/ha_tina.cc b/storage/csv/ha_tina.cc index e705ff0e7c0..9fc0bcf604b 100644 --- a/storage/csv/ha_tina.cc +++ b/storage/csv/ha_tina.cc @@ -184,7 +184,9 @@ static int tina_init_func(void *p) tina_hton= (handlerton *)p; mysql_mutex_init(csv_key_mutex_tina, &tina_mutex,
MY_MUTEX_INIT_FAST);
(void) my_hash_init(csv_key_memory_tina_share, &tina_open_tables, - system_charset_info, 32, 0, 0, (my_hash_get_key) + // QQ: was a bug here?
"was"
It was a bug. Reported as: MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0 This patch fixes the problem. It should be also backported to 10.4.
+ table_alias_charset, + 32, 0, 0, (my_hash_get_key) tina_get_key, 0, 0); tina_hton->db_type= DB_TYPE_CSV_DB; tina_hton->create= tina_create_handler; diff --git a/storage/innobase/dict/dict0dict.cc
b/storage/innobase/dict/dict0dict.cc
index 5545727b015..231bcc1a132 100644 --- a/storage/innobase/dict/dict0dict.cc +++ b/storage/innobase/dict/dict0dict.cc @@ -284,27 +284,26 @@ otherwise table->n_def */ ulint dict_table_has_column( const dict_table_t* table, - const char* col_name, + const LEX_CSTRING &col_name,
looks ok to me, but to avoid surprises Marko should review it too
I agree. I will ask Marko to review.
ulint col_nr) { ulint col_max = table->n_def;
ut_ad(table); - ut_ad(col_name); + ut_ad(col_name.str); ut_ad(table->magic_n == DICT_TABLE_MAGIC_N);
if (col_nr < col_max - && innobase_strcasecmp( - col_name, dict_table_get_col_name(table, col_nr)) == 0) { + && Lex_ident_column(col_name). + streq(dict_table_get_col_name_ls(table, col_nr))) { return(col_nr); }
/** The order of column may changed, check it with other columns */ for (ulint i = 0; i < col_max; i++) { if (i != col_nr - && innobase_strcasecmp( - col_name, dict_table_get_col_name(table, i)) == 0) { - + && Lex_ident_column(col_name). + streq(dict_table_get_col_name_ls(table, i))) { return(i); } } diff --git a/storage/mroonga/ha_mroonga.cpp
b/storage/mroonga/ha_mroonga.cpp
index 44fd09e74ed..71784dad475 100644 --- a/storage/mroonga/ha_mroonga.cpp +++ b/storage/mroonga/ha_mroonga.cpp @@ -1936,7 +1936,10 @@ static int mrn_init(void *p) MY_MUTEX_INIT_FAST) != 0)) { goto err_allocated_thds_mutex_init; } - if (mrn_my_hash_init(&mrn_allocated_thds, system_charset_info, 32, 0, 0, + if (mrn_my_hash_init(&mrn_allocated_thds, + // QQ: was a bug here?
"was"
Removed the comment.
+ &my_charset_bin, + 32, 0, 0, mrn_allocated_thds_get_key, 0, 0)) { goto error_allocated_thds_hash_init; } @@ -1945,7 +1948,10 @@ static int mrn_init(void *p) MY_MUTEX_INIT_FAST) != 0)) { goto err_allocated_open_tables_mutex_init; } - if (mrn_my_hash_init(&mrn_open_tables, system_charset_info, 32, 0, 0, + if (mrn_my_hash_init(&mrn_open_tables, + // QQ: was a bug here?
"was"
Removed the comment.
+ Lex_ident_table::charset_info(), + 32, 0, 0, mrn_open_tables_get_key, 0, 0)) { goto error_allocated_open_tables_hash_init; } @@ -1954,7 +1960,10 @@ static int mrn_init(void *p) MY_MUTEX_INIT_FAST) != 0)) { goto error_allocated_long_term_share_mutex_init; } - if (mrn_my_hash_init(&mrn_long_term_share, system_charset_info,
32, 0, 0,
+ if (mrn_my_hash_init(&mrn_long_term_share, + // QQ: was a bug here?
"was"
Removed the comment.
+ Lex_ident_table::charset_info(), + 32, 0, 0, mrn_long_term_share_get_key, 0, 0)) { goto error_allocated_long_term_share_hash_init; } diff --git a/storage/perfschema/pfs_engine_table.h
b/storage/perfschema/pfs_engine_table.h
index 1ef1e0282de..b4f4c9f0acc 100644 --- a/storage/perfschema/pfs_engine_table.h +++ b/storage/perfschema/pfs_engine_table.h @@ -25,6 +25,22 @@
#include "table.h" #include "sql_acl.h" + +/* + The performance schema is implemented as a storage engine, in memory. + The current storage engine interface exposed by the server, + and in particular handlerton::discover, uses 'FRM' files to describe a + table structure, which are later stored on disk, by the server, + in ha_create_table_from_engine(). + Because the table metadata is stored on disk, the table naming rules + used by the performance schema then have to comply with the constraints + imposed by the disk storage, and in particular with lower_case_table_names. +*/ + +using PFS_ident_db= Lex_ident_db; +using PFS_ident_table = Lex_ident_table;
No, table metadata for performance schema are never stored on disk. let's use case insensitive comparison here, if possible.
Fixed.
+ + /** @file storage/perfschema/pfs_engine_table.h Performance schema tables (declarations).
Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
Hi, Alexander, Few comments (I've omitted everything where I'd reply simply "ok"). On Dec 28, Alexander Barkov wrote:
Could you instead implement a constructor that understands
const Lex_ident_column primary_key_name= {STRING_WITH_LEN("PRIMARY")};
The are constructors: Lex_ident_xxx(const char *str, size_t length) Lex_ident_xxx(const LEX_CSTRING &str)
All these notations work:
Lex_ident_db MYSQL_SCHEMA_NAME("mysql"_LEX_CSTRING); Lex_ident_db MYSQL_SCHEMA_NAME2{"mysql"_LEX_CSTRING}; Lex_ident_db MYSQL_SCHEMA_NAME3= {STRING_WITH_LEN("mysql")}; Lex_ident_db MYSQL_SCHEMA_NAME4{STRING_WITH_LEN("mysql")};
Then let's use a shorter syntax, if possible? Not const Lex_ident_column primary_key_name= Lex_ident_column({STRING_WITH_LEN("PRIMARY")}); but const Lex_ident_column primary_key_name= {STRING_WITH_LEN("PRIMARY")}; or anything else out of these four variants, they're all fine.
The client library should be fixed to know new collations utf8mb3_general_as_ci and utf8mb4_general_as_ci and their IDs.
is there a CONC issue for that?
diff --git a/sql/debug_sync.cc b/sql/debug_sync.cc index 362463a7f22..4704691615a 100644 --- a/sql/debug_sync.cc +++ b/sql/debug_sync.cc @@ -858,6 +858,13 @@ static st_debug_sync_action *debug_sync_get_action(THD *thd, }
+class Debug_token: public Lex_ident_ci +{ +public: + using Lex_ident_ci::Lex_ident_ci; +};
I wonder if you can create those types with less boilerplate
will typedef work? will using work? if not, may be, like
template<> Lex_ident_templ: Lex_ident_ci { ... } Lex_ident_templ<> Debug_token;
These classes are small enough. I think there are no needs for a template. We have developers who don't like templates, unless they are absolutely necessary. And this not an absolutly necessary case :)
We also have developers who don't like multiple nearly-identical copies of the same copy-pasted code. But ok, it's a matter of taste, so up to you here.
diff --git a/sql/handler.cc b/sql/handler.cc index a977054d0da..7b9d11d00a2 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -7531,7 +7528,8 @@ int handler::check_duplicate_long_entry_key(const uchar *new_rec, uint key_no) else { Item_func_left *fnc= static_cast<Item_func_left *>(arguments[j]); - DBUG_ASSERT(!my_strcasecmp(system_charset_info, "left", fnc->func_name())); + DBUG_ASSERT(Lex_ident_func(fnc->func_name_cstring()).
eh, why func_name_cstring() doesn't return Lex_ident_func?
I agree it should be Lex_ident_func. However, "grep LEX_CSTRING.func_name_cstring" shows 388 instances. It's going to be a huge change. Can we do it in a separate patch?
okay. it just looks wrong.
+ streq("left"_Lex_cstring)); DBUG_ASSERT(fnc->arguments()[0]->type() == Item::FIELD_ITEM); t_field= static_cast<Item_field *>(fnc->arguments()[0])->field; uint length= (uint)fnc->arguments()[1]->val_int(); diff --git a/sql/opt_trace.cc b/sql/opt_trace.cc index 1008690ccaa..dbb2708f487 100644 --- a/sql/opt_trace.cc +++ b/sql/opt_trace.cc @@ -43,7 +43,7 @@ bool list_has_optimizer_trace_table(const TABLE_LIST *tbl) for (; tbl; tbl= tbl->next_global) { if (tbl->schema_table && - 0 == strcmp(tbl->schema_table->table_name, I_S_table_name)) + 0 == strcmp(tbl->schema_table->table_name.str, I_S_table_name))
Why not streq?
I'm not sure how exactly it should be compared here: 1. Like a normal table, according to Lex_ident_table 2. Like an I_S table, according to Lex_ident_i_s_table 3. Case sensitively, like it was in the old code.
I decided to go with #3, to preserve the old behavior.
Should we report it as a separate MDEV and ask Sergey Petrunya to check?
It's an I_S table name, right? So, should be #2.
return true; } return false; diff --git a/sql/sql_base.cc b/sql/sql_base.cc index 9d2415d3a4d..d7d7ebee024 100644 --- a/sql/sql_base.cc +++ b/sql/sql_base.cc @@ -1906,7 +1904,7 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) DBUG_ASSERT(!table_list->table);
/* an open table operation needs a lot of the stack space */ - if (check_stack_overrun(thd, STACK_MIN_SIZE_FOR_OPEN, (uchar *)&alias)) + if (check_stack_overrun(thd, STACK_MIN_SIZE_FOR_OPEN, (uchar *)&alias.str))
better pass `key` above. doesn't matter much but looks a bit less weird
I'd like to avoid to change the logic that's not directly related to the task.
that's not a logic change, it's a dummy argument that someone used as a hack to trick a compiler into not optimizing something out. by passing alias.str you give it a deeper meaning, surely it must be important to pass specifically this member of the alias structure, otherwise why would anyone would've written it like that? So, please change it back to something simple. Even reverting to the old (uchar *)&alias is fine. Or use NULL, it doesn't matter.
DBUG_RETURN(TRUE);
if (!(flags & MYSQL_OPEN_IGNORE_KILLED) && thd->killed) diff --git a/sql/sql_show.cc b/sql/sql_show.cc index 1fcafd6ec08..7548ff1ec57 100644 --- a/sql/sql_show.cc +++ b/sql/sql_show.cc @@ -7182,10 +7170,12 @@ static int get_schema_views_record(THD *thd, TABLE_LIST *tables, Security_context *sctx= thd->security_ctx; if (!tables->allowed_show) { - if (!my_strcasecmp(system_charset_info, tables->definer.user.str, - sctx->priv_user) && - !my_strcasecmp(system_charset_info, tables->definer.host.str, - sctx->priv_host)) + // QQ: why case insensitive comparision?
Same as above? Try a test case and fix if a bug?
Reported as:
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
This patch fixes the problem. Should be backported to 10.4.
I love how all these bugs came into light now
+ if (my_charset_utf8mb3_general1400_as_ci.streq( + tables->definer.user, + Lex_cstring_strlen(sctx->priv_user)) &&
Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
Hello Sergei, This is a new version: https://github.com/MariaDB/server/commit/0f52062f908a3dbe64f7ba6929584a89611... Please see comments below: On 12/28/23 22:44, Sergei Golubchik wrote:
Hi, Alexander,
Few comments (I've omitted everything where I'd reply simply "ok").
On Dec 28, Alexander Barkov wrote:
Could you instead implement a constructor that understands
const Lex_ident_column primary_key_name= {STRING_WITH_LEN("PRIMARY")};
The are constructors: Lex_ident_xxx(const char *str, size_t length) Lex_ident_xxx(const LEX_CSTRING &str)
All these notations work:
Lex_ident_db MYSQL_SCHEMA_NAME("mysql"_LEX_CSTRING); Lex_ident_db MYSQL_SCHEMA_NAME2{"mysql"_LEX_CSTRING}; Lex_ident_db MYSQL_SCHEMA_NAME3= {STRING_WITH_LEN("mysql")}; Lex_ident_db MYSQL_SCHEMA_NAME4{STRING_WITH_LEN("mysql")};
Then let's use a shorter syntax, if possible? Not
const Lex_ident_column primary_key_name= Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
but
const Lex_ident_column primary_key_name= {STRING_WITH_LEN("PRIMARY")};
or anything else out of these four variants, they're all fine.
The client library should be fixed to know new collations utf8mb3_general_as_ci and utf8mb4_general_as_ci and their IDs.
is there a CONC issue for that?
I just created it: CONC-682 Collation IDs for utf8mbX_general_as_ci
diff --git a/sql/debug_sync.cc b/sql/debug_sync.cc index 362463a7f22..4704691615a 100644 --- a/sql/debug_sync.cc +++ b/sql/debug_sync.cc @@ -858,6 +858,13 @@ static st_debug_sync_action *debug_sync_get_action(THD *thd, }
+class Debug_token: public Lex_ident_ci +{ +public: + using Lex_ident_ci::Lex_ident_ci; +};
I wonder if you can create those types with less boilerplate
will typedef work? will using work? if not, may be, like
template<> Lex_ident_templ: Lex_ident_ci { ... } Lex_ident_templ<> Debug_token;
These classes are small enough. I think there are no needs for a template. We have developers who don't like templates, unless they are absolutely necessary. And this not an absolutly necessary case :)
We also have developers who don't like multiple nearly-identical copies of the same copy-pasted code. But ok, it's a matter of taste, so up to you here.
diff --git a/sql/handler.cc b/sql/handler.cc index a977054d0da..7b9d11d00a2 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -7531,7 +7528,8 @@ int handler::check_duplicate_long_entry_key(const uchar *new_rec, uint key_no) else { Item_func_left *fnc= static_cast<Item_func_left *>(arguments[j]); - DBUG_ASSERT(!my_strcasecmp(system_charset_info, "left", fnc->func_name())); + DBUG_ASSERT(Lex_ident_func(fnc->func_name_cstring()).
eh, why func_name_cstring() doesn't return Lex_ident_func?
I agree it should be Lex_ident_func. However, "grep LEX_CSTRING.func_name_cstring" shows 388 instances. It's going to be a huge change. Can we do it in a separate patch?
okay. it just looks wrong.
+ streq("left"_Lex_cstring)); DBUG_ASSERT(fnc->arguments()[0]->type() == Item::FIELD_ITEM); t_field= static_cast<Item_field *>(fnc->arguments()[0])->field; uint length= (uint)fnc->arguments()[1]->val_int(); diff --git a/sql/opt_trace.cc b/sql/opt_trace.cc index 1008690ccaa..dbb2708f487 100644 --- a/sql/opt_trace.cc +++ b/sql/opt_trace.cc @@ -43,7 +43,7 @@ bool list_has_optimizer_trace_table(const TABLE_LIST *tbl) for (; tbl; tbl= tbl->next_global) { if (tbl->schema_table && - 0 == strcmp(tbl->schema_table->table_name, I_S_table_name)) + 0 == strcmp(tbl->schema_table->table_name.str, I_S_table_name))
Why not streq?
I'm not sure how exactly it should be compared here: 1. Like a normal table, according to Lex_ident_table 2. Like an I_S table, according to Lex_ident_i_s_table 3. Case sensitively, like it was in the old code.
I decided to go with #3, to preserve the old behavior.
Should we report it as a separate MDEV and ask Sergey Petrunya to check?
It's an I_S table name, right? So, should be #2.
Changed to #2.
return true; } return false; diff --git a/sql/sql_base.cc b/sql/sql_base.cc index 9d2415d3a4d..d7d7ebee024 100644 --- a/sql/sql_base.cc +++ b/sql/sql_base.cc @@ -1906,7 +1904,7 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) DBUG_ASSERT(!table_list->table);
/* an open table operation needs a lot of the stack space */ - if (check_stack_overrun(thd, STACK_MIN_SIZE_FOR_OPEN, (uchar *)&alias)) + if (check_stack_overrun(thd, STACK_MIN_SIZE_FOR_OPEN, (uchar *)&alias.str))
better pass `key` above. doesn't matter much but looks a bit less weird
I'd like to avoid to change the logic that's not directly related to the task.
that's not a logic change, it's a dummy argument that someone used as a hack to trick a compiler into not optimizing something out.
by passing alias.str you give it a deeper meaning, surely it must be important to pass specifically this member of the alias structure, otherwise why would anyone would've written it like that?
So, please change it back to something simple. Even reverting to the old (uchar *)&alias is fine. Or use NULL, it doesn't matter.
Changed to &alias.
DBUG_RETURN(TRUE);
if (!(flags & MYSQL_OPEN_IGNORE_KILLED) && thd->killed) diff --git a/sql/sql_show.cc b/sql/sql_show.cc index 1fcafd6ec08..7548ff1ec57 100644 --- a/sql/sql_show.cc +++ b/sql/sql_show.cc @@ -7182,10 +7170,12 @@ static int get_schema_views_record(THD *thd, TABLE_LIST *tables, Security_context *sctx= thd->security_ctx; if (!tables->allowed_show) { - if (!my_strcasecmp(system_charset_info, tables->definer.user.str, - sctx->priv_user) && - !my_strcasecmp(system_charset_info, tables->definer.host.str, - sctx->priv_host)) + // QQ: why case insensitive comparision?
Same as above? Try a test case and fix if a bug?
Reported as:
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
This patch fixes the problem. Should be backported to 10.4.
I love how all these bugs came into light now
+ if (my_charset_utf8mb3_general1400_as_ci.streq( + tables->definer.user, + Lex_cstring_strlen(sctx->priv_user)) &&
Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
Hi, Alexander, On Jan 09, Alexander Barkov wrote:
Hello Sergei,
This is a new version:
I've looked at the diff of this and previous commit excluding InnoDB. Only one comment, about Lex_ident_routine::check_routine_name_with_error() and Lex_ident_fs::check_table_name(). Why do they have different semantics? The former is a static method and takes the name as an argument. The latter checks `this`, needs a Lex_ident_fs to be constructed first. It's confusing, can you modify one of these methods to work like the other? I believe the generated code is the same either way, so it's only a question of consistency and predictability. Also, I'd suggest you rename them to "check_name", I feel that Lex_ident_routine::check_routine_name_with_error() is a bit redundant. Same for check_table_name(). Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
Hello Sergei, Please find a new patch here: https://github.com/MariaDB/server/commit/be66e975ecf15ae696d2e645da4aaf1a06c... Thanks. Comments follow below: On 1/9/24 16:32, Sergei Golubchik wrote:
Hi, Alexander,
On Jan 09, Alexander Barkov wrote:
Hello Sergei,
This is a new version:
I've looked at the diff of this and previous commit excluding InnoDB.
Only one comment, about Lex_ident_routine::check_routine_name_with_error() and Lex_ident_fs::check_table_name().
Why do they have different semantics? The former is a static method and takes the name as an argument. The latter checks `this`, needs a Lex_ident_fs to be constructed first.
It's confusing, can you modify one of these methods to work like the other? I believe the generated code is the same either way, so it's only a question of consistency and predictability.
Also, I'd suggest you rename them to "check_name", I feel that Lex_ident_routine::check_routine_name_with_error() is a bit redundant. Same for check_table_name().
I renamed and moved these methods to static methods as follows: class Lex_ident_db: public Lex_ident_fs { ... static bool check_name(const LEX_CSTRING &str); static bool check_name_with_error(const LEX_CSTRING &str); ... }; class Lex_ident_table: public Lex_ident_fs { ... static bool check_name(const LEX_CSTRING &str, bool check_for_path_chars); ... };
Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
participants (2)
-
Alexander Barkov
-
Sergei Golubchik