Re: [Maria-developers] e7762b3a725: MDEV-8334: Rename utf8 to utf8mb3
Hi, Rucha! Looks great! Just one question below re. using global vs session old_behavior variable. On May 05, Rucha Deodhar wrote:
revision-id: e7762b3a725 (mariadb-10.5.2-583-ge7762b3a725) parent(s): bfedf1eb4b6 author: Rucha Deodhar <rucha.deodhar@mariadb.com> committer: Rucha Deodhar <rucha.deodhar@mariadb.com> timestamp: 2021-04-20 12:50:32 +0530 message:
MDEV-8334: Rename utf8 to utf8mb3
This patch changes the main name of 3 byte character set from utf8 to utf8mb3. New old_mode UTF8_IS_UTF8MB3 is added and set TRUE by default, so that utf8 would mean utf8mb3. If not set, utf8 would mean utf8mb4.
diff --git a/libmariadb b/libmariadb --- a/libmariadb +++ b/libmariadb @@ -1 +1 @@ -Subproject commit b6f8883d9687936a50a7ed79bd9e5af2340efccd +Subproject commit 03d983b287f8a1fe855cb5ed479a3f7ab4f922ab
when rebasing and pushing into 10.6 take care not to rollback C/C changes. That is, only update C/C submodule reference if it's earlier than your 03d983b287f8a1fe855cb5ed479a3f7ab4f922ab
Binary files a/mysql-test/suite/sys_vars/r/character_set_results_basic.result and b/mysql-test/suite/sys_vars/r/character_set_results_basic.result differ diff --git a/sql/item.cc b/sql/item.cc --- a/sql/item.cc +++ b/sql/item.cc @@ -2359,6 +2359,9 @@ left_is_superset(const DTCollation *left, const DTCollation *right)
bool DTCollation::aggregate(const DTCollation &dt, uint flags) { + + myf utf8_flag= global_system_variables.old_behavior & + OLD_MODE_UTF8_IS_UTF8MB3 ? MY_UTF8_IS_UTF8MB3 : 0;
if old_behavior is a session variable, then you should use session value here.
if (!my_charset_same(collation, dt.collation)) { /* diff --git a/sql/mysqld.cc b/sql/mysqld.cc --- a/sql/mysqld.cc +++ b/sql/mysqld.cc @@ -4099,6 +4098,8 @@ static int init_common_variables() test purposes, to be able to start "mysqld" even if the requested character set is not available (see bug#18743). */ + myf utf8_flag= global_system_variables.old_behavior & + OLD_MODE_UTF8_IS_UTF8MB3 ? MY_UTF8_IS_UTF8MB3 : 0;
ok, here there's no "session" yet, so global value is correct
for (;;) { char *next_character_set_name= strchr(default_character_set_name, ','); diff --git a/sql/set_var.cc b/sql/set_var.cc --- a/sql/set_var.cc +++ b/sql/set_var.cc @@ -533,11 +533,12 @@ static my_old_conv old_conv[]= CHARSET_INFO *get_old_charset_by_name(const char *name) { my_old_conv *conv; - + myf utf8_flag= global_system_variables.old_behavior & + OLD_MODE_UTF8_IS_UTF8MB3 ? MY_UTF8_IS_UTF8MB3 : 0;
technically, you should use a session value here too. but see the old_conv array, it doesn't have "utf8" anywhere, so it doesn't matter what the flag is, you can use 0 there.
for (conv= old_conv; conv->old_name; conv++) { if (!my_strcasecmp(&my_charset_latin1, name, conv->old_name)) - return get_charset_by_csname(conv->new_name, MY_CS_PRIMARY, MYF(0)); + return get_charset_by_csname(conv->new_name, MY_CS_PRIMARY, MYF(utf8_flag)); } return NULL; } diff --git a/sql/sp.cc b/sql/sp.cc --- a/sql/sp.cc +++ b/sql/sp.cc @@ -291,7 +291,8 @@ bool load_charset(MEM_ROOT *mem_root, CHARSET_INFO **cs) { LEX_CSTRING cs_name; - + myf utf8_flag= global_system_variables.old_behavior & + OLD_MODE_UTF8_IS_UTF8MB3 ? MY_UTF8_IS_UTF8MB3 : 0;
but this needs a session value, I suppose
if (field->val_str_nopad(mem_root, &cs_name)) { *cs= dflt_cs; @@ -324,9 +325,10 @@ bool load_collation(MEM_ROOT *mem_root, *cl= dflt_cl; return TRUE; } + myf utf8_flag= thd->get_utf8_flag();
Hmm, here you do use a session value. So, I suppose your using global value above was not an oversight, but you intentionally did it. What were your reasons?
DBUG_ASSERT(cl_name.str[cl_name.length] == 0); - *cl= get_charset_by_name(cl_name.str, MYF(0)); + *cl= get_charset_by_name(cl_name.str, MYF(utf8_flag));
if (*cl == NULL) { diff --git a/sql/sql_class.h b/sql/sql_class.h --- a/sql/sql_class.h +++ b/sql/sql_class.h @@ -1051,14 +1052,16 @@ static inline void update_global_memory_status(int64 size) @retval NULL on error @retval Pointter to CHARSET_INFO with the given name on success */ -static inline CHARSET_INFO * -mysqld_collation_get_by_name(const char *name, +inline CHARSET_INFO *
static inline, please
+mysqld_collation_get_by_name(const char *name, bool utf8_is_utf8mb3,
up to you, but wouldn't it be more convenient to pass the flag here? Then you can invoke it with thd->utf8_flag()
CHARSET_INFO *name_cs= system_charset_info) { CHARSET_INFO *cs; MY_CHARSET_LOADER loader; + myf utf8_flag= utf8_is_utf8mb3 ? MY_UTF8_IS_UTF8MB3 : 0; my_charset_loader_init_mysys(&loader); - if (!(cs= my_collation_get_by_name(&loader, name, MYF(0)))) + + if (!(cs= my_collation_get_by_name(&loader, name, MYF(utf8_flag)))) { ErrConvString err(name, name_cs); my_error(ER_UNKNOWN_COLLATION, MYF(0), err.ptr()); diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc --- a/sql/sql_parse.cc +++ b/sql/sql_parse.cc @@ -10449,7 +10449,10 @@ merge_charset_and_collation(CHARSET_INFO *cs, CHARSET_INFO *cl) CHARSET_INFO *find_bin_collation(CHARSET_INFO *cs) { const char *csname= cs->csname; - cs= get_charset_by_csname(csname, MY_CS_BINSORT, MYF(0)); + myf utf8_flag= global_system_variables.old_behavior & + OLD_MODE_UTF8_IS_UTF8MB3 ? + MY_UTF8_IS_UTF8MB3 : 0;
Why not a session value here?
+ cs= get_charset_by_csname(csname, MY_CS_BINSORT, MYF(utf8_flag)); if (!cs) { char tmp[65];
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
participants (1)
-
Sergei Golubchik