Hi Sergei, Thanks for the review. The new patch "Addressing MDEV-19123 review suggestions" is here: https://github.com/MariaDB/server/commit/b3c3adbeeb08625e8816524510d5918cdb1... Please see comments inline: On 6/14/24 22:12, Sergei Golubchik wrote:
Hi, Alexander,
On Jun 14, Alexander Barkov wrote:
Hi Sergei,
Thanks for the review. Please see comments inline.
I commited a patch with a comment "Addressing MDEV-19123 review suggestions" on top of the main patch. Please find it in the same branch:
https://github.com/MariaDB/server/commits/bb-11.6-bar-MDEV-19123/
In the end, we can either squash it, or leave as an incremental patch. I'm fine with both ways. Please suggest.
Commit bff6cb7 is basically ok. See few comments/question below.
Squash it please, but not now. After yor branch will pass testing (you might need to do more fixes before that) and before pushing into the main 11.6 - that will be a good time to clean up the history and squash commits as needed.
Okey.
Thanks.
Hi, Alexander,
No comments about actual code changes, only about tests. Please, see below
On Jun 12, Alexander Barkov wrote:
revision-id: 98ebe0a3afc (mariadb-11.5.1-12-g98ebe0a3afc) parent(s): 186a30de58b author: Alexander Barkov committer: Alexander Barkov timestamp: 2024-06-11 14:17:11 +0400 message:
MDEV-19123 Change default charset from latin1 to utf8mb4
Changing the default server character set from latin1 to utf8mb4.
--- a/mysql-test/main/long_unique.result +++ b/mysql-test/main/long_unique.result @@ -41,7 +41,7 @@ Ignored NO
MyISAM file: DATADIR/test/t1 Record format: Packed -Character set: latin1_swedish_ci (8) +Character set: ? (0)
huh?
MyISAM uses only one byte to store the ID of the table-level default collation:
myisamdef.h: uchar language; /* Language for indexes */
It seems this member is used in myisamchk.c only.
Any ideas what to do with that?
please, create a bug report that myisamchk is broken for collations > 255. It seems that it can corrupt the table that way.
Or at least it can keep repairing it, thinking that collation changes every time.
MDEV-34409 myisamchk is broken for collation IDs >255
diff --git a/mysql-test/main/mysqlbinlog_row_compressed.result b/mysql-test/main/mysqlbinlog_row_compressed.result --- a/mysql-test/main/mysqlbinlog_row_compressed.result +++ b/mysql-test/main/mysqlbinlog_row_compressed.result @@ -40,9 +40,10 @@ SET @@session.sql_mode=#/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C latin1 *//*!*/; SET @@session.character_set_client=latin1,@@session.collation_connection=8,@@session.collation_server=#/*!*/; +SET @@session.character_set_collations='utf8mb3=utf8mb3_uca1400_ai_ci,ucs2=ucs2_uca1400_ai_ci,utf8mb4=utf8mb4_uca1400_ai_ci,utf16=utf16_uca1400_ai_ci,utf32=utf32_uca1400_ai_ci'/*!*/;
why did this appear?
MTR now starts mariadbd with these lines in my.cnf:
character-set-server=utf8mb4 collation-server=utf8mb4_uca1400_ai_ci
When mariadbd processes the first line, it finds the default collation for utf8mb4, which adds the CHARACTER_SET_COLLATIONS_USED flag. This forces the server to write @@character_set_collations to the binary log.
However, when it processes the second line, the flag becomes not needed.
The code could be rewritten to avoid adding CHARACTER_SET_COLLATIONS_USED if both character-set-server and collation-server are specified. The flag is needed when only character-set-server is specified without collation-server.
Do you need to set character-set-server at all? I mean, isn't the second line alone enough?
Or may be both can be removed? utf8mb4 is the default now, isn't it?
Sorry, I misunderstand the reason for the result change when traced it in gdb for the first time. The real reason was that I added the "CHARACTER SET latin1" clause into three CREATE TABLE statements in this test. I just fixed CREATE TABLEs to use: CHARACTER SET latin1 COLLATE latin1_swedish_ci The "SET @@session.character_set_collations=..." command is now gone in the mysqlbinlog output.
diff --git a/mysql-test/suite/innodb/t/alter_primary_key.test b/mysql-test/suite/innodb/t/alter_primary_key.test --- a/mysql-test/suite/innodb/t/alter_primary_key.test +++ b/mysql-test/suite/innodb/t/alter_primary_key.test @@ -1,6 +1,9 @@ --source innodb_default_row_format.inc --source include/have_debug.inc --source include/have_debug_sync.inc +--disable_query_log +--source include/test_db_charset_latin1.inc +--enable_query_log
you don't disable query log in other tests, why here?
Removed the disabling.
in all tests, I presume
There were some more tests. Fixed all of them now.
--echo # --echo # MDEV-23244 ALTER TABLE…ADD PRIMARY KEY fails to flag diff --git a/mysql-test/suite/perfschema/t/short_option_1-master.opt b/mysql-test/suite/perfschema/t/short_option_1-master.opt --- a/mysql-test/suite/perfschema/t/short_option_1-master.opt +++ b/mysql-test/suite/perfschema/t/short_option_1-master.opt @@ -1 +1 @@ --a -Cutf8 -W1 +-a -Cutf8 -W1 --collation-server=utf8_general_ci
was it necessary?
Without it, the server fails to start with this output in mysql-test/var/log/mysqld.1.err:
COLLATION 'utf8mb4_uca1400_ai_ci' is not valid for CHARACTER SET 'utf8mb3'
Can you do -Cutf8mb4 instead?
It's short_option_1.test, better to use short option names there :)
Right. Done.
create trigger t1rbr before insert on t1 for each row set new.a=now(6); diff --git a/storage/connect/mysql-test/connect/r/endian.result b/storage/connect/mysql-test/connect/r/endian.result --- a/storage/connect/mysql-test/connect/r/endian.result +++ b/storage/connect/mysql-test/connect/r/endian.result @@ -10,7 +10,7 @@ birth DATE NOT NULL FIELD_FORMAT='L', id CHAR(5) NOT NULL FIELD_FORMAT='L2', salary DOUBLE(9,2) NOT NULL DEFAULT 0.00 FIELD_FORMAT='LF', dept INT(4) NOT NULL FIELD_FORMAT='L2' -) ENGINE=CONNECT TABLE_TYPE=BIN BLOCK_SIZE=5 FILE_NAME='Testbal.dat'; +) ENGINE=CONNECT CHARSET=latin1 TABLE_TYPE=BIN BLOCK_SIZE=5 FILE_NAME='Testbal.dat';
or any connect utf8mb4 tests whatsoever?
I had to add CHARSET=latin1 into many tests.
I just removed a few CHARSET=latin1 from dbf.test, but not all of them.
Unfortunately, connect did not work with utf8mb4 in many cases. We'll need to ask Oliver to check. But this should not be a showstopper for MDEV-19123, I think.
Of course, not. But please, create a bug report like "CONNECT doesn't work with utf8mb4", explain that commit for MDEV-19123 had to add CHARSET=latin1 to many tests because many table types (see commit) don't work with utf8mb4. And assign to Andrew.
MDEV-34410 CONNECT doesn't work well with utf8mb4
Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org