[Maria-developers] is this a bug?
hi all: I have a problem about the type cast when I have read the source of Maria & MySQL I notice that when cast from string into int, Maria & MySQL do it like the C function atoi code: for (ul= 0 ; str < end9 && (ch= (uchar) (*str - '0')) < 10; str++) { ul= ul * 10 + ch; } but cast from hex into int, the code is: code: char *end=(char*) str_value.ptr()+str_value.length(), *ptr=end-min(str_value.length(),sizeof(longlong)); ulonglong value=0; for (; ptr != end ; ptr++) value=(value << 8)+ (ulonglong) (uchar) *ptr; so I do the test blew: create table t(id int auto_increment primary key, a int)engine=innodb; insert into t(a) values('1'); insert into t(a) values(0x31); everything is ok, I get the result: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 1 | | 2 | 49 | +----+------+ the value 0x31 is the ascii of the string '1', but we got the different result after inserting maybe it's not a problem, but in the replication environment, it maybe cause the data of slave not the same as master In the master, when we set the connection characterset into multi-bytes characterset and this characterset may escape with backslash is dangerous (like gbk, cp932) and then we use the prepared statement to do the insert, the string value in the insert must be cast into hex before write into the binlog, so we do the insert(insert into t(a) values('1')) on the master, but in the slave it does the another(insert into t(a) values(0x31)) , the data is not the same. how to reappear: master(binlog format=statement): create table t(id int auto_increment primay key, a int) engine=innodb; set names gbk; prepare stmt from 'insert into t(a) values(?)'; set @a='1'; execute stmt using @a; the result in master is: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 1 | +----+------+ and result in slave is: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 49 | +----+------+ I test the newest release version mariadb-10.0.2 and MySQL-5.6.10, the problem is still exist, is this a bug? 2013-04-28 jhx1008 @netease
Hello! This does look to be a legitimate bug. This would apply to any character set where charset_info_st field escape_with_backslash_is_dangerous is true, which currently is: big5, cp932, gbk, sjis. The problem here is that string parameters coming from prepared statements are being converted into 0xHHHH form indiscriminately in append_query_string, which is producing the string to be binlogged for statement-based replication. While that works okay for insertion of strings into string fields, it causes the conversion-from-string-to-integer which is happening on the master for insertion of a string into an integer field to not be happening on the slave, since 0xHHHH form is more properly an integer than a string. This can be captured by setting a breakpoint at str_to_hex and running jhx1008's test case (fixed up a bit): DROP TABLE IF EXISTS t; CREATE TABLE t (id INT NOT NULL AUTO_INCREMENT, a INT, PRIMARY KEY(id)); SET NAMES gbk; PREPARE STMT FROM 'INSERT INTO t (a) VALUES (?)'; SET @a = '1'; EXECUTE STMT USING @a; This is the stack backtrace at that point (sorry, this is from MySQL-5.5, not MariaDB, since that's what I had handily set up for debugging, and I suspected this problem to be much broader than MariaDB): #0 str_to_hex (to=0x7fffe0029f60 "", from=0x7fffe002ab90 "1", len=1) at sql/log_event.cc:585 #1 0x00000000006dee14 in append_query_string (thd=0x15b9bb0, csinfo=0xeebd00 <my_charset_gbk_chinese_ci>, from=0x7fffe000d400, to=0x7ffff404e9d0) at sql/log_event.cc:616 #2 0x000000000066da2d in Item_param::query_val_str (this=0x7fffe000d3f0, thd=<optimized out>, str=0x7ffff404e9d0) at sql/item.cc:3333 #3 0x000000000058a138 in insert_params_from_vars_with_log (stmt=<optimized out>, varnames=..., query=0x7ffff404eaf0) at sql/sql_prepare.cc:1216 #4 0x000000000058c314 in Prepared_statement::set_parameters (this=this@entry=0x7fffe0026660, expanded_query=expanded_query@entry=0x7ffff404eaf0, packet=0x0, packet_end=<optimized out>) at sql/sql_prepare.cc:3364 #5 0x000000000058ce50 in Prepared_statement::execute_loop (this=0x7fffe0026660, expanded_query=0x7ffff404eaf0, open_cursor=<optimized out>, packet=<optimized out>, packet_end=<optimized out>) at sql/sql_prepare.cc:3432 #6 0x000000000058d04a in mysql_sql_stmt_execute (thd=<optimized out>) at sql/sql_prepare.cc:2634 #7 0x000000000057d948 in mysql_execute_command (thd=thd@entry=0x15b9bb0) at sql/sql_parse.cc:2161 #8 0x0000000000580480 in mysql_parse (parser_state=0x7ffff40500c0, thd=0x15b9bb0, rawbuf=<optimized out>, length=<optimized out>) at sql/sql_parse.cc:5627 #9 mysql_parse (thd=0x15b9bb0, rawbuf=<optimized out>, length=21, parser_state=0x7ffff40500c0) at sql/sql_parse.cc:5551 #10 0x00000000005816b6 in dispatch_command (command=COM_QUERY, thd=0x15b9bb0, packet=<optimized out>, packet_length=<optimized out>) at sql/sql_parse.cc:1037 #11 0x000000000060a666 in do_handle_one_connection (thd_arg=thd_arg@entry=0x15b9bb0) at sql/sql_connect.cc:853 #12 0x000000000060a6ca in handle_one_connection (arg=arg@entry=0x15b9bb0) at sql/sql_connect.cc:772 #13 0x00000000008b3f35 in pfs_spawn_thread (arg=0x15eacf0) at storage/perfschema/pfs.cc:1015 #14 0x00007ffff77a5e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #15 0x00007ffff6a96cbd in clone () from /lib/x86_64-linux-gnu/libc.so.6 #16 0x0000000000000000 in ?? () Using SHOW BINLOG EVENTS shows that the problem is from the server (binlogging) side: *************************** 6. row *************************** Log_name: 0.000001 Pos: 450 Event_type: Query Server_id: 1 End_log_pos: 544 Info: use `test`; INSERT INTO t (a) VALUES (0x31) Seems like this bug has existed since at least 2006. It would seemingly make replication completely broken when using prepared statements along with any of big5, cp932, gbk, sjis character sets. Regards, Jeremy On Sat, Apr 27, 2013 at 7:31 PM, jhx1008 <jhx1008@gmail.com> wrote:
** ** hi all: I have a problem about the type cast when I have read the source of Maria & MySQL I notice that when cast from string into int, Maria & MySQL do it like the C function atoi code: *for (ul= 0 ; str < end9 && (ch= (uchar) (*str - '0')) < 10; str++) { ul= ul * 10 + ch; }* but cast from hex into int, the code is: code: *char *end=(char*) str_value.ptr()+str_value.length(), *ptr=end-min(str_value.length(),sizeof(longlong)); ulonglong value=0; for (; ptr != end ; ptr++) value=(value << 8)+ (ulonglong) (uchar) *ptr;*
so I do the test blew: *create table t(id int auto_increment primary key, a int)engine=innodb;* *insert into t(a) values('1');* *insert into t(a) values(0x31);* everything is ok, I get the result: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 1 | | 2 | 49 | +----+------+ the value 0x31 is the ascii of the string '1', but we got the different result after inserting maybe it's not a problem, but in the replication environment, it maybe cause the data of slave not the same as master In the master, when we set the connection characterset into multi-bytes characterset and this characterset may escape with backslash is dangerous (like gbk, cp932) and then we use the prepared statement to do the insert, the string value in the insert must be cast into hex before write into the binlog, so we do the insert(insert into t(a) values('1')) on the master, but in the slave it does the another(insert into t(a) values(0x31)) , the data is not the same. how to reappear: master(binlog format=statement): *create table t(id int auto_increment primay key, a int) engine=innodb;* *set names gbk;* *prepare stmt from 'insert into t(a) values(?)';* *set @a='1';* *execute stmt using @a;* the result in master is: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 1 | +----+------+ and result in slave is: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 49 | +----+------+ I test the newest release version mariadb-10.0.2 and MySQL-5.6.10, the problem is still exist, is this a bug?
2013-04-28 ------------------------------ jhx1008 @netease **
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Hello, This is definitely a bug. 0xHHHH is a MySQL extension, and it's a hybrid thing. It can behave as a number and a string depending on context. Binary log could use the X'HHHH' notation instead: INSERT INTO t1 VALUES (a) VALUES (X'31'); which is an SQL standard, and which must always be a string. However, it seems the behaviour of X'HHHH' and of 0xHHHH is exactly the same, and X'HHHH' can also act as a number. This script demonstrates the problem: drop table if exists t1; create table t1 (id int); insert into t1 values (x'31'),(0x31),(concat(0x31)); select * from t1; +------+ | id | +------+ | 49 | <-- wrong | 49 | <-- ok | 1 | <-- ok +------+ I'd propose the following as a quick fix: 1. Fix X'HHHH' to work always as string. 2. Fix binlog to use X'HHHH' This will fix the INSERT statements like the one in the bug report. But the problem is slightly wider. Replacing a string to X'HHHH' does not provide 100% compatibility in all cases. Each string value is described by the following parameters: a. buffer with actual data b. character set and collation c. collation derivation (coercibility) - the collation precedence which is used when the value is compared to another value. X'HHHH' only reproduces "a", but it does not reproduce "b" and "c". Thus simply replacing queries like this: UPDATE t1 SET .... WHERE column='str'; to UPDATE t1 SET .... WHERE column=X'HHHHHH'; can still lead to wrong results, even with the fixed X'HHHH'. We don't have a syntax to reproduce all three parameters. I think such syntax should be added in long terms. On 04/30/2013 03:48 AM, Jeremy Cole wrote:
Hello!
This does look to be a legitimate bug. This would apply to any character set where charset_info_st field escape_with_backslash_is_dangerous is true, which currently is: big5, cp932, gbk, sjis.
The problem here is that string parameters coming from prepared statements are being converted into 0xHHHH form indiscriminately in append_query_string, which is producing the string to be binlogged for statement-based replication. While that works okay for insertion of strings into string fields, it causes the conversion-from-string-to-integer which is happening on the master for insertion of a string into an integer field to not be happening on the slave, since 0xHHHH form is more properly an integer than a string.
This can be captured by setting a breakpoint at str_to_hex and running jhx1008's test case (fixed up a bit):
DROP TABLE IF EXISTS t; CREATE TABLE t (id INT NOT NULL AUTO_INCREMENT, a INT, PRIMARY KEY(id)); SET NAMES gbk; PREPARE STMT FROM 'INSERT INTO t (a) VALUES (?)'; SET @a = '1'; EXECUTE STMT USING @a;
This is the stack backtrace at that point (sorry, this is from MySQL-5.5, not MariaDB, since that's what I had handily set up for debugging, and I suspected this problem to be much broader than MariaDB):
#0 str_to_hex (to=0x7fffe0029f60 "", from=0x7fffe002ab90 "1", len=1) at sql/log_event.cc:585 #1 0x00000000006dee14 in append_query_string (thd=0x15b9bb0, csinfo=0xeebd00 <my_charset_gbk_chinese_ci>, from=0x7fffe000d400, to=0x7ffff404e9d0) at sql/log_event.cc:616 #2 0x000000000066da2d in Item_param::query_val_str (this=0x7fffe000d3f0, thd=<optimized out>, str=0x7ffff404e9d0) at sql/item.cc:3333 #3 0x000000000058a138 in insert_params_from_vars_with_log (stmt=<optimized out>, varnames=..., query=0x7ffff404eaf0) at sql/sql_prepare.cc:1216 #4 0x000000000058c314 in Prepared_statement::set_parameters (this=this@entry=0x7fffe0026660, expanded_query=expanded_query@entry=0x7ffff404eaf0, packet=0x0, packet_end=<optimized out>) at sql/sql_prepare.cc:3364 #5 0x000000000058ce50 in Prepared_statement::execute_loop (this=0x7fffe0026660, expanded_query=0x7ffff404eaf0, open_cursor=<optimized out>, packet=<optimized out>, packet_end=<optimized out>) at sql/sql_prepare.cc:3432 #6 0x000000000058d04a in mysql_sql_stmt_execute (thd=<optimized out>) at sql/sql_prepare.cc:2634 #7 0x000000000057d948 in mysql_execute_command (thd=thd@entry=0x15b9bb0) at sql/sql_parse.cc:2161 #8 0x0000000000580480 in mysql_parse (parser_state=0x7ffff40500c0, thd=0x15b9bb0, rawbuf=<optimized out>, length=<optimized out>) at sql/sql_parse.cc:5627 #9 mysql_parse (thd=0x15b9bb0, rawbuf=<optimized out>, length=21, parser_state=0x7ffff40500c0) at sql/sql_parse.cc:5551 #10 0x00000000005816b6 in dispatch_command (command=COM_QUERY, thd=0x15b9bb0, packet=<optimized out>, packet_length=<optimized out>) at sql/sql_parse.cc:1037 #11 0x000000000060a666 in do_handle_one_connection (thd_arg=thd_arg@entry=0x15b9bb0) at sql/sql_connect.cc:853 #12 0x000000000060a6ca in handle_one_connection (arg=arg@entry=0x15b9bb0) at sql/sql_connect.cc:772 #13 0x00000000008b3f35 in pfs_spawn_thread (arg=0x15eacf0) at storage/perfschema/pfs.cc:1015 #14 0x00007ffff77a5e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #15 0x00007ffff6a96cbd in clone () from /lib/x86_64-linux-gnu/libc.so.6 #16 0x0000000000000000 in ?? ()
Using SHOW BINLOG EVENTS shows that the problem is from the server (binlogging) side:
*************************** 6. row *************************** Log_name: 0.000001 Pos: 450 Event_type: Query Server_id: 1 End_log_pos: 544 Info: use `test`; INSERT INTO t (a) VALUES (0x31)
Seems like this bug has existed since at least 2006. It would seemingly make replication completely broken when using prepared statements along with any of big5, cp932, gbk, sjis character sets.
Regards,
Jeremy
On Sat, Apr 27, 2013 at 7:31 PM, jhx1008 <jhx1008@gmail.com <mailto:jhx1008@gmail.com>> wrote:
__ __ hi all: I have a problem about the type cast when I have read the source of Maria & MySQL I notice that when cast from string into int, Maria & MySQL do it like the C function atoi code: *for (ul= 0 ; str < end9 && (ch= (uchar) (*str - '0')) < 10; str++) { ul= ul * 10 + ch; }* but cast from hex into int, the code is: code: *char *end=(char*) str_value.ptr()+str_value.length(), *ptr=end-min(str_value.length(),sizeof(longlong)); ulonglong value=0; for (; ptr != end ; ptr++) value=(value << 8)+ (ulonglong) (uchar) *ptr;* so I do the test blew: *create table t(id int auto_increment primary key, a int)engine=innodb;* *insert into t(a) values('1');* *insert into t(a) values(0x31);* everything is ok, I get the result: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 1 | | 2 | 49 | +----+------+ the value 0x31 is the ascii of the string '1', but we got the different result after inserting maybe it's not a problem, but in the replication environment, it maybe cause the data of slave not the same as master In the master, when we set the connection characterset into multi-bytes characterset and this characterset may escape with backslash is dangerous (like gbk, cp932) and then we use the prepared statement to do the insert, the string value in the insert must be cast into hex before write into the binlog, so we do the insert(insert into t(a) values('1')) on the master, but in the slave it does the another(insert into t(a) values(0x31)) , the data is not the same. how to reappear: master(binlog format=statement): *create table t(id int auto_increment primay key, a int) engine=innodb;* *set names gbk;* *prepare stmt from 'insert into t(a) values(?)';* *set @a='1';* *execute stmt using @a;* the result in master is: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 1 | +----+------+ and result in slave is: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 49 | +----+------+ I test the newest release version mariadb-10.0.2 and MySQL-5.6.10, the problem is still exist, is this a bug? 2013-04-28 ------------------------------------------------------------------------ jhx1008 @netease __
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net <mailto:maria-developers@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Hi, Alexander! Could you report this bug and fix it, please? On Apr 30, Alexander Barkov wrote:
Hello,
This is definitely a bug.
0xHHHH is a MySQL extension, and it's a hybrid thing. It can behave as a number and a string depending on context.
Binary log could use the X'HHHH' notation instead:
INSERT INTO t1 VALUES (a) VALUES (X'31');
which is an SQL standard, and which must always be a string.
However, it seems the behaviour of X'HHHH' and of 0xHHHH is exactly the same, and X'HHHH' can also act as a number. ... I'd propose the following as a quick fix: 1. Fix X'HHHH' to work always as string. 2. Fix binlog to use X'HHHH'
Regards, Sergei
Hi Sergei, On 05/01/2013 01:27 PM, Sergei Golubchik wrote:
Hi, Alexander!
Could you report this bug and fix it, please?
The report is here: https://mariadb.atlassian.net/browse/MDEV-4489 Please review a patch fixing this problem (attached). Note, while working on the patch I noticed one suspicious thing, in the old implementation of Item_hex_string::print(). Always doing this "min" to calculate the string length to print looked strange: min(str_value.length(), sizeof(longlong)) I fixed this behaviour in the class implementing X'HHHH', it now prints the full length. But I preserved the old behaviour in the class implementing 0xHHHH. Perhaps it should also be fixed somehow. What do you think? Thanks.
On Apr 30, Alexander Barkov wrote:
Hello,
This is definitely a bug.
0xHHHH is a MySQL extension, and it's a hybrid thing. It can behave as a number and a string depending on context.
Binary log could use the X'HHHH' notation instead:
INSERT INTO t1 VALUES (a) VALUES (X'31');
which is an SQL standard, and which must always be a string.
However, it seems the behaviour of X'HHHH' and of 0xHHHH is exactly the same, and X'HHHH' can also act as a number. ... I'd propose the following as a quick fix: 1. Fix X'HHHH' to work always as string. 2. Fix binlog to use X'HHHH'
Regards, Sergei
Hi, Alexander! On May 06, Alexander Barkov wrote:
Hi Sergei,
On 05/01/2013 01:27 PM, Sergei Golubchik wrote:
Hi, Alexander!
Could you report this bug and fix it, please?
The report is here: https://mariadb.atlassian.net/browse/MDEV-4489
Please review a patch fixing this problem (attached).
looks ok to me Regards, Sergei
Thanks for all the replies, to fix it maybe need a new syntax, does mariadb has any plan to fix it? 2013-05-02 jhx1008 @netease 发件人:Alexander Barkov 发送时间:2013-04-30 13:37 主题:Re: [Maria-developers] is this a bug? 收件人:"Jeremy Cole"<jeremycole@google.com> 抄送:"jhx1008"<jhx1008@gmail.com>,"maria-developers"<maria-developers@lists.launchpad.net> Hello, This is definitely a bug. 0xHHHH is a MySQL extension, and it's a hybrid thing. It can behave as a number and a string depending on context. Binary log could use the X'HHHH' notation instead: INSERT INTO t1 VALUES (a) VALUES (X'31'); which is an SQL standard, and which must always be a string. However, it seems the behaviour of X'HHHH' and of 0xHHHH is exactly the same, and X'HHHH' can also act as a number. This script demonstrates the problem: drop table if exists t1; create table t1 (id int); insert into t1 values (x'31'),(0x31),(concat(0x31)); select * from t1; +------+ | id | +------+ | 49 | <-- wrong | 49 | <-- ok | 1 | <-- ok +------+ I'd propose the following as a quick fix: 1. Fix X'HHHH' to work always as string. 2. Fix binlog to use X'HHHH' This will fix the INSERT statements like the one in the bug report. But the problem is slightly wider. Replacing a string to X'HHHH' does not provide 100% compatibility in all cases. Each string value is described by the following parameters: a. buffer with actual data b. character set and collation c. collation derivation (coercibility) - the collation precedence which is used when the value is compared to another value. X'HHHH' only reproduces "a", but it does not reproduce "b" and "c". Thus simply replacing queries like this: UPDATE t1 SET .... WHERE column='str'; to UPDATE t1 SET .... WHERE column=X'HHHHHH'; can still lead to wrong results, even with the fixed X'HHHH'. We don't have a syntax to reproduce all three parameters. I think such syntax should be added in long terms. On 04/30/2013 03:48 AM, Jeremy Cole wrote:
Hello!
This does look to be a legitimate bug. This would apply to any character set where charset_info_st field escape_with_backslash_is_dangerous is true, which currently is: big5, cp932, gbk, sjis.
The problem here is that string parameters coming from prepared statements are being converted into 0xHHHH form indiscriminately in append_query_string, which is producing the string to be binlogged for statement-based replication. While that works okay for insertion of strings into string fields, it causes the conversion-from-string-to-integer which is happening on the master for insertion of a string into an integer field to not be happening on the slave, since 0xHHHH form is more properly an integer than a string.
This can be captured by setting a breakpoint at str_to_hex and running jhx1008's test case (fixed up a bit):
DROP TABLE IF EXISTS t; CREATE TABLE t (id INT NOT NULL AUTO_INCREMENT, a INT, PRIMARY KEY(id)); SET NAMES gbk; PREPARE STMT FROM 'INSERT INTO t (a) VALUES (?)'; SET @a = '1'; EXECUTE STMT USING @a;
This is the stack backtrace at that point (sorry, this is from MySQL-5.5, not MariaDB, since that's what I had handily set up for debugging, and I suspected this problem to be much broader than MariaDB):
#0 str_to_hex (to=0x7fffe0029f60 "", from=0x7fffe002ab90 "1", len=1) at sql/log_event.cc:585 #1 0x00000000006dee14 in append_query_string (thd=0x15b9bb0, csinfo=0xeebd00 <my_charset_gbk_chinese_ci>, from=0x7fffe000d400, to=0x7ffff404e9d0) at sql/log_event.cc:616 #2 0x000000000066da2d in Item_param::query_val_str (this=0x7fffe000d3f0, thd=<optimized out>, str=0x7ffff404e9d0) at sql/item.cc:3333 #3 0x000000000058a138 in insert_params_from_vars_with_log (stmt=<optimized out>, varnames=..., query=0x7ffff404eaf0) at sql/sql_prepare.cc:1216 #4 0x000000000058c314 in Prepared_statement::set_parameters (this=this@entry=0x7fffe0026660, expanded_query=expanded_query@entry=0x7ffff404eaf0, packet=0x0, packet_end=<optimized out>) at sql/sql_prepare.cc:3364 #5 0x000000000058ce50 in Prepared_statement::execute_loop (this=0x7fffe0026660, expanded_query=0x7ffff404eaf0, open_cursor=<optimized out>, packet=<optimized out>, packet_end=<optimized out>) at sql/sql_prepare.cc:3432 #6 0x000000000058d04a in mysql_sql_stmt_execute (thd=<optimized out>) at sql/sql_prepare.cc:2634 #7 0x000000000057d948 in mysql_execute_command (thd=thd@entry=0x15b9bb0) at sql/sql_parse.cc:2161 #8 0x0000000000580480 in mysql_parse (parser_state=0x7ffff40500c0, thd=0x15b9bb0, rawbuf=<optimized out>, length=<optimized out>) at sql/sql_parse.cc:5627 #9 mysql_parse (thd=0x15b9bb0, rawbuf=<optimized out>, length=21, parser_state=0x7ffff40500c0) at sql/sql_parse.cc:5551 #10 0x00000000005816b6 in dispatch_command (command=COM_QUERY, thd=0x15b9bb0, packet=<optimized out>, packet_length=<optimized out>) at sql/sql_parse.cc:1037 #11 0x000000000060a666 in do_handle_one_connection (thd_arg=thd_arg@entry=0x15b9bb0) at sql/sql_connect.cc:853 #12 0x000000000060a6ca in handle_one_connection (arg=arg@entry=0x15b9bb0) at sql/sql_connect.cc:772 #13 0x00000000008b3f35 in pfs_spawn_thread (arg=0x15eacf0) at storage/perfschema/pfs.cc:1015 #14 0x00007ffff77a5e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #15 0x00007ffff6a96cbd in clone () from /lib/x86_64-linux-gnu/libc.so.6 #16 0x0000000000000000 in ?? ()
Using SHOW BINLOG EVENTS shows that the problem is from the server (binlogging) side:
*************************** 6. row *************************** Log_name: 0.000001 Pos: 450 Event_type: Query Server_id: 1 End_log_pos: 544 Info: use `test`; INSERT INTO t (a) VALUES (0x31)
Seems like this bug has existed since at least 2006. It would seemingly make replication completely broken when using prepared statements along with any of big5, cp932, gbk, sjis character sets.
Regards,
Jeremy
On Sat, Apr 27, 2013 at 7:31 PM, jhx1008 <jhx1008@gmail.com <mailto:jhx1008@gmail.com>> wrote:
__ __ hi all: I have a problem about the type cast when I have read the source of Maria & MySQL I notice that when cast from string into int, Maria & MySQL do it like the C function atoi code: *for (ul= 0 ; str < end9 && (ch= (uchar) (*str - '0')) < 10; str++) { ul= ul * 10 + ch; }* but cast from hex into int, the code is: code: *char *end=(char*) str_value.ptr()+str_value.length(), *ptr=end-min(str_value.length(),sizeof(longlong)); ulonglong value=0; for (; ptr != end ; ptr++) value=(value << 8)+ (ulonglong) (uchar) *ptr;* so I do the test blew: *create table t(id int auto_increment primary key, a int)engine=innodb;* *insert into t(a) values('1');* *insert into t(a) values(0x31);* everything is ok, I get the result: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 1 | | 2 | 49 | +----+------+ the value 0x31 is the ascii of the string '1', but we got the different result after inserting maybe it's not a problem, but in the replication environment, it maybe cause the data of slave not the same as master In the master, when we set the connection characterset into multi-bytes characterset and this characterset may escape with backslash is dangerous (like gbk, cp932) and then we use the prepared statement to do the insert, the string value in the insert must be cast into hex before write into the binlog, so we do the insert(insert into t(a) values('1')) on the master, but in the slave it does the another(insert into t(a) values(0x31)) , the data is not the same. how to reappear: master(binlog format=statement): *create table t(id int auto_increment primay key, a int) engine=innodb;* *set names gbk;* *prepare stmt from 'insert into t(a) values(?)';* *set @a='1';* *execute stmt using @a;* the result in master is: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 1 | +----+------+ and result in slave is: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 49 | +----+------+ I test the newest release version mariadb-10.0.2 and MySQL-5.6.10, the problem is still exist, is this a bug? 2013-04-28 ------------------------------------------------------------------------ jhx1008 @netease __
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net <mailto:maria-developers@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Hi, On 05/02/2013 06:42 AM, jhx1008 wrote:
Thanks for all the replies, to fix it maybe need a new syntax, does mariadb has any plan to fix it? 2013-05-02
We'll fix INSERTs (like the one in your example) to work fine by the next release, and also think about more tricky cases.
------------------------------------------------------------------------ jhx1008 @netease ------------------------------------------------------------------------ *发件人:*Alexander Barkov *发送时间:*2013-04-30 13:37 *主题:*Re: [Maria-developers] is this a bug? *收件人:*"Jeremy Cole"<jeremycole@google.com> *抄送:*"jhx1008"<jhx1008@gmail.com>,"maria-developers"<maria- developers@lists.launchpad.net> Hello, This is definitely a bug. 0xHHHH is a MySQL extension, and it's a hybrid thing. It can behave as a number and a string depending on context. Binary log could use the X'HHHH' notation instead: INSERT INTO t1 VALUES (a) VALUES (X'31'); which is an SQL standard, and which must always be a string. However, it seems the behaviour of X'HHHH' and of 0xHHHH is exactly the same, and X'HHHH' can also act as a number. This script demonstrates the problem: drop table if exists t1; create table t1 (id int); insert into t1 values (x'31'),(0x31),(concat(0x31)); select * from t1; +------+ | id | +------+ | 49 | <-- wrong | 49 | <-- ok | 1 | <-- ok +------+ I'd propose the following as a quick fix: 1. Fix X'HHHH' to work always as string. 2. Fix binlog to use X'HHHH' This will fix the INSERT statements like the one in the bug report. But the problem is slightly wider. Replacing a string to X'HHHH' does not provide 100% compatibility in all cases. Each string value is described by the following parameters: a. buffer with actual data b. character set and collation c. collation derivation (coercibility) - the collation precedence which is used when the value is compared to another value. X'HHHH' only reproduces "a", but it does not reproduce "b" and "c". Thus simply replacing queries like this: UPDATE t1 SET .... WHERE column='str'; to UPDATE t1 SET .... WHERE column=X'HHHHHH'; can still lead to wrong results, even with the fixed X'HHHH'. We don't have a syntax to reproduce all three parameters. I think such syntax should be added in long terms. On 04/30/2013 03:48 AM, Jeremy Cole wrote:
Hello!
This does look to be a legitimate bug. This would apply to any character set where charset_info_st field escape_with_backslash_is_dangerous is true, which currently is: big5, cp932, gbk, sjis.
The problem here is that string parameters coming from prepared statements are being converted into 0xHHHH form indiscriminately in append_query_string, which is producing the string to be binlogged for statement-based replication. While that works okay for insertion of strings into string fields, it causes the conversion-from-string-to-integer which is happening on the master for insertion of a string into an integer field to not be happening on the slave, since 0xHHHH form is more properly an integer than a string.
This can be captured by setting a breakpoint at str_to_hex and running jhx1008's test case (fixed up a bit):
DROP TABLE IF EXISTS t; CREATE TABLE t (id INT NOT NULL AUTO_INCREMENT, a INT, PRIMARY KEY(id)); SET NAMES gbk; PREPARE STMT FROM 'INSERT INTO t (a) VALUES (?)'; SET @a = '1'; EXECUTE STMT USING @a;
This is the stack backtrace at that point (sorry, this is from MySQL-5.5, not MariaDB, since that's what I had handily set up for debugging, and I suspected this problem to be much broader than MariaDB):
#0 str_to_hex (to=0x7fffe0029f60 "", from=0x7fffe002ab90 "1", len=1) at sql/log_event.cc:585 #1 0x00000000006dee14 in append_query_string (thd=0x15b9bb0, csinfo=0xeebd00 <my_charset_gbk_chinese_ci>, from=0x7fffe000d400, to=0x7ffff404e9d0) at sql/log_event.cc:616 #2 0x000000000066da2d in Item_param::query_val_str (this=0x7fffe000d3f0, thd=<optimized out>, str=0x7ffff404e9d0) at sql/item.cc:3333 #3 0x000000000058a138 in insert_params_from_vars_with_log (stmt=<optimized out>, varnames=..., query=0x7ffff404eaf0) at sql/sql_prepare.cc:1216 #4 0x000000000058c314 in Prepared_statement::set_parameters (this=this@entry=0x7fffe0026660, expanded_query=expanded_query@entry=0x7ffff404eaf0, packet=0x0, packet_end=<optimized out>) at sql/sql_prepare.cc:3364 #5 0x000000000058ce50 in Prepared_statement::execute_loop (this=0x7fffe0026660, expanded_query=0x7ffff404eaf0, open_cursor=<optimized out>, packet=<optimized out>, packet_end=<optimized out>) at sql/sql_prepare.cc:3432 #6 0x000000000058d04a in mysql_sql_stmt_execute (thd=<optimized out>) at sql/sql_prepare.cc:2634 #7 0x000000000057d948 in mysql_execute_command (thd=thd@entry=0x15b9bb0) at sql/sql_parse.cc:2161 #8 0x0000000000580480 in mysql_parse (parser_state=0x7ffff40500c0, thd=0x15b9bb0, rawbuf=<optimized out>, length=<optimized out>) at sql/sql_parse.cc:5627 #9 mysql_parse (thd=0x15b9bb0, rawbuf=<optimized out>, length=21, parser_state=0x7ffff40500c0) at sql/sql_parse.cc:5551 #10 0x00000000005816b6 in dispatch_command (command=COM_QUERY, thd=0x15b9bb0, packet=<optimized out>, packet_length=<optimized out>) at sql/sql_parse.cc:1037 #11 0x000000000060a666 in do_handle_one_connection (thd_arg=thd_arg@entry=0x15b9bb0) at sql/sql_connect.cc:853 #12 0x000000000060a6ca in handle_one_connection (arg=arg@entry=0x15b9bb0) at sql/sql_connect.cc:772 #13 0x00000000008b3f35 in pfs_spawn_thread (arg=0x15eacf0) at storage/perfschema/pfs.cc:1015 #14 0x00007ffff77a5e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #15 0x00007ffff6a96cbd in clone () from /lib/x86_64-linux-gnu/libc.so.6 #16 0x0000000000000000 in ?? ()
Using SHOW BINLOG EVENTS shows that the problem is from the server (binlogging) side:
*************************** 6. row *************************** Log_name: 0.000001 Pos: 450 Event_type: Query Server_id: 1 End_log_pos: 544 Info: use `test`; INSERT INTO t (a) VALUES (0x31)
Seems like this bug has existed since at least 2006. It would seemingly make replication completely broken when using prepared statements along with any of big5, cp932, gbk, sjis character sets.
Regards,
Jeremy
On Sat, Apr 27, 2013 at 7:31 PM, jhx1008 <jhx1008@gmail.com <mailto:jhx1008@gmail.com>> wrote:
__ __ hi all: I have a problem about the type cast when I have read the source of Maria & MySQL I notice that when cast from string into int, Maria & MySQL do it like the C function atoi code: *for (ul= 0 ; str < end9 && (ch= (uchar) (*str - '0')) < 10; str++) { ul= ul * 10 + ch; }* but cast from hex into int, the code is: code: *char *end=(char*) str_value.ptr()+str_value.length(), *ptr=end-min(str_value.length(),sizeof(longlong)); ulonglong value=0; for (; ptr != end ; ptr++) value=(value << 8)+ (ulonglong) (uchar) *ptr;* so I do the test blew: *create table t(id int auto_increment primary key, a int)engine=innodb;* *insert into t(a) values('1');* *insert into t(a) values(0x31);* everything is ok, I get the result: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 1 | | 2 | 49 | +----+------+ the value 0x31 is the ascii of the string '1', but we got the different result after inserting maybe it's not a problem, but in the replication environment, it maybe cause the data of slave not the same as master In the master, when we set the connection characterset into multi-bytes characterset and this characterset may escape with backslash is dangerous (like gbk, cp932) and then we use the prepared statement to do the insert, the string value in the insert must be cast into hex before write into the binlog, so we do the insert(insert into t(a) values('1')) on the master, but in the slave it does the another(insert into t(a) values(0x31)) , the data is not the same. how to reappear: master(binlog format=statement): *create table t(id int auto_increment primay key, a int) engine=innodb;* *set names gbk;* *prepare stmt from 'insert into t(a) values(?)';* *set @a='1';* *execute stmt using @a;* the result in master is: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 1 | +----+------+ and result in slave is: mysql> select * from t; +----+------+ | id | a | +----+------+ | 1 | 49 | +----+------+ I test the newest release version mariadb-10.0.2 and MySQL-5.6.10, the problem is still exist, is this a bug? 2013-04-28 ------------------------------------------------------------------------ jhx1008 @netease __
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net <mailto:maria-developers@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
participants (4)
-
Alexander Barkov
-
Jeremy Cole
-
jhx1008
-
Sergei Golubchik