[Commits] 1e58c579f5b: MDEV-18930: Failed CREATE OR REPLACE TEMPORARY not written into binary log makes data on master and slave diverge
revision-id: 1e58c579f5b61644a74045a0470d2c62b343eb75 (mariadb-10.1.41-3-g1e58c579f5b) parent(s): 0bb8f33b55cc016d0ede86b97990a3ee30dcb069 author: Sujatha committer: Sujatha timestamp: 2019-08-01 16:16:05 +0530 message: MDEV-18930: Failed CREATE OR REPLACE TEMPORARY not written into binary log makes data on master and slave diverge Problem: ======= Failed CREATE OR REPLACE TEMPORARY TABLE statement which dropped the table but failed at a later stage of creation of temporary table is not written to binarylog in row based replication. This causes the slave to diverge. Analysis: ======== CREATE OR REPLACE statements work as shown below. CREATE OR REPLACE TABLE table_name (a int); is basically the same as: DROP TABLE IF EXISTS table_name; CREATE TABLE table_name (a int); Hence every CREATE OR REPLACE TABLE command which dropped the table should be written to binary log, even when following CREATE TABLE part fails. In order to achieve this, during the execution of CREATE OR REPLACE command, when a table is dropped 'thd->log_current_statement' flag is set. When table creation results in an error within 'mysql_create_table' code, the error handling part looks for this flag. If it is set the failed CREATE OR REPLACE statement is written into the binary log inspite of error. This ensure that slave doesn't diverge from the master. In case of row based replication the error handling code returns very early, if the table is of type temporary. This is done based on the assumption that temporary tables are not replicated in row based replication. It fails to handle the cases where a temporary table was created as part of statement based replication at an earlier stage and the binary log format was changed to row because of an unsafe statement. In this case when a CREATE OR REPLACE statement is executed on this temporary table it will dropped but the query will not be written to binary log. Hence slave diverges. Fix: === In error handling code check the return status of create table operation. If it is successful and replication mode is row based and table is of type temporary then return. Other wise proceed further to the code which checks for thd->log_current_statement flag and does appropriate logging. --- .../suite/rpl/r/rpl_create_or_replace_fail.result | 18 +++++++ .../suite/rpl/t/rpl_create_or_replace_fail.test | 56 ++++++++++++++++++++++ sql/sql_table.cc | 2 +- 3 files changed, 75 insertions(+), 1 deletion(-) diff --git a/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result b/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result new file mode 100644 index 00000000000..57178f0efbe --- /dev/null +++ b/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result @@ -0,0 +1,18 @@ +include/master-slave.inc +[connection master] +CREATE TEMPORARY TABLE t1 (a INT NOT NULL); +LOAD DATA INFILE 'x' INTO TABLE x; +ERROR 42S02: Table 'test.x' doesn't exist +CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x); +ERROR HY000: Cannot create temporary table with partitions +"************** DROP TEMPORARY TABLE Should be present in Binary log **************" +include/show_binlog_events.inc +Log_name Pos Event_type Server_id End_log_pos Info +master-bin.000001 # Gtid # # GTID #-#-# +master-bin.000001 # Query # # use `test`; CREATE TEMPORARY TABLE t1 (a INT NOT NULL) +master-bin.000001 # Gtid # # GTID #-#-# +master-bin.000001 # Query # # use `test`; CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x) +CREATE TABLE t1 (b INT); +INSERT INTO t1 VALUES (NULL); +DROP TABLE t1; +include/rpl_end.inc diff --git a/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test b/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test new file mode 100644 index 00000000000..e75f34b0b56 --- /dev/null +++ b/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test @@ -0,0 +1,56 @@ +# ==== Purpose ==== +# +# Test verifies that failed CREATE OR REPLACE TEMPORARY TABLE statement which +# dropped the table but failed at a later stage of creation of temporary table +# is written to binarylog in row based replication. +# +# ==== Implementation ==== +# +# Steps: +# 0 - Have mixed based replication mode. +# 1 - Create a temporary table. It will be replicated as mixed replication +# mode is in use. +# 2 - Execute an unsafe statement which will switch current statement +# binlog format to 'ROW'. i.e If binlog_format=MIXED, there are open +# temporary tables, and an unsafe statement is executed, then subsequent +# statements are logged in row format. +# 3 - Execute a CREATE OR REPLACE TEMPORARY TABLE statement which tries to +# create partitions on temporary table. Since it is not supported it will +# fail. +# 4 - Check the binary log output to ensure that the failed statement is +# written to the binary log. +# 5 - Slave should be up and running and in sync with master. +# +# ==== References ==== +# +# MDEV-18930: Failed CREATE OR REPLACE TEMPORARY not written into binary log +# makes data on master and slave diverge +# + +--source include/have_partition.inc +--source include/have_binlog_format_mixed.inc +--source include/master-slave.inc + +CREATE TEMPORARY TABLE t1 (a INT NOT NULL); + +# Execute an unsafe statement which switches replication mode internally from +# "STATEMENT" to "ROW". +--error ER_NO_SUCH_TABLE +LOAD DATA INFILE 'x' INTO TABLE x; + +--error ER_PARTITION_NO_TEMPORARY +CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x); + +--echo "************** DROP TEMPORARY TABLE Should be present in Binary log **************" +--source include/show_binlog_events.inc + +CREATE TABLE t1 (b INT); +INSERT INTO t1 VALUES (NULL); +--sync_slave_with_master + +# Cleanup +--connection master +DROP TABLE t1; + +--source include/rpl_end.inc + diff --git a/sql/sql_table.cc b/sql/sql_table.cc index 656834c7852..fe923d73200 100644 --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -5092,7 +5092,7 @@ bool mysql_create_table(THD *thd, TABLE_LIST *create_table, err: /* In RBR we don't need to log CREATE TEMPORARY TABLE */ - if (thd->is_current_stmt_binlog_format_row() && create_info->tmp_table()) + if (!result && thd->is_current_stmt_binlog_format_row() && create_info->tmp_table()) DBUG_RETURN(result); if (create_info->tmp_table())
Hi Sujatha! On Thu, Aug 1, 2019 at 4:16 PM sujatha <sujatha.sivakumar@mariadb.com> wrote:
revision-id: 1e58c579f5b61644a74045a0470d2c62b343eb75 (mariadb-10.1.41-3-g1e58c579f5b) parent(s): 0bb8f33b55cc016d0ede86b97990a3ee30dcb069 author: Sujatha committer: Sujatha timestamp: 2019-08-01 16:16:05 +0530 message:
MDEV-18930: Failed CREATE OR REPLACE TEMPORARY not written into binary log makes data on master and slave diverge
Problem: ======= Failed CREATE OR REPLACE TEMPORARY TABLE statement which dropped the table but failed at a later stage of creation of temporary table is not written to binarylog in row based replication. This causes the slave to diverge.
Analysis: ======== CREATE OR REPLACE statements work as shown below.
CREATE OR REPLACE TABLE table_name (a int); is basically the same as:
DROP TABLE IF EXISTS table_name; CREATE TABLE table_name (a int);
Hence every CREATE OR REPLACE TABLE command which dropped the table should be written to binary log, even when following CREATE TABLE part fails. In order to achieve this, during the execution of CREATE OR REPLACE command, when a table is dropped 'thd->log_current_statement' flag is set. When table creation results in an error within 'mysql_create_table' code, the error handling part looks for this flag. If it is set the failed CREATE OR REPLACE statement is written into the binary log inspite of error. This ensure that slave doesn't diverge from the master. In case of row based replication the error handling code returns very early, if the table is of type temporary. This is done based on the assumption that temporary tables are not replicated in row based replication.
It fails to handle the cases where a temporary table was created as part of statement based replication at an earlier stage and the binary log format was changed to row because of an unsafe statement. In this case when a CREATE OR REPLACE statement is executed on this temporary table it will dropped but the query will not be written to binary log. Hence slave diverges.
Fix: === In error handling code check the return status of create table operation. If it is successful and replication mode is row based and table is of type temporary then return. Other wise proceed further to the code which checks for thd->log_current_statement flag and does appropriate logging.
--- .../suite/rpl/r/rpl_create_or_replace_fail.result | 18 +++++++ .../suite/rpl/t/rpl_create_or_replace_fail.test | 56 ++++++++++++++++++++++ sql/sql_table.cc | 2 +- 3 files changed, 75 insertions(+), 1 deletion(-)
diff --git a/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result b/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result new file mode 100644 index 00000000000..57178f0efbe --- /dev/null +++ b/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result @@ -0,0 +1,18 @@ +include/master-slave.inc +[connection master] +CREATE TEMPORARY TABLE t1 (a INT NOT NULL); +LOAD DATA INFILE 'x' INTO TABLE x; +ERROR 42S02: Table 'test.x' doesn't exist +CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x); +ERROR HY000: Cannot create temporary table with partitions +"************** DROP TEMPORARY TABLE Should be present in Binary log **************" +include/show_binlog_events.inc +Log_name Pos Event_type Server_id End_log_pos Info +master-bin.000001 # Gtid # # GTID #-#-# +master-bin.000001 # Query # # use `test`; CREATE TEMPORARY TABLE t1 (a INT NOT NULL) +master-bin.000001 # Gtid # # GTID #-#-# +master-bin.000001 # Query # # use `test`; CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x) +CREATE TABLE t1 (b INT); +INSERT INTO t1 VALUES (NULL); +DROP TABLE t1; +include/rpl_end.inc diff --git a/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test b/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test new file mode 100644 index 00000000000..e75f34b0b56 --- /dev/null +++ b/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test @@ -0,0 +1,56 @@ +# ==== Purpose ==== +# +# Test verifies that failed CREATE OR REPLACE TEMPORARY TABLE statement which +# dropped the table but failed at a later stage of creation of temporary table +# is written to binarylog in row based replication. +# +# ==== Implementation ==== +# +# Steps: +# 0 - Have mixed based replication mode. +# 1 - Create a temporary table. It will be replicated as mixed replication +# mode is in use. +# 2 - Execute an unsafe statement which will switch current statement +# binlog format to 'ROW'. i.e If binlog_format=MIXED, there are open +# temporary tables, and an unsafe statement is executed, then subsequent +# statements are logged in row format.
Why not just set binlog format to row ?
+# 3 - Execute a CREATE OR REPLACE TEMPORARY TABLE statement which tries to +# create partitions on temporary table. Since it is not supported it will +# fail. +# 4 - Check the binary log output to ensure that the failed statement is +# written to the binary log. +# 5 - Slave should be up and running and in sync with master. +# +# ==== References ==== +# +# MDEV-18930: Failed CREATE OR REPLACE TEMPORARY not written into binary log +# makes data on master and slave diverge +# + +--source include/have_partition.inc +--source include/have_binlog_format_mixed.inc +--source include/master-slave.inc + +CREATE TEMPORARY TABLE t1 (a INT NOT NULL); + +# Execute an unsafe statement which switches replication mode internally from +# "STATEMENT" to "ROW". +--error ER_NO_SUCH_TABLE +LOAD DATA INFILE 'x' INTO TABLE x; + +--error ER_PARTITION_NO_TEMPORARY +CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x); + +--echo "************** DROP TEMPORARY TABLE Should be present in Binary log **************" +--source include/show_binlog_events.inc + +CREATE TABLE t1 (b INT); +INSERT INTO t1 VALUES (NULL); +--sync_slave_with_master + +# Cleanup +--connection master +DROP TABLE t1; + +--source include/rpl_end.inc + diff --git a/sql/sql_table.cc b/sql/sql_table.cc index 656834c7852..fe923d73200 100644 --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -5092,7 +5092,7 @@ bool mysql_create_table(THD *thd, TABLE_LIST *create_table,
err: /* In RBR we don't need to log CREATE TEMPORARY TABLE */ - if (thd->is_current_stmt_binlog_format_row() && create_info->tmp_table()) + if (!result && thd->is_current_stmt_binlog_format_row() && create_info->tmp_table()) DBUG_RETURN(result); Can we only binary log only the drop part ? not the create part.
if (create_info->tmp_table()) _______________________________________________ commits mailing list commits@mariadb.org https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
-- Regards Sachin Setiya Software Engineer at MariaDB
Hello Sachin, Thanks for the review comments. Please find my replies inline. On 02/08/19 11:26 AM, Sachin Setiya wrote:
Hi Sujatha!
On Thu, Aug 1, 2019 at 4:16 PM sujatha <sujatha.sivakumar@mariadb.com> wrote:
revision-id: 1e58c579f5b61644a74045a0470d2c62b343eb75 (mariadb-10.1.41-3-g1e58c579f5b) parent(s): 0bb8f33b55cc016d0ede86b97990a3ee30dcb069 author: Sujatha committer: Sujatha timestamp: 2019-08-01 16:16:05 +0530 message:
MDEV-18930: Failed CREATE OR REPLACE TEMPORARY not written into binary log makes data on master and slave diverge
Problem: ======= Failed CREATE OR REPLACE TEMPORARY TABLE statement which dropped the table but failed at a later stage of creation of temporary table is not written to binarylog in row based replication. This causes the slave to diverge.
Analysis: ======== CREATE OR REPLACE statements work as shown below.
CREATE OR REPLACE TABLE table_name (a int); is basically the same as:
DROP TABLE IF EXISTS table_name; CREATE TABLE table_name (a int);
Hence every CREATE OR REPLACE TABLE command which dropped the table should be written to binary log, even when following CREATE TABLE part fails. In order to achieve this, during the execution of CREATE OR REPLACE command, when a table is dropped 'thd->log_current_statement' flag is set. When table creation results in an error within 'mysql_create_table' code, the error handling part looks for this flag. If it is set the failed CREATE OR REPLACE statement is written into the binary log inspite of error. This ensure that slave doesn't diverge from the master. In case of row based replication the error handling code returns very early, if the table is of type temporary. This is done based on the assumption that temporary tables are not replicated in row based replication.
It fails to handle the cases where a temporary table was created as part of statement based replication at an earlier stage and the binary log format was changed to row because of an unsafe statement. In this case when a CREATE OR REPLACE statement is executed on this temporary table it will dropped but the query will not be written to binary log. Hence slave diverges.
Fix: === In error handling code check the return status of create table operation. If it is successful and replication mode is row based and table is of type temporary then return. Other wise proceed further to the code which checks for thd->log_current_statement flag and does appropriate logging.
--- .../suite/rpl/r/rpl_create_or_replace_fail.result | 18 +++++++ .../suite/rpl/t/rpl_create_or_replace_fail.test | 56 ++++++++++++++++++++++ sql/sql_table.cc | 2 +- 3 files changed, 75 insertions(+), 1 deletion(-)
diff --git a/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result b/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result new file mode 100644 index 00000000000..57178f0efbe --- /dev/null +++ b/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result @@ -0,0 +1,18 @@ +include/master-slave.inc +[connection master] +CREATE TEMPORARY TABLE t1 (a INT NOT NULL); +LOAD DATA INFILE 'x' INTO TABLE x; +ERROR 42S02: Table 'test.x' doesn't exist +CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x); +ERROR HY000: Cannot create temporary table with partitions +"************** DROP TEMPORARY TABLE Should be present in Binary log **************" +include/show_binlog_events.inc +Log_name Pos Event_type Server_id End_log_pos Info +master-bin.000001 # Gtid # # GTID #-#-# +master-bin.000001 # Query # # use `test`; CREATE TEMPORARY TABLE t1 (a INT NOT NULL) +master-bin.000001 # Gtid # # GTID #-#-# +master-bin.000001 # Query # # use `test`; CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x) +CREATE TABLE t1 (b INT); +INSERT INTO t1 VALUES (NULL); +DROP TABLE t1; +include/rpl_end.inc diff --git a/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test b/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test new file mode 100644 index 00000000000..e75f34b0b56 --- /dev/null +++ b/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test @@ -0,0 +1,56 @@ +# ==== Purpose ==== +# +# Test verifies that failed CREATE OR REPLACE TEMPORARY TABLE statement which +# dropped the table but failed at a later stage of creation of temporary table +# is written to binarylog in row based replication. +# +# ==== Implementation ==== +# +# Steps: +# 0 - Have mixed based replication mode. +# 1 - Create a temporary table. It will be replicated as mixed replication +# mode is in use. +# 2 - Execute an unsafe statement which will switch current statement +# binlog format to 'ROW'. i.e If binlog_format=MIXED, there are open +# temporary tables, and an unsafe statement is executed, then subsequent +# statements are logged in row format. Why not just set binlog format to row ?
With this change the bug will not be reproducible. Please find the test results and analysis. include/master-slave.inc [connection master] CREATE TEMPORARY TABLE t1 (a INT NOT NULL); set @@session.binlog_format=ROW; CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x); ERROR HY000: Cannot create temporary table with partitions "************** DROP TEMPORARY TABLE Should be present in Binary log **************" include/show_binlog_events.inc Log_name Pos Event_type Server_id End_log_pos Info master-bin.000001 # Gtid # # GTID #-#-# master-bin.000001 # Query # # use `test`; CREATE TEMPORARY TABLE t1 (a INT NOT NULL) CREATE TABLE t1 (b INT); INSERT INTO t1 VALUES (NULL); SHOW CREATE TABLE t1; Table Create Table t1 CREATE TABLE `t1` ( `b` int(11) DEFAULT NULL ) ENGINE=MyISAM DEFAULT CHARSET=latin1 DROP TABLE t1; include/rpl_end.inc rpl.rpl_create_or_replace_fail 'mix' [ pass ] 334 The reason being, 1. Temp table t1 is replication in stmt mode. 2. Switch to ROW 3. CREATE OR REPLACE execution failed but it dropped the temp table ant not replicated. 4. Regular table 'T1' is created and replicated to slave. 5. Slaves in replication will by default use |CREATE OR REPLACE| when replicating |CREATE| statements that don''t use |IF EXISTS|. Hence the above CREATE TABLE will be transformed into CREATE OR REPLACE TABLE on slave which will drop the 'Temporary table' and create the regular table. 6. Replication will not fail. This masks the bug. Hence we need to follow the actual queries mentioned in bug page.
+# 3 - Execute a CREATE OR REPLACE TEMPORARY TABLE statement which tries to +# create partitions on temporary table. Since it is not supported it will +# fail. +# 4 - Check the binary log output to ensure that the failed statement is +# written to the binary log. +# 5 - Slave should be up and running and in sync with master. +# +# ==== References ==== +# +# MDEV-18930: Failed CREATE OR REPLACE TEMPORARY not written into binary log +# makes data on master and slave diverge +# + +--source include/have_partition.inc +--source include/have_binlog_format_mixed.inc +--source include/master-slave.inc + +CREATE TEMPORARY TABLE t1 (a INT NOT NULL); + +# Execute an unsafe statement which switches replication mode internally from +# "STATEMENT" to "ROW". +--error ER_NO_SUCH_TABLE +LOAD DATA INFILE 'x' INTO TABLE x; + +--error ER_PARTITION_NO_TEMPORARY +CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x); + +--echo "************** DROP TEMPORARY TABLE Should be present in Binary log **************" +--source include/show_binlog_events.inc + +CREATE TABLE t1 (b INT); +INSERT INTO t1 VALUES (NULL); +--sync_slave_with_master + +# Cleanup +--connection master +DROP TABLE t1; + +--source include/rpl_end.inc + diff --git a/sql/sql_table.cc b/sql/sql_table.cc index 656834c7852..fe923d73200 100644 --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -5092,7 +5092,7 @@ bool mysql_create_table(THD *thd, TABLE_LIST *create_table,
err: /* In RBR we don't need to log CREATE TEMPORARY TABLE */ - if (thd->is_current_stmt_binlog_format_row() && create_info->tmp_table()) + if (!result && thd->is_current_stmt_binlog_format_row() && create_info->tmp_table()) DBUG_RETURN(result); Can we only binary log only the drop part ? not the create part.
When DDL statements are executed, some times they can fail and the effects of the query cannot be undone. The failed CREATE OR REPLACE command is of this type. The query failed but it dropped the temporary table which cannot be undone. In such scenario we do replicate these failed DDL statements along with the exact error code which happened on master. So that the DDL will get replicated and it will also fail on the slave with the same error code. With this we ensure slave is in sync with master. On slave, for such statements upon query execution, we compare the error_code received from the master and actual_error that happened on slave. If they both are fine slave continues. This is already implemented most of the cases. This particular 'temporary table' scenario is not covered. For example more scenarios are present in : mysql-test/suite/rpl/t/create_or_replace.inc --echo # --echo # Ensure that also failed create_or_replace are logged --echo # --let $binlog_start=query_get_value(SHOW MASTER STATUS, Position, 1) create table t1 (a int); --error ER_TABLE_MUST_HAVE_COLUMNS create or replace table t1; Since the current scenario also comes under the above case I followed the same approach. But we do log only the drop part in few special cases. You can find them in the following MDEV. https://jira.mariadb.org/browse/MDEV-5589 Please let me know your thoughts. Thank you S.Sujatha
if (create_info->tmp_table()) _______________________________________________ commits mailing list commits@mariadb.org https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
participants (2)
-
Sachin Setiya
-
sujatha