Hello Andrei, Good morning. Thank you for the review comments. Please find my replies inline. On 22/05/19 10:22 PM, andrei.elkin@pp.inet.fi wrote:
Sujatha, salute!
Thanks for a detailed description! I would add up to it a minor piece
revision-id: 53d0e9217943719b806ef9fa1cac0a699df4839c (mariadb-10.1.39-31-g53d0e921794) parent(s): 47637a3dd13d19e897a7cbfd1499f1bf3b2fdb2a author: Sujatha committer: Sujatha timestamp: 2019-05-17 13:11:49 +0530 message:
MDEV-11094: Blackhole table updates on slave fail when row annotation is enabled
Problem: ======= rpl_blackhole.test fails when executed with following options mysqld=--binlog_annotate_row_events=1, mysqld=--replicate_annotate_row_events=1
Test output: ------------ worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019 rpl.rpl_blackhole_bug 'mix' [ pass ] 791 rpl.rpl_blackhole_bug 'row' [ fail ] Replicate_Wild_Ignore_Table Last_Errno 1032 Last_Error Could not execute Update_rows_v1 event on table test.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 1510
Analysis: ========= Enabling "replicate_annotate_row_events" on slave, Tells the slave to write annotate rows events received from the master to its own binary log. The received annotate events are applied after the Gtid event as shown below. thd->query() will be set to the actual query received from the master, through annotate event. Annotate_rows event should not be deleted after the event is applied as the thd->query will be used to generate new Annotate_rows event during applying the subsequent Rows events. [ here ] The Table_map and Rows_log_event that follow do not touch thd->query().
The Table_map and Rows_log_event that follow "Annotate_rows" event do make use of "thd->query()". Please check the following path. #0 write_locked_table_maps (thd=0x7fff98001910) at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/handler.cc:5735 #1 0x0000555555c7cdb1 in binlog_log_row (table=0x7fff9801c7e0, before_record=0x0, after_record=0x7fff9801c028 "\376aUnsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT", log_func=0x555555c7f8f7 <Write_rows_log_event::binlog_row_logging_function(THD*, TABLE*, bool, unsigned char const*, unsigned char const*)>) at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/handler.cc:5842 #2 0x0000555555c7d53d in handler::ha_write_row (this=0x7fff9801d428, buf=0x7fff9801c028 "\376aUnsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT") at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/handler.cc:5985 #3 0x0000555555d79397 in Rows_log_event::write_row (this=0x7fff98018c40, rgi=0x7fff98000cf0, overwrite=false) at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/log_event.cc:11854 #4 0x0000555555d79d85 in Write_rows_log_event::do_exec_row (this=0x7fff98018c40, rgi=0x7fff98000cf0) at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/log_event.cc:12038 #5 0x0000555555d759aa in Rows_log_event::do_apply_event (this=0x7fff98018c40, rgi=0x7fff98000cf0) at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/log_event.cc:10191 #6 0x0000555555996a89 in Log_event::apply_event (this=0x7fff98018c40, rgi=0x7fff98000cf0) at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/log_event.h:1343 #7 0x000055555598c21d in apply_event_and_update_pos_apply (ev=0x7fff98018c40, thd=0x7fff98001910, rgi=0x7fff98000cf0, reason=0) at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/slave.cc:3479 #8 0x000055555598c6a6 in apply_event_and_update_pos (ev=0x7fff98018c40, thd=0x7fff98001910, rgi=0x7fff98000cf0) at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/slave.cc:3600 #9 0x000055555598ceaa in exec_relay_log_event (thd=0x7fff98001910, rli=0x555557c18f20, serial_rgi=0x7fff98000cf0) at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/slave.cc:3885 #10 0x00005555559900bd in handle_slave_sql (arg=0x555557c17250) at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/slave.cc:4981 #11 0x0000555555f97ed1 in pfs_spawn_thread (arg=0x7fffac02e680) at /home/sujatha/bug_repo/MDEV-11094-10.1/storage/perfschema/pfs.cc:1861 #12 0x00007ffff75646db in start_thread (arg=0x7fffe8681700) at pthread_create.c:463 #13 0x00007ffff6b6688f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 static int write_locked_table_maps(THD *thd) { ... if (thd->get_binlog_table_maps() == 0) { MYSQL_LOCK *locks[2]; locks[0]= thd->extra_lock; locks[1]= thd->lock; my_bool with_annotate= thd->variables.binlog_annotate_row_events && thd->query() && thd->query_length(); On slave upon receipt of annotate event "thd->variables.binlog_annotate_row_events" is set within "set_annotate_event(Annotate_rows_log_event *event)" call. This flag will be used by the following rows events like 'Write_rows", "Update_rows" and "Delete_rows" events to know that there is a replicated annotate event available. All rows events will have a "Table_map_log_event" and it is written by invoking "write_locked_table_maps()". Within "write_locked_table_maps" first step is to check for replicated annotate event flag, and if is set "thd->query()" is used to generate a new annotate event and it is written to the slave's binary log. IMHO the recommended suggestion of resetting the query to NULL, within "Rows_log_event::do_apply_event" will not work. Since the query is cleared within "Rows_log_event::do_apply_event" prior to the "do_exec_row" call, at the time of writing the "Table_map_log_event" the annotation check will fail. So no annotate events are written to binlog on the slave. my_bool with_annotate= thd->variables.binlog_annotate_row_events && thd->query() && thd->query_length(); The "thd->variables.binlog_annotate_row_events" is cleared within "free_annotate_event" after the last row event has been applied. Hence "thd->variables.binlog_annotate_row_events" flag is a key to identify Row based replication + replicated annotate row events. Please let me know your thoughts.
Notice that the being suggested sentence is sufficient to explain the not-NULL without the sentence below (fill free to remove then):
After the last Rows event has been applied, the saved Annotate_rows event (if any) will be deleted.
master-bin.000001 | 457 | Gtid | BEGIN GTID 0-1-2 master-bin.000001 | 495 | Annotate_rows | insert into t values (10) master-bin.000001 | 539 | Table_map | table_id: 19 (test.t) master-bin.000001 | 579 | Write_rows_v1 | table_id: 19 flags: STMT_END_F master-bin.000001 | 613 | Xid | COMMIT /* xid=7 */
In balckhole engine all the DML operations are noops as they donot store any data. They simply return success without doing any operation. But the existing strictly expects thd->query() to be 'NULL' to identify that row based replication is in use. This assumption will fail when row annotations are enabled as the query is not 'NULL'. Hence various row based operations like 'update', 'delete', 'index lookup' will fail when row annotations are enabled.
Fix: === Extend the row based replication check to include row annotations as well. i.e Either the thd->query() is NULL or thd->query() points to query and row annotations are in use. This works, but the check does not really refers to the Annotate. So in principle thd->query() may turn in not-NULL of different reasons. Besides the check is multiplied over few places (again hopefully covering all).
We can do it simpler actually:
Rows_log_event::do_apply_event() { ... do { /* in_use can have been set to NULL in close_tables_for_reopen */ THD* old_thd= table->in_use; if (!table->in_use) table->in_use= thd;
--> add a check whether Annotate has been done and when so store/restore --> thd->query() like `old_thd` above/below, and pass it as NULL (thd->reset_query()) to the following:
error= do_exec_row(rgi);
if (error) DBUG_PRINT("info", ("error: %s", HA_ERR(error))); DBUG_ASSERT(error != HA_ERR_RECORD_DELETED);
table->in_use = old_thd;
I also suggest to check out mysqlbinlog run. There seems to be no test for that though, so one needs to be written. The goal would be to prove a created binlog output is as expected (e.g to include Annotate).
Sure, I will add a test case. Thank you S.Sujatha
Cheers,
Andrei
--- mysql-test/suite/rpl/t/rpl_blackhole-master.opt | 1 + mysql-test/suite/rpl/t/rpl_blackhole-slave.opt | 1 + storage/blackhole/ha_blackhole.cc | 28 +++++++++++++++++++------ 3 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/mysql-test/suite/rpl/t/rpl_blackhole-master.opt b/mysql-test/suite/rpl/t/rpl_blackhole-master.opt new file mode 100644 index 00000000000..1ad0b884c60 --- /dev/null +++ b/mysql-test/suite/rpl/t/rpl_blackhole-master.opt @@ -0,0 +1 @@ +--binlog_annotate_row_events diff --git a/mysql-test/suite/rpl/t/rpl_blackhole-slave.opt b/mysql-test/suite/rpl/t/rpl_blackhole-slave.opt new file mode 100644 index 00000000000..7ac6a84faa7 --- /dev/null +++ b/mysql-test/suite/rpl/t/rpl_blackhole-slave.opt @@ -0,0 +1 @@ +--binlog_annotate_row_events --replicate_annotate_row_events diff --git a/storage/blackhole/ha_blackhole.cc b/storage/blackhole/ha_blackhole.cc index 01aaa9ea15f..43bcdc541a1 100644 --- a/storage/blackhole/ha_blackhole.cc +++ b/storage/blackhole/ha_blackhole.cc @@ -25,6 +25,16 @@ #include "ha_blackhole.h" #include "sql_class.h" // THD, SYSTEM_THREAD_SLAVE_SQL
+static bool is_row_based_replication(THD *thd) +{ + /* + A row event which has its thd->query() == NULL or a row event which has + replicate_annotate_row_events enabled. In the later case the thd->query() + will be pointing to the query, received through replicated annotate event + from master. + */ + return ((thd->query() == NULL) || thd->variables.binlog_annotate_row_events); +} /* Static declarations for handlerton */
static handler *blackhole_create_handler(handlerton *hton, @@ -109,7 +119,8 @@ int ha_blackhole::update_row(const uchar *old_data, uchar *new_data) { DBUG_ENTER("ha_blackhole::update_row"); THD *thd= ha_thd(); - if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL) + if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && + is_row_based_replication(thd)) DBUG_RETURN(0); DBUG_RETURN(HA_ERR_WRONG_COMMAND); } @@ -118,7 +129,8 @@ int ha_blackhole::delete_row(const uchar *buf) { DBUG_ENTER("ha_blackhole::delete_row"); THD *thd= ha_thd(); - if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL) + if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && + is_row_based_replication(thd)) DBUG_RETURN(0); DBUG_RETURN(HA_ERR_WRONG_COMMAND); } @@ -135,7 +147,8 @@ int ha_blackhole::rnd_next(uchar *buf) int rc; DBUG_ENTER("ha_blackhole::rnd_next"); THD *thd= ha_thd(); - if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL) + if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && + is_row_based_replication(thd)) rc= 0; else rc= HA_ERR_END_OF_FILE; @@ -220,7 +233,8 @@ int ha_blackhole::index_read_map(uchar * buf, const uchar * key, int rc; DBUG_ENTER("ha_blackhole::index_read"); THD *thd= ha_thd(); - if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL) + if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && + is_row_based_replication(thd)) rc= 0; else rc= HA_ERR_END_OF_FILE; @@ -235,7 +249,8 @@ int ha_blackhole::index_read_idx_map(uchar * buf, uint idx, const uchar * key, int rc; DBUG_ENTER("ha_blackhole::index_read_idx"); THD *thd= ha_thd(); - if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL) + if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && + is_row_based_replication(thd)) rc= 0; else rc= HA_ERR_END_OF_FILE; @@ -249,7 +264,8 @@ int ha_blackhole::index_read_last_map(uchar * buf, const uchar * key, int rc; DBUG_ENTER("ha_blackhole::index_read_last"); THD *thd= ha_thd(); - if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL) + if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && + is_row_based_replication(thd)) rc= 0; else rc= HA_ERR_END_OF_FILE; _______________________________________________ commits mailing list commits@mariadb.org https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits