Hi, Monty! On Oct 16, Michael Widenius wrote:
revision-id: 12d5c4f2813 (mariadb-10.5.4-190-g12d5c4f2813) parent(s): afa5f735f30 author: Michael Widenius <michael.widenius@gmail.com> committer: Michael Widenius <michael.widenius@gmail.com> timestamp: 2020-09-13 17:48:15 +0300 message:
Fixed problems with S3 after "DROP TABLE FORCE" changes
MDEV-23691 S3 storage engine: delayed slave can drop the table
please put the MDEV on the first line of the commit comment
This commit also fixes all failing replication S3 tests.
A slave is delayed if it is trying to execute replicated queries on a table that is already converted to S3 by the master.
What do you mean, a slave is delayed?
Fixes for replication events on S3 tables for delayed slaves: - INSERT and INSERT ... SELECT and CREATE TABLE are ignored but written to the binary log. UPDATE & DELETE will be fixed in a future commit.
Other things: - On slaves with --s3-slave-ignore-updates set, allow S3 tables to be opened in read-write mode. This was done to be able to ignore-but-replicate queries like insert. Without this change any open of an S3 table failed with 'Table is read only' which is too early to be able to replicate the original query. - Errors are now printed if handler::extra() call fails in wait_while_tables_are_used() - Error message for row changes are changed from HA_ERR_WRONG_COMMAND to HA_ERR_TABLE_READONLY - Disable some maria_extra() calls for S3 tables. This could cause S3 tables to fail in some cases. - Added missing thr_lock_delete() to ma_open() in case of failure
diff --git a/sql/log.cc b/sql/log.cc index 2a887a68606..9c656387e90 100644 --- a/sql/log.cc +++ b/sql/log.cc @@ -2151,6 +2151,12 @@ static int binlog_commit(handlerton *hton, THD *thd, bool all)
DBUG_RETURN(0); } + /* + This is true if we are doing an alter table that is replicated as + CREATE TABLE ... SELECT + */ + if (thd->variables.option_bits & OPTION_BIN_COMMIT_OFF) + DBUG_RETURN(0);
why binlog_commit is even called here?
DBUG_PRINT("debug", ("all: %d, in_transaction: %s, all.modified_non_trans_table: %s, stmt.modified_non_trans_table: %s", diff --git a/sql/sql_class.cc b/sql/sql_class.cc index f4cbda5490c..7c103756566 100644 --- a/sql/sql_class.cc +++ b/sql/sql_class.cc @@ -7476,6 +7476,34 @@ int THD::binlog_query(THD::enum_binlog_query_type qtype, char const *query_arg, DBUG_RETURN(0); }
+ +/** + Binlog current query as a statement if binary log is open + + @retval false ok + @retval true error +*/ + + +bool THD::binlog_current_query() +{ + if (!mysql_bin_log.is_open()) + return 0; + + /* + The following is needed if the query was "ignored" by a shared distributed + engine, in which case decide_logging_format() will set the binlog filter + */ + reset_binlog_local_stmt_filter(); + clear_binlog_local_stmt_filter(); + return binlog_query(THD::STMT_QUERY_TYPE, query(), query_length(), + /* is_trans */ FALSE, + /* direct */ FALSE, + /* suppress_use */ FALSE, + /* Error */ 0) > 0; +}
you created a generally looking function with a rather special behavior. Please rename it to make sure it's not a general purpose helper. For example, call it binlog_current_query_unfiltered() or force_binlog_current_query(). And the function comment should say Force-binlog current query as a statement ignoring the binlog filter setting
+ + void THD::wait_for_wakeup_ready() { diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc index 4ad4c478937..5f59e064a35 100644 --- a/sql/sql_insert.cc +++ b/sql/sql_insert.cc @@ -724,6 +724,7 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list, Item *unused_conds= 0; DBUG_ENTER("mysql_insert");
+ bzero((char*) &info,sizeof(info));
why did you move it up?
create_explain_query(thd->lex, thd->mem_root); /* Upgrade lock type if the requested lock is incompatible with @@ -764,16 +765,27 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list, DBUG_RETURN(TRUE); value_count= values->elements;
- if (mysql_prepare_insert(thd, table_list, table, fields, values, - update_fields, update_values, duplic, - &unused_conds, FALSE)) + if ((res= mysql_prepare_insert(thd, table_list, table, fields, values, + update_fields, update_values, duplic, + &unused_conds, FALSE))) + { + retval= thd->is_error(); + if (res < 0) + { + /* + Insert should be ignored but we have to log the query in statement + format in the binary log + */ + res= thd->binlog_current_query();
abort label doesn't look at `res` it returns `retval`. So I don't think you need `retval= thd->is_error()` at all, and instead you need if (res < 0) retval= thd->binlog_current_query();
+ } goto abort; + } + /* mysql_prepare_insert sets table_list->table if it was not set */ + table= table_list->table;
/* Prepares LEX::returing_list if it is not empty */ if (returning) result->prepare(returning->item_list, NULL); - /* mysql_prepare_insert sets table_list->table if it was not set */ - table= table_list->table;
why did you move this up?
context= &thd->lex->first_select_lex()->context; /* @@ -1575,21 +1587,29 @@ bool mysql_prepare_insert(THD *thd, TABLE_LIST *table_list, DBUG_ASSERT (!select_insert || !values);
if (mysql_handle_derived(thd->lex, DT_INIT)) DBUG_RETURN(1); if (table_list->handle_derived(thd->lex, DT_MERGE_FOR_INSERT)) DBUG_RETURN(1); if (thd->lex->handle_list_of_derived(table_list, DT_PREPARE)) DBUG_RETURN(1);
if (duplic == DUP_UPDATE) { /* it should be allocated before Item::fix_fields() */ if (table_list->set_insert_values(thd->mem_root)) DBUG_RETURN(1); + } + + if (!table) + table= table_list->table;
As far as I can see, table in mysql_prepare_insert is *always* either table_list->table or NULL. At least in 10.5. So I'd rather remove the table argument completely and let mysql_prepare_insert always use table_list->table.
+ + if (table->file->check_if_updates_are_ignored("INSERT")) + { + DBUG_RETURN(-1); }
if (mysql_prepare_insert_check_table(thd, table_list, fields, select_insert)) DBUG_RETURN(1);
/* Prepare the fields in the statement. */ if (values) diff --git a/sql/sql_table.cc b/sql/sql_table.cc index 15d190c3139..2843496f783 100644 --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -5206,7 +5215,19 @@ int create_table_impl(THD *thd, const LEX_CSTRING &orig_db, goto err; } else if (options.if_not_exists()) + { + /* + We never come here as part of normal create table as table existance + is checked in open_and_lock_tables(). We may come here as part of + ALTER TABLE + */
How so? There is no ALTER IF NOT EXISTS. And this is the old code, so it seems to be for CREATE TABLE IF NOT EXISTS.
+ + /* Log CREATE IF NOT EXISTS on slave for distributed engines */ + if (thd->slave_thread && (exists_hton && exists_hton->flags & + HTON_IGNORE_UPDATES)) + thd->log_current_statement= 1; goto warn;
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org