Re: [Maria-developers] c4de76aeff8: MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT
Hi, Aleksey! Note, it's a review of the `git diff 82e44d60d1e 8175ce8a5f1` so it's not only commit c4de76aeff8 On Apr 04, Aleksey Midenkov wrote:
revision-id: c4de76aeff8 (mariadb-10.5.2-540-gc4de76aeff8) parent(s): 82e44d60d1e author: Aleksey Midenkov <midenok@gmail.com> committer: Aleksey Midenkov <midenok@gmail.com> timestamp: 2021-03-31 21:17:55 +0300 message:
MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT
Overall it's very good! I have few minor questions/comments, see below. But it's almost done, I think.
diff --git a/mysql-test/suite/versioning/r/delete_history.result b/mysql-test/suite/versioning/r/delete_history.result index cb865a835b3..90c9e4777bb 100644 --- a/mysql-test/suite/versioning/r/delete_history.result +++ b/mysql-test/suite/versioning/r/delete_history.result @@ -154,3 +152,18 @@ select * from t1; a 1 drop table t1; +# +# MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT +# +# Don't auto-create new partition on DELETE HISTORY: +create or replace table t (a int) with system versioning +partition by system_time limit 1000 auto; +delete history from t; +show create table t; +Table Create Table +t CREATE TABLE `t` ( + `a` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME LIMIT 1000 AUTO +PARTITIONS 2
Hmm, and if DELETE HISTORY would auto-create new partitions, what output would one expect here? I mean, how one can see whether the test result is correct or wrong?
+drop table t; diff --git a/mysql-test/suite/versioning/t/partition.test b/mysql-test/suite/versioning/t/partition.test index 445f5844630..57b80ca0b47 100644 --- a/mysql-test/suite/versioning/t/partition.test +++ b/mysql-test/suite/versioning/t/partition.test @@ -1068,11 +1078,412 @@ alter table t1 add partition partitions 2; --replace_result $default_engine DEFAULT_ENGINE show create table t1; alter table t1 add partition partitions 8; +drop table t1; + +--echo # +--echo # MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT +--echo # +create or replace table t1 (x int) with system versioning +partition by system_time limit 1 auto; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +--echo # INSERT, INSERT .. SELECT don't increment partitions
it's not really "increment", better say "don't auto-create"
+insert into t1 values (1); + +create or replace table t2 (y int); +insert into t2 values (2); + +insert into t1 select * from t2; +insert into t2 select * from t1;
why do you need a t2 table in this test?
+--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +--echo # Too many partitions error +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto; +set timestamp= unix_timestamp('2001-01-01 00:01:00'); +--error ER_TOO_MANY_PARTITIONS_ERROR +update t1 set x= x + 1; + +--enable_info +--echo # Increment from 3 to 5 +create or replace table t1 (x int) with system versioning +partition by system_time interval 3600 second +starts '2000-01-01 00:00:00' auto partitions 3; + +insert into t1 values (1); +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 02:00:00'); +update t1 set x= x + 1; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +update t1 set x= x + 2; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +--echo # Increment from 3 to 6, manual names, LOCK TABLES +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto ( + partition p1 history, + partition p3 history, + partition pn current); + +insert into t1 values (1); --replace_result $default_engine DEFAULT_ENGINE show create table t1;
+set timestamp= unix_timestamp('2000-01-01 02:00:00'); +update t1 set x= x + 3; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +update t1 set x= x + 4; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 04:00:00'); +lock tables t1 write; +update t1 set x= x + 5; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +unlock tables; +set timestamp= default; + +--echo # Couple of more LOCK TABLES cases +create or replace table t1 (x int) with system versioning +partition by system_time limit 1000 auto; +lock tables t1 write; +insert into t1 values (1); +update t1 set x= x + 1; +unlock tables; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1;
what does this test case demonstrate?
+ +--echo # Overflow prevention under LOCK TABLES +create or replace table t1 (x int) +with system versioning partition by system_time +limit 10 auto; + +insert into t1 values (1), (2), (3), (4), (5), (6), (7), (8), (9); +update t1 set x= x + 10; + +lock tables t1 write; +update t1 set x= 1 where x = 11; +update t1 set x= 2 where x = 12; +update t1 set x= 3 where x = 13; +unlock tables; + +select count(x) from t1 partition (p0); +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +drop tables t1; + +--echo # Test VIEW, LOCK TABLES +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto partitions 3; +create or replace view v1 as select * from t1; + +insert into v1 values (1);
why would a view matter in this test case?
+--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 01:00:00'); +lock tables t1 write; +update t1 set x= x + 3; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +unlock tables; + +drop view v1; drop tables t1;
+--echo # Multiple increments in single command +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto partitions 3; + +create or replace table t2 (y int) with system versioning +partition by system_time interval 1 hour auto; + +insert into t1 values (1); +insert into t2 values (2); + +set timestamp= unix_timestamp('2000-01-01 01:00:00'); +update t1, t2 set x= x + 1, y= y + 1; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +--replace_result $default_engine DEFAULT_ENGINE +show create table t2; + +set timestamp= unix_timestamp('2000-01-01 02:00:00'); +update t1, t2 set x= x + 1, y= y + 1; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +--replace_result $default_engine DEFAULT_ENGINE +show create table t2; + +--echo # Here t2 is incremented too, but not updated +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +update t1, t2 set t1.x= 0 where t1.x< t2.y; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +# Multiupdate_prelocking_strategy::handle_end() is processed after table open. +# For PS it is possible to skip unneeded auto-creation because the above happens at +# prepare stage and auto-creation is done at execute stage. +--replace_result $default_engine DEFAULT_ENGINE 'PARTITIONS 4' 'PARTITIONS ok' 'PARTITIONS 5' 'PARTITIONS ok'
eh... I don't think this is really "ok". As far as I remember, Multiupdate_prelocking_strategy knows what tables should be opened for writing and what for reading. Why would a new partition be created for t2?
+show create table t2; + +drop tables t1, t2; + +--echo # PS, SP, LOCK TABLES +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto; + +insert into t1 values (1); + +set timestamp= unix_timestamp('2000-01-01 01:00:00'); +execute immediate 'update t1 set x= x + 5'; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +prepare s from 'update t1 set x= x + 6'; +set timestamp= unix_timestamp('2000-01-01 02:00:00'); +execute s; execute s; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +lock tables t1 write; +execute s; execute s; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +unlock tables; +drop prepare s; + +create procedure sp() update t1 set x= x + 7; +set timestamp= unix_timestamp('2000-01-01 04:00:00'); +call sp; call sp; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +set timestamp= unix_timestamp('2000-01-01 05:00:00'); +lock tables t1 write; +call sp; call sp; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +unlock tables; +drop procedure sp; + +set timestamp= unix_timestamp('2001-01-01 00:00:00'); +create or replace table t1 (i int) with system versioning +partition by system_time interval 1 day starts '2000-01-01 00:00:00'; +insert into t1 values (0); +set timestamp= unix_timestamp('2001-01-01 00:00:01'); +prepare s from 'update t1 set i= i + 1'; +execute s; +set timestamp= unix_timestamp('2001-01-02 00:00:01'); +execute s; +drop prepare s; + +if (!$MTR_COMBINATION_HEAP)
because of blobs?
+{ +--echo # Complex table +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 ( + x int primary key auto_increment, ... +--echo # Transaction +create or replace table t1 (x int) with system versioning engine innodb +partition by system_time interval 1 hour auto; +start transaction; +insert into t1 values (1); +select * from t1; + +--connect con1, localhost, root +set lock_wait_timeout= 1; +set innodb_lock_wait_timeout= 1; +--error ER_LOCK_WAIT_TIMEOUT +update t1 set x= x + 111; +select * from t1;
what do you test here? (there is no show create table in the test)
+ +# cleanup +--disconnect con1 +--connection default +drop table t1; + +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning engine innodb +partition by system_time interval 1 hour auto; + +insert into t1 values (1); +set timestamp= unix_timestamp('2000-01-01 01:00:00'); +start transaction; +update t1 set x= 0; +--connect con1, localhost, root +select * from t1;
if you add a show create table here, what would it show?
+--connection default +commit; +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 02:00:00'); +start transaction; +update t1 set x= 1; +--connection con1 +select * from t1; +--connection default +rollback; +show create table t1; +--disconnect con1 +--connection default +drop table t1; + +--echo # +--echo # MENT-724 Locking timeout caused by auto-creation affects original DML
I'd better avoid MENT references here. Let's only mention Jira issues that users can actually see
+--echo # +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int primary key) with system versioning engine innodb +partition by system_time interval 1 hour auto; + +insert into t1 values (1); +start transaction; +insert into t1 values (2); + +--connect con1, localhost, root +set lock_wait_timeout= 1; +set innodb_lock_wait_timeout= 1; +set timestamp= unix_timestamp('2000-01-01 01:00:01'); +update t1 set x= x + 122 where x = 1;
isn't it a test that you already have above? with x = x + 111
+--disconnect con1 +--connection default +select * from t1; + +# cleanup +drop table t1; +set timestamp= default; + --echo # --echo # End of 10.5 tests --echo # diff --git a/mysql-test/suite/versioning/r/partition.result b/mysql-test/suite/versioning/r/partition.result index 2a277b1c2ea..8369b40d98c 100644 --- a/mysql-test/suite/versioning/r/partition.result +++ b/mysql-test/suite/versioning/r/partition.result @@ -1236,27 +1270,752 @@ t1 CREATE TABLE `t1` ( ... +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +affected rows: 0 +lock tables t1 write; +affected rows: 0 +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0 +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0 +show create table t1; +Table Create Table +t1 CREATE TABLE `t1` ( + `x` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR STARTS TIMESTAMP'2000-01-01 00:00:00' AUTO +PARTITIONS 6
why did it add two partitions here?
+affected rows: 1 +unlock tables; +affected rows: 0 +drop prepare s; +affected rows: 0 +create procedure sp() update t1 set x= x + 7; +affected rows: 0 +set timestamp= unix_timestamp('2000-01-01 04:00:00'); +affected rows: 0 +call sp; +affected rows: 1 +call sp; +affected rows: 1 +show create table t1; +Table Create Table +t1 CREATE TABLE `t1` ( + `x` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR STARTS TIMESTAMP'2000-01-01 00:00:00' AUTO +PARTITIONS 6 +affected rows: 1 +set timestamp= unix_timestamp('2000-01-01 05:00:00'); +affected rows: 0 +lock tables t1 write; +affected rows: 0 +call sp; +affected rows: 1 +call sp; +affected rows: 1 +show create table t1; +Table Create Table +t1 CREATE TABLE `t1` ( + `x` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR STARTS TIMESTAMP'2000-01-01 00:00:00' AUTO +PARTITIONS 8
and two here
+affected rows: 1 +unlock tables; +affected rows: 0 +drop procedure sp; +affected rows: 0 +set timestamp= unix_timestamp('2001-01-01 00:00:00'); +affected rows: 0 +create or replace table t1 (i int) with system versioning +partition by system_time interval 1 day starts '2000-01-01 00:00:00'; +affected rows: 0 +insert into t1 values (0); +affected rows: 1 +set timestamp= unix_timestamp('2001-01-01 00:00:01'); +affected rows: 0 +prepare s from 'update t1 set i= i + 1'; +affected rows: 0 +info: Statement prepared +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 1 +Warnings: +Warning 4114 Versioned table `test`.`t1`: last HISTORY partition (`p0`) is out of INTERVAL, need more HISTORY partitions
why?
+set timestamp= unix_timestamp('2001-01-02 00:00:01'); +affected rows: 0 +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 1 +Warnings: +Warning 4114 Versioned table `test`.`t1`: last HISTORY partition (`p0`) is out of INTERVAL, need more HISTORY partitions +drop prepare s; +affected rows: 0 +# Complex table ... diff --git a/mysql-test/suite/versioning/t/rpl.test b/mysql-test/suite/versioning/t/rpl.test index b5be68feece..54258a8bdf1 100644 --- a/mysql-test/suite/versioning/t/rpl.test +++ b/mysql-test/suite/versioning/t/rpl.test @@ -133,4 +133,44 @@ sync_slave_with_master; connection master; drop table t1;
+--echo # +--echo # MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT +--echo # +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto; +insert t1 values (); +set timestamp= unix_timestamp('2000-01-01 01:00:00'); +delete from t1; +--replace_result InnoDB ENGINE MyISAM ENGINE MEMORY ENGINE +show create table t1; +--sync_slave_with_master +--replace_result InnoDB ENGINE MyISAM ENGINE MEMORY ENGINE +show create table t1; +--connection master +drop table t1; +set timestamp= default; + +--echo # +--echo # MENT-685 DML events for auto-partitioned tables are written into binary log twice
same comment about MENT
+--echo # +create table t1 (a int) with system versioning +partition by system_time limit 1 auto; + +insert into t1 values (1); +update t1 set a= a + 1; +update t1 set a= a + 1; +--replace_result InnoDB ENGINE MyISAM ENGINE MEMORY ENGINE +show create table t1; +select * from t1; + +--sync_slave_with_master +--replace_result InnoDB ENGINE MyISAM ENGINE MEMORY ENGINE +show create table t1; + +select * from t1; +--connection master +# cleanup +drop table t1;
may be show binlog events? you know, to verify that DML events aren't written into binary log twice
+ --source include/rpl_end.inc diff --git a/sql/ha_partition.h b/sql/ha_partition.h index 60a2d7f6762..7ade4419c22 100644 --- a/sql/ha_partition.h +++ b/sql/ha_partition.h @@ -1610,7 +1610,7 @@ class ha_partition :public handler for (; part_id < part_id_end; ++part_id) { handler *file= m_file[part_id]; - DBUG_ASSERT(bitmap_is_set(&(m_part_info->read_partitions), part_id)); + bitmap_set_bit(&(m_part_info->read_partitions), part_id);
Where was it set before your patch?
file->info(HA_STATUS_VARIABLE | HA_STATUS_NO_LOCK | HA_STATUS_OPEN); part_recs+= file->stats.records; } diff --git a/sql/handler.cc b/sql/handler.cc index b312635c8ee..6bb6c279193 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -1572,7 +1572,7 @@ int ha_commit_trans(THD *thd, bool all) DBUG_ASSERT(thd->transaction->stmt.ha_list == NULL || trans == &thd->transaction->stmt);
- if (thd->in_sub_stmt) + if (thd->in_sub_stmt & ~SUB_STMT_AUTO_HIST) {
where is this ha_commit_trans(thd, false) called from? mysql_alter_table that adds a new partition?
DBUG_ASSERT(0); /* diff --git a/sql/partition_info.cc b/sql/partition_info.cc index 871411cf6c4..cf8dd140553 100644 --- a/sql/partition_info.cc +++ b/sql/partition_info.cc @@ -816,10 +816,17 @@ bool partition_info::has_unique_name(partition_element *element) vers_info->interval Limit by fixed time interval vers_info->hist_part (out) Working history partition */ -void partition_info::vers_set_hist_part(THD *thd) +uint partition_info::vers_set_hist_part(THD *thd, bool auto_hist) { + DBUG_ASSERT(!thd->lex->last_table() || + !thd->lex->last_table()->vers_conditions.delete_history);
is that right? Conceptually you need to test vers_conditions.delete_history for the table that owns this partition_info. Is it always last_table() ? I'd say it'd be more logical to do, like part_field_array[0]->table->pos_in_table_list
+ + uint create_count= 0; + auto_hist= auto_hist && vers_info->auto_hist; + if (vers_info->limit) { + DBUG_ASSERT(!vers_info->interval.is_set()); ha_partition *hp= (ha_partition*)(table->file); partition_element *next= NULL; List_iterator<partition_element> it(partitions); @@ -862,9 +869,183 @@ void partition_info::vers_set_hist_part(THD *thd) { vers_info->hist_part= next; if (next->range_value > thd->query_start()) - return; + { + error= false; + break; + } + } + if (error) + { + if (auto_hist) + { + DBUG_ASSERT(thd->query_start() >= vers_info->hist_part->range_value); + my_time_t diff= thd->query_start() - (my_time_t) vers_info->hist_part->range_value; + if (diff > 0) + { + size_t delta= vers_info->interval.seconds(); + create_count= (uint) (diff / delta + 1); + if (diff % delta) + create_count++; + } + else + create_count= 1; + } + else + { + my_error(WARN_VERS_PART_FULL, MYF(ME_WARNING|ME_ERROR_LOG),
after such an overflow a table becomes essentially corrupted, rows are in a wrong partition. So new history partitions cannot be added anymore without reorganizing the last partition. How is it handled now? (it's just a question, not part of the review, as it's not something you were changing in your patch)
+ table->s->db.str, table->s->table_name.str, + vers_info->hist_part->partition_name, "INTERVAL"); + } } } + + /* + When hist_part is almost full LOCK TABLES my overflow the partition as we
s/my/may/
+ can't add new partitions under LOCK TABLES. Reserve one for this case. + */ + if (auto_hist && create_count == 0 && + thd->lex->sql_command == SQLCOM_LOCK_TABLES && + vers_info->hist_part->id + 1 == vers_info->now_part->id) + create_count= 1; + + return create_count; +} + + +/** + @brief Run fast_alter_partition_table() to add new history partitions + for tables requiring them. +*/ +bool vers_add_auto_hist_parts(THD *thd, TABLE_LIST* tl, uint num_parts) +{ + bool result= true; + HA_CREATE_INFO create_info; + Alter_info alter_info; + partition_info *save_part_info= thd->work_part_info; + Query_tables_list save_query_tables; + Reprepare_observer *save_reprepare_observer= thd->m_reprepare_observer; + bool save_no_write_to_binlog= thd->lex->no_write_to_binlog; + thd->m_reprepare_observer= NULL; + thd->lex->reset_n_backup_query_tables_list(&save_query_tables); + thd->in_sub_stmt|= SUB_STMT_AUTO_HIST; + thd->lex->no_write_to_binlog= true; + TABLE *table= tl->table; + + DBUG_ASSERT(!thd->is_error()); + + { + DBUG_ASSERT(table->s->get_table_ref_type() == TABLE_REF_BASE_TABLE); + DBUG_ASSERT(table->versioned()); + DBUG_ASSERT(table->part_info); + DBUG_ASSERT(table->part_info->vers_info); + alter_info.reset(); + alter_info.partition_flags= ALTER_PARTITION_ADD|ALTER_PARTITION_AUTO_HIST; + create_info.init(); + create_info.alter_info= &alter_info; + Alter_table_ctx alter_ctx(thd, tl, 1, &table->s->db, &table->s->table_name); + + MDL_REQUEST_INIT(&tl->mdl_request, MDL_key::TABLE, tl->db.str, + tl->table_name.str, MDL_SHARED_NO_WRITE, MDL_TRANSACTION); + if (thd->mdl_context.acquire_lock(&tl->mdl_request, + thd->variables.lock_wait_timeout)) + goto exit; + table->mdl_ticket= tl->mdl_request.ticket; + + create_info.db_type= table->s->db_type(); + create_info.options|= HA_VERSIONED_TABLE; + DBUG_ASSERT(create_info.db_type); + + create_info.vers_info.set_start(table->s->vers_start_field()->field_name); + create_info.vers_info.set_end(table->s->vers_end_field()->field_name); + + partition_info *part_info= new partition_info(); + if (unlikely(!part_info)) + { + my_error(ER_OUT_OF_RESOURCES, MYF(0)); + goto exit; + } + part_info->use_default_num_partitions= false; + part_info->use_default_num_subpartitions= false; + part_info->num_parts= num_parts; + part_info->num_subparts= table->part_info->num_subparts; + part_info->subpart_type= table->part_info->subpart_type; + if (unlikely(part_info->vers_init_info(thd))) + { + my_error(ER_OUT_OF_RESOURCES, MYF(0)); + goto exit; + } + + // NB: set_ok_status() requires DA_EMPTY + thd->get_stmt_da()->reset_diagnostics_area();
would it not be DA_EMPTY without a reset? this is at the beginning of a statement, I'd expect it normally be DA_EMPTY here. What other DA states can you get here?
+ + thd->work_part_info= part_info; + if (part_info->set_up_defaults_for_partitioning(thd, table->file, NULL, + table->part_info->next_part_no(num_parts))) + { + push_warning(thd, Sql_condition::WARN_LEVEL_WARN, + ER_VERS_HIST_PART_FAILED, + "Auto-increment history partition: " + "setting up defaults failed"); + my_error(ER_VERS_HIST_PART_FAILED, MYF(ME_WARNING), + tl->db.str, tl->table_name.str); + goto exit; + } + bool partition_changed= false; + bool fast_alter_partition= false; + if (prep_alter_part_table(thd, table, &alter_info, &create_info, + &partition_changed, &fast_alter_partition)) + { + push_warning(thd, Sql_condition::WARN_LEVEL_WARN, ER_VERS_HIST_PART_FAILED, + "Auto-increment history partition: " + "alter partitition prepare failed"); + my_error(ER_VERS_HIST_PART_FAILED, MYF(ME_WARNING), + tl->db.str, tl->table_name.str); + goto exit; + } + if (!fast_alter_partition) + { + push_warning(thd, Sql_condition::WARN_LEVEL_WARN, ER_VERS_HIST_PART_FAILED, + "Auto-increment history partition: " + "fast alter partitition is not possible"); + my_error(ER_VERS_HIST_PART_FAILED, MYF(ME_WARNING), + tl->db.str, tl->table_name.str); + goto exit; + } + DBUG_ASSERT(partition_changed); + if (mysql_prepare_alter_table(thd, table, &create_info, &alter_info, + &alter_ctx)) + { + push_warning(thd, Sql_condition::WARN_LEVEL_WARN, ER_VERS_HIST_PART_FAILED, + "Auto-increment history partition: " + "alter prepare failed"); + my_error(ER_VERS_HIST_PART_FAILED, MYF(ME_WARNING), + tl->db.str, tl->table_name.str); + goto exit; + } + + if (fast_alter_partition_table(thd, table, &alter_info, &create_info, + tl, &table->s->db, &table->s->table_name)) + { + push_warning(thd, Sql_condition::WARN_LEVEL_WARN, ER_VERS_HIST_PART_FAILED, + "Auto-increment history partition: " + "alter partition table failed"); + my_error(ER_VERS_HIST_PART_FAILED, MYF(ME_WARNING), + tl->db.str, tl->table_name.str); + goto exit; + } + } + + result= false; + // NB: we have to return DA_EMPTY for new command
may be DBUG_ASSERT(thd->get_stmt_da()->is_ok());
+ thd->get_stmt_da()->reset_diagnostics_area(); + +exit: + thd->work_part_info= save_part_info; + thd->m_reprepare_observer= save_reprepare_observer; + thd->lex->restore_backup_query_tables_list(&save_query_tables); + thd->in_sub_stmt&= ~SUB_STMT_AUTO_HIST; + thd->lex->no_write_to_binlog= save_no_write_to_binlog; + return result; }
diff --git a/sql/partition_info.h b/sql/partition_info.h index 0656238ec07..94093353d60 100644 --- a/sql/partition_info.h +++ b/sql/partition_info.h @@ -72,9 +73,26 @@ struct Vers_part_info : public Sql_alloc my_time_t start; INTERVAL step; enum interval_type type; - bool is_set() { return type < INTERVAL_LAST; } + bool is_set() const { return type < INTERVAL_LAST; } + size_t seconds() const + { + if (step.second) + return step.second; + if (step.minute) + return step.minute * 60; + if (step.hour) + return step.hour * 3600; + if (step.day) + return step.day * 3600 * 24; + // comparison is used in rough estimates, it doesn't need to be calendar-correct + if (step.month) + return step.month * 3600 * 24 * 30;
shouldn't it be `* 28` ? you need a most pessimistic estimate to make sure you never miss to create a partition when one is needed. You can sometimes create one when it's not needed yet, but it's less of a problem.
+ DBUG_ASSERT(step.year); + return step.year * 86400 * 30 * 365;
that's one `* 30` too many :)
+ } } interval; ulonglong limit; + bool auto_hist; partition_element *now_part; partition_element *hist_part; }; diff --git a/sql/sql_base.cc b/sql/sql_base.cc index 1a1186aca73..d0e255186da 100644 --- a/sql/sql_base.cc +++ b/sql/sql_base.cc @@ -1625,6 +1625,52 @@ bool is_locked_view(THD *thd, TABLE_LIST *t) }
+bool TABLE::vers_need_hist_part(const THD *thd, const TABLE_LIST *table_list) const +{ +#ifdef WITH_PARTITION_STORAGE_ENGINE + if (part_info && part_info->part_type == VERSIONING_PARTITION && + !table_list->vers_conditions.delete_history && + !thd->stmt_arena->is_stmt_prepare() && + table_list->lock_type >= TL_WRITE_ALLOW_WRITE && + table_list->mdl_request.type >= MDL_SHARED_WRITE && + table_list->mdl_request.type < MDL_EXCLUSIVE) + { + switch (thd->lex->sql_command) + {
SQLCOM_LOAD with DUP_REPLACE?
+ case SQLCOM_INSERT: + if (thd->lex->duplicates != DUP_UPDATE) + break; + /* fall-through: */ + case SQLCOM_LOCK_TABLES: + case SQLCOM_DELETE: + case SQLCOM_UPDATE: + case SQLCOM_REPLACE: + case SQLCOM_REPLACE_SELECT: + case SQLCOM_DELETE_MULTI: + case SQLCOM_UPDATE_MULTI: + return true; + default:; + break; + } + if (thd->rgi_slave && thd->rgi_slave->current_event && thd->lex->sql_command == SQLCOM_END)
I wonder, would it be possible, instead of introducing rgi_slave->current_event, just make row events set thd->lex->sql_command appropriately? For example, currently row events increment thd->status_var.com_stat[] each event for its own SQLCOM_xxx, it won't be needed if they'll just set thd->lex->sql_command.
+ { + switch (thd->rgi_slave->current_event->get_type_code()) + { + case UPDATE_ROWS_EVENT: + case UPDATE_ROWS_EVENT_V1: + case DELETE_ROWS_EVENT: + case DELETE_ROWS_EVENT_V1: + return true; + default:; + break; + } + } + } +#endif + return false; +} + + /** Open a base table.
@@ -1777,6 +1823,11 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) DBUG_PRINT("info",("Using locked table")); #ifdef WITH_PARTITION_STORAGE_ENGINE part_names_error= set_partitions_as_used(table_list, table); + if (table->vers_need_hist_part(thd, table_list)) + { + /* Rotation is still needed under LOCK TABLES */
a bit confusing comment, you left out stuff that are obvious to you but not to others. I'd suggest something like /* New partitions are not auto-created under LOCK TABLES (TODO: fix it) but rotation can still happen. */
+ table->part_info->vers_set_hist_part(thd, false); + } #endif goto reset; } @@ -2032,6 +2083,20 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) tc_add_table(thd, table); }
+ if (!ot_ctx->vers_create_count &&
what does this condition mean? When is ot_ctx->vers_create_count > 0 ?
+ table->vers_need_hist_part(thd, table_list)) + { + ot_ctx->vers_create_count= table->part_info->vers_set_hist_part(thd, true); + if (ot_ctx->vers_create_count) + { + MYSQL_UNBIND_TABLE(table->file); + tc_release_table(table); + ot_ctx->request_backoff_action(Open_table_context::OT_ADD_HISTORY_PARTITION, + table_list); + DBUG_RETURN(TRUE); + } + } + if (!(flags & MYSQL_OPEN_HAS_MDL_LOCK) && table->s->table_category < TABLE_CATEGORY_INFORMATION) { @@ -3172,10 +3239,18 @@ Open_table_context::recover_from_failed_open() break; case OT_DISCOVER: case OT_REPAIR: - if ((result= lock_table_names(m_thd, m_thd->lex->create_info, - m_failed_table, NULL, - get_timeout(), 0))) + case OT_ADD_HISTORY_PARTITION: + result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, + NULL, get_timeout(), 0); + if (result) + { + if (m_action == OT_ADD_HISTORY_PARTITION) + { + m_thd->clear_error();
why?
+ result= false; + } break; + }
tdc_remove_table(m_thd, m_failed_table->db.str, m_failed_table->table_name.str); diff --git a/sql/sql_error.h b/sql/sql_error.h index 318d5076534..92076616adb 100644 --- a/sql/sql_error.h +++ b/sql/sql_error.h @@ -1195,7 +1195,6 @@ class Diagnostics_area: public Sql_state_errno,
void copy_non_errors_from_wi(THD *thd, const Warning_info *src_wi);
-private:
It doesn't seem you're using get_warning_info() anywhere
Warning_info *get_warning_info() { return m_wi_stack.front(); }
const Warning_info *get_warning_info() const { return m_wi_stack.front(); } diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc index 86ba393fb7b..f4a960afe47 100644 --- a/sql/sql_partition.cc +++ b/sql/sql_partition.cc @@ -7237,7 +7246,8 @@ uint fast_alter_partition_table(THD *thd, TABLE *table, } else if ((alter_info->partition_flags & ALTER_PARTITION_ADD) && (part_info->part_type == RANGE_PARTITION || - part_info->part_type == LIST_PARTITION)) + part_info->part_type == LIST_PARTITION || + alter_info->partition_flags & ALTER_PARTITION_AUTO_HIST))
it'd be good it adding new empty VERSIONING partitions would always go this way, auto or not. but it's a separate task.
{ /* ADD RANGE/LIST PARTITIONS @@ -7281,7 +7291,7 @@ uint fast_alter_partition_table(THD *thd, TABLE *table, write_log_add_change_partition(lpt) || ERROR_INJECT_CRASH("crash_add_partition_4") || ERROR_INJECT_ERROR("fail_add_partition_4") || - mysql_change_partitions(lpt) || + mysql_change_partitions(lpt, false) ||
this way you skip trans_commit_stmt/trans_commit_implicit for ALTER TABLE ... ADD RANGE/LIST partition. is it always ok? A safer alternative would've been mysql_change_partitions(lpt, !(alter_info->partition_flags & ALTER_PARTITION_AUTO_HIST)) but it's more complex, so I'd prefer your variant, if we can be sure that it doesn't break anything.
ERROR_INJECT_CRASH("crash_add_partition_5") || ERROR_INJECT_ERROR("fail_add_partition_5") || alter_close_table(lpt) ||
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hi Sergei, On Sun, Apr 4, 2021 at 2:48 PM Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Aleksey!
Note, it's a review of the `git diff 82e44d60d1e 8175ce8a5f1` so it's not only commit c4de76aeff8
On Apr 04, Aleksey Midenkov wrote:
revision-id: c4de76aeff8 (mariadb-10.5.2-540-gc4de76aeff8) parent(s): 82e44d60d1e author: Aleksey Midenkov <midenok@gmail.com> committer: Aleksey Midenkov <midenok@gmail.com> timestamp: 2021-03-31 21:17:55 +0300 message:
MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT
Overall it's very good! I have few minor questions/comments, see below. But it's almost done, I think.
diff --git a/mysql-test/suite/versioning/r/delete_history.result b/mysql-test/suite/versioning/r/delete_history.result index cb865a835b3..90c9e4777bb 100644 --- a/mysql-test/suite/versioning/r/delete_history.result +++ b/mysql-test/suite/versioning/r/delete_history.result @@ -154,3 +152,18 @@ select * from t1; a 1 drop table t1; +# +# MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT +# +# Don't auto-create new partition on DELETE HISTORY: +create or replace table t (a int) with system versioning +partition by system_time limit 1000 auto; +delete history from t; +show create table t; +Table Create Table +t CREATE TABLE `t` ( + `a` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME LIMIT 1000 AUTO +PARTITIONS 2
Hmm, and if DELETE HISTORY would auto-create new partitions, what output would one expect here?
I mean, how one can see whether the test result is correct or wrong?
By PARTITIONS value. Stale test case, fixed.
+drop table t; diff --git a/mysql-test/suite/versioning/t/partition.test b/mysql-test/suite/versioning/t/partition.test index 445f5844630..57b80ca0b47 100644 --- a/mysql-test/suite/versioning/t/partition.test +++ b/mysql-test/suite/versioning/t/partition.test @@ -1068,11 +1078,412 @@ alter table t1 add partition partitions 2; --replace_result $default_engine DEFAULT_ENGINE show create table t1; alter table t1 add partition partitions 8; +drop table t1; + +--echo # +--echo # MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT +--echo # +create or replace table t1 (x int) with system versioning +partition by system_time limit 1 auto; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +--echo # INSERT, INSERT .. SELECT don't increment partitions
it's not really "increment", better say "don't auto-create"
Actually I like "increment" more. "Auto-create" overcomplicates phrases: --echo # Increment from 3 to 5 --echo # Increment from 3 to 6, manual names, LOCK TABLES --echo # Multiple increments in single command Besides "increment" is correct because PARTITIONS number is incremented.
+insert into t1 values (1); + +create or replace table t2 (y int); +insert into t2 values (2); + +insert into t1 select * from t2; +insert into t2 select * from t1;
why do you need a t2 table in this test?
t1 is not incremented in case of insert into t2 select * from t1;
+--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +--echo # Too many partitions error +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto; +set timestamp= unix_timestamp('2001-01-01 00:01:00'); +--error ER_TOO_MANY_PARTITIONS_ERROR +update t1 set x= x + 1; + +--enable_info +--echo # Increment from 3 to 5 +create or replace table t1 (x int) with system versioning +partition by system_time interval 3600 second +starts '2000-01-01 00:00:00' auto partitions 3; + +insert into t1 values (1); +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 02:00:00'); +update t1 set x= x + 1; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +update t1 set x= x + 2; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +--echo # Increment from 3 to 6, manual names, LOCK TABLES +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto ( + partition p1 history, + partition p3 history, + partition pn current); + +insert into t1 values (1); --replace_result $default_engine DEFAULT_ENGINE show create table t1;
+set timestamp= unix_timestamp('2000-01-01 02:00:00'); +update t1 set x= x + 3; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +update t1 set x= x + 4; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 04:00:00'); +lock tables t1 write; +update t1 set x= x + 5; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +unlock tables; +set timestamp= default; + +--echo # Couple of more LOCK TABLES cases +create or replace table t1 (x int) with system versioning +partition by system_time limit 1000 auto; +lock tables t1 write; +insert into t1 values (1); +update t1 set x= x + 1; +unlock tables; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1;
what does this test case demonstrate?
Fixed stale test case.
+ +--echo # Overflow prevention under LOCK TABLES +create or replace table t1 (x int) +with system versioning partition by system_time +limit 10 auto; + +insert into t1 values (1), (2), (3), (4), (5), (6), (7), (8), (9); +update t1 set x= x + 10; + +lock tables t1 write; +update t1 set x= 1 where x = 11; +update t1 set x= 2 where x = 12; +update t1 set x= 3 where x = 13; +unlock tables; + +select count(x) from t1 partition (p0); +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +drop tables t1; + +--echo # Test VIEW, LOCK TABLES +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto partitions 3; +create or replace view v1 as select * from t1; + +insert into v1 values (1);
why would a view matter in this test case?
Made it matter.
+--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 01:00:00'); +lock tables t1 write; +update t1 set x= x + 3; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +unlock tables; + +drop view v1; drop tables t1;
+--echo # Multiple increments in single command +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto partitions 3; + +create or replace table t2 (y int) with system versioning +partition by system_time interval 1 hour auto; + +insert into t1 values (1); +insert into t2 values (2); + +set timestamp= unix_timestamp('2000-01-01 01:00:00'); +update t1, t2 set x= x + 1, y= y + 1; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +--replace_result $default_engine DEFAULT_ENGINE +show create table t2; + +set timestamp= unix_timestamp('2000-01-01 02:00:00'); +update t1, t2 set x= x + 1, y= y + 1; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +--replace_result $default_engine DEFAULT_ENGINE +show create table t2; + +--echo # Here t2 is incremented too, but not updated +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +update t1, t2 set t1.x= 0 where t1.x< t2.y; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +# Multiupdate_prelocking_strategy::handle_end() is processed after table open. +# For PS it is possible to skip unneeded auto-creation because the above happens at +# prepare stage and auto-creation is done at execute stage. +--replace_result $default_engine DEFAULT_ENGINE 'PARTITIONS 4' 'PARTITIONS ok' 'PARTITIONS 5' 'PARTITIONS ok'
eh... I don't think this is really "ok". As far as I remember, Multiupdate_prelocking_strategy knows what tables should be opened for writing and what for reading. Why would a new partition be created for t2?
It knows this after tables are opened. Look at handle_end(), specifically mysql_handle_derived(), handle_derived(), setup_fields_with_no_wrap() and check_fields(). I believe all these calls are required to get proper get_table_map(). To get this working properly there must be 2-staged open tables, something like PS does.
+show create table t2; + +drop tables t1, t2; + +--echo # PS, SP, LOCK TABLES +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto; + +insert into t1 values (1); + +set timestamp= unix_timestamp('2000-01-01 01:00:00'); +execute immediate 'update t1 set x= x + 5'; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +prepare s from 'update t1 set x= x + 6'; +set timestamp= unix_timestamp('2000-01-01 02:00:00'); +execute s; execute s; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +lock tables t1 write; +execute s; execute s; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +unlock tables; +drop prepare s; + +create procedure sp() update t1 set x= x + 7; +set timestamp= unix_timestamp('2000-01-01 04:00:00'); +call sp; call sp; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +set timestamp= unix_timestamp('2000-01-01 05:00:00'); +lock tables t1 write; +call sp; call sp; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +unlock tables; +drop procedure sp; + +set timestamp= unix_timestamp('2001-01-01 00:00:00'); +create or replace table t1 (i int) with system versioning +partition by system_time interval 1 day starts '2000-01-01 00:00:00'; +insert into t1 values (0); +set timestamp= unix_timestamp('2001-01-01 00:00:01'); +prepare s from 'update t1 set i= i + 1'; +execute s; +set timestamp= unix_timestamp('2001-01-02 00:00:01'); +execute s; +drop prepare s; + +if (!$MTR_COMBINATION_HEAP)
because of blobs?
Because of blobs.
+{ +--echo # Complex table +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 ( + x int primary key auto_increment, ... +--echo # Transaction +create or replace table t1 (x int) with system versioning engine innodb +partition by system_time interval 1 hour auto; +start transaction; +insert into t1 values (1); +select * from t1; + +--connect con1, localhost, root +set lock_wait_timeout= 1; +set innodb_lock_wait_timeout= 1; +--error ER_LOCK_WAIT_TIMEOUT +update t1 set x= x + 111; +select * from t1;
what do you test here? (there is no show create table in the test)
This was MENT-675 Assertion `thd->transaction.stmt.is_empty() || thd->in_sub_stmt' failed I moved it to MDEV-25339 and added a comment.
+ +# cleanup +--disconnect con1 +--connection default +drop table t1; + +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning engine innodb +partition by system_time interval 1 hour auto; + +insert into t1 values (1); +set timestamp= unix_timestamp('2000-01-01 01:00:00'); +start transaction; +update t1 set x= 0; +--connect con1, localhost, root +select * from t1;
if you add a show create table here, what would it show?
Added. +show create table t1; +Table Create Table +t1 CREATE TABLE `t1` ( + `x` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR STARTS TIMESTAMP'2000-01-01 00:00:00' AUTO +PARTITIONS 3
+--connection default +commit; +show create table t1; + +set timestamp= unix_timestamp('2000-01-01 02:00:00'); +start transaction; +update t1 set x= 1; +--connection con1 +select * from t1; +--connection default +rollback; +show create table t1; +--disconnect con1 +--connection default +drop table t1; + +--echo # +--echo # MENT-724 Locking timeout caused by auto-creation affects original DML
I'd better avoid MENT references here. Let's only mention Jira issues that users can actually see
Fixed.
+--echo # +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int primary key) with system versioning engine innodb +partition by system_time interval 1 hour auto; + +insert into t1 values (1); +start transaction; +insert into t1 values (2); + +--connect con1, localhost, root +set lock_wait_timeout= 1; +set innodb_lock_wait_timeout= 1; +set timestamp= unix_timestamp('2000-01-01 01:00:01'); +update t1 set x= x + 122 where x = 1;
isn't it a test that you already have above? with x = x + 111
Not fully. That was the original bug posted by Elena (MDEV-25339). And this modification is posted by me (MDEV-23642).
+--disconnect con1 +--connection default +select * from t1; + +# cleanup +drop table t1; +set timestamp= default; + --echo # --echo # End of 10.5 tests --echo # diff --git a/mysql-test/suite/versioning/r/partition.result b/mysql-test/suite/versioning/r/partition.result index 2a277b1c2ea..8369b40d98c 100644 --- a/mysql-test/suite/versioning/r/partition.result +++ b/mysql-test/suite/versioning/r/partition.result @@ -1236,27 +1270,752 @@ t1 CREATE TABLE `t1` ( ... +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +affected rows: 0 +lock tables t1 write; +affected rows: 0 +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0 +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0 +show create table t1; +Table Create Table +t1 CREATE TABLE `t1` ( + `x` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR STARTS TIMESTAMP'2000-01-01 00:00:00' AUTO +PARTITIONS 6
why did it add two partitions here?
Because of this condition /* When hist_part is almost full LOCK TABLES my overflow the partition as we can't add new partitions under LOCK TABLES. Reserve one more for this case. */ if (auto_hist && create_count < 2 && thd->lex->sql_command == SQLCOM_LOCK_TABLES && vers_info->hist_part->id + 1 == vers_info->now_part->id) create_count++;
+affected rows: 1 +unlock tables; +affected rows: 0 +drop prepare s; +affected rows: 0 +create procedure sp() update t1 set x= x + 7; +affected rows: 0 +set timestamp= unix_timestamp('2000-01-01 04:00:00'); +affected rows: 0 +call sp; +affected rows: 1 +call sp; +affected rows: 1 +show create table t1; +Table Create Table +t1 CREATE TABLE `t1` ( + `x` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR STARTS TIMESTAMP'2000-01-01 00:00:00' AUTO +PARTITIONS 6 +affected rows: 1 +set timestamp= unix_timestamp('2000-01-01 05:00:00'); +affected rows: 0 +lock tables t1 write; +affected rows: 0 +call sp; +affected rows: 1 +call sp; +affected rows: 1 +show create table t1; +Table Create Table +t1 CREATE TABLE `t1` ( + `x` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR STARTS TIMESTAMP'2000-01-01 00:00:00' AUTO +PARTITIONS 8
and two here
Same here.
+affected rows: 1 +unlock tables; +affected rows: 0 +drop procedure sp; +affected rows: 0 +set timestamp= unix_timestamp('2001-01-01 00:00:00'); +affected rows: 0 +create or replace table t1 (i int) with system versioning +partition by system_time interval 1 day starts '2000-01-01 00:00:00'; +affected rows: 0 +insert into t1 values (0); +affected rows: 1 +set timestamp= unix_timestamp('2001-01-01 00:00:01'); +affected rows: 0 +prepare s from 'update t1 set i= i + 1'; +affected rows: 0 +info: Statement prepared +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 1 +Warnings: +Warning 4114 Versioned table `test`.`t1`: last HISTORY partition (`p0`) is out of INTERVAL, need more HISTORY partitions
why?
Actually this is a typo in the test case: year 2000 vs 2001. Fixed.
+set timestamp= unix_timestamp('2001-01-02 00:00:01'); +affected rows: 0 +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 1 +Warnings: +Warning 4114 Versioned table `test`.`t1`: last HISTORY partition (`p0`) is out of INTERVAL, need more HISTORY partitions +drop prepare s; +affected rows: 0 +# Complex table ... diff --git a/mysql-test/suite/versioning/t/rpl.test b/mysql-test/suite/versioning/t/rpl.test index b5be68feece..54258a8bdf1 100644 --- a/mysql-test/suite/versioning/t/rpl.test +++ b/mysql-test/suite/versioning/t/rpl.test @@ -133,4 +133,44 @@ sync_slave_with_master; connection master; drop table t1;
+--echo # +--echo # MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT +--echo # +set timestamp= unix_timestamp('2000-01-01 00:00:00'); +create or replace table t1 (x int) with system versioning +partition by system_time interval 1 hour auto; +insert t1 values (); +set timestamp= unix_timestamp('2000-01-01 01:00:00'); +delete from t1; +--replace_result InnoDB ENGINE MyISAM ENGINE MEMORY ENGINE +show create table t1; +--sync_slave_with_master +--replace_result InnoDB ENGINE MyISAM ENGINE MEMORY ENGINE +show create table t1; +--connection master +drop table t1; +set timestamp= default; + +--echo # +--echo # MENT-685 DML events for auto-partitioned tables are written into binary log twice
same comment about MENT
Fixed.
+--echo # +create table t1 (a int) with system versioning +partition by system_time limit 1 auto; + +insert into t1 values (1); +update t1 set a= a + 1; +update t1 set a= a + 1; +--replace_result InnoDB ENGINE MyISAM ENGINE MEMORY ENGINE +show create table t1; +select * from t1; + +--sync_slave_with_master +--replace_result InnoDB ENGINE MyISAM ENGINE MEMORY ENGINE +show create table t1; + +select * from t1; +--connection master +# cleanup +drop table t1;
may be show binlog events? you know, to verify that DML events aren't written into binary log twice
Added.
+ --source include/rpl_end.inc diff --git a/sql/ha_partition.h b/sql/ha_partition.h index 60a2d7f6762..7ade4419c22 100644 --- a/sql/ha_partition.h +++ b/sql/ha_partition.h @@ -1610,7 +1610,7 @@ class ha_partition :public handler for (; part_id < part_id_end; ++part_id) { handler *file= m_file[part_id]; - DBUG_ASSERT(bitmap_is_set(&(m_part_info->read_partitions), part_id)); + bitmap_set_bit(&(m_part_info->read_partitions), part_id);
Where was it set before your patch?
#1 0x0000000000a99d04 in partition_info::set_partition_bitmaps (this=0x7fc214b7af80, partition_names=0x0 ) at ../src/sql/partition_info.cc:274 #2 0x0000000001087da4 in ha_partition::change_partitions_to_open (this=0x7fc214b787a0, partition_names=0 x0) at ../src/sql/ha_partition.cc:8644 #3 0x0000000000822c01 in set_partitions_as_used (tl=0x7fc214015988, t=0x7fc214a61fb8) at ../src/sql/sql_ base.cc:1562 #4 0x00000000008222a7 in open_table (thd=0x7fc214000d48, table_list=0x7fc214015988, ot_ctx=0x7fc23c0a19c 8) at ../src/sql/sql_base.cc:1990 #5 0x0000000000827497 in open_and_process_table (thd=0x7fc214000d48, tables=0x7fc214015988, counter=0x7f c23c0a1acc, flags=0, prelocking_strategy=0x7fc23c0a1b40, has_prelocking_list=false, ot_ctx=0x7fc23c0a19c8 ) at ../src/sql/sql_base.cc:3801 #6 0x0000000000825de0 in open_tables (thd=0x7fc214000d48, options=..., start=0x7fc23c0a1ae0, counter=0x7 fc23c0a1acc, flags=0, prelocking_strategy=0x7fc23c0a1b40) at ../src/sql/sql_base.cc:4275
file->info(HA_STATUS_VARIABLE | HA_STATUS_NO_LOCK | HA_STATUS_OPEN); part_recs+= file->stats.records; } diff --git a/sql/handler.cc b/sql/handler.cc index b312635c8ee..6bb6c279193 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -1572,7 +1572,7 @@ int ha_commit_trans(THD *thd, bool all) DBUG_ASSERT(thd->transaction->stmt.ha_list == NULL || trans == &thd->transaction->stmt);
- if (thd->in_sub_stmt) + if (thd->in_sub_stmt & ~SUB_STMT_AUTO_HIST) {
where is this ha_commit_trans(thd, false) called from? mysql_alter_table that adds a new partition?
Looks like garbage. Removed SUB_STMT_AUTO_HIST.
DBUG_ASSERT(0); /* diff --git a/sql/partition_info.cc b/sql/partition_info.cc index 871411cf6c4..cf8dd140553 100644 --- a/sql/partition_info.cc +++ b/sql/partition_info.cc @@ -816,10 +816,17 @@ bool partition_info::has_unique_name(partition_element *element) vers_info->interval Limit by fixed time interval vers_info->hist_part (out) Working history partition */ -void partition_info::vers_set_hist_part(THD *thd) +uint partition_info::vers_set_hist_part(THD *thd, bool auto_hist) { + DBUG_ASSERT(!thd->lex->last_table() || + !thd->lex->last_table()->vers_conditions.delete_history);
is that right? Conceptually you need to test vers_conditions.delete_history for the table that owns this partition_info. Is it always last_table() ?
It is always last_table() because DELETE HISTORY works with one table.
I'd say it'd be more logical to do, like
part_field_array[0]->table->pos_in_table_list
Fixed.
+ + uint create_count= 0; + auto_hist= auto_hist && vers_info->auto_hist; + if (vers_info->limit) { + DBUG_ASSERT(!vers_info->interval.is_set()); ha_partition *hp= (ha_partition*)(table->file); partition_element *next= NULL; List_iterator<partition_element> it(partitions); @@ -862,9 +869,183 @@ void partition_info::vers_set_hist_part(THD *thd) { vers_info->hist_part= next; if (next->range_value > thd->query_start()) - return; + { + error= false; + break; + } + } + if (error) + { + if (auto_hist) + { + DBUG_ASSERT(thd->query_start() >= vers_info->hist_part->range_value); + my_time_t diff= thd->query_start() - (my_time_t) vers_info->hist_part->range_value; + if (diff > 0) + { + size_t delta= vers_info->interval.seconds(); + create_count= (uint) (diff / delta + 1); + if (diff % delta) + create_count++; + } + else + create_count= 1; + } + else + { + my_error(WARN_VERS_PART_FULL, MYF(ME_WARNING|ME_ERROR_LOG),
after such an overflow a table becomes essentially corrupted, rows are in a wrong partition. So new history partitions cannot be added anymore without reorganizing the last partition.
How is it handled now? (it's just a question, not part of the review, as it's not something you were changing in your patch)
+ table->s->db.str, table->s->table_name.str, + vers_info->hist_part->partition_name, "INTERVAL"); + } } } + + /* + When hist_part is almost full LOCK TABLES my overflow the partition as we
s/my/may/
Fixed.
+ can't add new partitions under LOCK TABLES. Reserve one for this case. + */ + if (auto_hist && create_count == 0 && + thd->lex->sql_command == SQLCOM_LOCK_TABLES && + vers_info->hist_part->id + 1 == vers_info->now_part->id) + create_count= 1; + + return create_count; +} + + +/** + @brief Run fast_alter_partition_table() to add new history partitions + for tables requiring them. +*/ +bool vers_add_auto_hist_parts(THD *thd, TABLE_LIST* tl, uint num_parts) +{ + bool result= true; + HA_CREATE_INFO create_info; + Alter_info alter_info; + partition_info *save_part_info= thd->work_part_info; + Query_tables_list save_query_tables; + Reprepare_observer *save_reprepare_observer= thd->m_reprepare_observer; + bool save_no_write_to_binlog= thd->lex->no_write_to_binlog; + thd->m_reprepare_observer= NULL; + thd->lex->reset_n_backup_query_tables_list(&save_query_tables); + thd->in_sub_stmt|= SUB_STMT_AUTO_HIST; + thd->lex->no_write_to_binlog= true; + TABLE *table= tl->table; + + DBUG_ASSERT(!thd->is_error()); + + { + DBUG_ASSERT(table->s->get_table_ref_type() == TABLE_REF_BASE_TABLE); + DBUG_ASSERT(table->versioned()); + DBUG_ASSERT(table->part_info); + DBUG_ASSERT(table->part_info->vers_info); + alter_info.reset(); + alter_info.partition_flags= ALTER_PARTITION_ADD|ALTER_PARTITION_AUTO_HIST; + create_info.init(); + create_info.alter_info= &alter_info; + Alter_table_ctx alter_ctx(thd, tl, 1, &table->s->db, &table->s->table_name); + + MDL_REQUEST_INIT(&tl->mdl_request, MDL_key::TABLE, tl->db.str, + tl->table_name.str, MDL_SHARED_NO_WRITE, MDL_TRANSACTION); + if (thd->mdl_context.acquire_lock(&tl->mdl_request, + thd->variables.lock_wait_timeout)) + goto exit; + table->mdl_ticket= tl->mdl_request.ticket; + + create_info.db_type= table->s->db_type(); + create_info.options|= HA_VERSIONED_TABLE; + DBUG_ASSERT(create_info.db_type); + + create_info.vers_info.set_start(table->s->vers_start_field()->field_name); + create_info.vers_info.set_end(table->s->vers_end_field()->field_name); + + partition_info *part_info= new partition_info(); + if (unlikely(!part_info)) + { + my_error(ER_OUT_OF_RESOURCES, MYF(0)); + goto exit; + } + part_info->use_default_num_partitions= false; + part_info->use_default_num_subpartitions= false; + part_info->num_parts= num_parts; + part_info->num_subparts= table->part_info->num_subparts; + part_info->subpart_type= table->part_info->subpart_type; + if (unlikely(part_info->vers_init_info(thd))) + { + my_error(ER_OUT_OF_RESOURCES, MYF(0)); + goto exit; + } + + // NB: set_ok_status() requires DA_EMPTY + thd->get_stmt_da()->reset_diagnostics_area();
would it not be DA_EMPTY without a reset? this is at the beginning of a statement, I'd expect it normally be DA_EMPTY here. What other DA states can you get here?
You are right. Removed.
+ + thd->work_part_info= part_info; + if (part_info->set_up_defaults_for_partitioning(thd, table->file, NULL, + table->part_info->next_part_no(num_parts))) + { + push_warning(thd, Sql_condition::WARN_LEVEL_WARN, + ER_VERS_HIST_PART_FAILED, + "Auto-increment history partition: " + "setting up defaults failed"); + my_error(ER_VERS_HIST_PART_FAILED, MYF(ME_WARNING), + tl->db.str, tl->table_name.str); + goto exit; + } + bool partition_changed= false; + bool fast_alter_partition= false; + if (prep_alter_part_table(thd, table, &alter_info, &create_info, + &partition_changed, &fast_alter_partition)) + { + push_warning(thd, Sql_condition::WARN_LEVEL_WARN, ER_VERS_HIST_PART_FAILED, + "Auto-increment history partition: " + "alter partitition prepare failed"); + my_error(ER_VERS_HIST_PART_FAILED, MYF(ME_WARNING), + tl->db.str, tl->table_name.str); + goto exit; + } + if (!fast_alter_partition) + { + push_warning(thd, Sql_condition::WARN_LEVEL_WARN, ER_VERS_HIST_PART_FAILED, + "Auto-increment history partition: " + "fast alter partitition is not possible"); + my_error(ER_VERS_HIST_PART_FAILED, MYF(ME_WARNING), + tl->db.str, tl->table_name.str); + goto exit; + } + DBUG_ASSERT(partition_changed); + if (mysql_prepare_alter_table(thd, table, &create_info, &alter_info, + &alter_ctx)) + { + push_warning(thd, Sql_condition::WARN_LEVEL_WARN, ER_VERS_HIST_PART_FAILED, + "Auto-increment history partition: " + "alter prepare failed"); + my_error(ER_VERS_HIST_PART_FAILED, MYF(ME_WARNING), + tl->db.str, tl->table_name.str); + goto exit; + } + + if (fast_alter_partition_table(thd, table, &alter_info, &create_info, + tl, &table->s->db, &table->s->table_name)) + { + push_warning(thd, Sql_condition::WARN_LEVEL_WARN, ER_VERS_HIST_PART_FAILED, + "Auto-increment history partition: " + "alter partition table failed"); + my_error(ER_VERS_HIST_PART_FAILED, MYF(ME_WARNING), + tl->db.str, tl->table_name.str); + goto exit; + } + } + + result= false; + // NB: we have to return DA_EMPTY for new command
may be DBUG_ASSERT(thd->get_stmt_da()->is_ok());
Added.
+ thd->get_stmt_da()->reset_diagnostics_area(); + +exit: + thd->work_part_info= save_part_info; + thd->m_reprepare_observer= save_reprepare_observer; + thd->lex->restore_backup_query_tables_list(&save_query_tables); + thd->in_sub_stmt&= ~SUB_STMT_AUTO_HIST; + thd->lex->no_write_to_binlog= save_no_write_to_binlog; + return result; }
diff --git a/sql/partition_info.h b/sql/partition_info.h index 0656238ec07..94093353d60 100644 --- a/sql/partition_info.h +++ b/sql/partition_info.h @@ -72,9 +73,26 @@ struct Vers_part_info : public Sql_alloc my_time_t start; INTERVAL step; enum interval_type type; - bool is_set() { return type < INTERVAL_LAST; } + bool is_set() const { return type < INTERVAL_LAST; } + size_t seconds() const + { + if (step.second) + return step.second; + if (step.minute) + return step.minute * 60; + if (step.hour) + return step.hour * 3600; + if (step.day) + return step.day * 3600 * 24; + // comparison is used in rough estimates, it doesn't need to be calendar-correct + if (step.month) + return step.month * 3600 * 24 * 30;
shouldn't it be `* 28` ? you need a most pessimistic estimate to make sure you never miss to create a partition when one is needed. You can sometimes create one when it's not needed yet, but it's less of a problem.
Good catch. Made it calendar-correct.
+ DBUG_ASSERT(step.year); + return step.year * 86400 * 30 * 365;
that's one `* 30` too many :)
+ } } interval; ulonglong limit; + bool auto_hist; partition_element *now_part; partition_element *hist_part; }; diff --git a/sql/sql_base.cc b/sql/sql_base.cc index 1a1186aca73..d0e255186da 100644 --- a/sql/sql_base.cc +++ b/sql/sql_base.cc @@ -1625,6 +1625,52 @@ bool is_locked_view(THD *thd, TABLE_LIST *t) }
+bool TABLE::vers_need_hist_part(const THD *thd, const TABLE_LIST *table_list) const +{ +#ifdef WITH_PARTITION_STORAGE_ENGINE + if (part_info && part_info->part_type == VERSIONING_PARTITION && + !table_list->vers_conditions.delete_history && + !thd->stmt_arena->is_stmt_prepare() && + table_list->lock_type >= TL_WRITE_ALLOW_WRITE && + table_list->mdl_request.type >= MDL_SHARED_WRITE && + table_list->mdl_request.type < MDL_EXCLUSIVE) + { + switch (thd->lex->sql_command) + {
SQLCOM_LOAD with DUP_REPLACE?
Added.
+ case SQLCOM_INSERT: + if (thd->lex->duplicates != DUP_UPDATE) + break; + /* fall-through: */ + case SQLCOM_LOCK_TABLES: + case SQLCOM_DELETE: + case SQLCOM_UPDATE: + case SQLCOM_REPLACE: + case SQLCOM_REPLACE_SELECT: + case SQLCOM_DELETE_MULTI: + case SQLCOM_UPDATE_MULTI: + return true; + default:; + break; + } + if (thd->rgi_slave && thd->rgi_slave->current_event && thd->lex->sql_command == SQLCOM_END)
I wonder, would it be possible, instead of introducing rgi_slave->current_event, just make row events set thd->lex->sql_command appropriately? For example, currently row events increment thd->status_var.com_stat[] each event for its own SQLCOM_xxx, it won't be needed if they'll just set thd->lex->sql_command.
Do you think this is a quick refactoring? I had the same idea when I did this. And probably I tried to do that quickly already. Added TODO comment.
+ { + switch (thd->rgi_slave->current_event->get_type_code()) + { + case UPDATE_ROWS_EVENT: + case UPDATE_ROWS_EVENT_V1: + case DELETE_ROWS_EVENT: + case DELETE_ROWS_EVENT_V1: + return true; + default:; + break; + } + } + } +#endif + return false; +} + + /** Open a base table.
@@ -1777,6 +1823,11 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) DBUG_PRINT("info",("Using locked table")); #ifdef WITH_PARTITION_STORAGE_ENGINE part_names_error= set_partitions_as_used(table_list, table); + if (table->vers_need_hist_part(thd, table_list)) + { + /* Rotation is still needed under LOCK TABLES */
a bit confusing comment, you left out stuff that are obvious to you but not to others. I'd suggest something like
/* New partitions are not auto-created under LOCK TABLES (TODO: fix it) but rotation can still happen. */
Fixed.
+ table->part_info->vers_set_hist_part(thd, false); + } #endif goto reset; } @@ -2032,6 +2083,20 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) tc_add_table(thd, table); }
+ if (!ot_ctx->vers_create_count &&
what does this condition mean? When is ot_ctx->vers_create_count > 0 ?
When we already requested backoff action.
+ table->vers_need_hist_part(thd, table_list)) + { + ot_ctx->vers_create_count= table->part_info->vers_set_hist_part(thd, true); + if (ot_ctx->vers_create_count) + { + MYSQL_UNBIND_TABLE(table->file); + tc_release_table(table); + ot_ctx->request_backoff_action(Open_table_context::OT_ADD_HISTORY_PARTITION, + table_list); + DBUG_RETURN(TRUE); + } + } + if (!(flags & MYSQL_OPEN_HAS_MDL_LOCK) && table->s->table_category < TABLE_CATEGORY_INFORMATION) { @@ -3172,10 +3239,18 @@ Open_table_context::recover_from_failed_open() break; case OT_DISCOVER: case OT_REPAIR: - if ((result= lock_table_names(m_thd, m_thd->lex->create_info, - m_failed_table, NULL, - get_timeout(), 0))) + case OT_ADD_HISTORY_PARTITION: + result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, + NULL, get_timeout(), 0); + if (result) + { + if (m_action == OT_ADD_HISTORY_PARTITION) + { + m_thd->clear_error();
why?
Because of MDEV-23642.
+ result= false; + } break; + }
tdc_remove_table(m_thd, m_failed_table->db.str, m_failed_table->table_name.str); diff --git a/sql/sql_error.h b/sql/sql_error.h index 318d5076534..92076616adb 100644 --- a/sql/sql_error.h +++ b/sql/sql_error.h @@ -1195,7 +1195,6 @@ class Diagnostics_area: public Sql_state_errno,
void copy_non_errors_from_wi(THD *thd, const Warning_info *src_wi);
-private:
It doesn't seem you're using get_warning_info() anywhere
Reverted.
Warning_info *get_warning_info() { return m_wi_stack.front(); }
const Warning_info *get_warning_info() const { return m_wi_stack.front(); } diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc index 86ba393fb7b..f4a960afe47 100644 --- a/sql/sql_partition.cc +++ b/sql/sql_partition.cc @@ -7237,7 +7246,8 @@ uint fast_alter_partition_table(THD *thd, TABLE *table, } else if ((alter_info->partition_flags & ALTER_PARTITION_ADD) && (part_info->part_type == RANGE_PARTITION || - part_info->part_type == LIST_PARTITION)) + part_info->part_type == LIST_PARTITION || + alter_info->partition_flags & ALTER_PARTITION_AUTO_HIST))
it'd be good it adding new empty VERSIONING partitions would always go this way, auto or not. but it's a separate task.
Added comment.
{ /* ADD RANGE/LIST PARTITIONS @@ -7281,7 +7291,7 @@ uint fast_alter_partition_table(THD *thd, TABLE *table, write_log_add_change_partition(lpt) || ERROR_INJECT_CRASH("crash_add_partition_4") || ERROR_INJECT_ERROR("fail_add_partition_4") || - mysql_change_partitions(lpt) || + mysql_change_partitions(lpt, false) ||
this way you skip trans_commit_stmt/trans_commit_implicit for ALTER TABLE ... ADD RANGE/LIST partition. is it always ok?
A safer alternative would've been
mysql_change_partitions(lpt, !(alter_info->partition_flags & ALTER_PARTITION_AUTO_HIST))
but it's more complex, so I'd prefer your variant, if we can be sure that it doesn't break anything.
The condition is if (copy_data && mysql_trans_commit_alter_copy_data(thd)) Data is not needed to be copied in ADD RANGE/LIST partition, is it? We can be sure only from testing period until GA.
ERROR_INJECT_CRASH("crash_add_partition_5") || ERROR_INJECT_ERROR("fail_add_partition_5") || alter_close_table(lpt) ||
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- All the best, Aleksey Midenkov @midenok
Hi, Aleksey! On Apr 07, Aleksey Midenkov wrote:
+ +--echo # INSERT, INSERT .. SELECT don't increment partitions
it's not really "increment", better say "don't auto-create"
Actually I like "increment" more. "Auto-create" overcomplicates phrases:
--echo # Increment from 3 to 5 --echo # Increment from 3 to 6, manual names, LOCK TABLES --echo # Multiple increments in single command
Besides "increment" is correct because PARTITIONS number is incremented.
Sure. "Increment the number of partitions", this is fine. "Auto create partitions" is also fine. "Increment partitions" is meaningless.
+insert into t1 values (1); + +create or replace table t2 (y int); +insert into t2 values (2); + +insert into t1 select * from t2; +insert into t2 select * from t1;
why do you need a t2 table in this test?
t1 is not incremented in case of
insert into t2 select * from t1;
add a comment please
+--echo # Here t2 is incremented too, but not updated +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +update t1, t2 set t1.x= 0 where t1.x< t2.y; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +# Multiupdate_prelocking_strategy::handle_end() is processed after table open. +# For PS it is possible to skip unneeded auto-creation because the above happens at +# prepare stage and auto-creation is done at execute stage. +--replace_result $default_engine DEFAULT_ENGINE 'PARTITIONS 4' 'PARTITIONS ok' 'PARTITIONS 5' 'PARTITIONS ok'
eh... I don't think this is really "ok". As far as I remember, Multiupdate_prelocking_strategy knows what tables should be opened for writing and what for reading. Why would a new partition be created for t2?
It knows this after tables are opened. Look at handle_end(), specifically mysql_handle_derived(), handle_derived(), setup_fields_with_no_wrap() and check_fields(). I believe all these calls are required to get proper get_table_map().
To get this working properly there must be 2-staged open tables, something like PS does.
You mean, a new partition is created for t2 in normal mode and not created in --ps? What if you'll also add `&& table_list->updating` to your condition in TABLE::vers_need_hist_part() ?
diff --git a/mysql-test/suite/versioning/r/partition.result b/mysql-test/suite/versioning/r/partition.result index 2a277b1c2ea..8369b40d98c 100644 --- a/mysql-test/suite/versioning/r/partition.result +++ b/mysql-test/suite/versioning/r/partition.result @@ -1236,27 +1270,752 @@ t1 CREATE TABLE `t1` ( ... +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +affected rows: 0 +lock tables t1 write; +affected rows: 0 +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0 +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0 +show create table t1; +Table Create Table +t1 CREATE TABLE `t1` ( + `x` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR STARTS TIMESTAMP'2000-01-01 00:00:00' AUTO +PARTITIONS 6
why did it add two partitions here?
Because of this condition
/* When hist_part is almost full LOCK TABLES my overflow the partition as we can't add new partitions under LOCK TABLES. Reserve one more for this case. */ if (auto_hist && create_count < 2 && thd->lex->sql_command == SQLCOM_LOCK_TABLES && vers_info->hist_part->id + 1 == vers_info->now_part->id) create_count++;
Hmm. I thought the logic is - normally for UPDATE/DELETE you create a new history partition when you need to write into it. And under LOCK TABLES you create the next partition, just in case, even if the current partition is still fine. Which kind of makes sense, at least until we'll fix that MDEV about creating partitions under LOCK TABLES. But this doesn't mean LOCK TABLES should create two partitions, I think. If UPDATE/DELETE needs to write into a new partition - you create it, LOCK TABLES or not. But I'd rather say that if it's just created, then it's not quite "almost full" (it's more like "almost empty", even "completely empty"). So there's no need to create a second partition at the same time.
+ --source include/rpl_end.inc diff --git a/sql/ha_partition.h b/sql/ha_partition.h index 60a2d7f6762..7ade4419c22 100644 --- a/sql/ha_partition.h +++ b/sql/ha_partition.h @@ -1610,7 +1610,7 @@ class ha_partition :public handler for (; part_id < part_id_end; ++part_id) { handler *file= m_file[part_id]; - DBUG_ASSERT(bitmap_is_set(&(m_part_info->read_partitions), part_id)); + bitmap_set_bit(&(m_part_info->read_partitions), part_id);
Where was it set before your patch?
#1 in partition_info::set_partition_bitmaps () at ../src/sql/partition_info.cc:274 #2 in ha_partition::change_partitions_to_open () at ../src/sql/ha_partition.cc:8644 #3 in set_partitions_as_used () at ../src/sql/sql_base.cc:1562 #4 in open_table () at ../src/sql/sql_base.cc:1990 #5 in open_and_process_table () at ../src/sql/sql_base.cc:3801 #6 in open_tables () at ../src/sql/sql_base.cc:4275
okay. Why is it not happening now?
after such an overflow a table becomes essentially corrupted, rows are in a wrong partition. So new history partitions cannot be added anymore without reorganizing the last partition.
How is it handled now? (it's just a question, not part of the review, as it's not something you were changing in your patch)
You've seemed to miss this question.
+bool TABLE::vers_need_hist_part(const THD *thd, const TABLE_LIST *table_list) const +{ +#ifdef WITH_PARTITION_STORAGE_ENGINE + if (part_info && part_info->part_type == VERSIONING_PARTITION && + !table_list->vers_conditions.delete_history && + !thd->stmt_arena->is_stmt_prepare() && + table_list->lock_type >= TL_WRITE_ALLOW_WRITE && + table_list->mdl_request.type >= MDL_SHARED_WRITE && + table_list->mdl_request.type < MDL_EXCLUSIVE) + { + switch (thd->lex->sql_command) + {
SQLCOM_LOAD with DUP_REPLACE?
Added.
With a test, I hope :)
+ if (thd->rgi_slave && thd->rgi_slave->current_event && thd->lex->sql_command == SQLCOM_END)
I wonder, would it be possible, instead of introducing rgi_slave->current_event, just make row events set thd->lex->sql_command appropriately? For example, currently row events increment thd->status_var.com_stat[] each event for its own SQLCOM_xxx, it won't be needed if they'll just set thd->lex->sql_command.
Do you think this is a quick refactoring? I had the same idea when I did this. And probably I tried to do that quickly already. Added TODO comment.
Right, may be not quick enough for now :(
@@ -2032,6 +2083,20 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) tc_add_table(thd, table); }
+ if (!ot_ctx->vers_create_count &&
what does this condition mean? When is ot_ctx->vers_create_count > 0 ?
When we already requested backoff action.
Could you describe an example? I thought if there are two tables in a list and both need to add a new partition (like in a multi-update), then backoff will happen on the first one, so open_table() won't reach the second with ot_ctx->vers_create_count > 0.
+ table->vers_need_hist_part(thd, table_list)) + { + ot_ctx->vers_create_count= table->part_info->vers_set_hist_part(thd, true); + if (ot_ctx->vers_create_count) + { + MYSQL_UNBIND_TABLE(table->file); + tc_release_table(table); + ot_ctx->request_backoff_action(Open_table_context::OT_ADD_HISTORY_PARTITION, + table_list); + DBUG_RETURN(TRUE); + } + } + if (!(flags & MYSQL_OPEN_HAS_MDL_LOCK) && table->s->table_category < TABLE_CATEGORY_INFORMATION) { @@ -3172,10 +3239,18 @@ Open_table_context::recover_from_failed_open() break; case OT_DISCOVER: case OT_REPAIR: - if ((result= lock_table_names(m_thd, m_thd->lex->create_info, - m_failed_table, NULL, - get_timeout(), 0))) + case OT_ADD_HISTORY_PARTITION: + result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, + NULL, get_timeout(), 0); + if (result) + { + if (m_action == OT_ADD_HISTORY_PARTITION) + { + m_thd->clear_error();
why?
Because of MDEV-23642.
I see. I generally prefer to avoid thd->clear_error() because of its non discriminative nature. It might be ok here, but then, please add an assert before lock_table_names: DBUG_ASSERT(!m_thd->get_stmt_da()->is_set());
+ result= false; + } @@ -7281,7 +7291,7 @@ uint fast_alter_partition_table(THD *thd, TABLE *table, write_log_add_change_partition(lpt) || ERROR_INJECT_CRASH("crash_add_partition_4") || ERROR_INJECT_ERROR("fail_add_partition_4") || - mysql_change_partitions(lpt) || + mysql_change_partitions(lpt, false) ||
this way you skip trans_commit_stmt/trans_commit_implicit for ALTER TABLE ... ADD RANGE/LIST partition. is it always ok?
A safer alternative would've been
mysql_change_partitions(lpt, !(alter_info->partition_flags & ALTER_PARTITION_AUTO_HIST))
but it's more complex, so I'd prefer your variant, if we can be sure that it doesn't break anything.
The condition is
if (copy_data && mysql_trans_commit_alter_copy_data(thd))
Data is not needed to be copied in ADD RANGE/LIST partition, is it? We can be sure only from testing period until GA.
well, we can never be sure after any amount of testing. You're right, the data, indeed, won't be copied in ADD RANGE/LIST partition. But perhaps InnoDB can still open a transaction there? As far as I understand a transaction is started on ha_innobase::external_lock(). Will it happen here? Or start_stmt() under LOCK TABLES. Is that possible? Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hi Sergei, On Thu, Apr 8, 2021 at 3:30 PM Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Aleksey!
On Apr 07, Aleksey Midenkov wrote:
+ +--echo # INSERT, INSERT .. SELECT don't increment partitions
it's not really "increment", better say "don't auto-create"
Actually I like "increment" more. "Auto-create" overcomplicates phrases:
--echo # Increment from 3 to 5 --echo # Increment from 3 to 6, manual names, LOCK TABLES --echo # Multiple increments in single command
Besides "increment" is correct because PARTITIONS number is incremented.
Sure. "Increment the number of partitions", this is fine. "Auto create partitions" is also fine. "Increment partitions" is meaningless.
It is obvious from the context that we are talking about the number, not partitions themselves. Treat "partitions" as PARTITIONS keyword and the increment is attached to a number right next to it. That's quite a sense, isn't it?
+insert into t1 values (1); + +create or replace table t2 (y int); +insert into t2 values (2); + +insert into t1 select * from t2; +insert into t2 select * from t1;
why do you need a t2 table in this test?
t1 is not incremented in case of
insert into t2 select * from t1;
add a comment please
Added.
+--echo # Here t2 is incremented too, but not updated +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +update t1, t2 set t1.x= 0 where t1.x< t2.y; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +# Multiupdate_prelocking_strategy::handle_end() is processed after table open. +# For PS it is possible to skip unneeded auto-creation because the above happens at +# prepare stage and auto-creation is done at execute stage. +--replace_result $default_engine DEFAULT_ENGINE 'PARTITIONS 4' 'PARTITIONS ok' 'PARTITIONS 5' 'PARTITIONS ok'
eh... I don't think this is really "ok". As far as I remember, Multiupdate_prelocking_strategy knows what tables should be opened for writing and what for reading. Why would a new partition be created for t2?
It knows this after tables are opened. Look at handle_end(), specifically mysql_handle_derived(), handle_derived(), setup_fields_with_no_wrap() and check_fields(). I believe all these calls are required to get proper get_table_map().
To get this working properly there must be 2-staged open tables, something like PS does.
You mean, a new partition is created for t2 in normal mode and not created in --ps?
Yes.
What if you'll also add `&& table_list->updating` to your condition in TABLE::vers_need_hist_part() ?
If `Multiupdate_prelocking_strategy::handle_end()` was not yet executed how `updating` can help? No, it doesn't help.
diff --git a/mysql-test/suite/versioning/r/partition.result b/mysql-test/suite/versioning/r/partition.result index 2a277b1c2ea..8369b40d98c 100644 --- a/mysql-test/suite/versioning/r/partition.result +++ b/mysql-test/suite/versioning/r/partition.result @@ -1236,27 +1270,752 @@ t1 CREATE TABLE `t1` ( ... +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +affected rows: 0 +lock tables t1 write; +affected rows: 0 +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0 +execute s; +affected rows: 1 +info: Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0 +show create table t1; +Table Create Table +t1 CREATE TABLE `t1` ( + `x` int(11) DEFAULT NULL +) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING + PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR STARTS TIMESTAMP'2000-01-01 00:00:00' AUTO +PARTITIONS 6
why did it add two partitions here?
Because of this condition
/* When hist_part is almost full LOCK TABLES my overflow the partition as we can't add new partitions under LOCK TABLES. Reserve one more for this case. */ if (auto_hist && create_count < 2 && thd->lex->sql_command == SQLCOM_LOCK_TABLES && vers_info->hist_part->id + 1 == vers_info->now_part->id) create_count++;
Hmm. I thought the logic is - normally for UPDATE/DELETE you create a new history partition when you need to write into it. And under LOCK TABLES you create the next partition, just in case, even if the current partition is still fine. Which kind of makes sense, at least until we'll fix that MDEV about creating partitions under LOCK TABLES.
But this doesn't mean LOCK TABLES should create two partitions, I think. If UPDATE/DELETE needs to write into a new partition - you create it, LOCK TABLES or not. But I'd rather say that if it's just created, then it's not quite "almost full" (it's more like "almost empty", even "completely empty"). So there's no need to create a second partition at the same time.
Right. Fixed. Actually it was bigger bug: it created 1 partition and when reopened created 1 more partition.
+ --source include/rpl_end.inc diff --git a/sql/ha_partition.h b/sql/ha_partition.h index 60a2d7f6762..7ade4419c22 100644 --- a/sql/ha_partition.h +++ b/sql/ha_partition.h @@ -1610,7 +1610,7 @@ class ha_partition :public handler for (; part_id < part_id_end; ++part_id) { handler *file= m_file[part_id]; - DBUG_ASSERT(bitmap_is_set(&(m_part_info->read_partitions), part_id)); + bitmap_set_bit(&(m_part_info->read_partitions), part_id);
Where was it set before your patch?
#1 in partition_info::set_partition_bitmaps () at ../src/sql/partition_info.cc:274 #2 in ha_partition::change_partitions_to_open () at ../src/sql/ha_partition.cc:8644 #3 in set_partitions_as_used () at ../src/sql/sql_base.cc:1562 #4 in open_table () at ../src/sql/sql_base.cc:1990 #5 in open_and_process_table () at ../src/sql/sql_base.cc:3801 #6 in open_tables () at ../src/sql/sql_base.cc:4275
okay. Why is it not happening now?
Because it is happening at line 1990 in open_table() and vers_set_hist_part() is happening earlier.
after such an overflow a table becomes essentially corrupted, rows are in a wrong partition. So new history partitions cannot be added anymore without reorganizing the last partition.
How is it handled now? (it's just a question, not part of the review, as it's not something you were changing in your patch)
You've seemed to miss this question.
Sorry. They are added via reorganize. I.e. manual ADD PARTITIONS does data copy. Added test "Partition overflow error and manual fix" and found pruning boundary bug in 10.3 (see FIXME).
+bool TABLE::vers_need_hist_part(const THD *thd, const TABLE_LIST *table_list) const +{ +#ifdef WITH_PARTITION_STORAGE_ENGINE + if (part_info && part_info->part_type == VERSIONING_PARTITION && + !table_list->vers_conditions.delete_history && + !thd->stmt_arena->is_stmt_prepare() && + table_list->lock_type >= TL_WRITE_ALLOW_WRITE && + table_list->mdl_request.type >= MDL_SHARED_WRITE && + table_list->mdl_request.type < MDL_EXCLUSIVE) + { + switch (thd->lex->sql_command) + {
SQLCOM_LOAD with DUP_REPLACE?
Added.
With a test, I hope :)
Yes.
+ if (thd->rgi_slave && thd->rgi_slave->current_event && thd->lex->sql_command == SQLCOM_END)
I wonder, would it be possible, instead of introducing rgi_slave->current_event, just make row events set thd->lex->sql_command appropriately? For example, currently row events increment thd->status_var.com_stat[] each event for its own SQLCOM_xxx, it won't be needed if they'll just set thd->lex->sql_command.
Do you think this is a quick refactoring? I had the same idea when I did this. And probably I tried to do that quickly already. Added TODO comment.
Right, may be not quick enough for now :(
@@ -2032,6 +2083,20 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) tc_add_table(thd, table); }
+ if (!ot_ctx->vers_create_count &&
what does this condition mean? When is ot_ctx->vers_create_count > 0 ?
When we already requested backoff action.
Could you describe an example?
I thought if there are two tables in a list and both need to add a new partition (like in a multi-update), then backoff will happen on the first one, so open_table() won't reach the second with ot_ctx->vers_create_count > 0.
open_table() is called again from recover_from_failed_open(). vers_create_count is reset in recover_from_failed_open(): 3306 case OT_ADD_HISTORY_PARTITION: 3307 { 3308 result= false; 3309 TABLE *table= open_ltable(m_thd, m_failed_table, TL_WRITE, 3310 MYSQL_OPEN_HAS_MDL_LOCK | MYSQL_OPEN_IGNORE_LOGGING_FORMAT); .... 3323 vers_create_count= 0; 3324 break; 3325 } 3326 case OT_BACKOFF_AND_RETRY: 3327 case OT_REOPEN_TABLES: 3328 case OT_NO_ACTION: 3329 In our scheme open_table() is called 3 times: 1. before partitions increment; 2. inside OT_ADD_HISTORY_PARTITION; 3. after partitions increment.
+ table->vers_need_hist_part(thd, table_list)) + { + ot_ctx->vers_create_count= table->part_info->vers_set_hist_part(thd, true); + if (ot_ctx->vers_create_count) + { + MYSQL_UNBIND_TABLE(table->file); + tc_release_table(table); + ot_ctx->request_backoff_action(Open_table_context::OT_ADD_HISTORY_PARTITION, + table_list); + DBUG_RETURN(TRUE); + } + } + if (!(flags & MYSQL_OPEN_HAS_MDL_LOCK) && table->s->table_category < TABLE_CATEGORY_INFORMATION) { @@ -3172,10 +3239,18 @@ Open_table_context::recover_from_failed_open() break; case OT_DISCOVER: case OT_REPAIR: - if ((result= lock_table_names(m_thd, m_thd->lex->create_info, - m_failed_table, NULL, - get_timeout(), 0))) + case OT_ADD_HISTORY_PARTITION: + result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, + NULL, get_timeout(), 0); + if (result) + { + if (m_action == OT_ADD_HISTORY_PARTITION) + { + m_thd->clear_error();
why?
Because of MDEV-23642.
I see. I generally prefer to avoid thd->clear_error() because of its non discriminative nature.
It might be ok here, but then, please add an assert before lock_table_names:
DBUG_ASSERT(!m_thd->get_stmt_da()->is_set());
Added.
+ result= false; + } @@ -7281,7 +7291,7 @@ uint fast_alter_partition_table(THD *thd, TABLE *table, write_log_add_change_partition(lpt) || ERROR_INJECT_CRASH("crash_add_partition_4") || ERROR_INJECT_ERROR("fail_add_partition_4") || - mysql_change_partitions(lpt) || + mysql_change_partitions(lpt, false) ||
this way you skip trans_commit_stmt/trans_commit_implicit for ALTER TABLE ... ADD RANGE/LIST partition. is it always ok?
A safer alternative would've been
mysql_change_partitions(lpt, !(alter_info->partition_flags & ALTER_PARTITION_AUTO_HIST))
but it's more complex, so I'd prefer your variant, if we can be sure that it doesn't break anything.
The condition is
if (copy_data && mysql_trans_commit_alter_copy_data(thd))
Data is not needed to be copied in ADD RANGE/LIST partition, is it? We can be sure only from testing period until GA.
well, we can never be sure after any amount of testing.
You're right, the data, indeed, won't be copied in ADD RANGE/LIST partition. But perhaps InnoDB can still open a transaction there? As far as I understand a transaction is started on ha_innobase::external_lock(). Will it happen here? Or start_stmt() under LOCK TABLES. Is that possible?
mysql_execute_command() does trans_commit_stmt() and trans_commit_implicit().
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- All the best, Aleksey Midenkov @midenok
Hi, Aleksey! On Apr 09, Aleksey Midenkov wrote:
--echo # Increment from 3 to 5 --echo # Increment from 3 to 6, manual names, LOCK TABLES --echo # Multiple increments in single command
Besides "increment" is correct because PARTITIONS number is incremented.
Sure. "Increment the number of partitions", this is fine. "Auto create partitions" is also fine. "Increment partitions" is meaningless.
It is obvious from the context that we are talking about the number, not partitions themselves. Treat "partitions" as PARTITIONS keyword and the increment is attached to a number right next to it. That's quite a sense, isn't it?
No, I think it's a meaningless combination of words. But let's ask native speakers, shall we?
+--echo # Here t2 is incremented too, but not updated +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +update t1, t2 set t1.x= 0 where t1.x< t2.y; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +# Multiupdate_prelocking_strategy::handle_end() is processed after table open. +# For PS it is possible to skip unneeded auto-creation because the above happens at +# prepare stage and auto-creation is done at execute stage. +--replace_result $default_engine DEFAULT_ENGINE 'PARTITIONS 4' 'PARTITIONS ok' 'PARTITIONS 5' 'PARTITIONS ok'
eh... I don't think this is really "ok". As far as I remember, Multiupdate_prelocking_strategy knows what tables should be opened for writing and what for reading. Why would a new partition be created for t2?
It knows this after tables are opened. Look at handle_end(), specifically mysql_handle_derived(), handle_derived(), setup_fields_with_no_wrap() and check_fields(). I believe all these calls are required to get proper get_table_map().
To get this working properly there must be 2-staged open tables, something like PS does.
You mean, a new partition is created for t2 in normal mode and not created in --ps?
Yes.
What if you'll also add `&& table_list->updating` to your condition in TABLE::vers_need_hist_part() ?
If `Multiupdate_prelocking_strategy::handle_end()` was not yet executed how `updating` can help? No, it doesn't help.
Okay. I think I know how to fix it, but let's do it after the main thing is pushed. Users are rather sensitive to cases when a non-updatable table in multi-update gets treated like updatable, we've had a bunch of bug reports about it over the years. Wrong privileges (UPDATE privilege should not be checked), wrong locks (the table should never be write-locked, a concurrent write lock should not block multi-update), triggers (should not be opened, so any problems with triggers, like privileges or missing tables, should not affect multi-update), and so on. So I suppose we'll have to fix this one too. This is how it could be done - vers_need_hist_part should check for table_list->updating. It's always FALSE in multi-update, so it won't be auto-creating new partitions at all for multi-update. But! after handle_end() (or, better, in handle_end()) we'll have another pass over tables and will implement a fallback-and-retry from handle_end(). Let's do it as a separate commit after the feature is pushed.
@@ -2032,6 +2083,20 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) tc_add_table(thd, table); }
+ if (!ot_ctx->vers_create_count &&
what does this condition mean? When is ot_ctx->vers_create_count > 0 ?
When we already requested backoff action.
I'd suggest to rewrite as if (ot_ctx->vers_create_count) /* already tried to add a partition to this table and failed (because of e.g. lock conflict). Don't try again */ else if (table->vers_need_hist_part(thd, table_list)) ... and here your code ... Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hi Sergei, On Sat, Apr 10, 2021 at 11:12 PM Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Aleksey!
On Apr 09, Aleksey Midenkov wrote:
--echo # Increment from 3 to 5 --echo # Increment from 3 to 6, manual names, LOCK TABLES --echo # Multiple increments in single command
Besides "increment" is correct because PARTITIONS number is incremented.
Sure. "Increment the number of partitions", this is fine. "Auto create partitions" is also fine. "Increment partitions" is meaningless.
It is obvious from the context that we are talking about the number, not partitions themselves. Treat "partitions" as PARTITIONS keyword and the increment is attached to a number right next to it. That's quite a sense, isn't it?
No, I think it's a meaningless combination of words. But let's ask native speakers, shall we?
The ability to imply things and make special terms is the freedom of any language I suppose. Slang (or "special language") helps us to describe technical entities more concisely.
+--echo # Here t2 is incremented too, but not updated +set timestamp= unix_timestamp('2000-01-01 03:00:00'); +update t1, t2 set t1.x= 0 where t1.x< t2.y; +--replace_result $default_engine DEFAULT_ENGINE +show create table t1; +# Multiupdate_prelocking_strategy::handle_end() is processed after table open. +# For PS it is possible to skip unneeded auto-creation because the above happens at +# prepare stage and auto-creation is done at execute stage. +--replace_result $default_engine DEFAULT_ENGINE 'PARTITIONS 4' 'PARTITIONS ok' 'PARTITIONS 5' 'PARTITIONS ok'
eh... I don't think this is really "ok". As far as I remember, Multiupdate_prelocking_strategy knows what tables should be opened for writing and what for reading. Why would a new partition be created for t2?
It knows this after tables are opened. Look at handle_end(), specifically mysql_handle_derived(), handle_derived(), setup_fields_with_no_wrap() and check_fields(). I believe all these calls are required to get proper get_table_map().
To get this working properly there must be 2-staged open tables, something like PS does.
You mean, a new partition is created for t2 in normal mode and not created in --ps?
Yes.
What if you'll also add `&& table_list->updating` to your condition in TABLE::vers_need_hist_part() ?
If `Multiupdate_prelocking_strategy::handle_end()` was not yet executed how `updating` can help? No, it doesn't help.
Okay. I think I know how to fix it, but let's do it after the main thing is pushed.
Users are rather sensitive to cases when a non-updatable table in multi-update gets treated like updatable, we've had a bunch of bug reports about it over the years. Wrong privileges (UPDATE privilege should not be checked), wrong locks (the table should never be write-locked, a concurrent write lock should not block multi-update), triggers (should not be opened, so any problems with triggers, like privileges or missing tables, should not affect multi-update), and so on.
So I suppose we'll have to fix this one too. This is how it could be done - vers_need_hist_part should check for table_list->updating. It's always FALSE in multi-update, so it won't be auto-creating new partitions at all for multi-update. But! after handle_end() (or, better, in handle_end()) we'll have another pass over tables and will implement a fallback-and-retry from handle_end().
Let's do it as a separate commit after the feature is pushed.
@@ -2032,6 +2083,20 @@ bool open_table(THD *thd, TABLE_LIST *table_list, Open_table_context *ot_ctx) tc_add_table(thd, table); }
+ if (!ot_ctx->vers_create_count &&
what does this condition mean? When is ot_ctx->vers_create_count > 0 ?
When we already requested backoff action.
I'd suggest to rewrite as
if (ot_ctx->vers_create_count) /* already tried to add a partition to this table and failed (because of e.g. lock conflict). Don't try again */ else if (table->vers_need_hist_part(thd, table_list)) ... and here your code ...
The comment is wrong. Added correct comment.
-void partition_info::vers_set_hist_part(THD *thd) +uint partition_info::vers_set_hist_part(THD *thd, bool auto_hist) { + DBUG_ASSERT(!thd->lex->last_table() || + !thd->lex->last_table()->vers_conditions.delete_history);
is that right? Conceptually you need to test vers_conditions.delete_history for the table that owns this partition_info. Is it always last_table() ?
I'd say it'd be more logical to do, like
part_field_array[0]->table->pos_in_table_list
Sorry, I had to revert that back. The original form was selected not by coincidence. This data in TABLE is not yet initialized. I also have to revert DBUG_ASSERT() you suggested which fails drop_table_force on winx64-debug and add error code condition instead: @@ -3268,12 +3271,12 @@ Open_table_context::recover_from_failed_open() case OT_DISCOVER: case OT_REPAIR: case OT_ADD_HISTORY_PARTITION: - DBUG_ASSERT(!m_thd->get_stmt_da()->is_set()); result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, NULL, get_timeout(), 0); if (result) { - if (m_action == OT_ADD_HISTORY_PARTITION) + if (m_action == OT_ADD_HISTORY_PARTITION && + m_thd->get_stmt_da()->sql_errno() == ER_LOCK_WAIT_TIMEOUT) { // MDEV-23642 Locking timeout caused by auto-creation affects original DML m_thd->clear_error();
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- All the best, Aleksey Midenkov @midenok
Hi, Aleksey! On Apr 11, Aleksey Midenkov wrote:
On Sat, Apr 10, 2021 at 11:12 PM Sergei Golubchik <serg@mariadb.org> wrote:
On Apr 09, Aleksey Midenkov wrote:
--echo # Increment from 3 to 5 --echo # Increment from 3 to 6, manual names, LOCK TABLES --echo # Multiple increments in single command
Besides "increment" is correct because PARTITIONS number is incremented.
Sure. "Increment the number of partitions", this is fine. "Auto create partitions" is also fine. "Increment partitions" is meaningless.
It is obvious from the context that we are talking about the number, not partitions themselves. Treat "partitions" as PARTITIONS keyword and the increment is attached to a number right next to it. That's quite a sense, isn't it?
No, I think it's a meaningless combination of words. But let's ask native speakers, shall we?
The ability to imply things and make special terms is the freedom of any language I suppose. Slang (or "special language") helps us to describe technical entities more concisely.
:) I feel it's just incorrect use of words and I shudder every time I see it. True, I know that my feeling of the language is not perfect, I'm not a native speaker. But it works both ways, you're not a native speaker either. So, if you don't want to change to correct English, there's no point in arguing, let's just ask a native speaker.
I'd suggest to rewrite as
if (ot_ctx->vers_create_count) /* already tried to add a partition to this table and failed (because of e.g. lock conflict). Don't try again */ else if (table->vers_need_hist_part(thd, table_list)) ... and here your code ...
The comment is wrong. Added correct comment.
The comment describes what I've seen in gdb, looking how ot_ctx->vers_create_count could be non-zero. It was your update t1 set x= x + 122 where x = 1 test case, "Locking timeout caused by auto-creation affects original DML" You try to auto-create, it fails because of the lock timeout, you clear_error(), the table is reopened. But because vers_create_count is not reset, it does not try to auto-create again, thus avoiding an infinite loop. What do you thing was wrong there?
I also have to revert DBUG_ASSERT() you suggested which fails drop_table_force on winx64-debug and add error code condition instead:
@@ -3268,12 +3271,12 @@ Open_table_context::recover_from_failed_open() case OT_DISCOVER: case OT_REPAIR: case OT_ADD_HISTORY_PARTITION: - DBUG_ASSERT(!m_thd->get_stmt_da()->is_set()); result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, NULL, get_timeout(), 0); if (result) { - if (m_action == OT_ADD_HISTORY_PARTITION) + if (m_action == OT_ADD_HISTORY_PARTITION && + m_thd->get_stmt_da()->sql_errno() == ER_LOCK_WAIT_TIMEOUT) { // MDEV-23642 Locking timeout caused by auto-creation affects original DML m_thd->clear_error();
What happens on winx64-debug? Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hi Sergei, On Sun, Apr 11, 2021 at 7:21 PM Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Aleksey!
On Apr 11, Aleksey Midenkov wrote:
On Sat, Apr 10, 2021 at 11:12 PM Sergei Golubchik <serg@mariadb.org> wrote:
On Apr 09, Aleksey Midenkov wrote:
--echo # Increment from 3 to 5 --echo # Increment from 3 to 6, manual names, LOCK TABLES --echo # Multiple increments in single command
Besides "increment" is correct because PARTITIONS number is incremented.
Sure. "Increment the number of partitions", this is fine. "Auto create partitions" is also fine. "Increment partitions" is meaningless.
It is obvious from the context that we are talking about the number, not partitions themselves. Treat "partitions" as PARTITIONS keyword and the increment is attached to a number right next to it. That's quite a sense, isn't it?
No, I think it's a meaningless combination of words. But let's ask native speakers, shall we?
The ability to imply things and make special terms is the freedom of any language I suppose. Slang (or "special language") helps us to describe technical entities more concisely.
:) I feel it's just incorrect use of words and I shudder every time I see it.
It was a conscious choice. Quantitative characteristic is implied.
True, I know that my feeling of the language is not perfect, I'm not a native speaker. But it works both ways, you're not a native speaker either.
So, if you don't want to change to correct English, there's no point in arguing, let's just ask a native speaker.
I'd suggest to rewrite as
if (ot_ctx->vers_create_count) /* already tried to add a partition to this table and failed (because of e.g. lock conflict). Don't try again */ else if (table->vers_need_hist_part(thd, table_list)) ... and here your code ...
The comment is wrong. Added correct comment.
The comment describes what I've seen in gdb, looking how ot_ctx->vers_create_count could be non-zero. It was your
update t1 set x= x + 122 where x = 1
test case, "Locking timeout caused by auto-creation affects original DML" You try to auto-create, it fails because of the lock timeout, you clear_error(), the table is reopened. But because vers_create_count is not reset, it does not try to auto-create again, thus avoiding an infinite loop.
What do you thing was wrong there?
Okay, it is partly true. Failing case is the secondary one.
I also have to revert DBUG_ASSERT() you suggested which fails drop_table_force on winx64-debug and add error code condition instead:
@@ -3268,12 +3271,12 @@ Open_table_context::recover_from_failed_open() case OT_DISCOVER: case OT_REPAIR: case OT_ADD_HISTORY_PARTITION: - DBUG_ASSERT(!m_thd->get_stmt_da()->is_set()); result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, NULL, get_timeout(), 0); if (result) { - if (m_action == OT_ADD_HISTORY_PARTITION) + if (m_action == OT_ADD_HISTORY_PARTITION && + m_thd->get_stmt_da()->sql_errno() == ER_LOCK_WAIT_TIMEOUT) { // MDEV-23642 Locking timeout caused by auto-creation affects original DML m_thd->clear_error();
What happens on winx64-debug?
Please look yourself at http://buildbot.askmonty.org/buildbot/builders/winx64-debug/builds/24345/ste...
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- All the best, Aleksey Midenkov @midenok
Hi, Aleksey! On Apr 11, Aleksey Midenkov wrote:
It was a conscious choice. Quantitative characteristic is implied.
This isn't going anywhere. I've asked Ian (KB technical writer) for his opinion and we'll do what he'll say.
True, I know that my feeling of the language is not perfect, I'm not a native speaker. But it works both ways, you're not a native speaker either.
So, if you don't want to change to correct English, there's no point in arguing, let's just ask a native speaker.
I'd suggest to rewrite as
if (ot_ctx->vers_create_count) /* already tried to add a partition to this table and failed (because of e.g. lock conflict). Don't try again */ else if (table->vers_need_hist_part(thd, table_list)) ... and here your code ...
The comment is wrong. Added correct comment.
The comment describes what I've seen in gdb, looking how ot_ctx->vers_create_count could be non-zero. It was your
update t1 set x= x + 122 where x = 1
test case, "Locking timeout caused by auto-creation affects original DML" You try to auto-create, it fails because of the lock timeout, you clear_error(), the table is reopened. But because vers_create_count is not reset, it does not try to auto-create again, thus avoiding an infinite loop.
What do you thing was wrong there?
Okay, it is partly true. Failing case is the secondary one.
May be, but it's the only case where ot_ctx->vers_create_count>0 there. So if there's a "primary" case too, your tests never trigger it. Add a test for it, please.
I also have to revert DBUG_ASSERT() you suggested which fails drop_table_force on winx64-debug and add error code condition instead:
@@ -3268,12 +3271,12 @@ Open_table_context::recover_from_failed_open() case OT_DISCOVER: case OT_REPAIR: case OT_ADD_HISTORY_PARTITION: - DBUG_ASSERT(!m_thd->get_stmt_da()->is_set()); result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, NULL, get_timeout(), 0); if (result) { - if (m_action == OT_ADD_HISTORY_PARTITION) + if (m_action == OT_ADD_HISTORY_PARTITION && + m_thd->get_stmt_da()->sql_errno() == ER_LOCK_WAIT_TIMEOUT) { // MDEV-23642 Locking timeout caused by auto-creation affects original DML m_thd->clear_error();
What happens on winx64-debug?
Please look yourself at
http://buildbot.askmonty.org/buildbot/builders/winx64-debug/builds/24345/ste...
Thanks. But either I don't know how to read it correctly or it doesn't have enough info. Could you explain what happens on this builder? What error do you see in stmt_da in recover_from_failed_open() that triggers the assert? Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hi Sergei, On Mon, Apr 12, 2021 at 3:14 PM Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Aleksey!
On Apr 11, Aleksey Midenkov wrote:
It was a conscious choice. Quantitative characteristic is implied.
This isn't going anywhere.
Of course. And because of that I prefer to write shorter comments.
I've asked Ian (KB technical writer) for his opinion and we'll do what he'll say.
So he says it is not meaningless. I am and was upset by your disagreement with my taste. If you are totally against the form I use I'm going to remove the comments. But when we are talking about public documents of course I'm going to make it better understandable, and he is right regarding that particular place.
True, I know that my feeling of the language is not perfect, I'm not a native speaker. But it works both ways, you're not a native speaker either.
So, if you don't want to change to correct English, there's no point in arguing, let's just ask a native speaker.
I'd suggest to rewrite as
if (ot_ctx->vers_create_count) /* already tried to add a partition to this table and failed (because of e.g. lock conflict). Don't try again */ else if (table->vers_need_hist_part(thd, table_list)) ... and here your code ...
The comment is wrong. Added correct comment.
The comment describes what I've seen in gdb, looking how ot_ctx->vers_create_count could be non-zero. It was your
update t1 set x= x + 122 where x = 1
test case, "Locking timeout caused by auto-creation affects original DML" You try to auto-create, it fails because of the lock timeout, you clear_error(), the table is reopened. But because vers_create_count is not reset, it does not try to auto-create again, thus avoiding an infinite loop.
What do you thing was wrong there?
Okay, it is partly true. Failing case is the secondary one.
May be, but it's the only case where ot_ctx->vers_create_count>0 there. So if there's a "primary" case too, your tests never trigger it. Add a test for it, please.
Sorry, my bad, vers_create_count was not passed to open_ltable(), there is a different ot_ctx. Added your comment as is.
I also have to revert DBUG_ASSERT() you suggested which fails drop_table_force on winx64-debug and add error code condition instead:
@@ -3268,12 +3271,12 @@ Open_table_context::recover_from_failed_open() case OT_DISCOVER: case OT_REPAIR: case OT_ADD_HISTORY_PARTITION: - DBUG_ASSERT(!m_thd->get_stmt_da()->is_set()); result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, NULL, get_timeout(), 0); if (result) { - if (m_action == OT_ADD_HISTORY_PARTITION) + if (m_action == OT_ADD_HISTORY_PARTITION && + m_thd->get_stmt_da()->sql_errno() == ER_LOCK_WAIT_TIMEOUT) { // MDEV-23642 Locking timeout caused by auto-creation affects original DML m_thd->clear_error();
What happens on winx64-debug?
Please look yourself at
http://buildbot.askmonty.org/buildbot/builders/winx64-debug/builds/24345/ste...
Thanks. But either I don't know how to read it correctly or it doesn't have enough info.
Could you explain what happens on this builder? What error do you see in stmt_da in recover_from_failed_open() that triggers the assert?
Why are you asking me to do investigations on your hypotheses out of the task scope? It was just a hypothesis after all! Though you can figure out error code (ER_NO_SUCH_TABLE) from the test case: create table t2(a int not null) engine=archive; flush tables; --error 1 --remove_file $DATADIR/test/t2.frm select * from t2; flush tables; --remove_file $DATADIR/test/t2.ARZ --error ER_NO_SUCH_TABLE select * from t2; There are several tests failing though...
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- All the best, Aleksey Midenkov @midenok
Hi, Aleksey! On Apr 12, Aleksey Midenkov wrote:
On Mon, Apr 12, 2021 at 3:14 PM Sergei Golubchik <serg@mariadb.org> wrote:
On Apr 11, Aleksey Midenkov wrote:
It was a conscious choice. Quantitative characteristic is implied.
This isn't going anywhere.
Of course. And because of that I prefer to write shorter comments.
I've asked Ian (KB technical writer) for his opinion and we'll do what he'll say.
So he says it is not meaningless. I am and was upset by your disagreement with my taste. If you are totally against the form I use I'm going to remove the comments. But when we are talking about public documents of course I'm going to make it better understandable, and he is right regarding that particular place.
I perceived it simply as incorrect, broken English. So I asked Ian's opinion. Also I believe we should avoid using different terminology internally and externally unless there's a good reason to do it. That's all, not a question of a taste.
I also have to revert DBUG_ASSERT() you suggested which fails drop_table_force on winx64-debug and add error code condition instead:
@@ -3268,12 +3271,12 @@ Open_table_context::recover_from_failed_open() case OT_DISCOVER: case OT_REPAIR: case OT_ADD_HISTORY_PARTITION: - DBUG_ASSERT(!m_thd->get_stmt_da()->is_set()); result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, NULL, get_timeout(), 0); if (result) { - if (m_action == OT_ADD_HISTORY_PARTITION) + if (m_action == OT_ADD_HISTORY_PARTITION && + m_thd->get_stmt_da()->sql_errno() == ER_LOCK_WAIT_TIMEOUT) { // MDEV-23642 Locking timeout caused by auto-creation affects original DML m_thd->clear_error();
What happens on winx64-debug?
Please look yourself at
http://buildbot.askmonty.org/buildbot/builders/winx64-debug/builds/24345/ste...
Thanks. But either I don't know how to read it correctly or it doesn't have enough info.
Could you explain what happens on this builder? What error do you see in stmt_da in recover_from_failed_open() that triggers the assert?
Why are you asking me to do investigations on your hypotheses out of the task scope? It was just a hypothesis after all!
I meant that you do thd->clear_error(). It's non-discriminative, it clears all errors, no matter what was the error and where it was created. This is not always correct - you, probably, remember a bug with virtual columns when a vcol expression was parsed in the error handling branch and clear_error() removed the original error. Generally, one should avoid thd->clear_error() and push an error handler into the thd. But my hypothesis was that what you're doing here is safe, because there cannot be any error at this point. If my hypothesis is wrong, then you cannot use thd->clear_error(). Or, perhaps, my hypothesis is correct and there should be no error here, meaning some other code is wrong. I asked to be able to differentiate between these two possibilities. If my hypothesis is wrong, you cannot use thd->clear_error(). If my hypothesis is correct, some other code needs fixing.
Though you can figure out error code (ER_NO_SUCH_TABLE) from the test case:
create table t2(a int not null) engine=archive; flush tables; --error 1 --remove_file $DATADIR/test/t2.frm select * from t2; flush tables; --remove_file $DATADIR/test/t2.ARZ --error ER_NO_SUCH_TABLE select * from t2;
This is, I suppose, archive discovery. And it's not Win64-specific. Ok, I see now that my hypothesis was indeed wrong, and this method indeed expects an error and has a thd->clear_error() on pretty much every code path. It'd be easier to read if it'd had one single thd->clear_error() at the beginning, but it's beyond the scope of MDEV-17554. Sorry for confusion, this change of yours is fine. I believe that was my last comment on these commits, please tell me when the new branch is ready. This converges rather quickly now, good! Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hi, Sergei! On Mon, Apr 12, 2021 at 8:46 PM Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Aleksey!
On Apr 12, Aleksey Midenkov wrote:
On Mon, Apr 12, 2021 at 3:14 PM Sergei Golubchik <serg@mariadb.org> wrote:
On Apr 11, Aleksey Midenkov wrote:
It was a conscious choice. Quantitative characteristic is implied.
This isn't going anywhere.
Of course. And because of that I prefer to write shorter comments.
I've asked Ian (KB technical writer) for his opinion and we'll do what he'll say.
So he says it is not meaningless. I am and was upset by your disagreement with my taste. If you are totally against the form I use I'm going to remove the comments. But when we are talking about public documents of course I'm going to make it better understandable, and he is right regarding that particular place.
I perceived it simply as incorrect, broken English. So I asked Ian's opinion. Also I believe we should avoid using different terminology internally and externally unless there's a good reason to do it. That's all, not a question of a taste.
I believe you are right in your motivation. But the emphasis in the test is on increment: we are checking the number in PARTITIONS clause. I removed all the comments regarding "increment" from the test.
I also have to revert DBUG_ASSERT() you suggested which fails drop_table_force on winx64-debug and add error code condition instead:
@@ -3268,12 +3271,12 @@ Open_table_context::recover_from_failed_open() case OT_DISCOVER: case OT_REPAIR: case OT_ADD_HISTORY_PARTITION: - DBUG_ASSERT(!m_thd->get_stmt_da()->is_set()); result= lock_table_names(m_thd, m_thd->lex->create_info, m_failed_table, NULL, get_timeout(), 0); if (result) { - if (m_action == OT_ADD_HISTORY_PARTITION) + if (m_action == OT_ADD_HISTORY_PARTITION && + m_thd->get_stmt_da()->sql_errno() == ER_LOCK_WAIT_TIMEOUT) { // MDEV-23642 Locking timeout caused by auto-creation affects original DML m_thd->clear_error();
What happens on winx64-debug?
Please look yourself at
http://buildbot.askmonty.org/buildbot/builders/winx64-debug/builds/24345/ste...
Thanks. But either I don't know how to read it correctly or it doesn't have enough info.
Could you explain what happens on this builder? What error do you see in stmt_da in recover_from_failed_open() that triggers the assert?
Why are you asking me to do investigations on your hypotheses out of the task scope? It was just a hypothesis after all!
I meant that you do thd->clear_error(). It's non-discriminative, it clears all errors, no matter what was the error and where it was created.
This is not always correct - you, probably, remember a bug with virtual columns when a vcol expression was parsed in the error handling branch and clear_error() removed the original error.
Generally, one should avoid thd->clear_error() and push an error handler into the thd.
But my hypothesis was that what you're doing here is safe, because there cannot be any error at this point. If my hypothesis is wrong, then you cannot use thd->clear_error(). Or, perhaps, my hypothesis is correct and there should be no error here, meaning some other code is wrong.
I asked to be able to differentiate between these two possibilities. If my hypothesis is wrong, you cannot use thd->clear_error(). If my hypothesis is correct, some other code needs fixing.
Though you can figure out error code (ER_NO_SUCH_TABLE) from the test case:
create table t2(a int not null) engine=archive; flush tables; --error 1 --remove_file $DATADIR/test/t2.frm select * from t2; flush tables; --remove_file $DATADIR/test/t2.ARZ --error ER_NO_SUCH_TABLE select * from t2;
This is, I suppose, archive discovery. And it's not Win64-specific.
Yes, I should remember there are no other debug buildbot nodes for a developer. Though there should be.
Ok, I see now that my hypothesis was indeed wrong, and this method indeed expects an error and has a thd->clear_error() on pretty much every code path. It'd be easier to read if it'd had one single thd->clear_error() at the beginning, but it's beyond the scope of MDEV-17554.
Sorry for confusion, this change of yours is fine.
I believe that was my last comment on these commits, please tell me when the new branch is ready. This converges rather quickly now, good!
The new branch is ready.
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- All the best, Aleksey Midenkov @midenok
Responding to the portion of the thread extracted below, "INSERT, INSERT .. SELECT don't increment partitions" is not meaningless, but both "INSERT, INSERT .. SELECT don't increment the number of partitions" and "INSERT, INSERT .. SELECT don't auto-create partitions" are better. It comes down to emphasis. Is the emphasis on the automatic creation, or on the increase in the number? From what I can understand of the topic, the emphasis most rests on automatic creation, so I'd go with "INSERT, INSERT .. SELECT don't auto-create partitions"
+--echo # INSERT, INSERT .. SELECT don't increment partitions
it's not really "increment", better say "don't auto-create"
Actually I like "increment" more. "Auto-create" overcomplicates phrases:
--echo # Increment from 3 to 5 --echo # Increment from 3 to 6, manual names, LOCK TABLES --echo # Multiple increments in single command Besides "increment" is correct because PARTITIONS number is incremented. Sure. "Increment the number of partitions", this is fine. "Auto create partitions" is also fine. "Increment partitions" is meaningless. It is obvious from the context that we are talking about the number, not partitions themselves. Treat "partitions" as PARTITIONS keyword and the increment is attached to a number right next to it. That's quite a sense, isn't it? No, I think it's a meaningless combination of words. But let's ask native speakers, shall we?
participants (3)
-
Aleksey Midenkov
-
Ian Gilfillan
-
Sergei Golubchik