May 2020 - developers - lists.mariadb.org

Re: [Maria-developers] ef2519fee4e: MDEV-16546 System versioning setting to allow history modification
by Aleksey Midenkov 08 Jun '21

08 Jun '21

Hello, Sergei! On Fri, May 3, 2019 at 8:43 PM Sergei Golubchik <serg(a)mariadb.com> wrote: > > Hi, Aleksey! > > On May 03, Aleksey Midenkov wrote: > > revision-id: ef2519fee4e (versioning-1.0.5-17-gef2519fee4e) > > parent(s): 56145be2951 > > author: Aleksey Midenkov <midenok(a)gmail.com> > > committer: Aleksey Midenkov <midenok(a)gmail.com> > > timestamp: 2018-06-28 13:42:09 +0300 > > message: > > > > MDEV-16546 System versioning setting to allow history modification > > > > 1. Add server variable system_versioning_modify_history which will > > allow to set values for row_start, row_end in DML operations. > > > > 2. If secure_timestamp is YES or REPLICATION, > > system_versioning_modify_history does not have effect. If > > secure_timestamp is SUPER, system_versioning_modify_history requires > > special privilege (same as for setting current timestamp). > > I thought more about this idea. We don't really want to have the history > editable, do we? Well, I'm thinking about rollback table data to specific point in time. That could be a useful feature. > But it's needed for replication, to keep the master and > slave identical. That's what secure_timestamp is for. > > The idea was that this new variable, system_versioning_modify_history, > will be just a convenience feature, it will not allow history editing > any more than one can do without it. > > But now I suspect that even with secure_timestamp=NO one cannot truly > edit history. One can only insert new rows with arbitrary timestamps. > For example, to insert a row with row_start=1000 and row_end=2000, one > needs to do (if secure_timestamp=NO): > > set timestamp=1000; > insert row; > set timestamp=2000; > delete row; > > But I don't see how one can update or delete a history row with > secure_timestamp=NO. > > Now, with a SUPER privilege and secure_timestamp=NO or SUPER, one can > use the BINLOG command and truly edit the history arbitrarily, by faking > row events. I don't really get it why this is so important: since there is some limitation by configuration and privilege, we are just fine. Everything can be changed at filesystem level after all. > > The conclusion, I believe, is that system_versioning_modify_history > should allow INSERTs when secure_timestamp=NO, and it should allow > UPDATE/DELETE only for a SUPER user when secure_timestamp=NO or SUPER. I don't see a reason to argue on that. The only thing that is not clear, why we don't allow INSERTs when secure_timestamp=SUPER? > > The second thing I don't like at all, is when a table is created like > > CREATE TABLE t1 (a int) WITH SYSTEM VERSIONING > > with row_start/row_end implicit. You don't have it in the test, but > anyway one should be able to load history into such a table, while the > table does not have row_start and row_end columns. From the user point > of view these columns don't exist, they're pseudo-columns, like ROWID. > They just cannot be insertable-into, conceptually. But a user will want > to restore the history, right? I don't have a solution for this yet :( > Any ideas? We don't have to follow the conception if it doesn't help us. Since we have physical row_start/row_end, we don't have to pretend they don't exist. Who will win from that? > > See below a couple of minor comments about the patch itself. > > ... These are going to be fixed. > > Regards, > Sergei > Chief Architect MariaDB > and security(a)mariadb.org -- All the best, Aleksey Midenkov @midenok

2 15

Re: [Maria-developers] 5150dfd6ab3: MDEV-17891 Assertion failures in select_insert::abort_result_set and mysql_load upon attempt to replace into a full table
by Sergei Golubchik 25 Oct '20

25 Oct '20

Hi, Nikita! On Apr 27, Nikita Malyavin wrote: > revision-id: 5150dfd6ab3 (mariadb-10.3.12-230-g5150dfd6ab3) > parent(s): cf78b8c699d > author: Nikita Malyavin <nikitamalyavin(a)gmail.com> > committer: Nikita Malyavin <nikitamalyavin(a)gmail.com> > timestamp: 2019-06-12 22:11:45 +1000 > message: > > MDEV-17891 Assertion failures in select_insert::abort_result_set and mysql_load upon attempt to replace into a full table > > * set modified_non_trans_table in one missed place This commit needs way more text than that. This is just too misterious now. What asserts fails? What has the test case to do with the fix? Why modified_non_trans_table even matters without replication? Why it's specific to versioning and partitioning? Where trans_safe becomes false? > diff --git a/mysql-test/suite/versioning/r/partition.result b/mysql-test/suite/versioning/r/partition.result > index 3fcb59bdb40..99a297417d5 100644 > --- a/mysql-test/suite/versioning/r/partition.result > +++ b/mysql-test/suite/versioning/r/partition.result > @@ -541,6 +541,29 @@ t1 CREATE TABLE `t1` ( > PARTITION BY SYSTEM_TIME INTERVAL 7 SECOND > (PARTITION `ver_p1` HISTORY ENGINE = DEFAULT_ENGINE, > PARTITION `ver_pn` CURRENT ENGINE = DEFAULT_ENGINE) > +set timestamp= default; > +# MDEV-17891 Assertion failures in select_insert::abort_result_set and > +# mysql_load upon attempt to replace into a full table > +set @@max_heap_table_size= 1024*1024; > +create or replace table t1 ( > +pk integer auto_increment, > +primary key (pk), > +f varchar(45000) > +) with system versioning engine=memory > +partition by system_time interval 1 year (partition p1 history, > +partition pn current); > +# fill the table until full > +insert into t1 () values (),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(); > +insert into t1 (f) select f from t1; > +ERROR HY000: The table 't1' is full > +# leave space for exactly one record in current partition > +delete from t1 where pk = 1; > +# copy all data into history partition > +replace into t1 select * from t1; > +replace into t1 select * from t1; > +ERROR HY000: The table 't1' is full > +drop table t1; > +set @@max_heap_table_size= 1048576; > # Test cleanup > drop database test; > create database test; > diff --git a/mysql-test/suite/versioning/t/partition.test b/mysql-test/suite/versioning/t/partition.test > index d5c83b4d3bb..23768836efc 100644 > --- a/mysql-test/suite/versioning/t/partition.test > +++ b/mysql-test/suite/versioning/t/partition.test > @@ -474,6 +474,7 @@ set timestamp=1523466002.799571; > insert into t1 values (11),(12); > set timestamp=1523466004.169435; > delete from t1 where pk in (11, 12); > +set timestamp= default; > > --echo # > --echo # MDEV-18136 Server crashes in Item_func_dyncol_create::prepare_arguments > @@ -489,6 +490,34 @@ partition by system_time interval column_get(column_create(7,7), 7 as int) secon > --replace_result $default_engine DEFAULT_ENGINE > show create table t1; > > +--echo # MDEV-17891 Assertion failures in select_insert::abort_result_set and > +--echo # mysql_load upon attempt to replace into a full table > + > +--let $max_heap_table_size_orig= `select @@max_heap_table_size;` > +set @@max_heap_table_size= 1024*1024; > +create or replace table t1 ( > + pk integer auto_increment, > + primary key (pk), > + f varchar(45000) > +) with system versioning engine=memory > + partition by system_time interval 1 year (partition p1 history, > + partition pn current); > + > +--echo # fill the table until full > +insert into t1 () values (),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(); > +--error ER_RECORD_FILE_FULL > +insert into t1 (f) select f from t1; > +--echo # leave space for exactly one record in current partition > +delete from t1 where pk = 1; > +--echo # copy all data into history partition > +replace into t1 select * from t1; > +--error ER_RECORD_FILE_FULL > +replace into t1 select * from t1; > + > +# cleanup > +drop table t1; > +eval set @@max_heap_table_size= $max_heap_table_size_orig; > + > --echo # Test cleanup > drop database test; > create database test; > diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc > index 1f3a70721fc..44502ea6704 100644 > --- a/sql/sql_insert.cc > +++ b/sql/sql_insert.cc > @@ -1953,6 +1953,8 @@ int write_record(THD *thd, TABLE *table,COPY_INFO *info) > if (likely(!error)) > { > info->deleted++; > + if (!table->file->has_transactions()) > + thd->transaction.stmt.modified_non_trans_table= TRUE; > if (table->versioned(VERS_TIMESTAMP)) > { > store_record(table, record[2]); Regards, Sergei VP of MariaDB Server Engineering and security(a)mariadb.org

2 1

[Maria-developers] Review for: MDEV-17399 Add support for JSON_TABLE, part #5
by Sergey Petrunia 26 May '20

26 May '20

Hi Alexey, On Fri, May 15, 2020 at 04:26:02PM +0400, Alexey Botchkov wrote: > See the next iteration here > https://github.com/MariaDB/server/commit/692cb566096d61b240ec26e84fc7d3c7d1… > So the ha_json_table;:position() was implemented. > The size of the reference depends on the nested path depth - each level > adds some 9 bytes to the initial 5. > That can be decreased but i decided to keep it simple initially and i doubt > we're going to have really deep > nesting in realistic scenarios. This is ok. I'm not sure if the server has a limit on the rowid size. Perhaps there is, and in that case we just need to limit the depth we allow. > And i'd like to ask you for some testing > queries here. Or just the model how > to produce queries that will be using the ::position() extensively. I've provided a testcase for the position() call in my previous email. https://gist.github.com/spetrunia/a905d51731c58f5439bd9f70c64cdc43 As for rnd_pos(), one case that I'm aware of is when the query does a filesort, and the row being sorted either includes a blob, or has the total length exceeding certain limit. I've tried constructing an example with blobs, and it fails with an assertion before the execution reaches rnd_pos() calls: select * from json_table('[{"color": "blue", "price": 50}, {"color": "red", "price": 100}, {"color": "rojo", "price": 10.0}, {"color": "blanco", "price": 11.0}]', '$[*]' columns( color varchar(100) path '$.color', price text path '$.price', seq for ordinality ) ) as T order by color desc; fails an assertion: mysqld: /home/psergey/dev-git/10.5-json/sql/field.cc:8309: virtual int Field_blob::store(const char*, size_t, CHARSET_INFO*): Assertion `marked_for_write_or_computed()' failed. Once that is fixed, this should use position() and rnd_pos() calls. This will likely expose more issues with rnd_pos(), see my comments to the Json_table_nested_path::set_position below. == A problem with VIEWs == mysql> select * -> from -> json_table( -> '[ {"color": "red", "price": 1}, {"color": "black", "price": 2}]', -> '$[*]' columns( color varchar(100) path '$.color')) as `T_A` where T_A.color<>'azul' ; +-------+ | color | +-------+ | red | | black | +-------+ 2 rows in set (6.34 sec) Good so far. Now, let's try creating a VIEW from this: create view v1 as select * from json_table( '[ {"color": "red", "price": 1}, {"color": "black", "price": 2}]', '$[*]' columns( color varchar(100) path '$.color')) as `T_A` where T_A.color<>'azul' ; select * from v1; ERROR 4041 (HY000): Unexpected end of JSON path in argument 2 to function 'JSON_TABLE' Examining the .frm file, I see something that looks like garbage data: query=select `T_A`.`color` AS `color` from JSON_TABLE(\'[ {"color": "red", "price": 1}, {"color": "black", "price": 2}]\', \'\'\0<D8>7^A<9C><FF>^?\0\0\0\0\0\0\0\0\0\0\0\0^A<A5><A5><A5><A5><A5>^H<D0>kWUU\0\0[ {"color": "red", "price": 1}, {"color": "black", "price": 2}]\0$[*]\'\' COLUMNS (`color` PATH \'^A<9C><FF>^?\0\0\0\0\0\0^A\0\0\0\0\0\0\0<A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5>\0\0\0\0<A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5><A5> <A5><A5><A5><A5><A5><A5> D^A<9C><FF>^?\0\0H:^A<9C><FF>^?\0\0<A5><A5><A5><A5><A5><A5><A5><A5>100\0<8F><8F><8F><8F>$.color\')) T_A where `T_A`.`color` <> \'azul\' EXPLAIN EXTENDED ...; SHOW WARNINGS; - also print something odd. == EXPLAIN [FORMAT=JSON] == I think EXPLAIN output should provide an indication that a table function is used. MySQL does it like so: <TODO> == Unneeded recursive rules in grammar == I've already complained about this: the grammar has recursive ON EMPTY, ON ERROR rules, which cause the following to be accepted (note the two ON EMPTY clauses) : select * from json_table('[{"color": "blue", "price": 50},{"color": "red"}]', '$[*]' columns( price varchar(255) path '$.price' default 'xyz' on empty default 'abc' on empty ) ) as T; > commit 692cb566096d61b240ec26e84fc7d3c7d13f024c > Author: Alexey Botchkov <holyfoot(a)askmonty.org> > Date: Fri May 15 15:25:28 2020 +0400 > > MDEV-17399 Add support for JSON_TABLE. > > Syntax for JSON_TABLE added. > The ha_json_table handler added. Executing the JSON_TABLE we > create the temporary table of the ha_json_table, add dependencies > of other tables and sometimes conditions to WHERE. > and sometimes conditions to WHERE. I think this is not true anymore? ... > diff --git a/include/my_base.h b/include/my_base.h > index 7efa5eb9673..89ef3e8e7c1 100644 > --- a/include/my_base.h > +++ b/include/my_base.h > @@ -523,6 +523,13 @@ enum ha_base_keytype { > #define HA_ERR_TABLESPACE_MISSING 194 /* Missing Tablespace */ > #define HA_ERR_SEQUENCE_INVALID_DATA 195 > #define HA_ERR_SEQUENCE_RUN_OUT 196 > + > +/* > + Share the error code to not increment the HA_ERR_LAST for now, > + as it disturbs some storage engine's tests. > + Probably should be fixed later. > +*/ > +#define HA_ERR_INVALID_JSON HA_ERR_TABLE_IN_FK_CHECK We definitely can't have this in the final patch. We need to either: A. Use an engine-specific error code. Check out MyRocks and ha_rocksdb::get_error_message() for an example of how this is done B. Indeed introduce a generic "Invalid JSON" error code. I'm hesitant about B, let's discuss this with other developers. ... > diff --git a/sql/table_function.cc b/sql/table_function.cc > new file mode 100644 > index 00000000000..71f2378ce7d > --- /dev/null > +++ b/sql/table_function.cc ... > + > +class ha_json_table: public handler > +{ Please add comments about these > +protected: > + Table_function_json_table *m_jt; > + String m_tmps; > + String *m_js; > + uchar *m_cur_pos; > +public: > + ha_json_table(TABLE_SHARE *share_arg, Table_function_json_table *jt): > + handler(&table_function_hton.m_hton, share_arg), m_jt(jt) > + { > + /* > + set the mark_trx_read_write_done to avoid the > + handler::mark_trx_read_write_internal() call. > + It relies on &ha_thd()->ha_data[ht->slot].ha_info[0] to be set. > + But we don't set the ha_data for the ha_json_table, and > + that call makes no sence for ha_json_table. > + */ > + mark_trx_read_write_done= 1; > + ref_length= (jt->m_depth+1) * (4 + 1) + > + jt->m_depth * (sizeof(Json_table_nested_path *)); > + } > + ~ha_json_table() {} > + handler *clone(const char *name, MEM_ROOT *mem_root) { return NULL; } > + const char *index_type(uint inx) { return "NONE"; } > + /* Rows also use a fixed-size format */ > + enum row_type get_row_type() const { return ROW_TYPE_FIXED; } > + ulonglong table_flags() const > + { > + return (HA_FAST_KEY_READ | /*HA_NO_BLOBS |*/ HA_NULL_IN_KEY | > + HA_CAN_SQL_HANDLER | > + HA_REC_NOT_IN_SEQ | HA_NO_TRANSACTIONS | > + HA_HAS_RECORDS | HA_CAN_HASH_KEYS); > + } > + ulong index_flags(uint inx, uint part, bool all_parts) const > + { > + return HA_ONLY_WHOLE_INDEX | HA_KEY_SCAN_NOT_ROR; > + } > + uint max_supported_keys() const { return 1; } > + uint max_supported_key_part_length() const { return MAX_KEY_LENGTH; } > + double scan_time() { return 1000000.0; } > + double read_time(uint index, uint ranges, ha_rows rows) { return 0.0; } The above two functions are never called. Please * add a comment saying that. * add a DBUG_ASSERT() into them to back the point made by the comment :-) > + int open(const char *name, int mode, uint test_if_locked); > + int close(void) { return 0; } > + int rnd_init(bool scan); > + int rnd_next(uchar *buf); > + int rnd_pos(uchar * buf, uchar *pos); > + void position(const uchar *record); > + int can_continue_handler_scan() { return 1; } > + int info(uint); > + int extra(enum ha_extra_function operation); > + THR_LOCK_DATA **store_lock(THD *thd, THR_LOCK_DATA **to, > + enum thr_lock_type lock_type) > + { return NULL; } > + int create(const char *name, TABLE *form, HA_CREATE_INFO *create_info) > + { return 1; } > +private: > + void update_key_stats(); > +}; > + ... > +void Json_table_nested_path::set_position(const char *j_start, const uchar *pos) > +{ * This needs a comment. * I don't see where the value of m_ordinality_counter is restored? * The same about m_null - its value is not restored either? > + if (m_nested) > + { > + memcpy(&m_cur_nested, pos, sizeof(m_cur_nested)); > + pos+= sizeof(m_cur_nested); > + } > + > + m_engine.s.c_str= (uchar *) j_start + sint4korr(pos); > + m_engine.state= (int) pos[4]; > + if (m_cur_nested) > + m_cur_nested->set_position(j_start, pos); > +} > + ... > +int ha_json_table::info(uint) > +{ > + /* We don't want 0 or 1 in stats.records. */ > + stats.records= 4; > + return 0; Does this value matter? As far as I understand it doesn't, as the optimizer will use the estimates obtained from Table_function_json_table::get_estimates. Please add a comment about this. > +} > + ... Please document this function. What's last_column? Why does print() method get it and return it? > +int Json_table_nested_path::print(THD *thd, TABLE_LIST *sql_table, String *str, > + List_iterator_fast<Json_table_column> &it, > + Json_table_column **last_column) > +{ > + Json_table_nested_path *c_path= this; ... > diff --git a/sql/table_function.h b/sql/table_function.h > new file mode 100644 > index 00000000000..49d650cb7b2 > --- /dev/null > +++ b/sql/table_function.h > @@ -0,0 +1,168 @@ ... > +#include <json_lib.h> > + > +/* > + The Json_table_nested_path represents the 'current nesting' level > + for a set of JSON_TABLE columns. > + Each column (Json_table_column instance) is linked with corresponding > + 'nested path' object and gets it's piece of JSON to parse during the computation > + phase. > + The root 'nested_path' is always present as a part of Table_function_json_table, > + then other 'nested_paths' can be created and linked into a tree structure when new > + 'NESTED PATH' is met. The nested 'nested_paths' are linked with 'm_nested', the same-level > + 'nested_paths' are linked with 'm_next_nested'. > + So for instance > + JSON_TABLE( '...', '$[*]' > + COLUMNS( a INT PATH '$.a' , > + NESTED PATH '$.b[*]' COLUMNS (b INT PATH '$', > + NESTED PATH '$.c[*]' COLUMNS(x INT PATH '$')), > + NESTED PATH '$.n[*]' COLUMNS (z INT PAHT '$')) > + results in 4 'nested_path' created: > + root nested_b nested_c nested_n > + m_path '$[*]' '$.b[*]' '$.c[*]' '$.n[*] > + m_nested &nested_b &nested_c NULL NULL > + n_next_nested NULL &nested_n NULL NULL > + > +and 4 columns created: > + a b x z > + m_nest &root &nested_b &nested_c &nested_n > +*/ > + > + > +class Json_table_column; > + > +class Json_table_nested_path : public Sql_alloc > +{ > +public: > + bool m_null; > + json_path_t m_path; > + json_engine_t m_engine; > + json_path_t m_cur_path; > + > + /* Counts the rows produced. Value is set to the FOR ORDINALITY coluns */ > + longlong m_ordinality_counter; > + > + Json_table_nested_path *m_parent; > + Json_table_nested_path *m_nested, *m_next_nested; > + Json_table_nested_path **m_nested_hook; > + Json_table_nested_path *m_cur_nested; Please add documentation about the above members. I see the diagram above, but I think text descriptions are also needed for each member. What's m_nested_hook? > + Json_table_nested_path(Json_table_nested_path *parent_nest): > + m_parent(parent_nest), m_nested(0), m_next_nested(0), > + m_nested_hook(&m_nested) {} > + int set_path(THD *thd, const LEX_CSTRING &path); > + void scan_start(CHARSET_INFO *i_cs, const uchar *str, const uchar *end); > + int scan_next(); > + int print(THD *thd, TABLE_LIST *sql_table, String *str, > + List_iterator_fast<Json_table_column> &it, > + Json_table_column **last_column); > + void get_current_position(const char *j_start, uchar *pos) const; > + void set_position(const char *j_start, const uchar *pos); > +}; ... > +class Table_function_json_table : public Sql_alloc Please document the class and the data members. > +{ > +public: > + Item *m_json; > + Json_table_nested_path m_nested_path; > + List<Json_table_column> m_columns; > + table_map m_dep_tables; > + uint m_depth, m_cur_depth; > + > + Table_function_json_table(Item *json): m_json(json), m_nested_path(0), > + m_depth(0), m_cur_depth(0) {} > + > + /* > + Used in sql_yacc.yy. > + Represents the current NESTED PATH level being parsed. > + */ > + Json_table_nested_path *m_sql_nest; > + void add_nested(Json_table_nested_path *np); > + void leave_nested(); BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog

1 0

Re: [Maria-developers] 070df171c1e: MDEV-16937 Strict SQL with system versioned tables causes issues
by Sergei Golubchik 26 May '20

26 May '20

Hi, Aleksey! ok to push On May 26, Aleksey Midenkov wrote: > revision-id: 070df171c1e (mariadb-10.3.21-91-g070df171c1e) > parent(s): ad41da5c934 > author: Aleksey Midenkov <midenok(a)gmail.com> > committer: Aleksey Midenkov <midenok(a)gmail.com> > timestamp: 2020-05-25 14:47:26 +0300 > message: > > MDEV-16937 Strict SQL with system versioned tables causes issues > > Respect system fields in NO_ZERO_DATE mode. > > This is the subject for refactoring in MDEV-19597 > Regards, Sergei VP of MariaDB Server Engineering and security(a)mariadb.org

1 0

Re: [Maria-developers] cd9cab54aac: MDEV-20015 Assertion `!in_use->is_error()' failed in TABLE::update_virtual_field
by Sergei Golubchik 26 May '20

26 May '20

Hi, Aleksey! Ok to push, thanks! On May 26, Aleksey Midenkov wrote: > revision-id: cd9cab54aac (mariadb-10.2.31-123-gcd9cab54aac) > parent(s): d275ecbd208 > author: Aleksey Midenkov <midenok(a)gmail.com> > committer: Aleksey Midenkov <midenok(a)gmail.com> > timestamp: 2020-05-25 20:56:31 +0300 > message: > > MDEV-20015 Assertion `!in_use->is_error()' failed in TABLE::update_virtual_field > > update_virtual_field() is called as part of index rebuild in > ha_myisam::repair() (MDEV-5800) which is done on bulk INSERT finish. > > Assertion in update_virtual_field() was put as part of MDEV-16222 > because update_virtual_field() returns in_use->is_error(). The idea: > wrongly mixed semantics of error status before update_virtual_field() > and the status returned by update_virtual_field(). The former can > falsely influence the latter. > > diff --git a/sql/table.cc b/sql/table.cc > index d6d86d96016..2429bb12abe 100644 > --- a/sql/table.cc > +++ b/sql/table.cc > @@ -7707,15 +7707,17 @@ int TABLE::update_virtual_fields(handler *h, enum_vcol_update_mode update_mode) > > int TABLE::update_virtual_field(Field *vf) > { > - DBUG_ASSERT(!in_use->is_error()); > - Query_arena backup_arena; > DBUG_ENTER("TABLE::update_virtual_field"); > + Query_arena backup_arena; > + Counting_error_handler count_errors; > + in_use->push_internal_handler(&count_errors); > in_use->set_n_backup_active_arena(expr_arena, &backup_arena); > bitmap_clear_all(&tmp_set); > vf->vcol_info->expr->walk(&Item::update_vcol_processor, 0, &tmp_set); > vf->vcol_info->expr->save_in_field(vf, 0); > in_use->restore_active_arena(expr_arena, &backup_arena); > - DBUG_RETURN(in_use->is_error()); > + in_use->pop_internal_handler(); > + DBUG_RETURN(count_errors.errors); > } Regards, Sergei VP of MariaDB Server Engineering and security(a)mariadb.org

1 0

Re: [Maria-developers] 7d593466a22: MDEV-20015 Assertion `!in_use->is_error()' failed in TABLE::update_virtual_field
by Sergei Golubchik 25 May '20

25 May '20

Hi, Aleksey! I don't see what you've changed. We've discussed that fix and that one isn't supposed to swap Diagnostics_area's like that. And in your new patch you do exactly the same. Possible correct approaches: * don't return in_use->is_error(), return the return value of vf->vcol_info->expr->walk() || vf->vcol_info->expr->save_in_field() This means that Item_field::update_vcol_processor() should also do the same, I suspect * Use thd->push_internal_handler() and Counting_error_handler. Or, better, Turn_errors_to_warnings_handler with counting. This is the simplest one. there's a third option: * always return 0, because, looking how it's used, I don't really see how update_virtual_field() can ever get an error. But it's not a particularly future-proof approach. And I just might be wrong about errors. On Apr 23, Aleksey Midenkov wrote: > revision-id: 7d593466a22 (mariadb-10.2.28-4-g7d593466a22) > parent(s): 7bc26de591c > author: Aleksey Midenkov <midenok(a)gmail.com> > committer: Aleksey Midenkov <midenok(a)gmail.com> > timestamp: 2019-11-07 10:45:21 +0300 > message: > > MDEV-20015 Assertion `!in_use->is_error()' failed in TABLE::update_virtual_field > > Preserve and restore statement DA. > > update_virtual_field() is called as part of index rebuild in > ha_myisam::repair() (MDEV-5800) which is done on bulk INSERT finish. > > Assertion in update_virtual_field() was put as part of MDEV-16222 > because update_virtual_field() returns in_use->is_error(). The idea: > wrongly mixed semantics of error status before update_virtual_field() > and the status returned by update_virtual_field(). The former can > falsely influence the latter. > > Preserve global error status and run update_virtual_field() with clear > DA since no matter how SQL command is finished it must update the > index after bulk INSERT. > Regards, Sergei VP of MariaDB Server Engineering and security(a)mariadb.org

2 1

[Maria-developers] Progress Report - Week 3
by Mohammed Hammaad Mateen 25 May '20

25 May '20

Greetings, Hope you are safe and doing great, This post describes the things I've done in my third week [ 18-24 May ] of Community Bonding Period under the mentor-ship of Sergei Golubchik and Oleksandr Byelkin for GSoC-20. The tasks taken up for this week was to study and analyze: *INSERT.... RETURNING:* • *TableList* consists of 1 table instance to use, where each instance is a structure consisting of db argument, table name argument, alias argument and lock type argument. *Referenced this from the table.h file.* • In *sql_parse.cc* <http://sql_parse.cc> under *case SQLCOM_INSERT*: we do a DBUG_ASSERT where first_table == all_table && first_table!=0 // we check if only 1 table is used and the table used is already created. • if lex-> has_returning() we increment the system status var by 1 and perform analyze.. insert.. returning. • we compute result by mysql_insert function by passing the following arguments. - Thread Handler thd - all table instances - list of all the fields used - the values we want to insert - the list of update feilds - and update value list - duplicate flag - ignore - result of select • if Inserting fails due to some reason we equate result to send_explain(thd) • we also update the MYSQL_INSERT_DONE with result of insert and the the row count. *INSERT.. SELECT.. RETURNING:* • We fix the lock on first table. • lock other tables until command is written to the binary log. • the procedure is same as discussed earlier with respect to insert.... returning. • To switch to the second table we traverse from first_table to (->) next_local and we compute select result with the help of select insert function. • we can now unlock the tables and we also need to check if something changed after unlocking, of that happens we should invalidate the table from the query cache using query_cache_invalidate3 function. • Manual cleaning of select result obtained from select insert must be done. Regards, Mohammed Hammaad Mateen

1 0

Re: [Maria-developers] MDEV-22461: JOIN::make_aggr_tables_info(): Assertion `select_options & (1ULL << 17)' failed.
by Sergey Petrunia 24 May '20

24 May '20

Hi Varun, - Please add a testcase to the testsuite - Please address a few cosmetic comments below - Ok to push after addressed. > commit 5e448b77a3812e65623c0a1214049322ace3aacf > Author: Varun Gupta <varun.gupta(a)mariadb.com> > Date: Tue May 5 20:44:43 2020 +0530 > > MDEV-22461: JOIN::make_aggr_tables_info(): Assertion `select_options & (1ULL << 17)' failed. > > A temporary table is needed for window function computation but if only a NAMED WINDOW SPEC > is used and there is no window function, then there is no need to create a temporary > table as there is no stage to compute WINDOW FUNCTION > > diff --git a/sql/sql_select.cc b/sql/sql_select.cc > index c601946cfa0..0a1d0c2dbcc 100644 > --- a/sql/sql_select.cc > +++ b/sql/sql_select.cc > @@ -2014,11 +2014,16 @@ JOIN::optimize_inner() > } > > need_tmp= test_if_need_tmp_table(); > - //TODO this could probably go in test_if_need_tmp_table. > - if (this->select_lex->window_specs.elements > 0) { > - need_tmp= TRUE; > + > + /* > + If window functions are present then we can't have simple_order set to > + TRUE as the window function needs a temp table for compuatation. typo: compuatation. > + ORDER BY is computed after the window function computation is done, so > + the sort would be done on the temp table. > + */ > + if (this->select_lex->have_window_funcs()) Is there a need to have "this->select_lex"? I think not, please remove, as it only confuses the reader. > simple_order= FALSE; > - } > + > > /* > If the hint FORCE INDEX FOR ORDER BY/GROUP BY is used for the table > diff --git a/sql/sql_select.h b/sql/sql_select.h > index 0e011c9267a..7a892c1af89 100644 > --- a/sql/sql_select.h > +++ b/sql/sql_select.h > @@ -1645,7 +1645,8 @@ class JOIN :public Sql_alloc > ((select_distinct || !simple_order || !simple_group) || > (group_list && order) || > MY_TEST(select_options & OPTION_BUFFER_RESULT))) || > - (rollup.state != ROLLUP::STATE_NONE && select_distinct)); > + (rollup.state != ROLLUP::STATE_NONE && select_distinct) || > + select_lex->have_window_funcs()); Please amend the function comment above this code to cover the new condition as well. > } > bool choose_subquery_plan(table_map join_tables); > void get_partial_cost_and_fanout(int end_tab_idx, > BR Sergei -- Sergei Petrunia, Software Developer MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog

1 0

Re: [Maria-developers] b22a28c2295: fixup! 3fe5cd5e1785e3e8de7add9977a1c2ddd403538b
by Michael Widenius 22 May '20

22 May '20

Hi! On Fri, May 22, 2020 at 3:27 PM Andrei Elkin <andrei.elkin(a)mariadb.com> wrote: <cut> CORNER CASES: read-only, pure myisam, binlog-*, @@skip_log_bin, etc > > Aria just makes yet another previously unknown use case of an engine > that produces THD::ha_info but does not support 2pc, which the assert > implied. > > To explain more, the original block > > #ifndef DBUG_OFF > for (ha_info= thd->transaction.all.ha_list; rw_count > 1 && ha_info; > ha_info= ha_info->next()) > DBUG_ASSERT(ha_info->ht() != binlog_hton); > #endif > > claims there most be no binlog hton in a transaction consisting of more > than 1 hton:s, *when* (at this point) this transaction has not been > binlogged yet. > So combination of binlog + Innodb hton would be raise the assert, to > question "why the heck binlogging has not been done yet?!". > > Aria is different from Innodb in this context in that binlogging > was done at the end of the statement, so to miss > `cache_mngr->need_unlog' flagging (which is at xa prepare time logging). Thanks for the explanation, we should have had that in the code. > I think we should fix the assert rather than to remove. This way: > > cat > assert.diff <<. > diff --git a/sql/log.cc b/sql/log.cc > index 792c6bb1a99..aaf1fae1cd6 100644 > --- a/sql/log.cc > +++ b/sql/log.cc > @@ -10124,6 +10124,16 @@ int TC_LOG_BINLOG::unlog_xa_prepare(THD *thd, bool all) > Ha_trx_info *ha_info; > uint rw_count= ha_count_rw_all(thd, &ha_info); > bool rc= false; > +#ifndef DBUG_OFF > + bool no_binlog= true, exist_no_2pc= false; > + for (ha_info= thd->transaction->all.ha_list; rw_count > 1 && ha_info; > + ha_info= ha_info->next()) > + { > + no_binlog= no_binlog && ha_info->ht() != binlog_hton; > + exist_no_2pc= exist_no_2pc || !ha_info->ht()->prepare; > + } > + DBUG_ASSERT(no_binlog || exist_no_2pc); > +#endif > > if (rw_count > 0) > { > . > > I tested it briefly with running XA:s on combination of engines > including. On which tree did you test? bb-10.5-aria? In the end, after discussions on slack, we ended with: #ifndef DBUG_OFF if (rw_count > 1) { /* There must be no binlog_hton used in a transaction consisting of more than 1 engine, *when* (at this point) this transaction has not been binlogged. The one exception is if there is an engine without a prepare method, as in this case the engine doesn't support XA and we have to ignore this check. */ bool binlog= false, exist_hton_without_prepare= false; for (ha_info= thd->transaction->all.ha_list; ha_info; ha_info= ha_info->next()) { if (ha_info->ht() == binlog_hton) binlog= true; if (!ha_info->ht()->prepare) exist_hton_without_prepare= true; } DBUG_ASSERT(!binlog || exist_hton_without_prepare); } #endif > > The reason it was not hit > > is that before Aria was not treated as transactional we could only > > come here in case of errors > > and that code was was apparently not tested, at least with binary logging on. > > > > > With Aria we could come here in case of rollback and we got assert for > > cases that was perfectly ok. > > Well, what 'rollback' do you mean? The function is invoked only > for ha_prepare(). As far as I remember, I come into this code in some edge case when something failed that normally never fails. I think the issue was that Aria doesn't have a prepare handler, which caused issues in ha_prepare(), but not sure. Anyway, we now have a solution for this. Regards, Monty

1 0

Re: [Maria-developers] acbe14b122c: Aria will now register it's transactions
by Sergei Golubchik 22 May '20

22 May '20

Hi, Michael! See below On May 20, Michael Widenius wrote: > revision-id: acbe14b122c (mariadb-10.5.2-255-gacbe14b122c) > parent(s): 48a758696bf > author: Michael Widenius <monty(a)mariadb.com> > committer: Michael Widenius <monty(a)mariadb.com> > timestamp: 2020-05-19 17:52:17 +0300 > message: > > Aria will now register it's transactions > > MDEV-22531 Remove maria::implicit_commit() > diff --git a/sql/sql_class.h b/sql/sql_class.h > index d4a95fa3fd8..a7a071f7cdb 100644 > --- a/sql/sql_class.h > +++ b/sql/sql_class.h > @@ -5149,6 +5151,40 @@ class THD: public THD_count, /* this must be first */ > > }; > > + > +/* > + Start a new independent transaction for the THD. > + The old one is stored in this object and restored when calling > + restore_old_transaction() or when the object is freed > +*/ > + > +class start_new_trans > +{ > + /* container for handler's private per-connection data */ > + Ha_data old_ha_data[MAX_HA]; > + struct THD::st_transactions *old_transaction, new_transaction; > + Open_tables_backup open_tables_state_backup; > + MDL_savepoint mdl_savepoint; > + PSI_transaction_locker *m_transaction_psi; > + THD *org_thd; > + uint in_sub_stmt; > + uint server_status; > + > +public: > + start_new_trans(THD *thd); > + ~start_new_trans() > + { > + destroy(); > + } > + void destroy() > + { > + if (org_thd) // Safety > + restore_old_transaction(); > + new_transaction.free(); > + } > + void restore_old_transaction(); interesting. You made restore_old_transaction() public and you use it in many places. Why not to use destroy() instead? When one would want to use restore_old_transaction() instead of destroy()? > +}; > + > /** A short cut for thd->get_stmt_da()->set_ok_status(). */ > > inline void > diff --git a/sql/sql_class.cc b/sql/sql_class.cc > index dda8e00f6bf..51d7380f622 100644 > --- a/sql/sql_class.cc > +++ b/sql/sql_class.cc > @@ -5742,6 +5744,90 @@ void THD::mark_transaction_to_rollback(bool all) > } > > > +/** > + Commit the whole transaction (both statment and all) > + > + This is used mainly to commit an independent transaction, > + like reading system tables. > + > + @return 0 0k > + @return <>0 error code. my_error() has been called() > +*/ > + > +int THD::commit_whole_transaction_and_close_tables() > +{ > + int error, error2; > + DBUG_ENTER("THD::commit_whole_transaction_and_close_tables"); > + > + /* > + This can only happened if we failed to open any table in the > + new transaction > + */ > + if (!open_tables) > + DBUG_RETURN(0); Generally, I think, it should still end the transaction here. Or add an assert that there is no active transaction at the moment. > + > + /* > + Ensure table was locked (opened with open_and_lock_tables()). If not > + the THD can't be part of any transactions and doesn't have to call > + this function. > + */ > + DBUG_ASSERT(lock); > + > + error= ha_commit_trans(this, FALSE); > + /* This will call external_lock to unlock all tables */ > + if ((error2= mysql_unlock_tables(this, lock))) > + { > + my_error(ER_ERROR_DURING_COMMIT, MYF(0), error2); > + error= error2; > + } > + lock= 0; > + if ((error2= ha_commit_trans(this, TRUE))) > + error= error2; > + close_thread_tables(this); I wonder why you're doing it in that specific order. commit(stmt)-unlock-commit(all)-close > + DBUG_RETURN(error); > +} > + > +/** > + Start a new independent transaction > +*/ > + > +start_new_trans::start_new_trans(THD *thd) > +{ > + org_thd= thd; > + mdl_savepoint= thd->mdl_context.mdl_savepoint(); > + memcpy(old_ha_data, thd->ha_data, sizeof(old_ha_data)); > + thd->reset_n_backup_open_tables_state(&open_tables_state_backup); > + bzero(thd->ha_data, sizeof(thd->ha_data)); > + old_transaction= thd->transaction; > + thd->transaction= &new_transaction; > + new_transaction.on= 1; > + in_sub_stmt= thd->in_sub_stmt; > + thd->in_sub_stmt= 0; > + server_status= thd->server_status; > + m_transaction_psi= thd->m_transaction_psi; > + thd->m_transaction_psi= 0; > + thd->server_status&= ~(SERVER_STATUS_IN_TRANS | > + SERVER_STATUS_IN_TRANS_READONLY); > + thd->server_status|= SERVER_STATUS_AUTOCOMMIT; > +} Few thoughts: 1. If you need to save and restore _all that_ then, perhaps, all that should be inside st_transactions ? 2. strictly speaking, ha_data is _per connection_. If you just bzero it, the engine will think it's a new connection, and you cannot just overwrite it on restore without hton->close_connection. A strictly "proper" solution would be to introduce ha_data per transaction in addition to what we have now. But it looks like an overkill. So I'd just add close_system_tables() now into restore_old_transaction() A strictly "proper" solution would be to introduce ha_data per transaction in addition to what we have now. But it looks like an overkill. So I'd just add ha_close_connection() now into restore_old_transaction() and that's all. I see that you've added free_transaction() call, but isn't it redundant? ha_close_connection() does that now. Otherwise very good, thanks, I cannot wait to start using it more for other features. > + > + > +void start_new_trans::restore_old_transaction() > +{ > + org_thd->transaction= old_transaction; > + org_thd->restore_backup_open_tables_state(&open_tables_state_backup); > + ha_free_transactions(org_thd); > + memcpy(org_thd->ha_data, old_ha_data, sizeof(old_ha_data)); > + org_thd->mdl_context.rollback_to_savepoint(mdl_savepoint); > + org_thd->in_sub_stmt= in_sub_stmt; > + org_thd->server_status= server_status; > + if (org_thd->m_transaction_psi) > + MYSQL_COMMIT_TRANSACTION(org_thd->m_transaction_psi); > + org_thd->m_transaction_psi= m_transaction_psi; > + org_thd= 0; > +} > + > + > /** > Decide on logging format to use for the statement and issue errors > or warnings as needed. The decision depends on the following > diff --git a/sql/event_db_repository.cc b/sql/event_db_repository.cc > index af43d92dea7..82b3968de85 100644 > --- a/sql/event_db_repository.cc > +++ b/sql/event_db_repository.cc > @@ -742,7 +748,8 @@ Event_db_repository::create_event(THD *thd, Event_parse_data *parse_data, > ret= 0; > > end: > - close_thread_tables(thd); > + if (table) > + thd->commit_whole_transaction_and_close_tables(); You're checking twice. Here in the caller, and the first thing inside commit_whole_transaction_and_close_tables(). Here and in many places. If you want to check in the caller, there's no need to check inside? You can add an assert instead, can't you? > thd->mdl_context.rollback_to_savepoint(mdl_savepoint); > > thd->variables.sql_mode= saved_mode; > @@ -1117,22 +1124,20 @@ update_timing_fields_for_event(THD *thd, > TABLE *table= NULL; > Field **fields; > int ret= 1; > - enum_binlog_format save_binlog_format; > MYSQL_TIME time; > DBUG_ENTER("Event_db_repository::update_timing_fields_for_event"); > > - /* > - Turn off row binlogging of event timing updates. These are not used > - for RBR of events replicated to the slave. > - */ > - save_binlog_format= thd->set_current_stmt_binlog_format_stmt(); > - > DBUG_ASSERT(thd->security_ctx->master_access & PRIV_IGNORE_READ_ONLY); > > if (open_event_table(thd, TL_WRITE, &table)) > goto end; Why do you reuse a current transaction, instead of creating a new one as everywhere above? > > fields= table->field; > + /* > + Turn off row binlogging of event timing updates. These are not used > + for RBR of events replicated to the slave. > + */ > + table->file->row_logging= 0; > > if (find_named_event(event_db_name, event_name, table)) > goto end; > diff --git a/sql/handler.h b/sql/handler.h > index e4903172c33..ebe23b97062 100644 > --- a/sql/handler.h > +++ b/sql/handler.h > @@ -1765,6 +1766,12 @@ handlerton *ha_default_tmp_handlerton(THD *thd); > */ > #define HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE (1 << 15) > > +/* > + True if handler cannot roolback transactions. If not true, the transaction > + will be put in the transactional binlog cache. > +*/ > +#define HTON_NO_ROLLBACK (1 << 16) How is it different from HA_PERSISTENT_TABLE? > + > class Ha_trx_info; > > struct THD_TRANS > diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc > index 582c9bb110b..0c53bdc5bbe 100644 > --- a/sql/ha_partition.cc > +++ b/sql/ha_partition.cc > @@ -11016,7 +11016,7 @@ int ha_partition::check_misplaced_rows(uint read_part_id, bool do_repair) > If the engine supports transactions, the failure will be > rollbacked. if you're changing it anyway, can you also change rollbacked -> rolled back ? thanks. > */ > - if (!m_file[correct_part_id]->has_transactions()) > + if (!m_file[correct_part_id]->has_transactions_and_rollback()) > { > /* Log this error, so the DBA can notice it and fix it! */ > sql_print_error("Table '%-192s' failed to move/insert a row" > diff --git a/sql/handler.cc b/sql/handler.cc > index 39841cc28d7..e3f5773717d 100644 > --- a/sql/handler.cc > +++ b/sql/handler.cc > @@ -4514,7 +4515,6 @@ void handler::mark_trx_read_write_internal() > */ > if (ha_info->is_started()) > { > - DBUG_ASSERT(has_transaction_manager()); there should be *some* assert here, I think > /* > table_share can be NULL in ha_delete_table(). See implementation > of standalone function ha_delete_table() in sql_base.cc. > diff --git a/sql/share/errmsg-utf8.txt b/sql/share/errmsg-utf8.txt > index 250da9948a0..1f3bf212d3f 100644 > --- a/sql/share/errmsg-utf8.txt > +++ b/sql/share/errmsg-utf8.txt > @@ -7961,3 +7961,5 @@ ER_KEY_CONTAINS_PERIOD_FIELDS > eng "Key %`s cannot explicitly include column %`s" > ER_KEY_CANT_HAVE_WITHOUT_OVERLAPS > eng "Key %`s cannot have WITHOUT OVERLAPS" > +ER_DATA_WAS_COMMITED_UNDER_ROLLBACK > + eng "Engine %s does not support rollback. Changes where commited during rollback call" A confusing error message. I'd just say "table is not transactional". This is the user point of view difference between a "transactional engine" like InnoDB and "crash safe engine" like Aria. It's very confusing to invent a new category of "transactional engines that cannot roll back" even if internally Aria is treated like one. > diff --git a/sql/sp.cc b/sql/sp.cc > index 51bbeeef368..1629290eb73 100644 > --- a/sql/sp.cc > +++ b/sql/sp.cc > @@ -470,27 +471,29 @@ static Proc_table_intact proc_table_intact; > currently open tables will be saved, and from which will be > restored when we will end work with mysql.proc. > > + NOTES > + On must have a start_new_trans object active when calling this function I think this: DBUG_ASSERT(thd->transcation != &thd->default_transaction); could be a bit safer than a NOTE :) > + > @retval > 0 Error > @retval > \# Pointer to TABLE object of mysql.proc > */ > > -TABLE *open_proc_table_for_read(THD *thd, Open_tables_backup *backup) > +TABLE *open_proc_table_for_read(THD *thd) > { > TABLE_LIST table; > - > DBUG_ENTER("open_proc_table_for_read"); > > table.init_one_table(&MYSQL_SCHEMA_NAME, &MYSQL_PROC_NAME, NULL, TL_READ); > > - if (open_system_tables_for_read(thd, &table, backup)) > + if (open_system_tables_for_read(thd, &table)) > DBUG_RETURN(NULL); > > if (!proc_table_intact.check(table.table, &proc_table_def)) > DBUG_RETURN(table.table); > > - close_system_tables(thd, backup); > + thd->commit_whole_transaction_and_close_tables(); > > DBUG_RETURN(NULL); > } > @@ -517,7 +520,8 @@ static TABLE *open_proc_table_for_update(THD *thd) > MDL_savepoint mdl_savepoint= thd->mdl_context.mdl_savepoint(); > DBUG_ENTER("open_proc_table_for_update"); > same? also need start_new_trans? > > - table_list.init_one_table(&MYSQL_SCHEMA_NAME, &MYSQL_PROC_NAME, NULL, TL_WRITE); > + table_list.init_one_table(&MYSQL_SCHEMA_NAME, &MYSQL_PROC_NAME, NULL, > + TL_WRITE); > > if (!(table= open_system_table_for_update(thd, &table_list))) > DBUG_RETURN(NULL); > diff --git a/sql/sql_base.cc b/sql/sql_base.cc > index 6c938670fdc..325a0f1d41d 100644 > --- a/sql/sql_base.cc > +++ b/sql/sql_base.cc > @@ -4259,7 +4259,7 @@ bool open_tables(THD *thd, const DDL_options_st &options, > list, we still need to call open_and_process_routine() to take > MDL locks on the routines. > */ > - if (thd->locked_tables_mode <= LTM_LOCK_TABLES) > + if (thd->locked_tables_mode <= LTM_LOCK_TABLES && *sroutine_to_open) why? > { > /* > Process elements of the prelocking set which are present there > @@ -8887,17 +8887,16 @@ bool is_equal(const LEX_CSTRING *a, const LEX_CSTRING *b) > open_system_tables_for_read() > thd Thread context. > table_list List of tables to open. > - backup Pointer to Open_tables_state instance where > - information about currently open tables will be > - saved, and from which will be restored when we will > - end work with system tables. > > NOTES > + Caller should have used start_new_trans object to start a new > + transcation when reading system tables. assert, may be? > + > Thanks to restrictions which we put on opening and locking of > system tables for writing, we can open and lock them for reading > - even when we already have some other tables open and locked. One > - must call close_system_tables() to close systems tables opened > - with this call. > + even when we already have some other tables open and locked. > + One should call thd->commit_whole_transaction_and_close_tables() > + to close systems tables opened with this call. > > NOTES > In some situations we use this function to open system tables for > diff --git a/sql/sql_help.cc b/sql/sql_help.cc > index c9307b578fc..3ccab553bfe 100644 > --- a/sql/sql_help.cc > +++ b/sql/sql_help.cc > @@ -709,8 +709,9 @@ static bool mysqld_help_internal(THD *thd, const char *mask) > Reset and backup the current open tables state to > make it possible. > */ > - Open_tables_backup open_tables_state_backup; > - if (open_system_tables_for_read(thd, tables, &open_tables_state_backup)) > + start_new_trans new_trans(thd); I'm starting to think that this (and in some other places) is rather redundant. Transactions can be expensive, the overhead being quite big. It seems like a overkill to wrap read-only system table accesses in a separate transaction, they could as well be performed as a part of the parent transaction. Write accesses has to be wrapped in a dedicated transaction, of course. > + > + if (open_system_tables_for_read(thd, tables)) > goto error2; > > /* > diff --git a/sql/sql_show.cc b/sql/sql_show.cc > index 2528134f4ee..db5b4d1c5fd 100644 > --- a/sql/sql_show.cc > +++ b/sql/sql_show.cc > @@ -6105,7 +6105,7 @@ static my_bool iter_schema_engines(THD *thd, plugin_ref plugin, > table->field[1]->store(option_name, strlen(option_name), scs); > table->field[2]->store(plugin_decl(plugin)->descr, > strlen(plugin_decl(plugin)->descr), scs); > - tmp= &yesno[MY_TEST(hton->commit)]; > + tmp= &yesno[MY_TEST(hton->commit && !(hton->flags & HTON_NO_ROLLBACK))]; Why not to say that hton->rollback==NULL means no rollback? Then you won't need a new flag for that. And you'll remove redundancy, what would it mean if HTON_NO_ROLLBACK is present, but hton->rollback!=NULL? Or the other way around, HTON_NO_ROLLBACK is not present, but hton->rollback==NULL? > table->field[3]->store(tmp->str, tmp->length, scs); > table->field[3]->set_notnull(); > tmp= &yesno[MY_TEST(hton->prepare)]; > diff --git a/sql/sql_statistics.cc b/sql/sql_statistics.cc > index 55e8e52c052..7a817aea97e 100644 > --- a/sql/sql_statistics.cc > +++ b/sql/sql_statistics.cc > @@ -230,17 +230,17 @@ index_stat_def= {INDEX_STAT_N_FIELDS, index_stat_fields, 4, index_stat_pk_col}; > Open all statistical tables and lock them > */ > > -static int open_stat_tables(THD *thd, TABLE_LIST *tables, > - Open_tables_backup *backup, bool for_write) > +static int open_stat_tables(THD *thd, TABLE_LIST *tables, bool for_write) > { > int rc; > - > Dummy_error_handler deh; // suppress errors > + DBUG_ASSERT(thd->transaction != &thd->default_transaction); I see, you already use such an assert :) > + > thd->push_internal_handler(&deh); > init_table_list_for_stat_tables(tables, for_write); > init_mdl_requests(tables); > thd->in_sub_stmt|= SUB_STMT_STAT_TABLES; > - rc= open_system_tables_for_read(thd, tables, backup); > + rc= open_system_tables_for_read(thd, tables); > thd->in_sub_stmt&= ~SUB_STMT_STAT_TABLES; > thd->pop_internal_handler(); > Regards, Sergei VP of MariaDB Server Engineering and security(a)mariadb.org

2 1