developers
Threads by month
- ----- 2025 -----
- June
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- 10 participants
- 6870 discussions

12 Mar '20
Привет, Сергей!
О чем этот "суппорт", не обьяснишь одной фразой?
Спасибо!
Андрей
psergey <sergey(a)mariadb.com> writes:
> revision-id: 5d0e4ce291ae940feaa8398435158dc56a3db3c4 ()
> parent(s): d504887718112e211544beca0e6651d5477466e1
> author: Sergei Petrunia
> committer: Sergei Petrunia
> timestamp: 2020-03-12 10:26:06 +0300
> message:
>
> MySQL support added
>
> ---
> .gitignore | 1 +
> filesort-bench1/06-make-varchar-bench.sh | 30 +++++++++++++++--
> prepare-server.sh | 11 ++++--
> setup-server/setup-mariadb-current.sh | 6 ++--
> setup-server/setup-mysql-8.0.sh | 58 ++++++++++++++++++++++++++++++++
> 5 files changed, 98 insertions(+), 8 deletions(-)
>
> diff --git a/.gitignore b/.gitignore
> index fcd5175..3bccc18 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -23,3 +23,4 @@ my-10.5-old.cnf
> mysql-boost
> mysql-8.0
> mysql8-data
> +mysql-8.0-data.clean/
> diff --git a/filesort-bench1/06-make-varchar-bench.sh b/filesort-bench1/06-make-varchar-bench.sh
> index 85afcfd..21da087 100644
> --- a/filesort-bench1/06-make-varchar-bench.sh
> +++ b/filesort-bench1/06-make-varchar-bench.sh
> @@ -29,6 +29,17 @@ create table test_run_queries (
> test_time_ms bigint,
> sort_merge_passes int
> );
> +
> +drop view if exists session_status;
> +
> +set @var= IF(version() like '%8.0%',
> + 'create view session_status as select * from performance_schema.session_status',
> + 'create view session_status as select * from information_schema.session_status');
> +
> +prepare s from @var;
> +execute s;
> +
> +
> END
>
> ###
> @@ -55,7 +66,18 @@ create table $test_table_name (
> char_field varchar($varchar_size) character set utf8, b int
> ) engine=myisam;
>
> -insert into $rand_table_name select 1+floor(rand() * @n_countries) from seq_1_to_$table_size;
> +drop table if exists ten, one_k;
> +create table ten(a int);
> +insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
> +
> +create table one_k(a int);
> +insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
> +
> +set @a=0;
> +insert into $rand_table_name
> +select 1+floor(rand() * @n_countries)
> +from
> + (select @a:=@a+1 from one_k A, one_k B, one_k C limit $table_size) T;
> insert into $test_table_name
> select
> (select Name from Country where id=T.a), 1234
> @@ -63,13 +85,15 @@ from $rand_table_name T ;
>
> drop table $rand_table_name;
> analyze table $test_table_name;
> +select count(*) from $test_table_name;
> +show create table $test_table_name;
> END
>
> for i in 1 2 3 4 5 6 7 8 9 10 ; do
>
> ### query_start.sql here:
> cat <<END
> -select variable_value into @query_start_smp from information_schema.session_status where variable_name like 'sort_merge_passes';
> +select variable_value into @query_start_smp from session_status where variable_name like 'sort_merge_passes';
> select current_timestamp(6) into @query_start_time;
> END
> ###
> @@ -87,7 +111,7 @@ echo $QUERY
> cat << END
> set @test_name='$TEST_NAME';
> set @query_time_ms= timestampdiff(microsecond, @query_start_time, current_timestamp(6))/1000;
> -select variable_value into @query_end_smp from information_schema.session_status where variable_name like 'sort_merge_passes';
> +select variable_value into @query_end_smp from session_status where variable_name like 'sort_merge_passes';
> set @query_merge_passes = @query_end_smp - @query_start_smp;
> insert into test_run_queries
> (table_size, varchar_size, test_ts, test_time_ms, sort_merge_passes)
> diff --git a/prepare-server.sh b/prepare-server.sh
> index 01dbeba..d4da8db 100755
> --- a/prepare-server.sh
> +++ b/prepare-server.sh
> @@ -38,6 +38,11 @@ if [ ! -d $SERVERNAME ]; then
> exit 1
> fi
>
> +if [ ! -f $SERVERNAME-vars.sh ]; then
> + echo "Can't find settings file $SERVERNAME-vars.sh."
> + exit 1
> +fi
> +
> if [[ $USE_RAMDISK ]] ; then
> echo " Using /dev/shm for data dir"
> fi
> @@ -49,6 +54,8 @@ sleep 5
>
> DATA_DIR=$SERVERNAME-data
>
> +source ${SERVERNAME}-vars.sh
> +
> if [[ $RECOVER ]] ; then
> echo "Recovering the existing datadir"
> else
> @@ -64,7 +71,7 @@ else
> fi
>
> #exit 0;
> -./$SERVERNAME/sql/mysqld --defaults-file=./my-${SERVERNAME}.cnf &
> +$MYSQLD --defaults-file=./my-${SERVERNAME}.cnf &
>
>
> server_attempts=0
> @@ -72,7 +79,7 @@ server_attempts=0
> while true ; do
> client_attempts=0
> while true ; do
> - ./$SERVERNAME/client/mysql --defaults-file=./my-${SERVERNAME}.cnf -uroot -e "create database sbtest"
> + $MYSQL $MYSQL_ARGS -e "select 1"
>
> if [ $? -eq 0 ]; then
> break
> diff --git a/setup-server/setup-mariadb-current.sh b/setup-server/setup-mariadb-current.sh
> index 1b49ef0..b6bc417 100755
> --- a/setup-server/setup-mariadb-current.sh
> +++ b/setup-server/setup-mariadb-current.sh
> @@ -85,16 +85,16 @@ innodb_buffer_pool_size=8G
>
> EOF
>
> -cat > mysql-vars.sh <<EOF
> +cat > $DIRNAME-vars.sh <<EOF
> MYSQL="`pwd`/$DIRNAME/client/mysql"
> +MYSQLD="`pwd`/$DIRNAME/sql/mysqld"
> MYSQLSLAP="`pwd`/$DIRNAME/client/mysqlslap"
> MYSQL_SOCKET="--socket=$SOCKETNAME"
> MYSQL_USER="-uroot"
> MYSQL_ARGS="\$MYSQL_USER \$MYSQL_SOCKET"
> EOF
>
> -source mysql-vars.sh
> -cp mysql-vars.sh $DIRNAME-vars.sh
> +source $DIRNAME-vars.sh
>
> (
> cd $HOMEDIR/$DIRNAME/sql
> diff --git a/setup-server/setup-mysql-8.0.sh b/setup-server/setup-mysql-8.0.sh
> new file mode 100644
> index 0000000..ffd9987
> --- /dev/null
> +++ b/setup-server/setup-mysql-8.0.sh
> @@ -0,0 +1,58 @@
> +#!/bin/bash
> +
> +HOMEDIR=`pwd`
> +
> +BRANCH=8.0
> +SERVER_VERSION=8.0
> +DIRNAME="mysql-$SERVER_VERSION"
> +
> +git clone --branch $BRANCH --depth 1 https://github.com/mysql/mysql-server.git $DIRNAME
> +
> +cd mysql-$SERVER_VERSION
> +cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDOWNLOAD_BOOST=1 -DWITH_BOOST=$HOMEDIR/mysql-boost \
> + -DENABLE_DOWNLOADS=1 -DFORCE_INSOURCE_BUILD=1 -DWITH_UNIT_TESTS=0
> +
> +make -j8
> +
> +cd mysql-test
> +perl ./mysql-test-run alias
> +cp -r var/data $HOMEDIR/$DIRNAME-data
> +cp -r var/data $HOMEDIR/$DIRNAME-data.clean
> +cd ..
> +
> +
> +source_dir=`pwd`
> +socket_name="`basename $source_dir`.sock"
> +SOCKETNAME="/tmp/$socket_name"
> +
> +cat > $HOMEDIR/my-$DIRNAME.cnf <<EOF
> +
> +[mysqld]
> +datadir=$HOMEDIR/$DIRNAME-data
> +
> +tmpdir=/tmp
> +port=3320
> +socket=$SOCKETNAME
> +#binlog-format=row
> +gdb
> +lc_messages_dir=../share
> +server-id=12
> +bind-address=0.0.0.0
> +log-error
> +secure_file_priv=
> +innodb_buffer_pool_size=4G
> +EOF
> +
> +cat > $DIRNAME-vars.sh <<EOF
> +MYSQL="`pwd`/$DIRNAME/bin/mysql"
> +MYSQLD="`pwd`/$DIRNAME/bin/mysqld"
> +MYSQLSLAP="`pwd`/$DIRNAME/bin/mysqlslap"
> +MYSQL_SOCKET="--socket=$SOCKETNAME"
> +MYSQL_USER="-uroot"
> +MYSQL_ARGS="\$MYSQL_USER \$MYSQL_SOCKET"
> +EOF
> +
> +source $DIRNAME-vars.sh
> +
> +$MYSQLD --defaults-file=$HOMEDIR/my-$DIRNAME.cnf &
> +
> _______________________________________________
> commits mailing list
> commits(a)mariadb.org
> https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
1
0
--
Cheers,
Badrul
1
0

11 Mar '20
Hi, Nikita!
On Mar 10, Nikita Malyavin wrote:
> revision-id: bbe056ac3fa (mariadb-10.5.0-273-gbbe056ac3fa)
> parent(s): 7a5d3316805
> author: Nikita Malyavin <nikitamalyavin(a)gmail.com>
> committer: Nikita Malyavin <nikitamalyavin(a)gmail.com>
> timestamp: 2020-03-11 00:46:24 +1000
> message:
>
> support NULL fields in key
>
> ---
> mysql-test/suite/period/r/overlaps.result | 18 ++++++++++++++++--
> mysql-test/suite/period/t/overlaps.test | 16 +++++++++++++++-
> sql/handler.cc | 18 ++++++++++++++----
> sql/sql_table.cc | 29 +++++++----------------------
> 4 files changed, 52 insertions(+), 29 deletions(-)
>
> diff --git a/mysql-test/suite/period/r/overlaps.result b/mysql-test/suite/period/r/overlaps.result
> index cf980afd7f0..e52b21496b5 100644
> --- a/mysql-test/suite/period/r/overlaps.result
> +++ b/mysql-test/suite/period/r/overlaps.result
> @@ -104,16 +104,30 @@ show create table t;
> Table Create Table
> t CREATE TABLE `t` (
> `id` int(11) NOT NULL,
> - `u` int(11) NOT NULL,
> + `u` int(11) DEFAULT NULL,
> `s` date NOT NULL,
> `e` date NOT NULL,
> PERIOD FOR `p` (`s`, `e`),
> PRIMARY KEY (`id`,`p` WITHOUT OVERLAPS),
> UNIQUE KEY `u` (`u`,`p` WITHOUT OVERLAPS)
> ) ENGINE=DEFAULT_ENGINE DEFAULT CHARSET=latin1
> +insert into t values (2, NULL, '2003-03-01', '2003-05-01');
> +insert into t values (2, NULL, '2003-03-01', '2003-05-01');
> +ERROR 23000: Duplicate entry '2-2003-05-01-2003-03-01' for key 'PRIMARY'
This is wrong. Should be no duplicate key error above.
> +insert into t values (3, NULL, '2003-03-01', '2003-05-01');
> insert into t values (1, 1, '2003-03-01', '2003-05-01');
> insert into t values (1, 2, '2003-05-01', '2003-07-01');
> -insert into t values (2, 1, '2003-05-01', '2003-07-01');
> +insert into t values (4, NULL, '2003-03-01', '2003-05-01');
> +create sequence seq start=5;
> +update t set id= nextval(seq), u= nextval(seq), s='2003-05-01', e='2003-07-01'
> + where u is NULL;
> +select * from t;
> +id u s e
> +1 1 2003-03-01 2003-05-01
> +1 2 2003-05-01 2003-07-01
> +5 6 2003-05-01 2003-07-01
> +7 8 2003-05-01 2003-07-01
> +9 10 2003-05-01 2003-07-01
> create or replace table t(id int, s date, e date,
> period for p(s,e));
> insert into t values (1, '2003-01-01', '2003-03-01'),
> diff --git a/sql/handler.cc b/sql/handler.cc
> index 917386f4392..16f57533f27 100644
> --- a/sql/handler.cc
> +++ b/sql/handler.cc
> @@ -7015,7 +7017,8 @@ int handler::ha_check_overlaps(const uchar *old_data, const uchar* new_data)
> bool key_used= false;
> for (uint k= 0; k < key_parts && !key_used; k++)
> key_used= bitmap_is_set(table->write_set,
> - key_info.key_part[k].fieldnr - 1);
> + key_info.key_part[k].fieldnr - 1)
> + && !key_info.key_part[k].field->is_null_in_record(new_data);
Why is that?
> if (!key_used)
> continue;
> }
> @@ -7064,8 +7067,15 @@ int handler::ha_check_overlaps(const uchar *old_data, const uchar* new_data)
> error= handler->ha_index_next(record_buffer);
> }
>
> - if (!error && table->check_period_overlaps(key_info, key_info,
> - new_data, record_buffer) == 0)
> + bool null_in_key= false;
> + for (uint k= 0; k < key_parts && !null_in_key; k++)
> + {
> + null_in_key= key_info.key_part[k].field->is_null_in_record(record_buffer);
> + }
> +
> + if (!null_in_key && !error
> + && table->check_period_overlaps(key_info, key_info,
> + new_data, record_buffer) == 0)
That's strange. You compare keys in key_period_compare_bases(), why do
you do NULL values here?
> error= HA_ERR_FOUND_DUPP_KEY;
>
> if (error == HA_ERR_KEY_NOT_FOUND || error == HA_ERR_END_OF_FILE)
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1

Re: [Maria-developers] eea71e8b05a: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
by Sergei Golubchik 10 Mar '20
by Sergei Golubchik 10 Mar '20
10 Mar '20
Hi, Nikita!
Despite what the subject says, it's the review for eea71e8b05a..7a5d3316805
That is everything in bb-10.5-MDEV-16978-without-overlaps minus the last
commit (that came too late and I'll review it separately)
In general it looks pretty good, just few minor comments below:
On Mar 10, Nikita Malyavin wrote:
> revision-id: eea71e8b05a (mariadb-10.5.0-246-geea71e8b05a)
> parent(s): 6618fc29749
> author: Nikita Malyavin <nikitamalyavin(a)gmail.com>
> committer: Nikita Malyavin <nikitamalyavin(a)gmail.com>
> timestamp: 2020-02-21 02:38:57 +1000
> message:
>
> MDEV-16978 Application-time periods: WITHOUT OVERLAPS
>
> * The overlaps check is implemented on a handler level per row command.
> It creates a separate cursor (actually, another handler instance) and
> caches it inside the original handler, when ha_update_row or
> ha_insert_row is issued. Cursor closes on unlocking the handler.
>
> * Containing the same key in index means unique constraint violation
> even in usual terms. So we fetch left and right neighbours and check
> that they have same key prefix, excluding from the key only the period part.
> If it doesnt match, then there's no such neighbour, and the check passes.
> Otherwise, we check if this neighbour intersects with the considered key.
>
> * The check does introduce new error and fails with ER_DUPP_KEY error.
"does not introduce new error" you mean?
> This might break REPLACE workflow and should be fixed separately
>
> diff --git a/sql/field.h b/sql/field.h
> index 4a8eec35b05..e187ffeb331 100644
> --- a/sql/field.h
> +++ b/sql/field.h
> @@ -1444,8 +1444,9 @@ class Field: public Value_source
> if (null_ptr)
> null_ptr=ADD_TO_PTR(null_ptr,ptr_diff,uchar*);
> }
> - virtual void get_image(uchar *buff, uint length, CHARSET_INFO *cs)
> - { memcpy(buff,ptr,length); }
> + virtual void get_image(uchar *buff, uint length,
> + const uchar *ptr_arg, CHARSET_INFO *cs) const
> + { memcpy(buff,ptr_arg,length); }
please, add a convenience method.
void get_image(uchar *buff, uint length, CHARSET_INFO *cs)
{ get_image(buff, length, ptr, cs); }
and the same below, where you add ptr_arg
> virtual void set_image(const uchar *buff,uint length, CHARSET_INFO *cs)
> { memcpy(ptr,buff,length); }
>
> @@ -4056,7 +4066,8 @@ class Field_varstring :public Field_longstr {
> using Field_str::store;
> double val_real() override;
> longlong val_int() override;
> - String *val_str(String *, String *) override;
> + String *val_str(String *, String *) final;
> + virtual String *val_str(String*,String *, const uchar*) const;
This means that for the sake of indexes WITHOUT OVERLAPS (that very few
people will use) and compressed blobs (that are used even less)
you've added a new virtual call to Field_varstring::val_str
(that is used awfully a lot)
Try to avoid it, please
> my_decimal *val_decimal(my_decimal *) override;
> int cmp_max(const uchar *, const uchar *, uint max_length) const override;
> int cmp(const uchar *a,const uchar *b) const override
> diff --git a/sql/field.cc b/sql/field.cc
> index 1ce49b0bdfa..82df2784057 100644
> --- a/sql/field.cc
> +++ b/sql/field.cc
> @@ -9638,11 +9645,12 @@ int Field_bit::cmp_offset(my_ptrdiff_t row_offset)
> }
>
>
> -uint Field_bit::get_key_image(uchar *buff, uint length, imagetype type_arg)
> +uint Field_bit::get_key_image(uchar *buff, uint length, const uchar *ptr_arg, imagetype type_arg) const
> {
> if (bit_len)
> {
> - uchar bits= get_rec_bits(bit_ptr, bit_ofs, bit_len);
> + auto *bit_ptr_for_arg= ptr_arg + (bit_ptr - ptr);
please, don't use auto in trivial cases like this one.
it might be easier to type, but then the reviewer and whoever
will in the future will edit this code will have to do type derivation
in the head.
> + uchar bits= get_rec_bits(bit_ptr_for_arg, bit_ofs, bit_len);
> *buff++= bits;
> length--;
> }
> diff --git a/sql/item_buff.cc b/sql/item_buff.cc
> index 81949bcdae0..514ac740697 100644
> --- a/sql/item_buff.cc
> +++ b/sql/item_buff.cc
> @@ -195,7 +195,7 @@ bool Cached_item_field::cmp(void)
> becasue of value change), then copy the new value to buffer.
> */
> if (! null_value && (tmp || (tmp= (field->cmp(buff) != 0))))
> - field->get_image(buff,length,field->charset());
> + field->get_image(buff,length,field->ptr,field->charset());
not needed if you add convenience methods
> return tmp;
> }
>
> diff --git a/sql/sql_class.h b/sql/sql_class.h
> index 13b2659789d..2f1b1431dc0 100644
> --- a/sql/sql_class.h
> +++ b/sql/sql_class.h
> @@ -357,13 +357,15 @@ class Key :public Sql_alloc, public DDL_options {
> engine_option_value *option_list;
> bool generated;
> bool invisible;
> + bool without_overlaps;
> + Lex_ident period;
strictly speaking, you don't need without_overlaps property,
if the period name is set it *has to be* without overlaps.
but ok, whatever you prefer
>
> Key(enum Keytype type_par, const LEX_CSTRING *name_arg,
> ha_key_alg algorithm_arg, bool generated_arg, DDL_options_st ddl_options)
> :DDL_options(ddl_options),
> type(type_par), key_create_info(default_key_create_info),
> name(*name_arg), option_list(NULL), generated(generated_arg),
> - invisible(false)
> + invisible(false), without_overlaps(false)
> {
> key_create_info.algorithm= algorithm_arg;
> }
> diff --git a/sql/table.h b/sql/table.h
> index 6ce92ee048e..09f03690a8c 100644
> --- a/sql/table.h
> +++ b/sql/table.h
> @@ -1635,13 +1635,12 @@ struct TABLE
> int insert_portion_of_time(THD *thd, const vers_select_conds_t &period_conds,
> ha_rows *rows_inserted);
> bool vers_check_update(List<Item> &items);
> -
> + static int check_period_overlaps(const KEY &lhs_key, const KEY &rhs_key,
> + const uchar *lhs, const uchar *rhs);
why did you make it a static method of TABLE?
> int delete_row();
> void vers_update_fields();
> void vers_update_end();
> void find_constraint_correlated_indexes();
> - void clone_handler_for_update();
> - void delete_update_handler();
>
> /** Number of additional fields used in versioned tables */
> #define VERSIONING_FIELDS 2
> diff --git a/sql/sql_table.cc b/sql/sql_table.cc
> index 240f001f7de..74d28ede25f 100644
> --- a/sql/sql_table.cc
> +++ b/sql/sql_table.cc
> @@ -3959,6 +3959,28 @@ mysql_prepare_create_table(THD *thd, HA_CREATE_INFO *create_info,
> DBUG_RETURN(TRUE);
> }
>
> + switch (key->type) {
> + case Key::UNIQUE:
> + if (!key->period)
> + break;
> + /* Fall through:
> + WITHOUT OVERLAPS forces fields to be NOT NULL
> + */
Why is that?
> + case Key::PRIMARY:
> + /* Implicitly set primary key fields to NOT NULL for ISO conf. */
> + if (!(sql_field->flags & NOT_NULL_FLAG))
> + {
> + /* Implicitly set primary key fields to NOT NULL for ISO conf. */
duplicated comment
> + sql_field->flags|= NOT_NULL_FLAG;
> + sql_field->pack_flag&= ~FIELDFLAG_MAYBE_NULL;
> + null_fields--;
> + }
> + break;
> + default:
> + // Fall through
> + break;
> + }
> +
> cols2.rewind();
> switch(key->type) {
>
> @@ -4536,15 +4556,13 @@ bool Column_definition::sp_prepare_create_field(THD *thd, MEM_ROOT *mem_root)
> }
>
>
> -static bool vers_prepare_keys(THD *thd, HA_CREATE_INFO *create_info,
> - Alter_info *alter_info, KEY **key_info, uint key_count)
> +static bool append_system_key_parts(THD *thd, HA_CREATE_INFO *create_info,
> + Alter_info *alter_info, KEY **key_info,
> + uint key_count)
> {
> - DBUG_ASSERT(create_info->versioned());
> -
> - const char *row_start_field= create_info->vers_info.as_row.start;
> - DBUG_ASSERT(row_start_field);
> - const char *row_end_field= create_info->vers_info.as_row.end;
> - DBUG_ASSERT(row_end_field);
> + const auto &row_start_field= create_info->vers_info.as_row.start;
> + const auto &row_end_field= create_info->vers_info.as_row.end;
please, don't use auto in trivial cases like this one.
it might be easier to type, but then the reviewer and whoever
will in the future will edit this code will have to do type derivation
in the head.
> + DBUG_ASSERT(!create_info->versioned() || (row_start_field && row_end_field));
>
> List_iterator<Key> key_it(alter_info->key_list);
> Key *key= NULL;
> @@ -4553,25 +4571,61 @@ static bool vers_prepare_keys(THD *thd, HA_CREATE_INFO *create_info,
> if (key->type != Key::PRIMARY && key->type != Key::UNIQUE)
> continue;
>
> + if (create_info->versioned())
> + {
> Key_part_spec *key_part=NULL;
> List_iterator<Key_part_spec> part_it(key->columns);
> while ((key_part=part_it++))
> {
> - if (!my_strcasecmp(system_charset_info,
> - row_start_field,
> - key_part->field_name.str) ||
> -
> - !my_strcasecmp(system_charset_info,
> - row_end_field,
> - key_part->field_name.str))
> + if (row_start_field.streq(key_part->field_name) ||
> + row_end_field.streq(key_part->field_name))
> break;
> }
> - if (key_part)
> - continue; // Key already contains Sys_start or Sys_end
> + if (!key_part)
> + key->columns.push_back(new Key_part_spec(&row_end_field, 0));
> + }
> + }
>
> - Key_part_spec *key_part_sys_end_col=
> - new (thd->mem_root) Key_part_spec(&create_info->vers_info.as_row.end, 0);
> - key->columns.push_back(key_part_sys_end_col);
> + key_it.rewind();
> + while ((key=key_it++))
please skip this loop if there's no PERIOD and skip the loop above
if there's no system versioning.
> + {
> + if (key->without_overlaps)
> + {
> + if (key->type != Key::PRIMARY && key->type != Key::UNIQUE)
> + {
> + my_error(ER_PERIOD_WITHOUT_OVERLAPS_NON_UNIQUE, MYF(0), key->period.str);
> + return true;
I think this can even be a syntax error in the parser.
there's no need to postpone it till here, is there?
> + }
> + if (!create_info->period_info.is_set()
> + || !key->period.streq(create_info->period_info.name))
> + {
> + my_error(ER_PERIOD_NOT_FOUND, MYF(0), key->period.str);
> + return true;
> + }
> + if (thd->work_part_info)
> + {
> + // Unfortunately partitions do not support searching upper/lower bounds
> + // (i.e. ha_index_read_map with KEY_OR_PREV, KEY_OR_NEXT)
> + my_error(ER_FEATURE_NOT_SUPPORTED_WITH_PARTITIONING, MYF(0),
> + "WITHOUT OVERLAPS");
> + return true;
> + }
> + const auto &period_start= create_info->period_info.period.start;
> + const auto &period_end= create_info->period_info.period.end;
> + List_iterator<Key_part_spec> part_it(key->columns);
> + while (Key_part_spec *key_part= part_it++)
> + {
> + if (period_start.streq(key_part->field_name)
> + || period_end.streq(key_part->field_name))
> + {
> + my_error(ER_KEY_CONTAINS_PERIOD_FIELDS, MYF(0), key->name.str,
> + key_part->field_name);
> + return true;
> + }
> + }
> + key->columns.push_back(new Key_part_spec(&period_end, 0));
> + key->columns.push_back(new Key_part_spec(&period_start, 0));
> + }
> }
>
> return false;
> diff --git a/sql/handler.cc b/sql/handler.cc
> index 7d61252eea6..917386f4392 100644
> --- a/sql/handler.cc
> +++ b/sql/handler.cc
> @@ -4173,6 +4173,11 @@ uint handler::get_dup_key(int error)
> if (table->s->long_unique_table && table->file->errkey < table->s->keys)
> DBUG_RETURN(table->file->errkey);
> table->file->errkey = (uint) -1;
> + if (overlaps_error_key != -1)
> + {
> + table->file->errkey= (uint)overlaps_error_key;
> + DBUG_RETURN(table->file->errkey);
> + }
Why do you need overlaps_error_key? it looks like you
can store the conflicting key number directly in errkey
and it's somewhat confusing that this method uses table->file->errkey
instead of just errkey. It's totally not clear why it does that.
Looks like some historical thing from 2000.
> if (error == HA_ERR_FOUND_DUPP_KEY ||
> error == HA_ERR_FOREIGN_DUPLICATE_KEY ||
> error == HA_ERR_FOUND_DUPP_UNIQUE || error == HA_ERR_NULL_IN_SPATIAL ||
> @@ -6563,10 +6576,12 @@ static int check_duplicate_long_entry_key(TABLE *table, handler *h,
> unique constraint on long columns.
> @returns 0 if no duplicate else returns error
> */
> -static int check_duplicate_long_entries(TABLE *table, handler *h,
> - const uchar *new_rec)
> +int handler::check_duplicate_long_entries(const uchar *new_rec)
> {
> - table->file->errkey= -1;
> + if (this->inited == RND)
> + create_lookup_handler();
1. and if inited==INDEX ?
2. why 'this->' ?
3. generally it's not a good idea to check inited==RND and lookup_handler==NULL
inside a loop per row, because these values cannot change in the middle
of a scan. Better to check them once, when inited is being set to RND.
> + handler *h= lookup_handler ? lookup_handler : table->file;
why table->file and not this?
> + errkey= -1;
> int result;
> for (uint i= 0; i < table->s->keys; i++)
> {
> @@ -6935,6 +6956,142 @@ void handler::set_lock_type(enum thr_lock_type lock)
> table->reginfo.lock_type= lock;
> }
>
> +/**
> + @brief clone of current handler.
> + Creates a clone of handler used for unique hash key and WITHOUT OVERLAPS.
> + @return error code
> +*/
> +int handler::create_lookup_handler()
> +{
> + if (lookup_handler)
> + return 0;
> + lookup_handler= clone(table_share->normalized_path.str,
> + table->in_use->mem_root);
> + int error= lookup_handler->ha_external_lock(table->in_use, F_RDLCK);
> + return error;
> +}
> +
> +int handler::ha_check_overlaps(const uchar *old_data, const uchar* new_data)
> +{
> + DBUG_ASSERT(new_data);
> + if (!table_share->period.unique_keys)
> + return 0;
> + if (table->versioned() && !table->vers_end_field()->is_max())
> + return 0;
> +
> + bool is_update= old_data != NULL;
> + if (!check_overlaps_buffer)
> + check_overlaps_buffer= (uchar*)alloc_root(&table_share->mem_root,
> + table_share->max_unique_length
> + + table_share->reclength);
check_overlaps_buffer is per handler. It should be allocated
in the TABLE::mem_root, not TABLE_SHARE::mem_root
Also, it should probably be called lookup_buffer and
check_duplicate_long_entry_key() should use it too.
> + auto *record_buffer= check_overlaps_buffer + table_share->max_unique_length;
> + auto *handler= this;
> + // handler->inited can be NONE on INSERT
> + if (handler->inited != NONE)
> + {
> + create_lookup_handler();
> + handler= lookup_handler;
> +
> + // Needs to compare record refs later is old_row_found()
> + if (is_update)
> + position(old_data);
> + }
> +
> + // Save and later restore this handler's keyread
> + int old_this_keyread= this->keyread;
Again, why do you save/restore this->keyread?
If you're using lookup_handler below, then this->keyread doesn't matter.
And if handler==this below, then this->inited==NONE, which implies no keyread.
Either way, handler->keyread_enabled() should always be false here.
What about DBUG_ASSERT(!handler->keyread_enabled()) ?
> + DBUG_ASSERT(this->ha_end_keyread() == 0);
please fix all cases where you used a function with side effects
inside an assert. not just the one I've commented about
> +
> + int error= 0;
> +
> + for (uint key_nr= 0; key_nr < table_share->keys && !error; key_nr++)
> + {
> + const KEY &key_info= table->key_info[key_nr];
> + const uint key_parts= key_info.user_defined_key_parts;
> + if (!key_info.without_overlaps)
> + continue;
> +
> + if (is_update)
> + {
> + bool key_used= false;
> + for (uint k= 0; k < key_parts && !key_used; k++)
> + key_used= bitmap_is_set(table->write_set,
> + key_info.key_part[k].fieldnr - 1);
> + if (!key_used)
> + continue;
> + }
> +
> + error= handler->ha_index_init(key_nr, 0);
> + if (error)
> + return error;
we should try to minimize number of index_init/index_end,
it doesn't look like a cheap operation, at least in InnoDB.
but in a separate commit, because it should cover long uniques too.
> +
> + error= handler->ha_start_keyread(key_nr);
> + DBUG_ASSERT(!error);
> +
> + const uint period_field_length= key_info.key_part[key_parts - 1].length;
> + const uint key_base_length= key_info.key_length - 2 * period_field_length;
> +
> + key_copy(check_overlaps_buffer, new_data, &key_info, 0);
> +
> + /* Copy period_start to period_end.
> + the value in period_start field is not significant, but anyway let's leave
> + it defined to avoid uninitialized memory access
> + */
> + memcpy(check_overlaps_buffer + key_base_length,
> + check_overlaps_buffer + key_base_length + period_field_length,
> + period_field_length);
> +
> + /* Find row with period_end > (period_start of new_data) */
> + error = handler->ha_index_read_map(record_buffer,
> + check_overlaps_buffer,
> + key_part_map((1 << (key_parts - 1)) - 1),
> + HA_READ_AFTER_KEY);
> +
> + if (!error && is_update)
> + {
> + /* In case of update it could happen that the nearest neighbour is
> + a record we are updating. It means, that there are no overlaps
> + from this side.
> +
> + An assumption is made that during update we always have the last
> + fetched row in old_data. Therefore, comparing ref's is enough
> + */
> + DBUG_ASSERT(handler != this);
> + DBUG_ASSERT(inited != NONE);
> + DBUG_ASSERT(ref_length == handler->ref_length);
> +
> + handler->position(record_buffer);
> + if (memcmp(ref, handler->ref, ref_length) == 0)
> + error= handler->ha_index_next(record_buffer);
> + }
> +
> + if (!error && table->check_period_overlaps(key_info, key_info,
> + new_data, record_buffer) == 0)
> + error= HA_ERR_FOUND_DUPP_KEY;
> +
> + if (error == HA_ERR_KEY_NOT_FOUND || error == HA_ERR_END_OF_FILE)
> + error= 0;
> +
> + if (error == HA_ERR_FOUND_DUPP_KEY)
> + overlaps_error_key= key_nr;
> +
> + int end_error= handler->ha_end_keyread();
> + DBUG_ASSERT(!end_error);
> +
> + end_error= handler->ha_index_end();
> + if (!error && end_error)
> + error= end_error;
> + }
> +
> + // Restore keyread of this handler, if it was enabled
> + if (old_this_keyread < MAX_KEY)
> + {
> + error= this->ha_start_keyread(old_this_keyread);
> + DBUG_ASSERT(error == 0);
> + }
> +
> + return error;
> +}
> +
> #ifdef WITH_WSREP
> /**
> @details
> diff --git a/sql/table.cc b/sql/table.cc
> index 718efa5767c..65fc44458f4 100644
> --- a/sql/table.cc
> +++ b/sql/table.cc
> @@ -1499,6 +1500,14 @@ static size_t extra2_read_len(const uchar **extra2, const uchar *extra2_end)
> return length;
> }
>
> +static
> +bool fill_unique_extra2(const uchar *extra2, size_t len, LEX_CUSTRING *section)
thanks, good point.
a bit confusing name, I thought it's something about UNIQUE
particularly as this is what the whole patch is about :)
What about read_extra2_section_once() or something like that?
Or get_ or consume_ or store_ ?
> +{
> + if (section->str)
> + return true;
> + *section= {extra2, len};
> + return false;
> +}
>
> static
> bool read_extra2(const uchar *frm_image, size_t len, extra2_fields *fields)
> @@ -1725,11 +1725,26 @@ int TABLE_SHARE::init_from_binary_frm_image(THD *thd, bool write,
> keyinfo= &first_keyinfo;
> thd->mem_root= &share->mem_root;
>
> + auto err= [thd, share, &handler_file, &se_plugin, old_root](){
> + share->db_plugin= NULL;
> + share->error= OPEN_FRM_CORRUPTED;
> + share->open_errno= my_errno;
> + delete handler_file;
> + plugin_unlock(0, se_plugin);
> + my_hash_free(&share->name_hash);
> +
> + if (!thd->is_error())
> + open_table_error(share, OPEN_FRM_CORRUPTED, share->open_errno);
> +
> + thd->mem_root= old_root;
> + return HA_ERR_NOT_A_TABLE;
> + };
> +
Okay. I kind of see your point, but let's postpone this refactoring
until after 10.5.2. I still need some time to think about it.
You'll still be able to push it later.
> if (write && write_frm_image(frm_image, frm_length))
> - goto err;
> + DBUG_RETURN(err());
>
> if (frm_length < FRM_HEADER_SIZE + FRM_FORMINFO_SIZE)
> - goto err;
> + DBUG_RETURN(err());
>
> share->frm_version= frm_image[2];
> /*
> @@ -8603,6 +8626,21 @@ void TABLE::evaluate_update_default_function()
> DBUG_VOID_RETURN;
> }
>
> +/**
> + Compare two keys with periods
> + @return -1, lhs precedes rhs
> + 0, lhs overlaps rhs
> + 1, lhs succeeds rhs
> + */
> +int TABLE::check_period_overlaps(const KEY &lhs_key, const KEY &rhs_key,
> + const uchar *lhs, const uchar *rhs)
> +{
> + int cmp_res= key_period_compare_bases(lhs_key, rhs_key, lhs, rhs);
you only use key_period_compare_bases here. I just wouldn't create
a separate small function for that, but rather inlined it here.
> + if (cmp_res)
> + return cmp_res;
> +
> + return key_period_compare_periods(lhs_key, rhs_key, lhs, rhs);
same for key_period_compare_periods
> +}
>
> void TABLE::vers_update_fields()
> {
> diff --git a/sql/key.cc b/sql/key.cc
> index 9dbb7a15726..49e97faea22 100644
> --- a/sql/key.cc
> +++ b/sql/key.cc
> @@ -896,3 +897,51 @@ bool key_buf_cmp(KEY *key_info, uint used_key_parts,
> }
> return FALSE;
> }
> +
> +
> +/**
> + Compare base parts (not including the period) of keys with period
> + @return -1, lhs less than rhs
> + 0, lhs equals rhs
> + 1, lhs more than rhs
> + */
> +int key_period_compare_bases(const KEY &lhs_key, const KEY &rhs_key,
> + const uchar *lhs, const uchar *rhs)
> +{
> + uint base_part_nr= lhs_key.user_defined_key_parts - 2;
may be, give this '2' a name, like 'period_key_parts' ?
and use them everywhere, of course, not just here
> + int cmp_res= 0;
> + for (uint part_nr= 0; !cmp_res && part_nr < base_part_nr; part_nr++)
> + {
> + Field *f= lhs_key.key_part[part_nr].field;
> + cmp_res= f->cmp(f->ptr_in_record(lhs),
> + rhs_key.key_part[part_nr].field->ptr_in_record(rhs));
This doesn't seem to be handling NULLs
(see the next review)
> + }
> +
> + return cmp_res;
> +}
> +
> +/**
> + Compare periods of two keys
> + @return -1, lhs preceeds rhs
> + 0, lhs overlaps rhs
> + 1, lhs succeeds rhs
> + */
> +int key_period_compare_periods(const KEY &lhs_key, const KEY &rhs_key,
> + const uchar *lhs, const uchar *rhs)
> +{
> + uint period_start= lhs_key.user_defined_key_parts - 1;
> + uint period_end= lhs_key.user_defined_key_parts - 2;
> +
> + const auto *f= lhs_key.key_part[period_start].field;
> + const uchar *l[]= {lhs_key.key_part[period_start].field->ptr_in_record(lhs),
> + rhs_key.key_part[period_start].field->ptr_in_record(rhs)};
> +
> + const uchar *r[]= {lhs_key.key_part[period_end].field->ptr_in_record(lhs),
> + rhs_key.key_part[period_end].field->ptr_in_record(rhs)};
I'd still prefer names like 'ls', 'le', 'rs', 're'. Now I need to look up
and keep in mind that 'r[0] is left key end period' etc
> +
> + if (f->cmp(r[0], l[1]) <= 0)
> + return -1;
> + if (f->cmp(l[0], r[1]) >= 0)
> + return 1;
> + return 0;
> +}
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

[Maria-developers] 9ae015878f1: MDEV-10047: table-based master info repository
by sujatha 10 Mar '20
by sujatha 10 Mar '20
10 Mar '20
revision-id: 9ae015878f11be3e3033fd1b35357ea5927c6c51 (mariadb-10.5.0-329-g9ae015878f1)
parent(s): b753ac066bc26acda9deb707a31c112f1bbf9ec2
author: Sujatha
committer: Sujatha
timestamp: 2020-03-10 15:55:50 +0530
message:
MDEV-10047: table-based master info repository
Problem:
=======
When we upgrade from "mysql" to "mariadb" if slave is using repositories as
tables their data is completely ignored and no warning is issued in error log.
Fix:
===
"mysql_upgrade" test should check for the presence of data in
"mysql.slave_master_info" and "mysql.slave_relay_log_info" tables. When tables
have some data the upgrade script should report a warning which hints users
that the data in repository tables will be ignored.
---
client/mysql_upgrade.c | 61 +++++++++-
.../main/rpl_mysql_upgrade_slave_repo_check.result | 33 ++++++
.../main/rpl_mysql_upgrade_slave_repo_check.test | 127 +++++++++++++++++++++
3 files changed, 220 insertions(+), 1 deletion(-)
diff --git a/client/mysql_upgrade.c b/client/mysql_upgrade.c
index 4e17089593f..bea82c2a112 100644
--- a/client/mysql_upgrade.c
+++ b/client/mysql_upgrade.c
@@ -1014,6 +1014,64 @@ static int install_used_engines(void)
return 0;
}
+static int check_slave_repositories(void)
+{
+ DYNAMIC_STRING ds_result;
+ int row_count= 0;
+ int error= 0;
+ const char *query = "SELECT COUNT(*) AS c1 FROM mysql.slave_master_info";
+
+ if (init_dynamic_string(&ds_result, "", 512, 512))
+ die("Out of memory");
+
+ run_query(query, &ds_result, TRUE);
+
+ if (ds_result.length)
+ {
+ row_count= atoi((char *)ds_result.str);
+ if (row_count)
+ {
+ fprintf(stderr,"Slave info repository compatibility check:"
+ " Found data in `mysql`.`slave_master_info` table.\n");
+ fprintf(stderr,"Warning: Content of `mysql`.`slave_master_info` table"
+ " will be ignored as MariaDB supports file based info "
+ "repository.\n");
+ error= 1;
+ }
+ }
+ dynstr_free(&ds_result);
+
+ query = "SELECT COUNT(*) AS c1 FROM mysql.slave_relay_log_info";
+
+ if (init_dynamic_string(&ds_result, "", 512, 512))
+ die("Out of memory");
+
+ run_query(query, &ds_result, TRUE);
+
+ if (ds_result.length)
+ {
+ row_count= atoi((char *)ds_result.str);
+ if (row_count)
+ {
+ fprintf(stderr, "Slave info repository compatibility check:"
+ " Found data in `mysql`.`slave_relay_log_info` table.\n");
+ fprintf(stderr, "Warning: Content of `mysql`.`slave_relay_log_info` "
+ "table will be ignored as MariaDB supports file based "
+ "repository.\n");
+ error= 1;
+ }
+ }
+ dynstr_free(&ds_result);
+ if (error)
+ {
+ fprintf(stderr,"Slave server may not possess the correct replication "
+ "metadata.\n");
+ fprintf(stderr, "Execution of CHANGE MASTER as per "
+ "`mysql`.`slave_master_info` and `mysql`.`slave_relay_log_info` "
+ "table content is recommended.\n");
+ }
+ return 0;
+}
/*
Update all system tables in MySQL Server to current
@@ -1225,7 +1283,8 @@ int main(int argc, char **argv)
run_mysqlcheck_views() ||
run_sql_fix_privilege_tables() ||
run_mysqlcheck_fixnames() ||
- run_mysqlcheck_upgrade(FALSE))
+ run_mysqlcheck_upgrade(FALSE) ||
+ check_slave_repositories())
die("Upgrade failed" );
verbose("Phase %d/%d: Running 'FLUSH PRIVILEGES'", ++phase, phases_total);
diff --git a/mysql-test/main/rpl_mysql_upgrade_slave_repo_check.result b/mysql-test/main/rpl_mysql_upgrade_slave_repo_check.result
new file mode 100644
index 00000000000..87cc9ab5a24
--- /dev/null
+++ b/mysql-test/main/rpl_mysql_upgrade_slave_repo_check.result
@@ -0,0 +1,33 @@
+include/master-slave.inc
+[connection master]
+********************************************************************
+* Test case1: Upgrade when repository tables have data. *
+* mysql_upgrade script should report warnings. *
+********************************************************************
+connection master;
+Slave info repository compatibility check: Found data in `mysql`.`slave_master_info` table.
+Warning: Content of `mysql`.`slave_master_info` table will be ignored as MariaDB supports file based info repository.
+Slave info repository compatibility check: Found data in `mysql`.`slave_relay_log_info` table.
+Warning: Content of `mysql`.`slave_relay_log_info` table will be ignored as MariaDB supports file based repository.
+Slave server may not possess the correct replication metadata.
+Execution of CHANGE MASTER as per `mysql`.`slave_master_info` and `mysql`.`slave_relay_log_info` table content is recommended.
+connection slave;
+Slave info repository compatibility check: Found data in `mysql`.`slave_master_info` table.
+Warning: Content of `mysql`.`slave_master_info` table will be ignored as MariaDB supports file based info repository.
+Slave info repository compatibility check: Found data in `mysql`.`slave_relay_log_info` table.
+Warning: Content of `mysql`.`slave_relay_log_info` table will be ignored as MariaDB supports file based repository.
+Slave server may not possess the correct replication metadata.
+Execution of CHANGE MASTER as per `mysql`.`slave_master_info` and `mysql`.`slave_relay_log_info` table content is recommended.
+connection master;
+TRUNCATE TABLE `mysql`.`slave_master_info`;
+TRUNCATE TABLE `mysql`.`slave_relay_log_info`;
+********************************************************************
+* Test case2: Upgrade when repository tables are empty. *
+* mysql_upgrade script should not report any warning. *
+********************************************************************
+connection master;
+connection slave;
+"====== Clean up ======"
+connection master;
+DROP TABLE `mysql`.`slave_master_info`, `mysql`.`slave_relay_log_info`;
+include/rpl_end.inc
diff --git a/mysql-test/main/rpl_mysql_upgrade_slave_repo_check.test b/mysql-test/main/rpl_mysql_upgrade_slave_repo_check.test
new file mode 100644
index 00000000000..24b5f029e8d
--- /dev/null
+++ b/mysql-test/main/rpl_mysql_upgrade_slave_repo_check.test
@@ -0,0 +1,127 @@
+# ==== Purpose ====
+#
+# While upgrading from "mysql" to "mariadb" if slave info repositories are
+# configured to be tables then appropriate warnings should be reported.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 1 - On MariaDB server create `mysql`.`slave_master_info` and
+# `mysql.slave_relay_log_info` tables to simulate upgrade from "mysql"
+# to "mariadb" server. Insert data into these tables.
+# 2 - Execute "mysql_upgrade" script and verify that appropriate warning
+# is reported. i.e Warning is to alert user that the data present in
+# repository tables will be ignored.
+# 3 - Truncate these tables. This simulates repositories being file and
+# the tables are empty.
+# 4 - Execute "mysql_upgrade" script and verify that no warnings are
+# reported.
+#
+# ==== References ====
+#
+# MDEV-10047: table-based master info repository
+#
+
+--source include/have_innodb.inc
+--source include/mysql_upgrade_preparation.inc
+--source include/have_binlog_format_mixed.inc
+--source include/master-slave.inc
+
+--write_file $MYSQLTEST_VARDIR/tmp/slave_table_repo_init.sql
+--disable_query_log
+--disable_result_log
+SET SQL_LOG_BIN=0;
+# Table structure extracted from MySQL-5.6.47
+CREATE TABLE `mysql`.`slave_master_info` (
+ `Number_of_lines` int(10) unsigned NOT NULL COMMENT 'Number of lines in the file.',
+ `Master_log_name` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL COMMENT 'The name of the master binary log currently being read from the master.',
+ `Master_log_pos` bigint(20) unsigned NOT NULL COMMENT 'The master log position of the last read event.',
+ `Host` char(64) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '' COMMENT 'The host name of the master.',
+ `User_name` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The user name used to connect to the master.',
+ `User_password` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The password used to connect to the master.',
+ `Port` int(10) unsigned NOT NULL COMMENT 'The network port used to connect to the master.',
+ `Connect_retry` int(10) unsigned NOT NULL COMMENT 'The period (in seconds) that the slave will wait before trying to reconnect to the master.',
+ `Enabled_ssl` tinyint(1) NOT NULL COMMENT 'Indicates whether the server supports SSL connections.',
+ `Ssl_ca` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The file used for the Certificate Authority (CA) certificate.',
+ `Ssl_capath` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The path to the Certificate Authority (CA) certificates.',
+ `Ssl_cert` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The name of the SSL certificate file.',
+ `Ssl_cipher` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The name of the cipher in use for the SSL connection.',
+ `Ssl_key` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The name of the SSL key file.',
+ `Ssl_verify_server_cert` tinyint(1) NOT NULL COMMENT 'Whether to verify the server certificate.',
+ `Heartbeat` float NOT NULL,
+ `Bind` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'Displays which interface is employed when connecting to the MySQL server',
+ `Ignored_server_ids` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The number of server IDs to be ignored, followed by the actual server IDs',
+ `Uuid` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The master server uuid.',
+ `Retry_count` bigint(20) unsigned NOT NULL COMMENT 'Number of reconnect attempts, to the master, before giving up.',
+ `Ssl_crl` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The file used for the Certificate Revocation List (CRL)',
+ `Ssl_crlpath` text CHARACTER SET utf8 COLLATE utf8_bin COMMENT 'The path used for Certificate Revocation List (CRL) files',
+ `Enabled_auto_position` tinyint(1) NOT NULL COMMENT 'Indicates whether GTIDs will be used to retrieve events from the master.',
+ PRIMARY KEY (`Host`,`Port`)
+) ENGINE=InnoDB DEFAULT CHARSET=utf8 STATS_PERSISTENT=0 COMMENT='Master Information';
+
+INSERT INTO `mysql`.`slave_master_info` VALUES (23,'master-bin.000001', 120, 'localhost', 'root'," ", 13000, 60, 0," "," "," "," "," ",0 , 60," ", " ", '28e10fdd-6289-11ea-aab9-207918567a34',10," "," ", 0 );
+
+# Table structure extracted from MySQL-5.6.47
+CREATE TABLE `mysql`.`slave_relay_log_info` (
+ `Number_of_lines` int(10) unsigned NOT NULL COMMENT 'Number of lines in the file or rows in the table. Used to version table definitions.',
+ `Relay_log_name` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL COMMENT 'The name of the current relay log file.',
+ `Relay_log_pos` bigint(20) unsigned NOT NULL COMMENT 'The relay log position of the last executed event.',
+ `Master_log_name` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL COMMENT 'The name of the master binary log file from which the events in the relay log file were read.',
+ `Master_log_pos` bigint(20) unsigned NOT NULL COMMENT 'The master log position of the last executed event.',
+ `Sql_delay` int(11) NOT NULL COMMENT 'The number of seconds that the slave must lag behind the master.',
+ `Number_of_workers` int(10) unsigned NOT NULL,
+ `Id` int(10) unsigned NOT NULL COMMENT 'Internal Id that uniquely identifies this record.',
+ PRIMARY KEY (`Id`)
+) ENGINE=InnoDB DEFAULT CHARSET=utf8 STATS_PERSISTENT=0 COMMENT='Relay Log Information';
+
+INSERT INTO `mysql`.`slave_relay_log_info` VALUES (7,'./slave-relay-bin.000001',4 ," ",0, 0 ,0 , 1);
+SET SQL_LOG_BIN=1;
+--enable_query_log
+--enable_result_log
+EOF
+
+--echo ********************************************************************
+--echo * Test case1: Upgrade when repository tables have data. *
+--echo * mysql_upgrade script should report warnings. *
+--echo ********************************************************************
+--connection master
+--source $MYSQLTEST_VARDIR/tmp/slave_table_repo_init.sql
+--exec $MYSQL_UPGRADE --skip-verbose --force --user=root > $MYSQLTEST_VARDIR/log/mysql_upgrade_master.log 2>&1
+--cat_file $MYSQLTEST_VARDIR/log/mysql_upgrade_master.log
+
+--connection slave
+--source $MYSQLTEST_VARDIR/tmp/slave_table_repo_init.sql
+--exec $MYSQL_UPGRADE --skip-verbose --force --user=root > $MYSQLTEST_VARDIR/log/mysql_upgrade_slave.log 2>&1
+--cat_file $MYSQLTEST_VARDIR/log/mysql_upgrade_slave.log
+
+--connection master
+let $datadir= `select @@datadir`;
+remove_file $datadir/mysql_upgrade_info;
+TRUNCATE TABLE `mysql`.`slave_master_info`;
+TRUNCATE TABLE `mysql`.`slave_relay_log_info`;
+--remove_file $MYSQLTEST_VARDIR/log/mysql_upgrade_master.log
+--remove_file $MYSQLTEST_VARDIR/log/mysql_upgrade_slave.log
+
+--echo ********************************************************************
+--echo * Test case2: Upgrade when repository tables are empty. *
+--echo * mysql_upgrade script should not report any warning. *
+--echo ********************************************************************
+--connection master
+--exec $MYSQL_UPGRADE --skip-verbose --force --user=root > $MYSQLTEST_VARDIR/log/mysql_upgrade_master.log 2>&1
+--cat_file $MYSQLTEST_VARDIR/log/mysql_upgrade_master.log
+
+--connection slave
+--exec $MYSQL_UPGRADE --skip-verbose --force --user=root > $MYSQLTEST_VARDIR/log/mysql_upgrade_slave.log 2>&1
+--cat_file $MYSQLTEST_VARDIR/log/mysql_upgrade_slave.log
+
+--echo "====== Clean up ======"
+--connection master
+let $datadir= `select @@datadir`;
+remove_file $datadir/mysql_upgrade_info;
+DROP TABLE `mysql`.`slave_master_info`, `mysql`.`slave_relay_log_info`;
+
+--remove_file $MYSQLTEST_VARDIR/tmp/slave_table_repo_init.sql
+--remove_file $MYSQLTEST_VARDIR/log/mysql_upgrade_master.log
+--remove_file $MYSQLTEST_VARDIR/log/mysql_upgrade_slave.log
+
+--source include/rpl_end.inc
1
0

[Maria-developers] Please review MDEV-17832 Protocol: extensions for Pluggable types and JSON, GEOMETRY
by Alexander Barkov 06 Mar '20
by Alexander Barkov 06 Mar '20
06 Mar '20
Hi, Sergei, Georg,
Please review a fixed version of the patch for MDEV-17832.
There are two files attached:
- mdev-17821.v18.diff (server changes)
- mdev-17821-cli.v06.diff (libmariadb changes)
Comparing to the previous version, this version:
1. Adds a new structure MA_FIELD_EXTENSION
2. Moves extended data type information from MYSQL_FIELD
to MYSQL_FIELD::extension in the client-server implementation.
Note, in case of embedded server, the extended metadata
is stored directly to MYSQL_FIELD.
3. Adds a new API function mariadb_field_metadata_attr(),
to extract metadata from MYSQL_FIELD.
4. Changes the way how the metadata is packed on the wire
from "easily human readable" to "easily parse-able", which:
- makes the things faster
- allows to transfer arbitrary binary data in the future, if needed.
Every metadata chunk is now encoded as:
a. chunk type (1 byte)
b. chunk data length (1 byte)
c. chunk data (according to #b)
For now, two chunk types are implemented:
- data type name (used for GEOMETRY sub-types, and for INET6)
- format name (for JSON)
Thanks!
3
4

[Maria-developers] Fwd: [andrei.elkin@mariadb.com] 30a3ee1b8ce: MDEV-21469: Implement crash-safe logging of the user XA
by andrei.elkin@pp.inet.fi 04 Mar '20
by andrei.elkin@pp.inet.fi 04 Mar '20
04 Mar '20
Kristian,
Fyi, here is the XA replication event recovery part.
It's largely coded by Sujatha Sivakumar. The latest patch resides in bb-10.5-MDEV_21469.
Cheers,
Andrei
revision-id: 30a3ee1b8ce98ea33dfc0595207bf467cdb91def (mariadb-10.5.0-285-g30a3ee1b8ce)
parent(s): 77f4a1938f2a069174e07b3453fa0f99ba171a2e
author: Sujatha
committer: Andrei Elkin
timestamp: 2020-03-04 14:45:01 +0200
message:
MDEV-21469: Implement crash-safe logging of the user XA
Description: Make XA PREPARE, XA COMMIT and XA ROLLBACK statements crash-safe.
Implementation:
In order to ensure consistent replication XA statements like XA PREPARE, XA
COMMIT and XA ROLLBACK are firstly written into the binary log and then to
storage engine. In a case server crashes after writing to binary log but not
in storage engine it will lead to inconsistent state.
In order to make both binary log and engine to be consistent crash recovery
needs to be initiated. During crash recovery binary log needs to be parsed to
identify the transactions which are present only in binary log and not present
in engine. These transaction are resubmitted to engine to make it consistent
along with binary log.
---
.../r/binlog_xa_multi_binlog_crash_recovery.result | 40 +++
.../t/binlog_xa_multi_binlog_crash_recovery.test | 85 +++++
.../suite/rpl/r/rpl_xa_commit_crash_safe.result | 50 +++
.../suite/rpl/r/rpl_xa_event_apply_failure.result | 60 ++++
.../rpl/r/rpl_xa_prepare_commit_prepare.result | 48 +++
.../suite/rpl/r/rpl_xa_prepare_crash_safe.result | 62 ++++
.../rpl/r/rpl_xa_rollback_commit_crash_safe.result | 47 +++
.../suite/rpl/t/rpl_xa_commit_crash_safe.test | 98 ++++++
.../suite/rpl/t/rpl_xa_event_apply_failure.test | 119 +++++++
.../suite/rpl/t/rpl_xa_prepare_commit_prepare.test | 95 ++++++
.../suite/rpl/t/rpl_xa_prepare_crash_safe.test | 117 +++++++
.../rpl/t/rpl_xa_rollback_commit_crash_safe.test | 97 ++++++
sql/handler.cc | 29 +-
sql/handler.h | 12 +-
sql/log.cc | 379 +++++++++++++++++++--
sql/log.h | 8 +-
sql/log_event.cc | 18 +-
sql/log_event.h | 15 +-
sql/mysqld.cc | 4 +-
sql/xa.cc | 4 +
20 files changed, 1353 insertions(+), 34 deletions(-)
diff --git a/mysql-test/suite/binlog/r/binlog_xa_multi_binlog_crash_recovery.result b/mysql-test/suite/binlog/r/binlog_xa_multi_binlog_crash_recovery.result
new file mode 100644
index 00000000000..a09472aa94c
--- /dev/null
+++ b/mysql-test/suite/binlog/r/binlog_xa_multi_binlog_crash_recovery.result
@@ -0,0 +1,40 @@
+RESET MASTER;
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+connect con1,localhost,root,,;
+SET DEBUG_SYNC= "simulate_hang_after_binlog_prepare SIGNAL con1_ready WAIT_FOR con1_go";
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_prepare";
+XA START 'xa1';
+INSERT INTO t1 SET a=1;
+XA END 'xa1';
+XA PREPARE 'xa1';;
+connection default;
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+FLUSH LOGS;
+FLUSH LOGS;
+FLUSH LOGS;
+show binary logs;
+Log_name File_size
+master-bin.000001 #
+master-bin.000002 #
+master-bin.000003 #
+master-bin.000004 #
+include/show_binlog_events.inc
+Log_name Pos Event_type Server_id End_log_pos Info
+master-bin.000004 # Format_desc # # SERVER_VERSION, BINLOG_VERSION
+master-bin.000004 # Gtid_list # # [#-#-#]
+master-bin.000004 # Binlog_checkpoint # # master-bin.000001
+SET DEBUG_SYNC= "now SIGNAL con1_go";
+connection con1;
+ERROR HY000: Lost connection to MySQL server during query
+connection default;
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 3 0 xa1
+XA COMMIT 'xa1';
+SELECT * FROM t1;
+a b
+1 NULL
+connection default;
+DROP TABLE t1;
+SET debug_sync = 'reset';
diff --git a/mysql-test/suite/binlog/t/binlog_xa_multi_binlog_crash_recovery.test b/mysql-test/suite/binlog/t/binlog_xa_multi_binlog_crash_recovery.test
new file mode 100644
index 00000000000..aa8d3d04fc7
--- /dev/null
+++ b/mysql-test/suite/binlog/t/binlog_xa_multi_binlog_crash_recovery.test
@@ -0,0 +1,85 @@
+# ==== Purpose ====
+#
+# Test verifies that XA crash recovery works fine across multiple binary logs.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Generate an explicit XA transaction. Using debug simulation hold the
+# execution of XA PREPARE statement after the XA PREPARE is written to
+# the binary log. With this the prepare will not be done in engine.
+# 1 - By executing FLUSH LOGS generate multiple binary logs.
+# 2 - Now make the server to disappear at this point.
+# 3 - Restart the server. During recovery the XA PREPARE from the binary
+# log will be read. It is cross checked with engine. Since it is not
+# present in engine it will be executed once again.
+# 4 - When server is up execute XA RECOVER to check that the XA is
+# prepared in engine as well.
+# 5 - XA COMMIT the transaction and check the validity of the data.
+#
+# ==== References ====
+#
+# MDEV-21469: Implement crash-safe logging of the user XA
+#
+
+--source include/have_innodb.inc
+--source include/have_debug.inc
+--source include/have_debug_sync.inc
+--source include/have_log_bin.inc
+
+RESET MASTER;
+
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+
+connect(con1,localhost,root,,);
+SET DEBUG_SYNC= "simulate_hang_after_binlog_prepare SIGNAL con1_ready WAIT_FOR con1_go";
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_prepare";
+XA START 'xa1';
+INSERT INTO t1 SET a=1;
+XA END 'xa1';
+--send XA PREPARE 'xa1';
+
+connection default;
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+FLUSH LOGS;
+FLUSH LOGS;
+FLUSH LOGS;
+
+--source include/show_binary_logs.inc
+--let $binlog_file= master-bin.000004
+--let $binlog_start= 4
+--source include/show_binlog_events.inc
+
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+SET DEBUG_SYNC= "now SIGNAL con1_go";
+--source include/wait_until_disconnected.inc
+
+--connection con1
+--error 2013
+--reap
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+XA RECOVER;
+XA COMMIT 'xa1';
+
+SELECT * FROM t1;
+
+# Clean up.
+connection default;
+DROP TABLE t1;
+SET debug_sync = 'reset';
diff --git a/mysql-test/suite/rpl/r/rpl_xa_commit_crash_safe.result b/mysql-test/suite/rpl/r/rpl_xa_commit_crash_safe.result
new file mode 100644
index 00000000000..27d043270ea
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa_commit_crash_safe.result
@@ -0,0 +1,50 @@
+include/master-slave.inc
+[connection master]
+connect master2,localhost,root,,;
+connection master;
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+XA START 'xa1';
+INSERT INTO t VALUES (20);
+XA END 'xa1';
+XA PREPARE 'xa1';
+XA COMMIT 'xa1';
+connection slave;
+include/stop_slave.inc
+connection master1;
+XA START 'xa2';
+INSERT INTO t VALUES (40);
+XA END 'xa2';
+XA PREPARE 'xa2';
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_commit";
+XA COMMIT 'xa2';
+ERROR HY000: Lost connection to MySQL server during query
+connection master1;
+connection master;
+connection default;
+connection server_1;
+connection master;
+connection slave;
+include/start_slave.inc
+connection master;
+SELECT * FROM t;
+f
+20
+40
+XA RECOVER;
+formatID gtrid_length bqual_length data
+XA COMMIT 'xa2';
+ERROR XAE04: XAER_NOTA: Unknown XID
+SELECT * FROM t;
+f
+20
+40
+connection slave;
+SELECT * FROM t;
+f
+20
+40
+connection master;
+DROP TABLE t;
+connection slave;
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_xa_event_apply_failure.result b/mysql-test/suite/rpl/r/rpl_xa_event_apply_failure.result
new file mode 100644
index 00000000000..547a0aae9a6
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa_event_apply_failure.result
@@ -0,0 +1,60 @@
+include/master-slave.inc
+[connection master]
+connect master2,localhost,root,,;
+connection master;
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+CALL mtr.add_suppression("Failed to execute binlog query event");
+CALL mtr.add_suppression("Recovery: Error .Out of memory..");
+CALL mtr.add_suppression("Crash recovery failed.");
+CALL mtr.add_suppression("Can.t init tc log");
+CALL mtr.add_suppression("Aborting");
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+XA START 'xa1';
+INSERT INTO t VALUES (20);
+XA END 'xa1';
+XA PREPARE 'xa1';
+XA COMMIT 'xa1';
+connection slave;
+include/stop_slave.inc
+connection master1;
+XA START 'xa2';
+INSERT INTO t VALUES (40);
+XA END 'xa2';
+XA PREPARE 'xa2';
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_commit";
+XA COMMIT 'xa2';
+ERROR HY000: Lost connection to MySQL server during query
+connection master1;
+connection master;
+connection default;
+connection default;
+connection master;
+*** must be no 'xa2' commit seen, as it's still prepared:
+SELECT * FROM t;
+f
+20
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 3 0 xa2
+SET GLOBAL DEBUG_DBUG="";
+SET SQL_LOG_BIN=0;
+XA COMMIT 'xa2';
+SET SQL_LOG_BIN=1;
+connection server_1;
+connection master;
+connection slave;
+include/start_slave.inc
+connection master;
+SELECT * FROM t;
+f
+20
+40
+connection slave;
+SELECT * FROM t;
+f
+20
+40
+connection master;
+DROP TABLE t;
+connection slave;
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_xa_prepare_commit_prepare.result b/mysql-test/suite/rpl/r/rpl_xa_prepare_commit_prepare.result
new file mode 100644
index 00000000000..9ba24716639
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa_prepare_commit_prepare.result
@@ -0,0 +1,48 @@
+include/master-slave.inc
+[connection master]
+connect master2,localhost,root,,;
+connection master;
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+XA START 'xa1';
+INSERT INTO t VALUES (20);
+XA END 'xa1';
+XA PREPARE 'xa1';
+XA COMMIT 'xa1';
+connection slave;
+include/stop_slave.inc
+connection master1;
+XA START 'xa2';
+INSERT INTO t VALUES (40);
+XA END 'xa2';
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_prepare";
+XA PREPARE 'xa2';
+ERROR HY000: Lost connection to MySQL server during query
+connection master1;
+connection master;
+connection default;
+connection server_1;
+connection master;
+connection slave;
+include/start_slave.inc
+connection master;
+SELECT * FROM t;
+f
+20
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 3 0 xa2
+XA COMMIT 'xa2';
+SELECT * FROM t;
+f
+20
+40
+connection slave;
+SELECT * FROM t;
+f
+20
+40
+connection master;
+DROP TABLE t;
+connection slave;
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_xa_prepare_crash_safe.result b/mysql-test/suite/rpl/r/rpl_xa_prepare_crash_safe.result
new file mode 100644
index 00000000000..99baf59a3c1
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa_prepare_crash_safe.result
@@ -0,0 +1,62 @@
+include/master-slave.inc
+[connection master]
+connect master2,localhost,root,,;
+connection master;
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+CALL mtr.add_suppression("Found 2 prepared XA transactions");
+CALL mtr.add_suppression("Found 3 prepared XA transactions");
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+XA START 'xa1';
+INSERT INTO t VALUES (20);
+XA END 'xa1';
+XA PREPARE 'xa1';
+connection slave;
+include/stop_slave.inc
+connection master1;
+use test;
+xa start 'xa2';
+insert into t values (30);
+xa end 'xa2';
+SET DEBUG_SYNC="simulate_hang_after_binlog_prepare SIGNAL reached WAIT_FOR go";
+xa prepare 'xa2';
+connection master2;
+XA START 'xa3';
+INSERT INTO t VALUES (40);
+XA END 'xa3';
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_prepare";
+XA PREPARE 'xa3';
+ERROR HY000: Lost connection to MySQL server during query
+connection master1;
+ERROR HY000: Lost connection to MySQL server during query
+connection master;
+connection default;
+connection server_1;
+connection master;
+connection slave;
+include/start_slave.inc
+connection master;
+SELECT * FROM t;
+f
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 3 0 xa3
+1 3 0 xa1
+1 3 0 xa2
+XA COMMIT 'xa1';
+XA COMMIT 'xa2';
+XA COMMIT 'xa3';
+SELECT * FROM t;
+f
+20
+30
+40
+connection slave;
+SELECT * FROM t;
+f
+20
+30
+40
+connection master;
+DROP TABLE t;
+connection slave;
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_xa_rollback_commit_crash_safe.result b/mysql-test/suite/rpl/r/rpl_xa_rollback_commit_crash_safe.result
new file mode 100644
index 00000000000..bc48c84e1c7
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa_rollback_commit_crash_safe.result
@@ -0,0 +1,47 @@
+include/master-slave.inc
+[connection master]
+connect master2,localhost,root,,;
+connection master;
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+XA START 'xa1';
+INSERT INTO t VALUES (20);
+XA END 'xa1';
+XA PREPARE 'xa1';
+XA COMMIT 'xa1';
+connection slave;
+include/stop_slave.inc
+connection master1;
+XA START 'xa2';
+INSERT INTO t VALUES (40);
+XA END 'xa2';
+XA PREPARE 'xa2';
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_rollback";
+XA ROLLBACK 'xa2';
+ERROR HY000: Lost connection to MySQL server during query
+connection master1;
+connection master;
+connection default;
+connection server_1;
+connection master;
+connection slave;
+include/start_slave.inc
+connection master;
+SELECT * FROM t;
+f
+20
+XA RECOVER;
+formatID gtrid_length bqual_length data
+XA ROLLBACK 'xa2';
+ERROR XAE04: XAER_NOTA: Unknown XID
+SELECT * FROM t;
+f
+20
+connection slave;
+SELECT * FROM t;
+f
+20
+connection master;
+DROP TABLE t;
+connection slave;
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_xa_commit_crash_safe.test b/mysql-test/suite/rpl/t/rpl_xa_commit_crash_safe.test
new file mode 100644
index 00000000000..b9e3b0d3d0d
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_commit_crash_safe.test
@@ -0,0 +1,98 @@
+# ==== Purpose ====
+#
+# Test verifies that XA COMMIT statements are crash safe.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Generate 2 explicit XA transactions. 'xa1' and 'xa2'.
+# 'xa1' will be prepared and committed.
+# 1 - For 'xa2' let the XA COMMIT be done in binary log and crash the
+# server so that it is not committed in engine.
+# 2 - Restart the server. The recovery code should successfully recover
+# 'xa2'. The COMMIT should be executed during recovery.
+# 3 - Check the data in table. Both rows should be present in table.
+# 4 - Trying to commit 'xa2' should report unknown 'XA' error as COMMIT is
+# already complete during recovery.
+#
+# ==== References ====
+#
+# MDEV-21469: Implement crash-safe logging of the user XA
+
+
+--source include/have_innodb.inc
+--source include/master-slave.inc
+--source include/have_debug.inc
+
+connect (master2,localhost,root,,);
+--connection master
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+XA START 'xa1';
+INSERT INTO t VALUES (20);
+XA END 'xa1';
+XA PREPARE 'xa1';
+XA COMMIT 'xa1';
+--sync_slave_with_master
+--source include/stop_slave.inc
+
+--connection master1
+XA START 'xa2';
+INSERT INTO t VALUES (40);
+XA END 'xa2';
+XA PREPARE 'xa2';
+
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_commit";
+--error 2013 # CR_SERVER_LOST
+XA COMMIT 'xa2';
+--source include/wait_until_disconnected.inc
+
+--connection master1
+--source include/wait_until_disconnected.inc
+
+--connection master
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+# rpl_end.inc needs to use the connection server_1
+connection server_1;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection master
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection slave
+--source include/start_slave.inc
+--sync_with_master
+
+--connection master
+SELECT * FROM t;
+XA RECOVER;
+--error 1397 # ER_XAER_NOTA
+XA COMMIT 'xa2';
+SELECT * FROM t;
+--sync_slave_with_master
+
+SELECT * FROM t;
+
+--connection master
+DROP TABLE t;
+--sync_slave_with_master
+--source include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_xa_event_apply_failure.test b/mysql-test/suite/rpl/t/rpl_xa_event_apply_failure.test
new file mode 100644
index 00000000000..71d0de0fc56
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_event_apply_failure.test
@@ -0,0 +1,119 @@
+# ==== Purpose ====
+#
+# Test verifies that if for some reason an event cannot be applied during
+# recovery, appropriate error is reported.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Generate 2 explicit XA transactions. 'xa1' and 'xa2'.
+# 'xa1' will be prepared and committed.
+# 1 - For 'xa2' let the XA COMMIT be done in binary log and crash the
+# server so that it is not committed in engine.
+# 2 - Restart the server. Using debug simulation point make XA COMMIT 'xa2'
+# execution to fail. The server will resume anyway
+# to leave the error in the errlog (see "Recovery: Error..").
+# 3 - Work around the simulated failure with Commit once again
+# from a connection that turns OFF binlogging.
+# Slave must catch up with the master.
+#
+# ==== References ====
+#
+# MDEV-21469: Implement crash-safe logging of the user XA
+
+
+--source include/have_innodb.inc
+--source include/master-slave.inc
+--source include/have_debug.inc
+
+connect (master2,localhost,root,,);
+--connection master
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+CALL mtr.add_suppression("Failed to execute binlog query event");
+CALL mtr.add_suppression("Recovery: Error .Out of memory..");
+CALL mtr.add_suppression("Crash recovery failed.");
+CALL mtr.add_suppression("Can.t init tc log");
+CALL mtr.add_suppression("Aborting");
+
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+XA START 'xa1';
+INSERT INTO t VALUES (20);
+XA END 'xa1';
+XA PREPARE 'xa1';
+XA COMMIT 'xa1';
+--sync_slave_with_master
+--source include/stop_slave.inc
+
+--connection master1
+XA START 'xa2';
+INSERT INTO t VALUES (40);
+XA END 'xa2';
+XA PREPARE 'xa2';
+
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_commit";
+--error 2013 # CR_SERVER_LOST
+XA COMMIT 'xa2';
+--source include/wait_until_disconnected.inc
+
+--connection master1
+--source include/wait_until_disconnected.inc
+
+--connection master
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart: --debug-dbug=d,trans_xa_commit_fail
+EOF
+
+connection default;
+--source include/wait_until_disconnected.inc
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection master
+--enable_reconnect
+--echo *** must be no 'xa2' commit seen, as it's still prepared:
+SELECT * FROM t;
+XA RECOVER;
+
+# Commit it manually now to work around the extra binlog record
+# by turning binlogging OFF by the connection.
+
+SET GLOBAL DEBUG_DBUG="";
+SET SQL_LOG_BIN=0;
+--error 0
+XA COMMIT 'xa2';
+SET SQL_LOG_BIN=1;
+
+
+# rpl_end.inc needs to use the connection server_1
+connection server_1;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection master
+--source include/wait_until_connected_again.inc
+
+--connection slave
+--source include/start_slave.inc
+--sync_with_master
+
+--connection master
+SELECT * FROM t;
+
+--sync_slave_with_master
+SELECT * FROM t;
+
+--connection master
+DROP TABLE t;
+--sync_slave_with_master
+--source include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_xa_prepare_commit_prepare.test b/mysql-test/suite/rpl/t/rpl_xa_prepare_commit_prepare.test
new file mode 100644
index 00000000000..7b987c7f29b
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_prepare_commit_prepare.test
@@ -0,0 +1,95 @@
+# ==== Purpose ====
+#
+# Test verifies that XA PREPARE transactions are crash safe.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Generate 2 explicit XA transactions. 'xa1' and 'xa2'.
+# 'xa1' will be prepared and committed.
+# 1 - For 'xa2' let the XA PREPARE be done in binary log and crash the
+# server so that it is not prepared in engine.
+# 2 - Restart the server. The recovery code should successfully recover
+# 'xa2'.
+# 3 - When server is up, execute XA RECOVER and verify that 'xa2' is
+# present.
+# 4 - Commit the XA transaction and verify its correctness.
+#
+# ==== References ====
+#
+# MDEV-21469: Implement crash-safe logging of the user XA
+
+--source include/have_innodb.inc
+--source include/master-slave.inc
+--source include/have_debug.inc
+
+connect (master2,localhost,root,,);
+--connection master
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+XA START 'xa1';
+INSERT INTO t VALUES (20);
+XA END 'xa1';
+XA PREPARE 'xa1';
+XA COMMIT 'xa1';
+--sync_slave_with_master
+--source include/stop_slave.inc
+
+--connection master1
+XA START 'xa2';
+INSERT INTO t VALUES (40);
+XA END 'xa2';
+
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_prepare";
+--error 2013 # CR_SERVER_LOST
+XA PREPARE 'xa2';
+--source include/wait_until_disconnected.inc
+
+--connection master1
+--source include/wait_until_disconnected.inc
+
+--connection master
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+# rpl_end.inc needs to use the connection server_1
+connection server_1;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection master
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection slave
+--source include/start_slave.inc
+--sync_with_master
+
+--connection master
+SELECT * FROM t;
+XA RECOVER;
+XA COMMIT 'xa2';
+SELECT * FROM t;
+--sync_slave_with_master
+
+SELECT * FROM t;
+
+--connection master
+DROP TABLE t;
+--sync_slave_with_master
+--source include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_xa_prepare_crash_safe.test b/mysql-test/suite/rpl/t/rpl_xa_prepare_crash_safe.test
new file mode 100644
index 00000000000..9d2c5cce528
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_prepare_crash_safe.test
@@ -0,0 +1,117 @@
+# ==== Purpose ====
+#
+# Test verifies that XA PREPARE transactions are crash safe.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Generate 3 explicit XA transactions. 'xa1', 'xa2' and 'xa3'.
+# Using debug simulation hold the execution of second XA PREPARE
+# statement after the XA PREPARE is written to the binary log.
+# With this the prepare will not be done in engine.
+# 1 - For 'xa3' allow the PREPARE statement to be written to binary log and
+# simulate server crash.
+# 2 - Restart the server. The recovery code should successfully recover
+# 'xa2' and 'xa3'.
+# 3 - When server is up, execute XA RECOVER and verify that 'xa2' and 'xa3'
+# are present along with 'xa1'.
+# 4 - Commit all the XA transactions and verify their correctness.
+#
+# ==== References ====
+#
+# MDEV-21469: Implement crash-safe logging of the user XA
+
+
+--source include/have_innodb.inc
+--source include/master-slave.inc
+--source include/have_debug.inc
+
+connect (master2,localhost,root,,);
+--connection master
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+CALL mtr.add_suppression("Found 2 prepared XA transactions");
+CALL mtr.add_suppression("Found 3 prepared XA transactions");
+
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+XA START 'xa1';
+INSERT INTO t VALUES (20);
+XA END 'xa1';
+XA PREPARE 'xa1';
+--sync_slave_with_master
+--source include/stop_slave.inc
+
+--connection master1
+use test;
+xa start 'xa2';
+insert into t values (30);
+xa end 'xa2';
+SET DEBUG_SYNC="simulate_hang_after_binlog_prepare SIGNAL reached WAIT_FOR go";
+send xa prepare 'xa2';
+
+--connection master2
+let $wait_condition=
+ SELECT COUNT(*) = 1 FROM INFORMATION_SCHEMA.PROCESSLIST
+ WHERE STATE like "debug sync point: simulate_hang_after_binlog_prepare%";
+--source include/wait_condition.inc
+
+XA START 'xa3';
+INSERT INTO t VALUES (40);
+XA END 'xa3';
+
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_prepare";
+--error 2013 # CR_SERVER_LOST
+XA PREPARE 'xa3';
+--source include/wait_until_disconnected.inc
+
+--connection master1
+--error 2013
+--reap
+--source include/wait_until_disconnected.inc
+
+--connection master
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+# rpl_end.inc needs to use the connection server_1
+connection server_1;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection master
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+
+--connection slave
+--source include/start_slave.inc
+--sync_with_master
+
+--connection master
+SELECT * FROM t;
+XA RECOVER;
+XA COMMIT 'xa1';
+XA COMMIT 'xa2';
+XA COMMIT 'xa3';
+SELECT * FROM t;
+--sync_slave_with_master
+
+SELECT * FROM t;
+
+--connection master
+DROP TABLE t;
+--sync_slave_with_master
+--source include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_xa_rollback_commit_crash_safe.test b/mysql-test/suite/rpl/t/rpl_xa_rollback_commit_crash_safe.test
new file mode 100644
index 00000000000..6416602da5e
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_rollback_commit_crash_safe.test
@@ -0,0 +1,97 @@
+# ==== Purpose ====
+#
+# Test verifies that XA COMMIT statements are crash safe.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Generate 2 explicit XA transactions. 'xa1' and 'xa2'.
+# 'xa1' will be prepared and committed.
+# 1 - For 'xa2' let the XA ROLLBACK be done in binary log and crash the
+# server so that it is not committed in engine.
+# 2 - Restart the server. The recovery code should successfully recover
+# 'xa2'. The ROLLBACK should be executed during recovery.
+# 3 - Check the data in table. Only one row should be present in table.
+# 4 - Trying to rollback 'xa2' should report unknown 'XA' error as rollback
+# is already complete during recovery.
+#
+# ==== References ====
+#
+# MDEV-21469: Implement crash-safe logging of the user XA
+
+--source include/have_innodb.inc
+--source include/master-slave.inc
+--source include/have_debug.inc
+
+connect (master2,localhost,root,,);
+--connection master
+CALL mtr.add_suppression("Found 1 prepared XA transactions");
+
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+XA START 'xa1';
+INSERT INTO t VALUES (20);
+XA END 'xa1';
+XA PREPARE 'xa1';
+XA COMMIT 'xa1';
+--sync_slave_with_master
+--source include/stop_slave.inc
+
+--connection master1
+XA START 'xa2';
+INSERT INTO t VALUES (40);
+XA END 'xa2';
+XA PREPARE 'xa2';
+
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+SET GLOBAL DEBUG_DBUG="d,simulate_crash_after_binlog_rollback";
+--error 2013 # CR_SERVER_LOST
+XA ROLLBACK 'xa2';
+--source include/wait_until_disconnected.inc
+
+--connection master1
+--source include/wait_until_disconnected.inc
+
+--connection master
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+# rpl_end.inc needs to use the connection server_1
+connection server_1;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection master
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection slave
+--source include/start_slave.inc
+--sync_with_master
+
+--connection master
+SELECT * FROM t;
+XA RECOVER;
+--error 1397 # ER_XAER_NOTA
+XA ROLLBACK 'xa2';
+SELECT * FROM t;
+--sync_slave_with_master
+
+SELECT * FROM t;
+
+--connection master
+DROP TABLE t;
+--sync_slave_with_master
+--source include/rpl_end.inc
diff --git a/sql/handler.cc b/sql/handler.cc
index a1719f9b922..c997c52e602 100644
--- a/sql/handler.cc
+++ b/sql/handler.cc
@@ -1290,6 +1290,9 @@ int ha_prepare(THD *thd)
error=1;
break;
}
+ DEBUG_SYNC(thd, "simulate_hang_after_binlog_prepare");
+ DBUG_EXECUTE_IF("simulate_crash_after_binlog_prepare",
+ DBUG_SUICIDE(););
}
else
{
@@ -1795,6 +1798,8 @@ commit_one_phase_2(THD *thd, bool all, THD_TRANS *trans, bool is_real_trans)
++count;
ha_info_next= ha_info->next();
ha_info->reset(); /* keep it conveniently zero-filled */
+ DBUG_EXECUTE_IF("simulate_crash_after_binlog_commit",
+ DBUG_SUICIDE(););
}
trans->ha_list= 0;
trans->no_2pc=0;
@@ -1908,6 +1913,8 @@ int ha_rollback_trans(THD *thd, bool all)
status_var_increment(thd->status_var.ha_rollback_count);
ha_info_next= ha_info->next();
ha_info->reset(); /* keep it conveniently zero-filled */
+ DBUG_EXECUTE_IF("simulate_crash_after_binlog_rollback",
+ DBUG_SUICIDE(););
}
trans->ha_list= 0;
trans->no_2pc=0;
@@ -2107,6 +2114,7 @@ struct xarecover_st
int len, found_foreign_xids, found_my_xids;
XID *list;
HASH *commit_list;
+ HASH *xa_prepared_list;
bool dry_run;
};
@@ -2155,7 +2163,23 @@ static my_bool xarecover_handlerton(THD *unused, plugin_ref plugin,
_db_doprnt_("ignore xid %s", xid_to_str(buf, info->list+i));
});
xid_cache_insert(info->list + i, true);
+ XID *foreign_xid= info->list + i;
info->found_foreign_xids++;
+
+ /*
+ For each foreign xid prepraed in engine, check if it is present in
+ xa_prepared_list sent by binlog.
+ */
+ if (info->xa_prepared_list)
+ {
+ struct xa_recovery_member *member= NULL;
+ if ((member= (xa_recovery_member *)
+ my_hash_search(info->xa_prepared_list, foreign_xid->key(),
+ foreign_xid->key_length())))
+ {
+ member->in_engine_prepare= true;
+ }
+ }
continue;
}
if (IF_WSREP(!(wsrep_emulate_bin_log &&
@@ -2202,12 +2226,13 @@ static my_bool xarecover_handlerton(THD *unused, plugin_ref plugin,
return FALSE;
}
-int ha_recover(HASH *commit_list)
+int ha_recover(HASH *commit_list, HASH *xa_prepared_list)
{
struct xarecover_st info;
DBUG_ENTER("ha_recover");
info.found_foreign_xids= info.found_my_xids= 0;
info.commit_list= commit_list;
+ info.xa_prepared_list= xa_prepared_list;
info.dry_run= (info.commit_list==0 && tc_heuristic_recover==0);
info.list= NULL;
@@ -2254,7 +2279,7 @@ int ha_recover(HASH *commit_list)
info.found_my_xids, opt_tc_log_file);
DBUG_RETURN(1);
}
- if (info.commit_list)
+ if (info.commit_list && !info.found_foreign_xids)
sql_print_information("Crash recovery finished.");
DBUG_RETURN(0);
}
diff --git a/sql/handler.h b/sql/handler.h
index 92c2a61ed0e..4ff08d7bd08 100644
--- a/sql/handler.h
+++ b/sql/handler.h
@@ -521,6 +521,8 @@ enum legacy_db_type
DB_TYPE_FIRST_DYNAMIC=45,
DB_TYPE_DEFAULT=127 // Must be last
};
+
+enum xa_binlog_state {XA_PREPARE=0, XA_COMPLETE};
/*
Better name for DB_TYPE_UNKNOWN. Should be used for engines that do not have
a hard-coded type value here.
@@ -806,7 +808,6 @@ struct st_system_tablename
const char *tablename;
};
-
typedef ulonglong my_xid; // this line is the same as in log_event.h
#define MYSQL_XID_PREFIX "MySQLXid"
#define MYSQL_XID_PREFIX_LEN 8 // must be a multiple of 8
@@ -898,6 +899,13 @@ struct xid_t {
};
typedef struct xid_t XID;
+struct xa_recovery_member
+{
+ XID xid;
+ enum xa_binlog_state state;
+ bool in_engine_prepare;
+};
+
/* for recover() handlerton call */
#define MIN_XID_LIST_SIZE 128
#define MAX_XID_LIST_SIZE (1024*128)
@@ -4996,7 +5004,7 @@ int ha_commit_one_phase(THD *thd, bool all);
int ha_commit_trans(THD *thd, bool all);
int ha_rollback_trans(THD *thd, bool all);
int ha_prepare(THD *thd);
-int ha_recover(HASH *commit_list);
+int ha_recover(HASH *commit_list, HASH *xa_recover_list);
/* transactions: these functions never call handlerton functions directly */
int ha_enable_transaction(THD *thd, bool on);
diff --git a/sql/log.cc b/sql/log.cc
index e13f8fbc88f..ed6dc87b262 100644
--- a/sql/log.cc
+++ b/sql/log.cc
@@ -38,6 +38,7 @@
#include "log_event.h" // Query_log_event
#include "rpl_filter.h"
#include "rpl_rli.h"
+#include "rpl_mi.h"
#include "sql_audit.h"
#include "mysqld.h"
@@ -3406,6 +3407,8 @@ MYSQL_BIN_LOG::MYSQL_BIN_LOG(uint *sync_period)
index_file_name[0] = 0;
bzero((char*) &index_file, sizeof(index_file));
bzero((char*) &purge_index_file, sizeof(purge_index_file));
+ /* non-zero is a marker to conduct xa recovery and related cleanup */
+ xa_recover_list.records= 0;
}
void MYSQL_BIN_LOG::stop_background_thread()
@@ -3467,6 +3470,11 @@ void MYSQL_BIN_LOG::cleanup()
mysql_cond_destroy(&COND_xid_list);
mysql_cond_destroy(&COND_binlog_background_thread);
mysql_cond_destroy(&COND_binlog_background_thread_end);
+ if (!is_relay_log && xa_recover_list.records)
+ {
+ free_root(&mem_root, MYF(0));
+ my_hash_free(&xa_recover_list);
+ }
}
/*
@@ -8028,7 +8036,7 @@ MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader)
/* Now we have in queue the list of transactions to be committed in order. */
}
-
+
DBUG_ASSERT(is_open());
if (likely(is_open())) // Should always be true
{
@@ -9717,7 +9725,7 @@ int TC_LOG_MMAP::recover()
goto err2; // OOM
}
- if (ha_recover(&xids))
+ if (ha_recover(&xids, 0))
goto err2;
my_hash_free(&xids);
@@ -9758,7 +9766,7 @@ int TC_LOG::using_heuristic_recover()
return 0;
sql_print_information("Heuristic crash recovery mode");
- if (ha_recover(0))
+ if (ha_recover(0, 0))
sql_print_error("Heuristic crash recovery failed");
sql_print_information("Please restart mysqld without --tc-heuristic-recover");
return 1;
@@ -10217,14 +10225,108 @@ start_binlog_background_thread()
return 0;
}
+/**
+ Auxiliary function for ::recover().
+ @returns a successfully created and inserted @c xa_recovery_member
+ into hash @c hash_arg,
+ or NULL.
+*/
+static xa_recovery_member*
+xa_member_insert(HASH *hash_arg, xid_t *xid_arg, xa_binlog_state state_arg,
+ MEM_ROOT *ptr_mem_root)
+{
+ xa_recovery_member *member= (xa_recovery_member*)
+ alloc_root(ptr_mem_root, sizeof(xa_recovery_member));
+ if (!member)
+ return NULL;
+
+ member->xid.set(xid_arg);
+ member->state= state_arg;
+ member->in_engine_prepare= false;
+ return my_hash_insert(hash_arg, (uchar*) member) ? NULL : member;
+}
+/* Inserts or update an existing hash member with a proper state */
+static bool xa_member_replace(HASH *hash_arg, xid_t *xid_arg, bool is_prepare,
+ MEM_ROOT *ptr_mem_root)
+{
+ if(is_prepare)
+ {
+ if (!(xa_member_insert(hash_arg, xid_arg, XA_PREPARE, ptr_mem_root)))
+ return true;
+ }
+ else
+ {
+ /*
+ Search if XID is already present in recovery_list. If found
+ and the state is 'XA_PREPRAED' mark it as XA_COMPLETE.
+ Effectively, there won't be XA-prepare event group replay.
+ */
+ xa_recovery_member* member;
+ if ((member= (xa_recovery_member *)
+ my_hash_search(hash_arg, xid_arg->key(), xid_arg->key_length())))
+ {
+ if (member->state == XA_PREPARE)
+ member->state= XA_COMPLETE;
+ }
+ else // We found only XA COMMIT during recovery insert to list
+ {
+ if (!(member= xa_member_insert(hash_arg,
+ xid_arg, XA_COMPLETE, ptr_mem_root)))
+ return true;
+ }
+ }
+ return false;
+}
+
+extern "C" uchar *xid_get_var_key(xid_t *entry, size_t *length,
+ my_bool not_used __attribute__((unused)))
+{
+ *length= entry->key_length();
+ return (uchar*) entry->key();
+}
+
+/**
+ Performs recovery based on transaction coordinator log for 2pc. At the
+ time of crash, if the binary log was in active state, then recovery for
+ 'xid's and explicit 'XA' transactions is initiated, otherwise the gtid
+ binlog state is updated. For 'xid' and 'XA' based recovery following steps
+ are performed.
+
+ Look for latest binlog checkpoint file. There can be two cases. The active
+ binary log and the latest binlog checkpoint file can be the same.
+
+ Scan the binary log from the beginning.
+ From GTID_LIST and GTID_EVENTs reconstruct the gtid binlog state.
+ Prepare a list of 'xid's for recovery.
+ Prepare a list of explicit 'XA' transactions for recovery.
+ Recover the 'xid' transactions.
+ The explicit 'XA' transaction recovery is initiated once all the server
+ components are initialized. Please check 'execute_xa_for_recovery()'.
+
+ Called from @c MYSQL_BIN_LOG::do_binlog_recovery()
+
+ @param linfo Store here the found log file name and position to
+ the NEXT log file name in the index file.
+
+ @param last_log_name Name of the last active binary log at the time of
+ crash.
+
+ @param first_log Pointer to IO_CACHE of active binary log
+ @param fdle Format_description_log_event of active binary log
+ @param do_xa Is 2pc recovery needed for 'xid's and explicit XA
+ transactions.
+ @return indicates success or failure of recovery.
+ @retval 0 success
+ @retval 1 failure
+
+*/
int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
IO_CACHE *first_log,
Format_description_log_event *fdle, bool do_xa)
{
Log_event *ev= NULL;
HASH xids;
- MEM_ROOT mem_root;
char binlog_checkpoint_name[FN_REFLEN];
bool binlog_checkpoint_found;
bool first_round;
@@ -10237,9 +10339,17 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
bool last_gtid_valid= false;
#endif
- if (! fdle->is_valid() ||
- (do_xa && my_hash_init(&xids, &my_charset_bin, TC_LOG_PAGE_SIZE/3, 0,
- sizeof(my_xid), 0, 0, MYF(0))))
+ binlog_checkpoint_name[0]= 0;
+ if (!fdle->is_valid() ||
+ (do_xa &&
+ (my_hash_init(&xids, &my_charset_bin, TC_LOG_PAGE_SIZE/3,
+ 0,
+ sizeof(my_xid), 0, 0, MYF(0)) ||
+ my_hash_init(&xa_recover_list,
+ &my_charset_bin,
+ TC_LOG_PAGE_SIZE/3,
+ 0, 0,
+ (my_hash_get_key) xid_get_var_key, 0, MYF(0)))))
goto err1;
if (do_xa)
@@ -10313,21 +10423,29 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
#ifdef HAVE_REPLICATION
case GTID_EVENT:
- if (first_round)
{
Gtid_log_event *gev= (Gtid_log_event *)ev;
-
- /* Update the binlog state with any GTID logged after Gtid_list. */
- last_gtid.domain_id= gev->domain_id;
- last_gtid.server_id= gev->server_id;
- last_gtid.seq_no= gev->seq_no;
- last_gtid_standalone=
- ((gev->flags2 & Gtid_log_event::FL_STANDALONE) ? true : false);
- last_gtid_valid= true;
+ if (first_round)
+ {
+ /* Update the binlog state with any GTID logged after Gtid_list. */
+ last_gtid.domain_id= gev->domain_id;
+ last_gtid.server_id= gev->server_id;
+ last_gtid.seq_no= gev->seq_no;
+ last_gtid_standalone=
+ ((gev->flags2 & Gtid_log_event::FL_STANDALONE) ? true : false);
+ last_gtid_valid= true;
+ }
+ if (do_xa &&
+ (gev->flags2 &
+ (Gtid_log_event::FL_PREPARED_XA |
+ Gtid_log_event::FL_COMPLETED_XA)) &&
+ xa_member_replace(&xa_recover_list, &gev->xid,
+ gev->flags2 & Gtid_log_event::FL_PREPARED_XA,
+ &mem_root))
+ goto err2;
+ break;
}
- break;
#endif
-
case START_ENCRYPTION_EVENT:
{
if (fdle->start_decryption((Start_encryption_log_event*) ev))
@@ -10417,10 +10535,22 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
if (do_xa)
{
- if (ha_recover(&xids))
+ if (ha_recover(&xids, &xa_recover_list))
goto err2;
- free_root(&mem_root, MYF(0));
+ DBUG_ASSERT(!xa_recover_list.records ||
+ (binlog_checkpoint_found && binlog_checkpoint_name[0] != 0));
+
+ if (!xa_recover_list.records)
+ {
+ free_root(&mem_root, MYF(0));
+ my_hash_free(&xa_recover_list);
+ }
+ else
+ {
+ xa_binlog_checkpoint_name= strmake_root(&mem_root, binlog_checkpoint_name,
+ strlen(binlog_checkpoint_name));
+ }
my_hash_free(&xids);
}
return 0;
@@ -10436,6 +10566,7 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
{
free_root(&mem_root, MYF(0));
my_hash_free(&xids);
+ my_hash_free(&xa_recover_list);
}
err1:
sql_print_error("Crash recovery failed. Either correct the problem "
@@ -10445,6 +10576,214 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
return 1;
}
+void MYSQL_BIN_LOG::execute_xa_for_recovery()
+{
+ if (xa_recover_list.records)
+ (void) recover_explicit_xa_prepare();
+ free_root(&mem_root, MYF(0));
+ my_hash_free(&xa_recover_list);
+};
+
+/**
+ Performs recovery of explict XA transactions.
+ 'xa_recover_list' contains the list of XA transactions to be recovered.
+ These events are replayed from the binary log to complete the recovery.
+
+ @return indicates success or failure of recovery.
+ @retval false success
+ @retval true failure
+
+*/
+bool MYSQL_BIN_LOG::recover_explicit_xa_prepare()
+{
+#ifndef HAVE_REPLICATION
+ /* Can't be supported without replication applier built in. */
+ return false;
+#else
+ bool err= true;
+ int error=0;
+ Relay_log_info *rli= NULL;
+ rpl_group_info *rgi;
+ THD *thd= new THD(0); /* Needed by start_slave_threads */
+ thd->thread_stack= (char*) &thd;
+ thd->store_globals();
+ thd->security_ctx->skip_grants();
+ IO_CACHE log;
+ const char *errmsg;
+ File file;
+ bool enable_apply_event= false;
+ Log_event *ev = 0;
+ LOG_INFO linfo;
+ int recover_xa_count= xa_recover_list.records;
+ xa_recovery_member *member= NULL;
+
+ //DBUG_ASSERT(!thd->rli_fake);
+
+ if (!(rli= thd->rli_fake= new Relay_log_info(FALSE, "Recovery")))
+ {
+ my_error(ER_OUTOFMEMORY, MYF(ME_FATAL), 1);
+ goto err2;
+ }
+ rli->sql_driver_thd= thd;
+ static LEX_CSTRING connection_name= { STRING_WITH_LEN("Recovery") };
+ rli->mi= new Master_info(&connection_name, false);
+ if (!(rgi= thd->rgi_fake))
+ rgi= thd->rgi_fake= new rpl_group_info(rli);
+ rgi->thd= thd;
+ thd->system_thread_info.rpl_sql_info=
+ new rpl_sql_thread_info(rli->mi->rpl_filter);
+
+ if (rli && !rli->relay_log.description_event_for_exec)
+ {
+ rli->relay_log.description_event_for_exec=
+ new Format_description_log_event(4);
+ }
+ if (find_log_pos(&linfo, xa_binlog_checkpoint_name, 1))
+ {
+ sql_print_error("Binlog file '%s' not found in binlog index, needed "
+ "for recovery. Aborting.", xa_binlog_checkpoint_name);
+ goto err2;
+ }
+
+ tmp_disable_binlog(thd);
+ thd->variables.pseudo_slave_mode= TRUE;
+ for (;;)
+ {
+ if ((file= open_binlog(&log, linfo.log_file_name, &errmsg)) < 0)
+ {
+ sql_print_error("%s", errmsg);
+ goto err1;
+ }
+ while (recover_xa_count > 0 &&
+ (ev= Log_event::read_log_event(&log,
+ rli->relay_log.description_event_for_exec,
+ opt_master_verify_checksum)))
+ {
+ if (!ev->is_valid())
+ {
+ sql_print_error("Found invalid binlog query event %s"
+ " at %s:%lu; error %d %s", ev->get_type_str(),
+ linfo.log_file_name,
+ (ev->log_pos - ev->data_written));
+ goto err1;
+ }
+ enum Log_event_type typ= ev->get_type_code();
+ ev->thd= thd;
+
+ if (typ == FORMAT_DESCRIPTION_EVENT)
+ enable_apply_event= true;
+
+ if (typ == GTID_EVENT)
+ {
+ Gtid_log_event *gev= (Gtid_log_event *)ev;
+ if (gev->flags2 &
+ (Gtid_log_event::FL_PREPARED_XA | Gtid_log_event::FL_COMPLETED_XA))
+ {
+ if ((member=
+ (xa_recovery_member*) my_hash_search(&xa_recover_list,
+ gev->xid.key(),
+ gev->xid.key_length())))
+ {
+ /* Got XA PREPARE query in binlog but check member->state. If it is
+ marked as XA_PREPARE then this PREPARE has not seen its end
+ COMMIT/ROLLBACK. Check if it exists in engine in prepared state.
+ If so apply.
+ */
+ if (gev->flags2 & Gtid_log_event::FL_PREPARED_XA)
+ {
+ if (member->state == XA_PREPARE)
+ {
+ // XA is prepared in binlog and not present in engine then apply
+ if (member->in_engine_prepare == false)
+ enable_apply_event= true;
+ else
+ --recover_xa_count;
+ }
+ }
+ else if (gev->flags2 & Gtid_log_event::FL_COMPLETED_XA)
+ {
+ if (member->state == XA_COMPLETE &&
+ member->in_engine_prepare == true)
+ enable_apply_event= true;
+ else
+ --recover_xa_count;
+ }
+ }
+ }
+ }
+
+ if (enable_apply_event)
+ {
+ if (typ == XA_PREPARE_LOG_EVENT)
+ thd->transaction.xid_state.set_binlogged();
+ if ((err= ev->apply_event(rgi)))
+ {
+ sql_print_error("Failed to execute binlog query event of type: %s,"
+ " at %s:%lu; error %d %s", ev->get_type_str(),
+ linfo.log_file_name,
+ (ev->log_pos - ev->data_written),
+ thd->get_stmt_da()->sql_errno(),
+ thd->get_stmt_da()->message());
+ delete ev;
+ goto err1;
+ }
+ else if (typ == FORMAT_DESCRIPTION_EVENT)
+ enable_apply_event=false;
+ else if (thd->lex->sql_command == SQLCOM_XA_PREPARE ||
+ thd->lex->sql_command == SQLCOM_XA_COMMIT ||
+ thd->lex->sql_command == SQLCOM_XA_ROLLBACK)
+ {
+ --recover_xa_count;
+ enable_apply_event=false;
+
+ sql_print_information("Binlog event %s at %s:%lu"
+ " successfully applied",
+ typ == XA_PREPARE_LOG_EVENT ?
+ static_cast<XA_prepare_log_event *>(ev)->get_query() :
+ static_cast<Query_log_event *>(ev)->query,
+ linfo.log_file_name, (ev->log_pos - ev->data_written));
+ }
+ }
+ if (typ != FORMAT_DESCRIPTION_EVENT)
+ delete ev;
+ }
+ end_io_cache(&log);
+ mysql_file_close(file, MYF(MY_WME));
+ file= -1;
+ if (unlikely((error= find_next_log(&linfo, 1))))
+ {
+ if (error != LOG_INFO_EOF)
+ sql_print_error("find_log_pos() failed (error: %d)", error);
+ else
+ break;
+ }
+ }
+err1:
+ reenable_binlog(thd);
+ /*
+ There should be no more XA transactions to recover upon successful
+ completion.
+ */
+ if (recover_xa_count > 0)
+ goto err2;
+ sql_print_information("Crash recovery finished.");
+ err= false;
+err2:
+ if (file >= 0)
+ {
+ end_io_cache(&log);
+ mysql_file_close(file, MYF(MY_WME));
+ }
+ thd->variables.pseudo_slave_mode= FALSE;
+ delete rli->mi;
+ delete thd->system_thread_info.rpl_sql_info;
+ rgi->slave_close_thread_tables(thd);
+ thd->reset_globals();
+ delete thd;
+
+ return err;
+#endif /* !HAVE_REPLICATION */
+}
int
MYSQL_BIN_LOG::do_binlog_recovery(const char *opt_name, bool do_xa_recovery)
diff --git a/sql/log.h b/sql/log.h
index 8e70d3c8f4c..9bf3248d4c9 100644
--- a/sql/log.h
+++ b/sql/log.h
@@ -63,6 +63,7 @@ class TC_LOG
virtual int unlog(ulong cookie, my_xid xid)=0;
virtual int unlog_xa_prepare(THD *thd, bool all)= 0;
virtual void commit_checkpoint_notify(void *cookie)= 0;
+ virtual void execute_xa_for_recovery() {};
protected:
/*
@@ -708,6 +709,8 @@ class MYSQL_BIN_LOG: public TC_LOG, private MYSQL_LOG
void commit_checkpoint_notify(void *cookie);
int recover(LOG_INFO *linfo, const char *last_log_name, IO_CACHE *first_log,
Format_description_log_event *fdle, bool do_xa);
+ bool recover_explicit_xa_prepare();
+
int do_binlog_recovery(const char *opt_name, bool do_xa_recovery);
#if !defined(MYSQL_CLIENT)
@@ -932,7 +935,7 @@ class MYSQL_BIN_LOG: public TC_LOG, private MYSQL_LOG
mysql_mutex_t* get_binlog_end_pos_lock() { return &LOCK_binlog_end_pos; }
int wait_for_update_binlog_end_pos(THD* thd, struct timespec * timeout);
-
+ void execute_xa_for_recovery();
/*
Binlog position of end of the binlog.
Access to this is protected by LOCK_binlog_end_pos
@@ -945,6 +948,9 @@ class MYSQL_BIN_LOG: public TC_LOG, private MYSQL_LOG
*/
my_off_t binlog_end_pos;
char binlog_end_pos_file[FN_REFLEN];
+ MEM_ROOT mem_root;
+ char *xa_binlog_checkpoint_name;
+ HASH xa_recover_list;
};
class Log_event_handler
diff --git a/sql/log_event.cc b/sql/log_event.cc
index ee44f7f1da4..9adfefceb97 100644
--- a/sql/log_event.cc
+++ b/sql/log_event.cc
@@ -18,7 +18,7 @@
#include "mariadb.h"
#include "sql_priv.h"
-
+#include "handler.h"
#ifndef MYSQL_CLIENT
#include "unireg.h"
#include "log_event.h"
@@ -2812,9 +2812,25 @@ XA_prepare_log_event(const char* buf,
buf += sizeof(temp);
memcpy(&temp, buf, sizeof(temp));
m_xid.gtrid_length= uint4korr(&temp);
+ // Todo: validity here and elsewhere checks to be replaced by MDEV-21839 fixes
+ if (m_xid.gtrid_length < 0 || m_xid.gtrid_length > MAXGTRIDSIZE)
+ {
+ m_xid.formatID= -1;
+ return;
+ }
buf += sizeof(temp);
memcpy(&temp, buf, sizeof(temp));
m_xid.bqual_length= uint4korr(&temp);
+ if (m_xid.bqual_length < 0 || m_xid.bqual_length > MAXBQUALSIZE)
+ {
+ m_xid.formatID= -1;
+ return;
+ }
+ if (m_xid.gtrid_length + m_xid.bqual_length > XIDDATASIZE)
+ {
+ m_xid.formatID= -1;
+ return;
+ }
buf += sizeof(temp);
memcpy(m_xid.data, buf, m_xid.gtrid_length + m_xid.bqual_length);
diff --git a/sql/log_event.h b/sql/log_event.h
index a6543b70eb5..595dc9f6c3c 100644
--- a/sql/log_event.h
+++ b/sql/log_event.h
@@ -3234,6 +3234,7 @@ class XA_prepare_log_event: public Xid_apply_log_event
const Format_description_log_event *description_event);
~XA_prepare_log_event() {}
Log_event_type get_type_code() { return XA_PREPARE_LOG_EVENT; }
+ bool is_valid() const { return m_xid.formatID != -1; }
int get_data_size()
{
return xid_subheader_no_data + m_xid.gtrid_length + m_xid.bqual_length;
@@ -3241,12 +3242,7 @@ class XA_prepare_log_event: public Xid_apply_log_event
#ifdef MYSQL_SERVER
bool write();
-#endif
-
-private:
-#if defined(MYSQL_SERVER) && defined(HAVE_REPLICATION)
- char query[sizeof("XA COMMIT ONE PHASE") + 1 + ser_buf_size];
- int do_commit();
+#ifdef HAVE_REPLICATION
const char* get_query()
{
sprintf(query,
@@ -3254,6 +3250,13 @@ class XA_prepare_log_event: public Xid_apply_log_event
m_xid.serialize());
return query;
}
+#endif /* HAVE_REPLICATION */
+#endif /* MYSQL_SERVER */
+
+private:
+#if defined(MYSQL_SERVER) && defined(HAVE_REPLICATION)
+ char query[sizeof("XA COMMIT ONE PHASE") + 1 + ser_buf_size];
+ int do_commit();
#endif
};
diff --git a/sql/mysqld.cc b/sql/mysqld.cc
index b2f8afca7a6..f669a4ca5d8 100644
--- a/sql/mysqld.cc
+++ b/sql/mysqld.cc
@@ -5189,7 +5189,7 @@ static int init_server_components()
unireg_abort(1);
}
- if (ha_recover(0))
+ if (ha_recover(0, 0))
{
unireg_abort(1);
}
@@ -5606,7 +5606,7 @@ int mysqld_main(int argc, char **argv)
initialize_information_schema_acl();
execute_ddl_log_recovery();
-
+ tc_log->execute_xa_for_recovery();
/*
Change EVENTS_ORIGINAL to EVENTS_OFF (the default value) as there is no
point in using ORIGINAL during startup
diff --git a/sql/xa.cc b/sql/xa.cc
index 786d09c2b39..df7d0229157 100644
--- a/sql/xa.cc
+++ b/sql/xa.cc
@@ -581,6 +581,10 @@ bool trans_xa_commit(THD *thd)
XID_STATE &xid_state= thd->transaction.xid_state;
DBUG_ENTER("trans_xa_commit");
+ DBUG_EXECUTE_IF("trans_xa_commit_fail",
+ {my_error(ER_OUT_OF_RESOURCES, MYF(0));
+ DBUG_RETURN(TRUE);});
+
if (!xid_state.is_explicit_XA() ||
!xid_state.xid_cache_element->xid.eq(thd->lex->xid))
1
0

[Maria-developers] [andrei.elkin@mariadb.com] 25fe744e5c3: MDEV-742 XA PREPAREd transaction survive disconnect/server restart
by andrei.elkin@pp.inet.fi 04 Mar '20
by andrei.elkin@pp.inet.fi 04 Mar '20
04 Mar '20
Howdy, Kristian.
I am forwarding for your attention a patch that HEADs bb-10.5-mdev_742
branch. While it's of some size maybe you'll find some time to review
changes/extension esp done to the pararellel slave.
Looking forward to hear from you!
Thanks.
Andrei
revision-id: 25fe744e5c3d2e8f7d3db7eed37d8e412837457b (mariadb-10.5.0-283-g25fe744e5c3)
parent(s): 36cebe53a3645bf1e665ffdf5b552cabcc1e8e56
author: Andrei Elkin
committer: Andrei Elkin
timestamp: 2020-03-04 15:48:43 +0200
message:
MDEV-742 XA PREPAREd transaction survive disconnect/server restart
Lifted long standing limitation to the XA of rolling it back at the
transaction's
connection close even if the XA is prepared.
Prepared XA-transaction is made to sustain connection close or server
restart.
The patch consists of
- binary logging extension to write prepared XA part of
transaction signified with
its XID in a new XA_prepare_log_event. The concusion part -
with Commit or Rollback decision - is logged separately as
Query_log_event.
That is in the binlog the XA consists of two separate group of
events.
That makes the whole XA possibly interweaving in binlog with
other XA:s or regular transaction but with no harm to
replication and data consistency.
Gtid_log_event receives two more flags to identify which of the
two XA phases of the transaction it represents. With either flag
set also XID info is added to the event.
- engines are made aware of the server policy to keep up user
prepared XA:s so they (Innodb, rocksdb) don't roll them back
anymore at their disconnect methods.
- slave applier is refined to cope with two phase logged XA:s
including parallel modes of execution.
This patch does not address crash-safe logging of the new events which
is being addressed by MDEV-21469, its commit is to be published shortly.
There's a list of fixes to 10.5 that were required by this MDEV, incl
MDEV-21856: XID::formatID is constrained to 4 bytes by requirements of
cross-platform replication and Innodb legacy.
MDEV-21659 XA rollback 'foreign_xid' is allowed inside active XA
MDEV-21766 Forbid XID with empty 'gtrid'
MDEV-21854 xa commit 'xid' one phase for already prepared transaction must always error out
Many thanks to Alexey Botchkov for driving this work initially!
---
mysql-test/include/kill_and_restart_mysqld.inc | 15 +
mysql-test/main/flush_read_lock.result | 76 +-
mysql-test/main/flush_read_lock.test | 112 +-
mysql-test/main/xa.result | 61 +
mysql-test/main/xa.test | 49 +
mysql-test/main/xa_binlog.result | 11 +-
mysql-test/main/xa_binlog.test | 2 +-
mysql-test/main/xa_prepared_binlog_off-master.opt | 1 +
mysql-test/main/xa_prepared_binlog_off.result | 1044 +++++++++++++++++
mysql-test/main/xa_prepared_binlog_off.test | 11 +
mysql-test/main/xa_sync.result | 10 +
mysql-test/main/xa_sync.test | 5 +
.../include/binlog_xa_prepare_connection.inc | 31 +
.../include/binlog_xa_prepare_disconnect.inc | 35 +
.../include/binlog_xa_prepared_do_and_restart.inc | 323 ++++++
.../suite/binlog/r/binlog_xa_checkpoint.result | 33 +
.../suite/binlog/r/binlog_xa_prepared.result | 1176 ++++++++++++++++++++
.../binlog/r/binlog_xa_prepared_disconnect.result | 1176 ++++++++++++++++++++
.../suite/binlog/t/binlog_xa_checkpoint.test | 57 +
mysql-test/suite/binlog/t/binlog_xa_prepared.inc | 102 ++
.../binlog/t/binlog_xa_prepared_disconnect.test | 11 +
.../suite/rpl/include/rpl_xa_mixed_engines.inc | 183 +++
.../suite/rpl/r/rpl_parallel_optimistic_xa.result | 51 +
.../r/rpl_parallel_optimistic_xa_lsu_off.result | 51 +
.../suite/rpl/r/rpl_parallel_xa_same_xid.result | 23 +
mysql-test/suite/rpl/r/rpl_temporary_errors.result | 47 +-
mysql-test/suite/rpl/r/rpl_xa.result | 48 +
mysql-test/suite/rpl/r/rpl_xa_gap_lock.result | 44 +
.../suite/rpl/r/rpl_xa_gtid_pos_auto_engine.result | 64 ++
.../suite/rpl/r/rpl_xa_survive_disconnect.result | 319 ++++++
.../rpl/r/rpl_xa_survive_disconnect_lsu_off.result | 319 ++++++
.../rpl_xa_survive_disconnect_mixed_engines.result | 373 +++++++
.../suite/rpl/t/rpl_parallel_optimistic_xa.test | 235 ++++
.../t/rpl_parallel_optimistic_xa_lsu_off-slave.opt | 1 +
.../rpl/t/rpl_parallel_optimistic_xa_lsu_off.test | 2 +
.../suite/rpl/t/rpl_parallel_xa_same_xid.test | 138 +++
mysql-test/suite/rpl/t/rpl_temporary_errors.test | 82 +-
mysql-test/suite/rpl/t/rpl_xa.inc | 73 ++
mysql-test/suite/rpl/t/rpl_xa.test | 5 +
mysql-test/suite/rpl/t/rpl_xa_gap_lock-slave.opt | 1 +
mysql-test/suite/rpl/t/rpl_xa_gap_lock.test | 137 +++
.../suite/rpl/t/rpl_xa_gtid_pos_auto_engine.test | 29 +
.../suite/rpl/t/rpl_xa_survive_disconnect.test | 294 +++++
.../t/rpl_xa_survive_disconnect_lsu_off-slave.opt | 2 +
.../rpl/t/rpl_xa_survive_disconnect_lsu_off.test | 8 +
.../t/rpl_xa_survive_disconnect_mixed_engines.test | 68 ++
sql/handler.cc | 11 +-
sql/log.cc | 248 ++++-
sql/log.h | 10 +
sql/log_event.cc | 59 +-
sql/log_event.h | 213 +++-
sql/log_event_client.cc | 37 +-
sql/log_event_server.cc | 298 +++--
sql/rpl_parallel.cc | 220 +++-
sql/rpl_parallel.h | 28 +-
sql/rpl_rli.cc | 14 +-
sql/rpl_rli.h | 4 +
sql/slave.cc | 10 +-
sql/sql_repl.cc | 4 +-
sql/xa.cc | 254 ++++-
sql/xa.h | 22 +-
storage/innobase/handler/ha_innodb.cc | 1 +
storage/innobase/trx/trx0trx.cc | 19 +-
storage/rocksdb/ha_rocksdb.cc | 18 +-
storage/rocksdb/mysql-test/rocksdb/r/xa.result | 41 +-
storage/rocksdb/mysql-test/rocksdb/t/xa.test | 43 +-
.../rocksdb/mysql-test/rocksdb_rpl/r/rpl_xa.result | 50 +
.../rocksdb/mysql-test/rocksdb_rpl/t/rpl_xa.inc | 70 ++
.../rocksdb/mysql-test/rocksdb_rpl/t/rpl_xa.test | 6 +
.../tokudb/mysql-test/tokudb_mariadb/r/xa.result | 1 +
storage/tokudb/mysql-test/tokudb_mariadb/t/xa.test | 3 +
71 files changed, 8410 insertions(+), 212 deletions(-)
diff --git a/mysql-test/include/kill_and_restart_mysqld.inc b/mysql-test/include/kill_and_restart_mysqld.inc
new file mode 100644
index 00000000000..b67fb7350b4
--- /dev/null
+++ b/mysql-test/include/kill_and_restart_mysqld.inc
@@ -0,0 +1,15 @@
+if (!$restart_parameters)
+{
+ let $restart_parameters = restart;
+}
+
+--let $_server_id= `SELECT @@server_id`
+--let $_expect_file_name= $MYSQLTEST_VARDIR/tmp/mysqld.$_server_id.expect
+
+--echo # Kill and $restart_parameters
+--exec echo "$restart_parameters" > $_expect_file_name
+--shutdown_server 0
+--source include/wait_until_disconnected.inc
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+--disable_reconnect
diff --git a/mysql-test/main/flush_read_lock.result b/mysql-test/main/flush_read_lock.result
index 0f8c2ce9fb9..be710050139 100644
--- a/mysql-test/main/flush_read_lock.result
+++ b/mysql-test/main/flush_read_lock.result
@@ -1310,6 +1310,8 @@ unlock tables;
# Check that XA non-COMMIT statements are not and COMMIT is
# blocked by active FTWRL in another connection
#
+# XA COMMIT, XA ROLLBACK and XA PREPARE does take COMMIT lock to ensure
+# that nothing is written to bin log and redo log under FTWRL mode.
connection con1;
flush tables with read lock;
connection default;
@@ -1322,11 +1324,25 @@ connection con1;
flush tables with read lock;
connection default;
xa end 'test1';
-xa prepare 'test1';
+xa prepare 'test1';;
+connection con1;
+unlock tables;
+# Switching to connection 'default'.
+connection default;
+# Reap XA PREPARE.
+# Switching to connection 'con1'.
+connection con1;
+flush tables with read lock;
+# Switching to connection 'default'.
+connection default;
+# Send XA ROLLBACK 'test1'
xa rollback 'test1';
+# Switching to connection 'con1'.
connection con1;
+# Wait until XA ROLLBACK is blocked.
unlock tables;
connection default;
+# Reap XA ROLLBACK
xa start 'test1';
insert into t3_trans values (1);
connection con1;
@@ -1334,7 +1350,20 @@ flush tables with read lock;
connection default;
connection default;
xa end 'test1';
+# Send XA PREPARE 'test1'
xa prepare 'test1';
+# Switching to connection 'con1'.
+connection con1;
+# Wait until XA PREPARE is blocked.
+unlock tables;
+# Switching to connection 'default'.
+connection default;
+# Reap XA PREPARE.
+# Switching to connection 'con1'.
+connection con1;
+flush tables with read lock;
+# Switching to connection 'default'.
+connection default;
# Send:
xa commit 'test1';;
connection con1;
@@ -1344,6 +1373,51 @@ connection default;
# Reap XA COMMIT.
delete from t3_trans;
#
+# Check that XA COMMIT / ROLLBACK for prepared transaction from a
+# disconnected session is blocked by active FTWRL in another connection.
+#
+# Create temporary connection for XA transaction.
+connect con_tmp,localhost,root,,;
+xa start 'test1';
+insert into t3_trans values (1);
+xa end 'test1';
+xa prepare 'test1';
+# Disconnect temporary connection
+disconnect con_tmp;
+# Create temporary connection for XA transaction.
+connect con_tmp,localhost,root,,;
+xa start 'test2';
+insert into t3_trans values (2);
+xa end 'test2';
+xa prepare 'test2';
+# Disconnect temporary connection
+disconnect con_tmp;
+# Switching to connection 'con1'.
+connection con1;
+flush tables with read lock;
+# Switching to connection 'default'.
+connection default;
+# Send XA ROLLBACK 'test1'
+xa rollback 'test1';
+# Switching to connection 'con1'.
+connection con1;
+# Wait until XA ROLLBACK is blocked.
+unlock tables;
+flush tables with read lock;
+# Switching to connection 'default'.
+connection default;
+# Reap XA ROLLBACK
+# Send XA COMMIT
+xa commit 'test2';;
+# Switching to connection 'con1'.
+connection con1;
+# Wait until XA COMMIT is blocked.
+unlock tables;
+# Switching to connection 'default'.
+connection default;
+# Reap XA COMMIT.
+delete from t3_trans;
+#
# Check that XA COMMIT blocks FTWRL in another connection.
xa start 'test1';
insert into t3_trans values (1);
diff --git a/mysql-test/main/flush_read_lock.test b/mysql-test/main/flush_read_lock.test
index 80512deac4e..4283358770c 100644
--- a/mysql-test/main/flush_read_lock.test
+++ b/mysql-test/main/flush_read_lock.test
@@ -1592,6 +1592,8 @@ unlock tables;
--echo # Check that XA non-COMMIT statements are not and COMMIT is
--echo # blocked by active FTWRL in another connection
--echo #
+--echo # XA COMMIT, XA ROLLBACK and XA PREPARE does take COMMIT lock to ensure
+--echo # that nothing is written to bin log and redo log under FTWRL mode.
connection $con_aux1;
flush tables with read lock;
connection default;
@@ -1604,11 +1606,37 @@ connection $con_aux1;
flush tables with read lock;
connection default;
xa end 'test1';
-xa prepare 'test1';
-xa rollback 'test1';
+--send xa prepare 'test1';
connection $con_aux1;
+let $wait_condition=
+ select count(*) = 1 from information_schema.processlist
+ where state = "Waiting for backup lock" and
+ info = "xa prepare 'test1'";
+--source include/wait_condition.inc
unlock tables;
+--echo # Switching to connection 'default'.
+connection default;
+--echo # Reap XA PREPARE.
+--reap
+--echo # Switching to connection '$con_aux1'.
+connection $con_aux1;
+flush tables with read lock;
+--echo # Switching to connection 'default'.
connection default;
+--echo # Send XA ROLLBACK 'test1'
+--send xa rollback 'test1'
+--echo # Switching to connection '$con_aux1'.
+connection $con_aux1;
+--echo # Wait until XA ROLLBACK is blocked.
+let $wait_condition=
+ select count(*) = 1 from information_schema.processlist
+ where state = "Waiting for backup lock" and
+ info = "xa rollback 'test1'";
+--source include/wait_condition.inc
+unlock tables;
+connection default;
+--echo # Reap XA ROLLBACK
+--reap
xa start 'test1';
insert into t3_trans values (1);
connection $con_aux1;
@@ -1616,7 +1644,27 @@ flush tables with read lock;
connection default;
connection default;
xa end 'test1';
-xa prepare 'test1';
+--echo # Send XA PREPARE 'test1'
+--send xa prepare 'test1'
+--echo # Switching to connection '$con_aux1'.
+connection $con_aux1;
+--echo # Wait until XA PREPARE is blocked.
+let $wait_condition=
+ select count(*) = 1 from information_schema.processlist
+ where state = "Waiting for backup lock" and
+ info = "xa prepare 'test1'";
+--source include/wait_condition.inc
+unlock tables;
+--echo # Switching to connection 'default'.
+connection default;
+--echo # Reap XA PREPARE.
+--reap
+--echo # Switching to connection '$con_aux1'.
+connection $con_aux1;
+flush tables with read lock;
+--echo # Switching to connection 'default'.
+connection default;
+
--echo # Send:
--send xa commit 'test1';
connection $con_aux1;
@@ -1631,6 +1679,64 @@ connection default;
--echo # Reap XA COMMIT.
--reap
delete from t3_trans;
+--echo #
+--echo # Check that XA COMMIT / ROLLBACK for prepared transaction from a
+--echo # disconnected session is blocked by active FTWRL in another connection.
+--echo #
+--echo # Create temporary connection for XA transaction.
+connect (con_tmp,localhost,root,,);
+xa start 'test1';
+insert into t3_trans values (1);
+xa end 'test1';
+xa prepare 'test1';
+--echo # Disconnect temporary connection
+disconnect con_tmp;
+--echo # Create temporary connection for XA transaction.
+connect (con_tmp,localhost,root,,);
+xa start 'test2';
+insert into t3_trans values (2);
+xa end 'test2';
+xa prepare 'test2';
+--echo # Disconnect temporary connection
+disconnect con_tmp;
+--echo # Switching to connection '$con_aux1'.
+connection $con_aux1;
+flush tables with read lock;
+--echo # Switching to connection 'default'.
+connection default;
+--echo # Send XA ROLLBACK 'test1'
+--send xa rollback 'test1'
+--echo # Switching to connection '$con_aux1'.
+connection $con_aux1;
+--echo # Wait until XA ROLLBACK is blocked.
+let $wait_condition=
+ select count(*) = 1 from information_schema.processlist
+ where state = "Waiting for backup lock" and
+ info = "xa rollback 'test1'";
+--source include/wait_condition.inc
+unlock tables;
+flush tables with read lock;
+--echo # Switching to connection 'default'.
+connection default;
+--echo # Reap XA ROLLBACK
+--reap
+--echo # Send XA COMMIT
+--send xa commit 'test2';
+--echo # Switching to connection '$con_aux1'.
+connection $con_aux1;
+--echo # Wait until XA COMMIT is blocked.
+let $wait_condition=
+ select count(*) = 1 from information_schema.processlist
+ where state = "Waiting for backup lock" and
+ info = "xa commit 'test2'";
+--source include/wait_condition.inc
+unlock tables;
+--echo # Switching to connection 'default'.
+connection default;
+--echo # Reap XA COMMIT.
+--reap
+delete from t3_trans;
+
--echo #
--echo # Check that XA COMMIT blocks FTWRL in another connection.
xa start 'test1';
diff --git a/mysql-test/main/xa.result b/mysql-test/main/xa.result
index bd2946247d8..aa48f6d26c7 100644
--- a/mysql-test/main/xa.result
+++ b/mysql-test/main/xa.result
@@ -66,6 +66,9 @@ select * from t1;
a
20
disconnect con1;
+xa rollback 'testb',0x2030405060,11;
+xa recover;
+formatID gtrid_length bqual_length data
connection default;
xa start 'tr1';
insert t1 values (40);
@@ -376,3 +379,61 @@ XA PREPARE 'Я_упaлa_c_сеновала_тормозила_головой';
XA ROLLBACK 'Я_упaлa_c_сеновала_тормозила_головой';
SET NAMES default;
DROP TABLE t1;
+# MDEV-7974 related
+# Check XA state when lock_wait_timeout happens
+# More tests added to flush_read_lock.test
+connect con_tmp,localhost,root,,;
+set session lock_wait_timeout=1;
+create table asd (a int) engine=innodb;
+xa start 'test1';
+insert into asd values(1);
+xa end 'test1';
+connection default;
+flush table with read lock;
+connection con_tmp;
+# PREPARE error will do auto rollback.
+xa prepare 'test1';
+ERROR HY000: Lock wait timeout exceeded; try restarting transaction
+show errors;
+Level Code Message
+Error 1205 Lock wait timeout exceeded; try restarting transaction
+Error 1402 XA_RBROLLBACK: Transaction branch was rolled back
+connection default;
+unlock tables;
+connection con_tmp;
+xa start 'test1';
+insert into asd values(1);
+xa end 'test1';
+xa prepare 'test1';
+connection default;
+flush tables with read lock;
+connection con_tmp;
+# LOCK error during ROLLBACK will not alter transaction state.
+xa rollback 'test1';
+ERROR HY000: Lock wait timeout exceeded; try restarting transaction
+show errors;
+Level Code Message
+Error 1205 Lock wait timeout exceeded; try restarting transaction
+Error 1401 XAER_RMERR: Fatal error occurred in the transaction branch - check your data for consistency
+xa recover;
+formatID gtrid_length bqual_length data
+1 5 0 test1
+# LOCK error during COMMIT will not alter transaction state.
+xa commit 'test1';
+ERROR HY000: Lock wait timeout exceeded; try restarting transaction
+show errors;
+Level Code Message
+Error 1205 Lock wait timeout exceeded; try restarting transaction
+Error 1401 XAER_RMERR: Fatal error occurred in the transaction branch - check your data for consistency
+xa recover;
+formatID gtrid_length bqual_length data
+1 5 0 test1
+connection default;
+unlock tables;
+connection con_tmp;
+xa rollback 'test1';
+xa recover;
+formatID gtrid_length bqual_length data
+drop table asd;
+disconnect con_tmp;
+connection default;
diff --git a/mysql-test/main/xa.test b/mysql-test/main/xa.test
index 55c41452635..ca09989dfc8 100644
--- a/mysql-test/main/xa.test
+++ b/mysql-test/main/xa.test
@@ -98,6 +98,8 @@ xa start 'zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz';
select * from t1;
disconnect con1;
--source include/wait_until_count_sessions.inc
+xa rollback 'testb',0x2030405060,11;
+xa recover;
connection default;
@@ -523,3 +525,50 @@ SET NAMES default;
DROP TABLE t1;
--source include/wait_until_count_sessions.inc
+
+--echo # MDEV-7974 related
+--echo # Check XA state when lock_wait_timeout happens
+--echo # More tests added to flush_read_lock.test
+connect (con_tmp,localhost,root,,);
+set session lock_wait_timeout=1;
+create table asd (a int) engine=innodb;
+xa start 'test1';
+insert into asd values(1);
+xa end 'test1';
+connection default;
+flush table with read lock;
+connection con_tmp;
+--echo # PREPARE error will do auto rollback.
+--ERROR ER_LOCK_WAIT_TIMEOUT
+xa prepare 'test1';
+show errors;
+connection default;
+unlock tables;
+
+connection con_tmp;
+xa start 'test1';
+insert into asd values(1);
+xa end 'test1';
+xa prepare 'test1';
+connection default;
+flush tables with read lock;
+connection con_tmp;
+--echo # LOCK error during ROLLBACK will not alter transaction state.
+--ERROR ER_LOCK_WAIT_TIMEOUT
+xa rollback 'test1';
+show errors;
+xa recover;
+--echo # LOCK error during COMMIT will not alter transaction state.
+--ERROR ER_LOCK_WAIT_TIMEOUT
+xa commit 'test1';
+show errors;
+xa recover;
+connection default;
+unlock tables;
+connection con_tmp;
+xa rollback 'test1';
+xa recover;
+drop table asd;
+disconnect con_tmp;
+--source include/wait_until_disconnected.inc
+connection default;
diff --git a/mysql-test/main/xa_binlog.result b/mysql-test/main/xa_binlog.result
index 619a6e08b20..c45749d500f 100644
--- a/mysql-test/main/xa_binlog.result
+++ b/mysql-test/main/xa_binlog.result
@@ -18,14 +18,17 @@ a
1
2
3
-SHOW BINLOG EVENTS LIMIT 3,9;
+SHOW BINLOG EVENTS LIMIT 3,12;
Log_name Pos Event_type Server_id End_log_pos Info
-master-bin.000001 # Gtid 1 # BEGIN GTID #-#-#
+master-bin.000001 # Gtid 1 # XA START X'786174657374',X'',1 GTID #-#-#
master-bin.000001 # Query 1 # use `test`; INSERT INTO t1 VALUES (1)
-master-bin.000001 # Query 1 # COMMIT
+master-bin.000001 # Query 1 # XA END X'786174657374',X'',1
+master-bin.000001 # XA_prepare 1 # XA PREPARE X'786174657374',X'',1
+master-bin.000001 # Gtid 1 # GTID #-#-#
+master-bin.000001 # Query 1 # XA COMMIT X'786174657374',X'',1
master-bin.000001 # Gtid 1 # BEGIN GTID #-#-#
master-bin.000001 # Query 1 # use `test`; INSERT INTO t1 VALUES (2)
-master-bin.000001 # Query 1 # COMMIT
+master-bin.000001 # Xid 1 # COMMIT /* xid=XX */
master-bin.000001 # Gtid 1 # BEGIN GTID #-#-#
master-bin.000001 # Query 1 # use `test`; INSERT INTO t1 VALUES (3)
master-bin.000001 # Xid 1 # COMMIT /* xid=XX */
diff --git a/mysql-test/main/xa_binlog.test b/mysql-test/main/xa_binlog.test
index ecbf1f4f066..91bca2ac8cb 100644
--- a/mysql-test/main/xa_binlog.test
+++ b/mysql-test/main/xa_binlog.test
@@ -27,6 +27,6 @@ SELECT * FROM t1 ORDER BY a;
--replace_column 2 # 5 #
--replace_regex /xid=[0-9]+/xid=XX/ /GTID [0-9]+-[0-9]+-[0-9]+/GTID #-#-#/
-SHOW BINLOG EVENTS LIMIT 3,9;
+SHOW BINLOG EVENTS LIMIT 3,12;
DROP TABLE t1;
diff --git a/mysql-test/main/xa_prepared_binlog_off-master.opt b/mysql-test/main/xa_prepared_binlog_off-master.opt
new file mode 100644
index 00000000000..789275fa25e
--- /dev/null
+++ b/mysql-test/main/xa_prepared_binlog_off-master.opt
@@ -0,0 +1 @@
+--skip-log-bin
diff --git a/mysql-test/main/xa_prepared_binlog_off.result b/mysql-test/main/xa_prepared_binlog_off.result
new file mode 100644
index 00000000000..ca19f6cdfaf
--- /dev/null
+++ b/mysql-test/main/xa_prepared_binlog_off.result
@@ -0,0 +1,1044 @@
+call mtr.add_suppression("You need to use --log-bin to make --log-slave-updates work.");
+connection default;
+CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
+call mtr.add_suppression("Found 10 prepared XA transactions");
+CREATE TABLE t (a INT) ENGINE=innodb;
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx1tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx1tmp';
+XA PREPARE 'trx1tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx2tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx2tmp';
+XA PREPARE 'trx2tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx3tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx3tmp';
+XA PREPARE 'trx3tmp';
+connection default;
+XA COMMIT 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA ROLLBACK 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA START 'trx1tmp';
+ERROR XAE08: XAER_DUPID: The XID already exists
+connection default;
+*** 3 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1tmp;
+disconnect conn1tmp;
+connection default;
+XA COMMIT 'trx1tmp';
+KILL connection CONN_ID;
+XA COMMIT 'trx3tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1ro';
+SELECT * from t ORDER BY a;
+a
+XA END 'trx1ro';
+XA PREPARE 'trx1ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2ro';
+SELECT * from t ORDER BY a;
+a
+XA END 'trx2ro';
+XA PREPARE 'trx2ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3ro';
+SELECT * from t ORDER BY a;
+a
+XA END 'trx3ro';
+XA PREPARE 'trx3ro';
+connection default;
+*** 4 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1ro;
+disconnect conn1ro;
+connection default;
+XA ROLLBACK 'trx1ro';
+KILL connection CONN_ID;
+XA ROLLBACK 'trx3ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1empty';
+XA END 'trx1empty';
+XA PREPARE 'trx1empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2empty';
+XA END 'trx2empty';
+XA PREPARE 'trx2empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3empty';
+XA END 'trx3empty';
+XA PREPARE 'trx3empty';
+connection default;
+*** 5 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1empty;
+disconnect conn1empty;
+connection default;
+XA COMMIT 'trx1empty';
+KILL connection CONN_ID;
+XA COMMIT 'trx3empty';
+connect conn1$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1unprepared';
+INSERT INTO t set a=0;
+XA END 'trx1unprepared';
+INSERT INTO t set a=0;
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+disconnect conn1unprepared;
+connection default;
+XA COMMIT 'trx1unprepared';
+ERROR XAE04: XAER_NOTA: Unknown XID
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_0';
+INSERT INTO t SET a=0;
+XA END 'trx_0';
+XA PREPARE 'trx_0';
+disconnect conn0;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_1';
+INSERT INTO t SET a=1;
+XA END 'trx_1';
+XA PREPARE 'trx_1';
+disconnect conn1;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_2';
+INSERT INTO t SET a=2;
+XA END 'trx_2';
+XA PREPARE 'trx_2';
+disconnect conn2;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_3';
+INSERT INTO t SET a=3;
+XA END 'trx_3';
+XA PREPARE 'trx_3';
+disconnect conn3;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_4';
+INSERT INTO t SET a=4;
+XA END 'trx_4';
+XA PREPARE 'trx_4';
+disconnect conn4;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_5';
+INSERT INTO t SET a=5;
+XA END 'trx_5';
+XA PREPARE 'trx_5';
+disconnect conn5;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_6';
+INSERT INTO t SET a=6;
+XA END 'trx_6';
+XA PREPARE 'trx_6';
+disconnect conn6;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_7';
+INSERT INTO t SET a=7;
+XA END 'trx_7';
+XA PREPARE 'trx_7';
+disconnect conn7;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_8';
+INSERT INTO t SET a=8;
+XA END 'trx_8';
+XA PREPARE 'trx_8';
+disconnect conn8;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_9';
+INSERT INTO t SET a=9;
+XA END 'trx_9';
+XA PREPARE 'trx_9';
+disconnect conn9;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_10';
+INSERT INTO t SET a=10;
+XA END 'trx_10';
+XA PREPARE 'trx_10';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_11';
+INSERT INTO t SET a=11;
+XA END 'trx_11';
+XA PREPARE 'trx_11';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_12';
+INSERT INTO t SET a=12;
+XA END 'trx_12';
+XA PREPARE 'trx_12';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_13';
+INSERT INTO t SET a=13;
+XA END 'trx_13';
+XA PREPARE 'trx_13';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_14';
+INSERT INTO t SET a=14;
+XA END 'trx_14';
+XA PREPARE 'trx_14';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_15';
+INSERT INTO t SET a=15;
+XA END 'trx_15';
+XA PREPARE 'trx_15';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_16';
+INSERT INTO t SET a=16;
+XA END 'trx_16';
+XA PREPARE 'trx_16';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_17';
+INSERT INTO t SET a=17;
+XA END 'trx_17';
+XA PREPARE 'trx_17';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_18';
+INSERT INTO t SET a=18;
+XA END 'trx_18';
+XA PREPARE 'trx_18';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_19';
+INSERT INTO t SET a=19;
+XA END 'trx_19';
+XA PREPARE 'trx_19';
+connection default;
+KILL CONNECTION CONN_ID;
+connection default;
+XA ROLLBACK 'trx_0';
+XA ROLLBACK 'trx_1';
+XA ROLLBACK 'trx_2';
+XA ROLLBACK 'trx_3';
+XA ROLLBACK 'trx_4';
+XA COMMIT 'trx_5';
+XA COMMIT 'trx_6';
+XA COMMIT 'trx_7';
+XA COMMIT 'trx_8';
+XA COMMIT 'trx_9';
+# restart
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_0';
+INSERT INTO t SET a=0;
+XA END 'new_trx_0';
+XA PREPARE 'new_trx_0';
+disconnect conn_restart_0;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_1';
+INSERT INTO t SET a=1;
+XA END 'new_trx_1';
+XA PREPARE 'new_trx_1';
+disconnect conn_restart_1;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_2';
+INSERT INTO t SET a=2;
+XA END 'new_trx_2';
+XA PREPARE 'new_trx_2';
+disconnect conn_restart_2;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_3';
+INSERT INTO t SET a=3;
+XA END 'new_trx_3';
+XA PREPARE 'new_trx_3';
+disconnect conn_restart_3;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_4';
+INSERT INTO t SET a=4;
+XA END 'new_trx_4';
+XA PREPARE 'new_trx_4';
+disconnect conn_restart_4;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_5';
+INSERT INTO t SET a=5;
+XA END 'new_trx_5';
+XA PREPARE 'new_trx_5';
+disconnect conn_restart_5;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_6';
+INSERT INTO t SET a=6;
+XA END 'new_trx_6';
+XA PREPARE 'new_trx_6';
+disconnect conn_restart_6;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_7';
+INSERT INTO t SET a=7;
+XA END 'new_trx_7';
+XA PREPARE 'new_trx_7';
+disconnect conn_restart_7;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_8';
+INSERT INTO t SET a=8;
+XA END 'new_trx_8';
+XA PREPARE 'new_trx_8';
+disconnect conn_restart_8;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_9';
+INSERT INTO t SET a=9;
+XA END 'new_trx_9';
+XA PREPARE 'new_trx_9';
+disconnect conn_restart_9;
+connection default;
+connection default;
+XA COMMIT 'new_trx_0';
+XA COMMIT 'new_trx_1';
+XA COMMIT 'new_trx_2';
+XA COMMIT 'new_trx_3';
+XA COMMIT 'new_trx_4';
+XA COMMIT 'new_trx_5';
+XA COMMIT 'new_trx_6';
+XA COMMIT 'new_trx_7';
+XA COMMIT 'new_trx_8';
+XA COMMIT 'new_trx_9';
+XA START 'trx_10';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_10';
+XA START 'trx_11';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_11';
+XA START 'trx_12';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_12';
+XA START 'trx_13';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_13';
+XA START 'trx_14';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_14';
+XA START 'trx_15';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_15';
+XA START 'trx_16';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_16';
+XA START 'trx_17';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_17';
+XA START 'trx_18';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_18';
+XA START 'trx_19';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_19';
+SELECT * FROM t;
+a
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+disconnect conn2tmp;
+disconnect conn3tmp;
+disconnect conn2ro;
+disconnect conn3ro;
+disconnect conn2empty;
+disconnect conn3empty;
+connection default;
+XA ROLLBACK 'trx_20';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn19;
+connection default;
+XA ROLLBACK 'trx_19';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn18;
+connection default;
+XA ROLLBACK 'trx_18';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn17;
+connection default;
+XA ROLLBACK 'trx_17';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn16;
+connection default;
+XA ROLLBACK 'trx_16';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn15;
+connection default;
+XA ROLLBACK 'trx_15';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn14;
+connection default;
+XA ROLLBACK 'trx_14';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn13;
+connection default;
+XA ROLLBACK 'trx_13';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn12;
+connection default;
+XA ROLLBACK 'trx_12';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn11;
+connection default;
+XA ROLLBACK 'trx_11';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn10;
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx1tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx1tmp';
+XA PREPARE 'trx1tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx2tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx2tmp';
+XA PREPARE 'trx2tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx3tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx3tmp';
+XA PREPARE 'trx3tmp';
+connection default;
+XA COMMIT 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA ROLLBACK 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA START 'trx1tmp';
+ERROR XAE08: XAER_DUPID: The XID already exists
+connection default;
+*** 3 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1tmp;
+disconnect conn1tmp;
+connection default;
+XA COMMIT 'trx1tmp';
+KILL connection CONN_ID;
+XA COMMIT 'trx3tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1ro';
+SELECT * from t ORDER BY a;
+a
+0
+1
+2
+3
+4
+5
+5
+6
+6
+7
+7
+8
+8
+9
+9
+10
+11
+12
+13
+14
+XA END 'trx1ro';
+XA PREPARE 'trx1ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2ro';
+SELECT * from t ORDER BY a;
+a
+0
+1
+2
+3
+4
+5
+5
+6
+6
+7
+7
+8
+8
+9
+9
+10
+11
+12
+13
+14
+XA END 'trx2ro';
+XA PREPARE 'trx2ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3ro';
+SELECT * from t ORDER BY a;
+a
+0
+1
+2
+3
+4
+5
+5
+6
+6
+7
+7
+8
+8
+9
+9
+10
+11
+12
+13
+14
+XA END 'trx3ro';
+XA PREPARE 'trx3ro';
+connection default;
+*** 4 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1ro;
+disconnect conn1ro;
+connection default;
+XA ROLLBACK 'trx1ro';
+KILL connection CONN_ID;
+XA ROLLBACK 'trx3ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1empty';
+XA END 'trx1empty';
+XA PREPARE 'trx1empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2empty';
+XA END 'trx2empty';
+XA PREPARE 'trx2empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3empty';
+XA END 'trx3empty';
+XA PREPARE 'trx3empty';
+connection default;
+*** 5 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1empty;
+disconnect conn1empty;
+connection default;
+XA COMMIT 'trx1empty';
+KILL connection CONN_ID;
+XA COMMIT 'trx3empty';
+connect conn1$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1unprepared';
+INSERT INTO t set a=0;
+XA END 'trx1unprepared';
+INSERT INTO t set a=0;
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+disconnect conn1unprepared;
+connection default;
+XA COMMIT 'trx1unprepared';
+ERROR XAE04: XAER_NOTA: Unknown XID
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_0';
+INSERT INTO t SET a=0;
+XA END 'trx_0';
+XA PREPARE 'trx_0';
+disconnect conn0;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_1';
+INSERT INTO t SET a=1;
+XA END 'trx_1';
+XA PREPARE 'trx_1';
+disconnect conn1;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_2';
+INSERT INTO t SET a=2;
+XA END 'trx_2';
+XA PREPARE 'trx_2';
+disconnect conn2;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_3';
+INSERT INTO t SET a=3;
+XA END 'trx_3';
+XA PREPARE 'trx_3';
+disconnect conn3;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_4';
+INSERT INTO t SET a=4;
+XA END 'trx_4';
+XA PREPARE 'trx_4';
+disconnect conn4;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_5';
+INSERT INTO t SET a=5;
+XA END 'trx_5';
+XA PREPARE 'trx_5';
+disconnect conn5;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_6';
+INSERT INTO t SET a=6;
+XA END 'trx_6';
+XA PREPARE 'trx_6';
+disconnect conn6;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_7';
+INSERT INTO t SET a=7;
+XA END 'trx_7';
+XA PREPARE 'trx_7';
+disconnect conn7;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_8';
+INSERT INTO t SET a=8;
+XA END 'trx_8';
+XA PREPARE 'trx_8';
+disconnect conn8;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_9';
+INSERT INTO t SET a=9;
+XA END 'trx_9';
+XA PREPARE 'trx_9';
+disconnect conn9;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_10';
+INSERT INTO t SET a=10;
+XA END 'trx_10';
+XA PREPARE 'trx_10';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_11';
+INSERT INTO t SET a=11;
+XA END 'trx_11';
+XA PREPARE 'trx_11';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_12';
+INSERT INTO t SET a=12;
+XA END 'trx_12';
+XA PREPARE 'trx_12';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_13';
+INSERT INTO t SET a=13;
+XA END 'trx_13';
+XA PREPARE 'trx_13';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_14';
+INSERT INTO t SET a=14;
+XA END 'trx_14';
+XA PREPARE 'trx_14';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_15';
+INSERT INTO t SET a=15;
+XA END 'trx_15';
+XA PREPARE 'trx_15';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_16';
+INSERT INTO t SET a=16;
+XA END 'trx_16';
+XA PREPARE 'trx_16';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_17';
+INSERT INTO t SET a=17;
+XA END 'trx_17';
+XA PREPARE 'trx_17';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_18';
+INSERT INTO t SET a=18;
+XA END 'trx_18';
+XA PREPARE 'trx_18';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_19';
+INSERT INTO t SET a=19;
+XA END 'trx_19';
+XA PREPARE 'trx_19';
+connection default;
+KILL CONNECTION CONN_ID;
+connection default;
+XA ROLLBACK 'trx_0';
+XA ROLLBACK 'trx_1';
+XA ROLLBACK 'trx_2';
+XA ROLLBACK 'trx_3';
+XA ROLLBACK 'trx_4';
+XA COMMIT 'trx_5';
+XA COMMIT 'trx_6';
+XA COMMIT 'trx_7';
+XA COMMIT 'trx_8';
+XA COMMIT 'trx_9';
+# Kill and restart
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_0';
+INSERT INTO t SET a=0;
+XA END 'new_trx_0';
+XA PREPARE 'new_trx_0';
+disconnect conn_restart_0;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_1';
+INSERT INTO t SET a=1;
+XA END 'new_trx_1';
+XA PREPARE 'new_trx_1';
+disconnect conn_restart_1;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_2';
+INSERT INTO t SET a=2;
+XA END 'new_trx_2';
+XA PREPARE 'new_trx_2';
+disconnect conn_restart_2;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_3';
+INSERT INTO t SET a=3;
+XA END 'new_trx_3';
+XA PREPARE 'new_trx_3';
+disconnect conn_restart_3;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_4';
+INSERT INTO t SET a=4;
+XA END 'new_trx_4';
+XA PREPARE 'new_trx_4';
+disconnect conn_restart_4;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_5';
+INSERT INTO t SET a=5;
+XA END 'new_trx_5';
+XA PREPARE 'new_trx_5';
+disconnect conn_restart_5;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_6';
+INSERT INTO t SET a=6;
+XA END 'new_trx_6';
+XA PREPARE 'new_trx_6';
+disconnect conn_restart_6;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_7';
+INSERT INTO t SET a=7;
+XA END 'new_trx_7';
+XA PREPARE 'new_trx_7';
+disconnect conn_restart_7;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_8';
+INSERT INTO t SET a=8;
+XA END 'new_trx_8';
+XA PREPARE 'new_trx_8';
+disconnect conn_restart_8;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_9';
+INSERT INTO t SET a=9;
+XA END 'new_trx_9';
+XA PREPARE 'new_trx_9';
+disconnect conn_restart_9;
+connection default;
+connection default;
+XA COMMIT 'new_trx_0';
+XA COMMIT 'new_trx_1';
+XA COMMIT 'new_trx_2';
+XA COMMIT 'new_trx_3';
+XA COMMIT 'new_trx_4';
+XA COMMIT 'new_trx_5';
+XA COMMIT 'new_trx_6';
+XA COMMIT 'new_trx_7';
+XA COMMIT 'new_trx_8';
+XA COMMIT 'new_trx_9';
+XA START 'trx_10';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_10';
+XA START 'trx_11';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_11';
+XA START 'trx_12';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_12';
+XA START 'trx_13';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_13';
+XA START 'trx_14';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_14';
+XA START 'trx_15';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_15';
+XA START 'trx_16';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_16';
+XA START 'trx_17';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_17';
+XA START 'trx_18';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_18';
+XA START 'trx_19';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_19';
+SELECT * FROM t;
+a
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+disconnect conn2tmp;
+disconnect conn3tmp;
+disconnect conn2ro;
+disconnect conn3ro;
+disconnect conn2empty;
+disconnect conn3empty;
+connection default;
+XA ROLLBACK 'trx_20';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn19;
+connection default;
+XA ROLLBACK 'trx_19';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn18;
+connection default;
+XA ROLLBACK 'trx_18';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn17;
+connection default;
+XA ROLLBACK 'trx_17';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn16;
+connection default;
+XA ROLLBACK 'trx_16';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn15;
+connection default;
+XA ROLLBACK 'trx_15';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn14;
+connection default;
+XA ROLLBACK 'trx_14';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn13;
+connection default;
+XA ROLLBACK 'trx_13';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn12;
+connection default;
+XA ROLLBACK 'trx_12';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn11;
+connection default;
+XA ROLLBACK 'trx_11';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn10;
+connection default;
+XA START 'one_phase_trx_0';
+INSERT INTO t SET a=0;
+XA END 'one_phase_trx_0';
+XA COMMIT 'one_phase_trx_0' ONE PHASE;
+XA START 'one_phase_trx_1';
+INSERT INTO t SET a=1;
+XA END 'one_phase_trx_1';
+XA COMMIT 'one_phase_trx_1' ONE PHASE;
+XA START 'one_phase_trx_2';
+INSERT INTO t SET a=2;
+XA END 'one_phase_trx_2';
+XA COMMIT 'one_phase_trx_2' ONE PHASE;
+XA START 'one_phase_trx_3';
+INSERT INTO t SET a=3;
+XA END 'one_phase_trx_3';
+XA COMMIT 'one_phase_trx_3' ONE PHASE;
+XA START 'one_phase_trx_4';
+INSERT INTO t SET a=4;
+XA END 'one_phase_trx_4';
+XA COMMIT 'one_phase_trx_4' ONE PHASE;
+SELECT SUM(a) FROM t;
+SUM(a)
+290
+DROP TABLE t;
+DROP VIEW v_processlist;
+All transactions must be completed, to empty-list the following:
+XA RECOVER;
+formatID gtrid_length bqual_length data
diff --git a/mysql-test/main/xa_prepared_binlog_off.test b/mysql-test/main/xa_prepared_binlog_off.test
new file mode 100644
index 00000000000..edbfa7c2825
--- /dev/null
+++ b/mysql-test/main/xa_prepared_binlog_off.test
@@ -0,0 +1,11 @@
+###############################################################################
+# MDEV-7974 (bug#12161 Xa recovery and client disconnection)
+# Testing XA behaviour with binlog turned off.
+###############################################################################
+
+--source include/not_valgrind.inc
+--source include/not_embedded.inc
+
+# Common part with XA binlogging testing
+call mtr.add_suppression("You need to use --log-bin to make --log-slave-updates work.");
+--source suite/binlog/t/binlog_xa_prepared.inc
diff --git a/mysql-test/main/xa_sync.result b/mysql-test/main/xa_sync.result
index 1482ff5cacf..e7dd9b02847 100644
--- a/mysql-test/main/xa_sync.result
+++ b/mysql-test/main/xa_sync.result
@@ -18,6 +18,11 @@ disconnect con1;
SET debug_sync='now SIGNAL go';
connection con2;
ERROR XAE04: XAER_NOTA: Unknown XID
+*** Must have 'xatest' in the list
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 6 0 xatest
+XA COMMIT 'xatest';
disconnect con2;
connection default;
SET debug_sync='RESET';
@@ -37,6 +42,11 @@ disconnect con1;
SET debug_sync='now SIGNAL go';
connection con2;
ERROR XAE04: XAER_NOTA: Unknown XID
+*** Must have 'xatest' in the list
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 6 0 xatest
+XA ROLLBACK 'xatest';
disconnect con2;
connection default;
SET debug_sync='RESET';
diff --git a/mysql-test/main/xa_sync.test b/mysql-test/main/xa_sync.test
index bb95af7c0ba..2fe7337501e 100644
--- a/mysql-test/main/xa_sync.test
+++ b/mysql-test/main/xa_sync.test
@@ -35,6 +35,11 @@ while ($i)
connection con2;
--error ER_XAER_NOTA
reap;
+ --echo *** Must have 'xatest' in the list
+ XA RECOVER;
+ # second time yields no error
+ --error 0
+ --eval $op
disconnect con2;
connection default;
diff --git a/mysql-test/suite/binlog/include/binlog_xa_prepare_connection.inc b/mysql-test/suite/binlog/include/binlog_xa_prepare_connection.inc
new file mode 100644
index 00000000000..c0041af1e7f
--- /dev/null
+++ b/mysql-test/suite/binlog/include/binlog_xa_prepare_connection.inc
@@ -0,0 +1,31 @@
+#
+# This file initiate connections to run XA transactions up to
+# their prepare.
+# Connection name, transaction name and its content depends on
+# supplied parameters.
+#
+# param $type type of transaction
+# param $index index identifies the connection with those of type $type
+# param $sql_init1 a query to execute once connection is established
+# param $sql_init2 a query to execute once connection is established
+# param $sql_doit a query to execute inside transaction
+# Note, the query may depend on tables created by caller
+#
+
+--connect (conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,)
+if ($sql_init1)
+{
+ --eval $sql_init1
+}
+if ($sql_init2)
+{
+ --eval $sql_init2
+}
+
+--eval XA START 'trx$index$type'
+if ($sql_doit)
+{
+ --eval $sql_doit
+}
+--eval XA END 'trx$index$type'
+--eval XA PREPARE 'trx$index$type'
diff --git a/mysql-test/suite/binlog/include/binlog_xa_prepare_disconnect.inc b/mysql-test/suite/binlog/include/binlog_xa_prepare_disconnect.inc
new file mode 100644
index 00000000000..1f6ce713cc9
--- /dev/null
+++ b/mysql-test/suite/binlog/include/binlog_xa_prepare_disconnect.inc
@@ -0,0 +1,35 @@
+#
+# This file disconnects two connections. One actively and one through
+# kill. It is included by binlog_xa_prepared_do_and_restart.
+#
+# param $type type of transaction
+# param $terminate_with how to conclude actively disconnecte:
+# XA COMMIT or XA ROLLBACK
+# param $conn3_id connection id of the being killed.
+# param $num_trx_prepared number of transactions prepared so far
+#
+--connection default
+
+--echo *** $num_trx_prepared prepared transactions must be in the list ***
+--replace_column 2 LEN1 3 LEN2 4 TRX_N
+XA RECOVER;
+
+--connection conn1$type
+--let $conn1_id=`SELECT connection_id()`
+--disconnect conn1$type
+
+--connection default
+--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn1_id
+--source include/wait_condition.inc
+
+# It will conclude now
+--eval $terminate_with 'trx1$type'
+
+--replace_result $conn3_id CONN_ID
+--eval KILL connection $conn3_id
+
+--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn3_id
+--source include/wait_condition.inc
+
+# It will conclude now
+--eval $terminate_with 'trx3$type'
diff --git a/mysql-test/suite/binlog/include/binlog_xa_prepared_do_and_restart.inc b/mysql-test/suite/binlog/include/binlog_xa_prepared_do_and_restart.inc
new file mode 100644
index 00000000000..cbd740fdae4
--- /dev/null
+++ b/mysql-test/suite/binlog/include/binlog_xa_prepared_do_and_restart.inc
@@ -0,0 +1,323 @@
+#
+# This file creates various kinds of prepared XA transactions,
+# manipulates their connection state and examines how their prepared
+# status behave while the transaction is disconnected, killed or
+# the server kisses it shutdown.
+# The file can be sourced multiple times
+# param $restart_number (as the number of inclusion) adjusts
+# verification logics.
+#
+# param [in] $conn_number Total number of connection each performing
+# one insert into table.
+# param [in] $commit_number Number of commits from either.
+# side of the server restart.
+# param [in] $rollback_number The same as the above just for rollback.
+# param [in] $term_number Number of transaction that are terminated
+# before server restarts
+# param [in] $killed_number Instead of disconnect make some
+# connections killed when their
+# transactions got prepared.
+# param [in] $server_disconn_number Make some connections disconnected
+# by shutdown rather than actively
+# param [in] $post_restart_conn_number Number a "warmup" connection
+# after server restart, they all commit
+# param [out] restart_number Counter to be incremented at the end of the test
+#
+
+# The test consists of three sections:
+# I. Corner cases check
+# II. Regular case check
+# III. Post server-restart verification
+
+
+#
+# I. Corner cases of
+#
+# A. XA with an update to a temp table
+# B. XA with SELECT
+# C. XA empty
+# Demonstrate their XA status upon prepare and how they react on disconnect and
+# shutdown.
+# In each of A,B,C three prepared transactions are set up.
+# trx1 is for disconnection, trx2 for shutdown, trx3 for being killed.
+# The A case additionally contains some XA prohibited state transaction check.
+#
+# D. Prove that not prepared XA remains to be cleared out by disconnection.
+#
+
+#
+# A. The temp table only prepared XA recovers only formally to
+# let post recovery XA COMMIT or XA ROLLBACK with no effect.
+
+--let $type = tmp
+--let $index = 1
+--let $sql_init1 = SET @@sql_log_bin = OFF
+--let $sql_init2 = CREATE TEMPORARY TABLE tmp$index (a int) ENGINE=innodb
+--let $sql_doit = INSERT INTO tmp$index SET a=$index
+--source suite/binlog/include/binlog_xa_prepare_connection.inc
+
+--let $index = 2
+--source suite/binlog/include/binlog_xa_prepare_connection.inc
+
+--let $index = 3
+--source suite/binlog/include/binlog_xa_prepare_connection.inc
+--let $conn3_id=`SELECT connection_id()`
+
+#
+# Various prohibited XA state changes to test here:
+#
+
+--connection default
+# Stealing is not allowed
+--error ER_XAER_NOTA
+--eval XA COMMIT 'trx1$type'
+--error ER_XAER_NOTA
+--eval XA ROLLBACK 'trx1$type'
+
+# Before disconnect: creating a duplicate is not allowed
+--error ER_XAER_DUPID
+--eval XA START 'trx1$type'
+
+# Manipulate now the prepared transactions.
+# Two to terminate, one to leave out.
+--let $terminate_with = XA COMMIT
+--let $num_trx_prepared = $index
+--source suite/binlog/include/binlog_xa_prepare_disconnect.inc
+
+#
+# B. "Read-only" (select) prepared XA recovers only formally to
+# let post recovery XA COMMIT or XA ROLLBACK with no effect.
+#
+--let $type=ro
+--let $index = 1
+--let $sql_init1 =
+--let $sql_init2 =
+--let $sql_doit = SELECT * from t ORDER BY a
+--source suite/binlog/include/binlog_xa_prepare_connection.inc
+
+--let $index = 2
+--source suite/binlog/include/binlog_xa_prepare_connection.inc
+
+--let $index = 3
+--source suite/binlog/include/binlog_xa_prepare_connection.inc
+--let $conn3_id=`SELECT connection_id()`
+
+--let $terminate_with = XA ROLLBACK
+# two three above section prepared transaction were terminated.
+--inc $num_trx_prepared
+--source suite/binlog/include/binlog_xa_prepare_disconnect.inc
+
+#
+# C. Empty prepared XA recovers only formally to
+# let post recovery XA COMMIT or XA ROLLBACK with no effect.
+#
+--let $type=empty
+--let $index = 1
+--let $sql_init1 =
+--let $sql_init2 =
+--let $sql_doit =
+--source suite/binlog/include/binlog_xa_prepare_connection.inc
+
+--let $index = 2
+--source suite/binlog/include/binlog_xa_prepare_connection.inc
+
+--let $index = 3
+--source suite/binlog/include/binlog_xa_prepare_connection.inc
+--let $conn3_id=`SELECT connection_id()`
+
+--let $terminate_with = XA COMMIT
+--inc $num_trx_prepared
+--source suite/binlog/include/binlog_xa_prepare_disconnect.inc
+
+#
+# D. Not prepared XA disconnects to be cleared out,
+# no effect on data left as well.
+# Few more prohibited XA state transactions is checked out.
+#
+--let $type=unprepared
+--let $prev_count=`SELECT count(*) from t`
+
+--connect(conn1$type, 127.0.0.1,root,,test,$MASTER_MYPORT,)
+--eval XA START 'trx1$type'
+INSERT INTO t set a=0;
+--eval XA END 'trx1$type'
+
+--error ER_XAER_RMFAIL
+INSERT INTO t set a=0;
+--error ER_XAER_RMFAIL
+--eval XA START 'trx1$type'
+--error ER_XAER_RMFAIL
+--eval XA START 'trx1$type'
+
+--disconnect conn1$type
+
+--connection default
+# No such transactions
+--error ER_XAER_NOTA
+--eval XA COMMIT 'trx1$type'
+if (`SELECT count(*) > $prev_count from t`)
+{
+ --echo *** Unexpected commit to the table. ***
+ --die
+}
+
+#
+# II. Regular case.
+#
+# Prepared transactions get disconnected in three ways:
+# actively, being killed and by the server shutdown.
+#
+--let $i=0
+while ($i < $conn_number)
+{
+ --connect (conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,)
+ --let $conn_id=`SELECT connection_id()`
+ --disable_reconnect
+ SET @@binlog_format = STATEMENT;
+ if (`SELECT $i % 2`)
+ {
+ SET @@binlog_format = ROW;
+ }
+ --eval XA START 'trx_$i'
+ --eval INSERT INTO t SET a=$i
+ --eval XA END 'trx_$i'
+ --eval XA PREPARE 'trx_$i'
+
+ --let $disc_via_kill=`SELECT $conn_number - $i <= $killed_number`
+ if (!$disc_via_kill)
+ {
+ --let $disc_via_shutdown=`SELECT $conn_number - $i <= $killed_number + $server_disconn_number`
+ if (!$disc_via_shutdown)
+ {
+ --disconnect conn$i
+ }
+ }
+ if ($disc_via_kill)
+ {
+ --connection default
+ --replace_result $conn_id CONN_ID
+ --eval KILL CONNECTION $conn_id
+ }
+
+ if (!$disc_via_shutdown)
+ {
+ --connection default
+ --let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
+ --source include/wait_condition.inc
+ }
+ --inc $i
+}
+
+# [0, $rollback_number - 1] are rolled back now
+--connection default
+
+--let $i=0
+while ($i < $rollback_number)
+{
+ --eval XA ROLLBACK 'trx_$i'
+
+ --inc $i
+}
+
+# [$rollback_number, $rollback_number + $commit_number - 1] get committed
+while ($i < $term_number)
+{
+ --eval XA COMMIT 'trx_$i'
+
+ --inc $i
+}
+
+--source include/$how_to_restart
+
+#
+# III. Post server-restart verification.
+# It concludes survived XA:s with a number of commits and rollbacks
+# as configured in the 1st part to check expected results in the end.
+# Cleanup section consists of explicit disconnect (for killed, or
+# not disconnected before shutdown).
+#
+
+# New XA can be prepared and committed
+--let $k = 0
+while ($k < $post_restart_conn_number)
+{
+ --connect (conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,)
+ --let $conn_id=`SELECT connection_id()`
+ --eval XA START 'new_trx_$k'
+ --eval INSERT INTO t SET a=$k
+ --eval XA END 'new_trx_$k'
+ --eval XA PREPARE 'new_trx_$k'
+
+ --disconnect conn_restart_$k
+
+ --connection default
+ --let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
+ --source include/wait_condition.inc
+
+ --inc $k
+}
+
+--connection default
+--let $k = 0
+while ($k < $post_restart_conn_number)
+{
+ --eval XA COMMIT 'new_trx_$k'
+ --inc $k
+}
+
+#
+# Symmetrically to the pre-restart, the resurrected trx:s are committed
+# [$term_number, $term_number + $commit_number - 1]
+# and the rest is rolled back.
+#
+--let $i = $term_number
+
+while ($i < `SELECT $term_number + $commit_number`)
+{
+ # Expected to fail
+ --error ER_XAER_DUPID
+ --eval XA START 'trx_$i'
+ --eval XA COMMIT 'trx_$i'
+ --inc $i
+}
+
+while ($i < $conn_number)
+{
+ # Expected to fail
+ --error ER_XAER_DUPID
+ --eval XA START 'trx_$i'
+ --eval XA ROLLBACK 'trx_$i'
+ --inc $i
+}
+
+#
+# Verification of correct results of recovered XA transaction handling:
+#
+SELECT * FROM t;
+
+--let $type=tmp
+--disconnect conn2$type
+--disconnect conn3$type
+--let $type=ro
+--disconnect conn2$type
+--disconnect conn3$type
+--let $type=empty
+--disconnect conn2$type
+--disconnect conn3$type
+
+--let $i= $conn_number
+--let $k= 0
+--let $expl_disconn_number = `SELECT $killed_number + $server_disconn_number`
+while ($k < $expl_disconn_number)
+{
+ --connection default
+ --error ER_XAER_NOTA
+ --eval XA ROLLBACK 'trx_$i'
+
+ --dec $i
+ --disconnect conn$i
+
+ --inc $k
+}
+
+--inc $restart_number
diff --git a/mysql-test/suite/binlog/r/binlog_xa_checkpoint.result b/mysql-test/suite/binlog/r/binlog_xa_checkpoint.result
new file mode 100644
index 00000000000..d8a5818674f
--- /dev/null
+++ b/mysql-test/suite/binlog/r/binlog_xa_checkpoint.result
@@ -0,0 +1,33 @@
+RESET MASTER;
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+connect con1,localhost,root,,;
+SET DEBUG_SYNC= "at_unlog_xa_prepare SIGNAL con1_ready WAIT_FOR con1_go";
+XA START '1';
+INSERT INTO t1 SET a=1;
+XA END '1';
+XA PREPARE '1';;
+connection default;
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+FLUSH LOGS;
+FLUSH LOGS;
+FLUSH LOGS;
+show binary logs;
+Log_name File_size
+master-bin.000001 #
+master-bin.000002 #
+master-bin.000003 #
+master-bin.000004 #
+include/show_binlog_events.inc
+Log_name Pos Event_type Server_id End_log_pos Info
+master-bin.000004 # Format_desc # # SERVER_VERSION, BINLOG_VERSION
+master-bin.000004 # Gtid_list # # [#-#-#]
+master-bin.000004 # Binlog_checkpoint # # master-bin.000001
+SET DEBUG_SYNC= "now SIGNAL con1_go";
+connection con1;
+*** master-bin.000004 checkpoint must show up now ***
+connection con1;
+XA ROLLBACK '1';
+SET debug_sync = 'reset';
+connection default;
+DROP TABLE t1;
+SET debug_sync = 'reset';
diff --git a/mysql-test/suite/binlog/r/binlog_xa_prepared.result b/mysql-test/suite/binlog/r/binlog_xa_prepared.result
new file mode 100644
index 00000000000..9fda8ab3143
--- /dev/null
+++ b/mysql-test/suite/binlog/r/binlog_xa_prepared.result
@@ -0,0 +1,1176 @@
+connection default;
+RESET MASTER;
+CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
+call mtr.add_suppression("Found 10 prepared XA transactions");
+CREATE TABLE t (a INT) ENGINE=innodb;
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx1tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx1tmp';
+XA PREPARE 'trx1tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx2tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx2tmp';
+XA PREPARE 'trx2tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx3tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx3tmp';
+XA PREPARE 'trx3tmp';
+connection default;
+XA COMMIT 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA ROLLBACK 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA START 'trx1tmp';
+ERROR XAE08: XAER_DUPID: The XID already exists
+connection default;
+*** 3 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1tmp;
+disconnect conn1tmp;
+connection default;
+XA COMMIT 'trx1tmp';
+KILL connection CONN_ID;
+XA COMMIT 'trx3tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1ro';
+SELECT * from t ORDER BY a;
+a
+XA END 'trx1ro';
+XA PREPARE 'trx1ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2ro';
+SELECT * from t ORDER BY a;
+a
+XA END 'trx2ro';
+XA PREPARE 'trx2ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3ro';
+SELECT * from t ORDER BY a;
+a
+XA END 'trx3ro';
+XA PREPARE 'trx3ro';
+connection default;
+*** 4 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1ro;
+disconnect conn1ro;
+connection default;
+XA ROLLBACK 'trx1ro';
+KILL connection CONN_ID;
+XA ROLLBACK 'trx3ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1empty';
+XA END 'trx1empty';
+XA PREPARE 'trx1empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2empty';
+XA END 'trx2empty';
+XA PREPARE 'trx2empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3empty';
+XA END 'trx3empty';
+XA PREPARE 'trx3empty';
+connection default;
+*** 5 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1empty;
+disconnect conn1empty;
+connection default;
+XA COMMIT 'trx1empty';
+KILL connection CONN_ID;
+XA COMMIT 'trx3empty';
+connect conn1$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1unprepared';
+INSERT INTO t set a=0;
+XA END 'trx1unprepared';
+INSERT INTO t set a=0;
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+disconnect conn1unprepared;
+connection default;
+XA COMMIT 'trx1unprepared';
+ERROR XAE04: XAER_NOTA: Unknown XID
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_0';
+INSERT INTO t SET a=0;
+XA END 'trx_0';
+XA PREPARE 'trx_0';
+disconnect conn0;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_1';
+INSERT INTO t SET a=1;
+XA END 'trx_1';
+XA PREPARE 'trx_1';
+disconnect conn1;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_2';
+INSERT INTO t SET a=2;
+XA END 'trx_2';
+XA PREPARE 'trx_2';
+disconnect conn2;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_3';
+INSERT INTO t SET a=3;
+XA END 'trx_3';
+XA PREPARE 'trx_3';
+disconnect conn3;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_4';
+INSERT INTO t SET a=4;
+XA END 'trx_4';
+XA PREPARE 'trx_4';
+disconnect conn4;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_5';
+INSERT INTO t SET a=5;
+XA END 'trx_5';
+XA PREPARE 'trx_5';
+disconnect conn5;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_6';
+INSERT INTO t SET a=6;
+XA END 'trx_6';
+XA PREPARE 'trx_6';
+disconnect conn6;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_7';
+INSERT INTO t SET a=7;
+XA END 'trx_7';
+XA PREPARE 'trx_7';
+disconnect conn7;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_8';
+INSERT INTO t SET a=8;
+XA END 'trx_8';
+XA PREPARE 'trx_8';
+disconnect conn8;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_9';
+INSERT INTO t SET a=9;
+XA END 'trx_9';
+XA PREPARE 'trx_9';
+disconnect conn9;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_10';
+INSERT INTO t SET a=10;
+XA END 'trx_10';
+XA PREPARE 'trx_10';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_11';
+INSERT INTO t SET a=11;
+XA END 'trx_11';
+XA PREPARE 'trx_11';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_12';
+INSERT INTO t SET a=12;
+XA END 'trx_12';
+XA PREPARE 'trx_12';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_13';
+INSERT INTO t SET a=13;
+XA END 'trx_13';
+XA PREPARE 'trx_13';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_14';
+INSERT INTO t SET a=14;
+XA END 'trx_14';
+XA PREPARE 'trx_14';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_15';
+INSERT INTO t SET a=15;
+XA END 'trx_15';
+XA PREPARE 'trx_15';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_16';
+INSERT INTO t SET a=16;
+XA END 'trx_16';
+XA PREPARE 'trx_16';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_17';
+INSERT INTO t SET a=17;
+XA END 'trx_17';
+XA PREPARE 'trx_17';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_18';
+INSERT INTO t SET a=18;
+XA END 'trx_18';
+XA PREPARE 'trx_18';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_19';
+INSERT INTO t SET a=19;
+XA END 'trx_19';
+XA PREPARE 'trx_19';
+connection default;
+KILL CONNECTION CONN_ID;
+connection default;
+XA ROLLBACK 'trx_0';
+XA ROLLBACK 'trx_1';
+XA ROLLBACK 'trx_2';
+XA ROLLBACK 'trx_3';
+XA ROLLBACK 'trx_4';
+XA COMMIT 'trx_5';
+XA COMMIT 'trx_6';
+XA COMMIT 'trx_7';
+XA COMMIT 'trx_8';
+XA COMMIT 'trx_9';
+# restart
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_0';
+INSERT INTO t SET a=0;
+XA END 'new_trx_0';
+XA PREPARE 'new_trx_0';
+disconnect conn_restart_0;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_1';
+INSERT INTO t SET a=1;
+XA END 'new_trx_1';
+XA PREPARE 'new_trx_1';
+disconnect conn_restart_1;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_2';
+INSERT INTO t SET a=2;
+XA END 'new_trx_2';
+XA PREPARE 'new_trx_2';
+disconnect conn_restart_2;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_3';
+INSERT INTO t SET a=3;
+XA END 'new_trx_3';
+XA PREPARE 'new_trx_3';
+disconnect conn_restart_3;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_4';
+INSERT INTO t SET a=4;
+XA END 'new_trx_4';
+XA PREPARE 'new_trx_4';
+disconnect conn_restart_4;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_5';
+INSERT INTO t SET a=5;
+XA END 'new_trx_5';
+XA PREPARE 'new_trx_5';
+disconnect conn_restart_5;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_6';
+INSERT INTO t SET a=6;
+XA END 'new_trx_6';
+XA PREPARE 'new_trx_6';
+disconnect conn_restart_6;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_7';
+INSERT INTO t SET a=7;
+XA END 'new_trx_7';
+XA PREPARE 'new_trx_7';
+disconnect conn_restart_7;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_8';
+INSERT INTO t SET a=8;
+XA END 'new_trx_8';
+XA PREPARE 'new_trx_8';
+disconnect conn_restart_8;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_9';
+INSERT INTO t SET a=9;
+XA END 'new_trx_9';
+XA PREPARE 'new_trx_9';
+disconnect conn_restart_9;
+connection default;
+connection default;
+XA COMMIT 'new_trx_0';
+XA COMMIT 'new_trx_1';
+XA COMMIT 'new_trx_2';
+XA COMMIT 'new_trx_3';
+XA COMMIT 'new_trx_4';
+XA COMMIT 'new_trx_5';
+XA COMMIT 'new_trx_6';
+XA COMMIT 'new_trx_7';
+XA COMMIT 'new_trx_8';
+XA COMMIT 'new_trx_9';
+XA START 'trx_10';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_10';
+XA START 'trx_11';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_11';
+XA START 'trx_12';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_12';
+XA START 'trx_13';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_13';
+XA START 'trx_14';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_14';
+XA START 'trx_15';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_15';
+XA START 'trx_16';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_16';
+XA START 'trx_17';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_17';
+XA START 'trx_18';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_18';
+XA START 'trx_19';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_19';
+SELECT * FROM t;
+a
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+disconnect conn2tmp;
+disconnect conn3tmp;
+disconnect conn2ro;
+disconnect conn3ro;
+disconnect conn2empty;
+disconnect conn3empty;
+connection default;
+XA ROLLBACK 'trx_20';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn19;
+connection default;
+XA ROLLBACK 'trx_19';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn18;
+connection default;
+XA ROLLBACK 'trx_18';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn17;
+connection default;
+XA ROLLBACK 'trx_17';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn16;
+connection default;
+XA ROLLBACK 'trx_16';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn15;
+connection default;
+XA ROLLBACK 'trx_15';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn14;
+connection default;
+XA ROLLBACK 'trx_14';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn13;
+connection default;
+XA ROLLBACK 'trx_13';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn12;
+connection default;
+XA ROLLBACK 'trx_12';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn11;
+connection default;
+XA ROLLBACK 'trx_11';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn10;
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx1tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx1tmp';
+XA PREPARE 'trx1tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx2tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx2tmp';
+XA PREPARE 'trx2tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx3tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx3tmp';
+XA PREPARE 'trx3tmp';
+connection default;
+XA COMMIT 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA ROLLBACK 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA START 'trx1tmp';
+ERROR XAE08: XAER_DUPID: The XID already exists
+connection default;
+*** 3 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1tmp;
+disconnect conn1tmp;
+connection default;
+XA COMMIT 'trx1tmp';
+KILL connection CONN_ID;
+XA COMMIT 'trx3tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1ro';
+SELECT * from t ORDER BY a;
+a
+0
+1
+2
+3
+4
+5
+5
+6
+6
+7
+7
+8
+8
+9
+9
+10
+11
+12
+13
+14
+XA END 'trx1ro';
+XA PREPARE 'trx1ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2ro';
+SELECT * from t ORDER BY a;
+a
+0
+1
+2
+3
+4
+5
+5
+6
+6
+7
+7
+8
+8
+9
+9
+10
+11
+12
+13
+14
+XA END 'trx2ro';
+XA PREPARE 'trx2ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3ro';
+SELECT * from t ORDER BY a;
+a
+0
+1
+2
+3
+4
+5
+5
+6
+6
+7
+7
+8
+8
+9
+9
+10
+11
+12
+13
+14
+XA END 'trx3ro';
+XA PREPARE 'trx3ro';
+connection default;
+*** 4 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1ro;
+disconnect conn1ro;
+connection default;
+XA ROLLBACK 'trx1ro';
+KILL connection CONN_ID;
+XA ROLLBACK 'trx3ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1empty';
+XA END 'trx1empty';
+XA PREPARE 'trx1empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2empty';
+XA END 'trx2empty';
+XA PREPARE 'trx2empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3empty';
+XA END 'trx3empty';
+XA PREPARE 'trx3empty';
+connection default;
+*** 5 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1empty;
+disconnect conn1empty;
+connection default;
+XA COMMIT 'trx1empty';
+KILL connection CONN_ID;
+XA COMMIT 'trx3empty';
+connect conn1$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1unprepared';
+INSERT INTO t set a=0;
+XA END 'trx1unprepared';
+INSERT INTO t set a=0;
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+disconnect conn1unprepared;
+connection default;
+XA COMMIT 'trx1unprepared';
+ERROR XAE04: XAER_NOTA: Unknown XID
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_0';
+INSERT INTO t SET a=0;
+XA END 'trx_0';
+XA PREPARE 'trx_0';
+disconnect conn0;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_1';
+INSERT INTO t SET a=1;
+XA END 'trx_1';
+XA PREPARE 'trx_1';
+disconnect conn1;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_2';
+INSERT INTO t SET a=2;
+XA END 'trx_2';
+XA PREPARE 'trx_2';
+disconnect conn2;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_3';
+INSERT INTO t SET a=3;
+XA END 'trx_3';
+XA PREPARE 'trx_3';
+disconnect conn3;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_4';
+INSERT INTO t SET a=4;
+XA END 'trx_4';
+XA PREPARE 'trx_4';
+disconnect conn4;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_5';
+INSERT INTO t SET a=5;
+XA END 'trx_5';
+XA PREPARE 'trx_5';
+disconnect conn5;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_6';
+INSERT INTO t SET a=6;
+XA END 'trx_6';
+XA PREPARE 'trx_6';
+disconnect conn6;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_7';
+INSERT INTO t SET a=7;
+XA END 'trx_7';
+XA PREPARE 'trx_7';
+disconnect conn7;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_8';
+INSERT INTO t SET a=8;
+XA END 'trx_8';
+XA PREPARE 'trx_8';
+disconnect conn8;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_9';
+INSERT INTO t SET a=9;
+XA END 'trx_9';
+XA PREPARE 'trx_9';
+disconnect conn9;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_10';
+INSERT INTO t SET a=10;
+XA END 'trx_10';
+XA PREPARE 'trx_10';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_11';
+INSERT INTO t SET a=11;
+XA END 'trx_11';
+XA PREPARE 'trx_11';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_12';
+INSERT INTO t SET a=12;
+XA END 'trx_12';
+XA PREPARE 'trx_12';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_13';
+INSERT INTO t SET a=13;
+XA END 'trx_13';
+XA PREPARE 'trx_13';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_14';
+INSERT INTO t SET a=14;
+XA END 'trx_14';
+XA PREPARE 'trx_14';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_15';
+INSERT INTO t SET a=15;
+XA END 'trx_15';
+XA PREPARE 'trx_15';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_16';
+INSERT INTO t SET a=16;
+XA END 'trx_16';
+XA PREPARE 'trx_16';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_17';
+INSERT INTO t SET a=17;
+XA END 'trx_17';
+XA PREPARE 'trx_17';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_18';
+INSERT INTO t SET a=18;
+XA END 'trx_18';
+XA PREPARE 'trx_18';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_19';
+INSERT INTO t SET a=19;
+XA END 'trx_19';
+XA PREPARE 'trx_19';
+connection default;
+KILL CONNECTION CONN_ID;
+connection default;
+XA ROLLBACK 'trx_0';
+XA ROLLBACK 'trx_1';
+XA ROLLBACK 'trx_2';
+XA ROLLBACK 'trx_3';
+XA ROLLBACK 'trx_4';
+XA COMMIT 'trx_5';
+XA COMMIT 'trx_6';
+XA COMMIT 'trx_7';
+XA COMMIT 'trx_8';
+XA COMMIT 'trx_9';
+# Kill and restart
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_0';
+INSERT INTO t SET a=0;
+XA END 'new_trx_0';
+XA PREPARE 'new_trx_0';
+disconnect conn_restart_0;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_1';
+INSERT INTO t SET a=1;
+XA END 'new_trx_1';
+XA PREPARE 'new_trx_1';
+disconnect conn_restart_1;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_2';
+INSERT INTO t SET a=2;
+XA END 'new_trx_2';
+XA PREPARE 'new_trx_2';
+disconnect conn_restart_2;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_3';
+INSERT INTO t SET a=3;
+XA END 'new_trx_3';
+XA PREPARE 'new_trx_3';
+disconnect conn_restart_3;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_4';
+INSERT INTO t SET a=4;
+XA END 'new_trx_4';
+XA PREPARE 'new_trx_4';
+disconnect conn_restart_4;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_5';
+INSERT INTO t SET a=5;
+XA END 'new_trx_5';
+XA PREPARE 'new_trx_5';
+disconnect conn_restart_5;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_6';
+INSERT INTO t SET a=6;
+XA END 'new_trx_6';
+XA PREPARE 'new_trx_6';
+disconnect conn_restart_6;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_7';
+INSERT INTO t SET a=7;
+XA END 'new_trx_7';
+XA PREPARE 'new_trx_7';
+disconnect conn_restart_7;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_8';
+INSERT INTO t SET a=8;
+XA END 'new_trx_8';
+XA PREPARE 'new_trx_8';
+disconnect conn_restart_8;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_9';
+INSERT INTO t SET a=9;
+XA END 'new_trx_9';
+XA PREPARE 'new_trx_9';
+disconnect conn_restart_9;
+connection default;
+connection default;
+XA COMMIT 'new_trx_0';
+XA COMMIT 'new_trx_1';
+XA COMMIT 'new_trx_2';
+XA COMMIT 'new_trx_3';
+XA COMMIT 'new_trx_4';
+XA COMMIT 'new_trx_5';
+XA COMMIT 'new_trx_6';
+XA COMMIT 'new_trx_7';
+XA COMMIT 'new_trx_8';
+XA COMMIT 'new_trx_9';
+XA START 'trx_10';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_10';
+XA START 'trx_11';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_11';
+XA START 'trx_12';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_12';
+XA START 'trx_13';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_13';
+XA START 'trx_14';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_14';
+XA START 'trx_15';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_15';
+XA START 'trx_16';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_16';
+XA START 'trx_17';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_17';
+XA START 'trx_18';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_18';
+XA START 'trx_19';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_19';
+SELECT * FROM t;
+a
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+disconnect conn2tmp;
+disconnect conn3tmp;
+disconnect conn2ro;
+disconnect conn3ro;
+disconnect conn2empty;
+disconnect conn3empty;
+connection default;
+XA ROLLBACK 'trx_20';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn19;
+connection default;
+XA ROLLBACK 'trx_19';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn18;
+connection default;
+XA ROLLBACK 'trx_18';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn17;
+connection default;
+XA ROLLBACK 'trx_17';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn16;
+connection default;
+XA ROLLBACK 'trx_16';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn15;
+connection default;
+XA ROLLBACK 'trx_15';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn14;
+connection default;
+XA ROLLBACK 'trx_14';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn13;
+connection default;
+XA ROLLBACK 'trx_13';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn12;
+connection default;
+XA ROLLBACK 'trx_12';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn11;
+connection default;
+XA ROLLBACK 'trx_11';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn10;
+connection default;
+XA START 'one_phase_trx_0';
+INSERT INTO t SET a=0;
+XA END 'one_phase_trx_0';
+XA COMMIT 'one_phase_trx_0' ONE PHASE;
+XA START 'one_phase_trx_1';
+INSERT INTO t SET a=1;
+XA END 'one_phase_trx_1';
+XA COMMIT 'one_phase_trx_1' ONE PHASE;
+XA START 'one_phase_trx_2';
+INSERT INTO t SET a=2;
+XA END 'one_phase_trx_2';
+XA COMMIT 'one_phase_trx_2' ONE PHASE;
+XA START 'one_phase_trx_3';
+INSERT INTO t SET a=3;
+XA END 'one_phase_trx_3';
+XA COMMIT 'one_phase_trx_3' ONE PHASE;
+XA START 'one_phase_trx_4';
+INSERT INTO t SET a=4;
+XA END 'one_phase_trx_4';
+XA COMMIT 'one_phase_trx_4' ONE PHASE;
+SELECT SUM(a) FROM t;
+SUM(a)
+290
+DROP TABLE t;
+DROP VIEW v_processlist;
+include/show_binlog_events.inc
+Log_name Pos Event_type Server_id End_log_pos Info
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # use `test`; CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v_processlist` AS SELECT * FROM performance_schema.threads where type = 'FOREGROUND'
+master-bin.000001 # Gtid # # BEGIN GTID #-#-#
+master-bin.000001 # Query # # use `mtr`; INSERT INTO test_suppressions (pattern) VALUES ( NAME_CONST('pattern',_latin1'Found 10 prepared XA transactions' COLLATE 'latin1_swedish_ci'))
+master-bin.000001 # Query # # COMMIT
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # use `test`; CREATE TABLE t (a INT) ENGINE=innodb
+master-bin.000001 # Gtid # # XA START X'7472785f30',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=0
+master-bin.000001 # Query # # XA END X'7472785f30',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f30',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f31',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=1
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f31',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f31',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f32',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=2
+master-bin.000001 # Query # # XA END X'7472785f32',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f32',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f33',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=3
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f33',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f33',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f34',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=4
+master-bin.000001 # Query # # XA END X'7472785f34',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f34',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f35',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=5
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f35',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f35',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f36',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=6
+master-bin.000001 # Query # # XA END X'7472785f36',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f36',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f37',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=7
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f37',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f37',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f38',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=8
+master-bin.000001 # Query # # XA END X'7472785f38',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f38',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f39',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=9
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f39',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f39',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3130',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=10
+master-bin.000001 # Query # # XA END X'7472785f3130',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3130',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3131',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=11
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f3131',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3131',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3132',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=12
+master-bin.000001 # Query # # XA END X'7472785f3132',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3132',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3133',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=13
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f3133',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3133',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3134',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=14
+master-bin.000001 # Query # # XA END X'7472785f3134',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3134',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3135',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=15
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f3135',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3135',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3136',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=16
+master-bin.000001 # Query # # XA END X'7472785f3136',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3136',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3137',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=17
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f3137',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3137',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3138',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=18
+master-bin.000001 # Query # # XA END X'7472785f3138',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3138',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3139',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=19
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f3139',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3139',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA ROLLBACK X'7472785f30',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA ROLLBACK X'7472785f31',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA ROLLBACK X'7472785f32',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA ROLLBACK X'7472785f33',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA ROLLBACK X'7472785f34',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'7472785f35',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'7472785f36',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'7472785f37',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'7472785f38',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'7472785f39',X'',1
+master-bin.000001 # Stop # #
+All transactions must be completed, to empty-list the following:
+XA RECOVER;
+formatID gtrid_length bqual_length data
+XA RECOVER;
+formatID gtrid_length bqual_length data
diff --git a/mysql-test/suite/binlog/r/binlog_xa_prepared_disconnect.result b/mysql-test/suite/binlog/r/binlog_xa_prepared_disconnect.result
new file mode 100644
index 00000000000..9fda8ab3143
--- /dev/null
+++ b/mysql-test/suite/binlog/r/binlog_xa_prepared_disconnect.result
@@ -0,0 +1,1176 @@
+connection default;
+RESET MASTER;
+CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
+call mtr.add_suppression("Found 10 prepared XA transactions");
+CREATE TABLE t (a INT) ENGINE=innodb;
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx1tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx1tmp';
+XA PREPARE 'trx1tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx2tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx2tmp';
+XA PREPARE 'trx2tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx3tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx3tmp';
+XA PREPARE 'trx3tmp';
+connection default;
+XA COMMIT 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA ROLLBACK 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA START 'trx1tmp';
+ERROR XAE08: XAER_DUPID: The XID already exists
+connection default;
+*** 3 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1tmp;
+disconnect conn1tmp;
+connection default;
+XA COMMIT 'trx1tmp';
+KILL connection CONN_ID;
+XA COMMIT 'trx3tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1ro';
+SELECT * from t ORDER BY a;
+a
+XA END 'trx1ro';
+XA PREPARE 'trx1ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2ro';
+SELECT * from t ORDER BY a;
+a
+XA END 'trx2ro';
+XA PREPARE 'trx2ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3ro';
+SELECT * from t ORDER BY a;
+a
+XA END 'trx3ro';
+XA PREPARE 'trx3ro';
+connection default;
+*** 4 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1ro;
+disconnect conn1ro;
+connection default;
+XA ROLLBACK 'trx1ro';
+KILL connection CONN_ID;
+XA ROLLBACK 'trx3ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1empty';
+XA END 'trx1empty';
+XA PREPARE 'trx1empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2empty';
+XA END 'trx2empty';
+XA PREPARE 'trx2empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3empty';
+XA END 'trx3empty';
+XA PREPARE 'trx3empty';
+connection default;
+*** 5 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1empty;
+disconnect conn1empty;
+connection default;
+XA COMMIT 'trx1empty';
+KILL connection CONN_ID;
+XA COMMIT 'trx3empty';
+connect conn1$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1unprepared';
+INSERT INTO t set a=0;
+XA END 'trx1unprepared';
+INSERT INTO t set a=0;
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+disconnect conn1unprepared;
+connection default;
+XA COMMIT 'trx1unprepared';
+ERROR XAE04: XAER_NOTA: Unknown XID
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_0';
+INSERT INTO t SET a=0;
+XA END 'trx_0';
+XA PREPARE 'trx_0';
+disconnect conn0;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_1';
+INSERT INTO t SET a=1;
+XA END 'trx_1';
+XA PREPARE 'trx_1';
+disconnect conn1;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_2';
+INSERT INTO t SET a=2;
+XA END 'trx_2';
+XA PREPARE 'trx_2';
+disconnect conn2;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_3';
+INSERT INTO t SET a=3;
+XA END 'trx_3';
+XA PREPARE 'trx_3';
+disconnect conn3;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_4';
+INSERT INTO t SET a=4;
+XA END 'trx_4';
+XA PREPARE 'trx_4';
+disconnect conn4;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_5';
+INSERT INTO t SET a=5;
+XA END 'trx_5';
+XA PREPARE 'trx_5';
+disconnect conn5;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_6';
+INSERT INTO t SET a=6;
+XA END 'trx_6';
+XA PREPARE 'trx_6';
+disconnect conn6;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_7';
+INSERT INTO t SET a=7;
+XA END 'trx_7';
+XA PREPARE 'trx_7';
+disconnect conn7;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_8';
+INSERT INTO t SET a=8;
+XA END 'trx_8';
+XA PREPARE 'trx_8';
+disconnect conn8;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_9';
+INSERT INTO t SET a=9;
+XA END 'trx_9';
+XA PREPARE 'trx_9';
+disconnect conn9;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_10';
+INSERT INTO t SET a=10;
+XA END 'trx_10';
+XA PREPARE 'trx_10';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_11';
+INSERT INTO t SET a=11;
+XA END 'trx_11';
+XA PREPARE 'trx_11';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_12';
+INSERT INTO t SET a=12;
+XA END 'trx_12';
+XA PREPARE 'trx_12';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_13';
+INSERT INTO t SET a=13;
+XA END 'trx_13';
+XA PREPARE 'trx_13';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_14';
+INSERT INTO t SET a=14;
+XA END 'trx_14';
+XA PREPARE 'trx_14';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_15';
+INSERT INTO t SET a=15;
+XA END 'trx_15';
+XA PREPARE 'trx_15';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_16';
+INSERT INTO t SET a=16;
+XA END 'trx_16';
+XA PREPARE 'trx_16';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_17';
+INSERT INTO t SET a=17;
+XA END 'trx_17';
+XA PREPARE 'trx_17';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_18';
+INSERT INTO t SET a=18;
+XA END 'trx_18';
+XA PREPARE 'trx_18';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_19';
+INSERT INTO t SET a=19;
+XA END 'trx_19';
+XA PREPARE 'trx_19';
+connection default;
+KILL CONNECTION CONN_ID;
+connection default;
+XA ROLLBACK 'trx_0';
+XA ROLLBACK 'trx_1';
+XA ROLLBACK 'trx_2';
+XA ROLLBACK 'trx_3';
+XA ROLLBACK 'trx_4';
+XA COMMIT 'trx_5';
+XA COMMIT 'trx_6';
+XA COMMIT 'trx_7';
+XA COMMIT 'trx_8';
+XA COMMIT 'trx_9';
+# restart
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_0';
+INSERT INTO t SET a=0;
+XA END 'new_trx_0';
+XA PREPARE 'new_trx_0';
+disconnect conn_restart_0;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_1';
+INSERT INTO t SET a=1;
+XA END 'new_trx_1';
+XA PREPARE 'new_trx_1';
+disconnect conn_restart_1;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_2';
+INSERT INTO t SET a=2;
+XA END 'new_trx_2';
+XA PREPARE 'new_trx_2';
+disconnect conn_restart_2;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_3';
+INSERT INTO t SET a=3;
+XA END 'new_trx_3';
+XA PREPARE 'new_trx_3';
+disconnect conn_restart_3;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_4';
+INSERT INTO t SET a=4;
+XA END 'new_trx_4';
+XA PREPARE 'new_trx_4';
+disconnect conn_restart_4;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_5';
+INSERT INTO t SET a=5;
+XA END 'new_trx_5';
+XA PREPARE 'new_trx_5';
+disconnect conn_restart_5;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_6';
+INSERT INTO t SET a=6;
+XA END 'new_trx_6';
+XA PREPARE 'new_trx_6';
+disconnect conn_restart_6;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_7';
+INSERT INTO t SET a=7;
+XA END 'new_trx_7';
+XA PREPARE 'new_trx_7';
+disconnect conn_restart_7;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_8';
+INSERT INTO t SET a=8;
+XA END 'new_trx_8';
+XA PREPARE 'new_trx_8';
+disconnect conn_restart_8;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_9';
+INSERT INTO t SET a=9;
+XA END 'new_trx_9';
+XA PREPARE 'new_trx_9';
+disconnect conn_restart_9;
+connection default;
+connection default;
+XA COMMIT 'new_trx_0';
+XA COMMIT 'new_trx_1';
+XA COMMIT 'new_trx_2';
+XA COMMIT 'new_trx_3';
+XA COMMIT 'new_trx_4';
+XA COMMIT 'new_trx_5';
+XA COMMIT 'new_trx_6';
+XA COMMIT 'new_trx_7';
+XA COMMIT 'new_trx_8';
+XA COMMIT 'new_trx_9';
+XA START 'trx_10';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_10';
+XA START 'trx_11';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_11';
+XA START 'trx_12';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_12';
+XA START 'trx_13';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_13';
+XA START 'trx_14';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_14';
+XA START 'trx_15';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_15';
+XA START 'trx_16';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_16';
+XA START 'trx_17';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_17';
+XA START 'trx_18';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_18';
+XA START 'trx_19';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_19';
+SELECT * FROM t;
+a
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+disconnect conn2tmp;
+disconnect conn3tmp;
+disconnect conn2ro;
+disconnect conn3ro;
+disconnect conn2empty;
+disconnect conn3empty;
+connection default;
+XA ROLLBACK 'trx_20';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn19;
+connection default;
+XA ROLLBACK 'trx_19';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn18;
+connection default;
+XA ROLLBACK 'trx_18';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn17;
+connection default;
+XA ROLLBACK 'trx_17';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn16;
+connection default;
+XA ROLLBACK 'trx_16';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn15;
+connection default;
+XA ROLLBACK 'trx_15';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn14;
+connection default;
+XA ROLLBACK 'trx_14';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn13;
+connection default;
+XA ROLLBACK 'trx_13';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn12;
+connection default;
+XA ROLLBACK 'trx_12';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn11;
+connection default;
+XA ROLLBACK 'trx_11';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn10;
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx1tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx1tmp';
+XA PREPARE 'trx1tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx2tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx2tmp';
+XA PREPARE 'trx2tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@sql_log_bin = OFF;
+CREATE TEMPORARY TABLE tmp1 (a int) ENGINE=innodb;
+XA START 'trx3tmp';
+INSERT INTO tmp1 SET a=1;
+XA END 'trx3tmp';
+XA PREPARE 'trx3tmp';
+connection default;
+XA COMMIT 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA ROLLBACK 'trx1tmp';
+ERROR XAE04: XAER_NOTA: Unknown XID
+XA START 'trx1tmp';
+ERROR XAE08: XAER_DUPID: The XID already exists
+connection default;
+*** 3 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1tmp;
+disconnect conn1tmp;
+connection default;
+XA COMMIT 'trx1tmp';
+KILL connection CONN_ID;
+XA COMMIT 'trx3tmp';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1ro';
+SELECT * from t ORDER BY a;
+a
+0
+1
+2
+3
+4
+5
+5
+6
+6
+7
+7
+8
+8
+9
+9
+10
+11
+12
+13
+14
+XA END 'trx1ro';
+XA PREPARE 'trx1ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2ro';
+SELECT * from t ORDER BY a;
+a
+0
+1
+2
+3
+4
+5
+5
+6
+6
+7
+7
+8
+8
+9
+9
+10
+11
+12
+13
+14
+XA END 'trx2ro';
+XA PREPARE 'trx2ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3ro';
+SELECT * from t ORDER BY a;
+a
+0
+1
+2
+3
+4
+5
+5
+6
+6
+7
+7
+8
+8
+9
+9
+10
+11
+12
+13
+14
+XA END 'trx3ro';
+XA PREPARE 'trx3ro';
+connection default;
+*** 4 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1ro;
+disconnect conn1ro;
+connection default;
+XA ROLLBACK 'trx1ro';
+KILL connection CONN_ID;
+XA ROLLBACK 'trx3ro';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1empty';
+XA END 'trx1empty';
+XA PREPARE 'trx1empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx2empty';
+XA END 'trx2empty';
+XA PREPARE 'trx2empty';
+connect conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx3empty';
+XA END 'trx3empty';
+XA PREPARE 'trx3empty';
+connection default;
+*** 5 prepared transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+1 LEN1 LEN2 TRX_N
+connection conn1empty;
+disconnect conn1empty;
+connection default;
+XA COMMIT 'trx1empty';
+KILL connection CONN_ID;
+XA COMMIT 'trx3empty';
+connect conn1$type, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'trx1unprepared';
+INSERT INTO t set a=0;
+XA END 'trx1unprepared';
+INSERT INTO t set a=0;
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+XA START 'trx1unprepared';
+ERROR XAE07: XAER_RMFAIL: The command cannot be executed when global transaction is in the IDLE state
+disconnect conn1unprepared;
+connection default;
+XA COMMIT 'trx1unprepared';
+ERROR XAE04: XAER_NOTA: Unknown XID
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_0';
+INSERT INTO t SET a=0;
+XA END 'trx_0';
+XA PREPARE 'trx_0';
+disconnect conn0;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_1';
+INSERT INTO t SET a=1;
+XA END 'trx_1';
+XA PREPARE 'trx_1';
+disconnect conn1;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_2';
+INSERT INTO t SET a=2;
+XA END 'trx_2';
+XA PREPARE 'trx_2';
+disconnect conn2;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_3';
+INSERT INTO t SET a=3;
+XA END 'trx_3';
+XA PREPARE 'trx_3';
+disconnect conn3;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_4';
+INSERT INTO t SET a=4;
+XA END 'trx_4';
+XA PREPARE 'trx_4';
+disconnect conn4;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_5';
+INSERT INTO t SET a=5;
+XA END 'trx_5';
+XA PREPARE 'trx_5';
+disconnect conn5;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_6';
+INSERT INTO t SET a=6;
+XA END 'trx_6';
+XA PREPARE 'trx_6';
+disconnect conn6;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_7';
+INSERT INTO t SET a=7;
+XA END 'trx_7';
+XA PREPARE 'trx_7';
+disconnect conn7;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_8';
+INSERT INTO t SET a=8;
+XA END 'trx_8';
+XA PREPARE 'trx_8';
+disconnect conn8;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_9';
+INSERT INTO t SET a=9;
+XA END 'trx_9';
+XA PREPARE 'trx_9';
+disconnect conn9;
+connection default;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_10';
+INSERT INTO t SET a=10;
+XA END 'trx_10';
+XA PREPARE 'trx_10';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_11';
+INSERT INTO t SET a=11;
+XA END 'trx_11';
+XA PREPARE 'trx_11';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_12';
+INSERT INTO t SET a=12;
+XA END 'trx_12';
+XA PREPARE 'trx_12';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_13';
+INSERT INTO t SET a=13;
+XA END 'trx_13';
+XA PREPARE 'trx_13';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_14';
+INSERT INTO t SET a=14;
+XA END 'trx_14';
+XA PREPARE 'trx_14';
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_15';
+INSERT INTO t SET a=15;
+XA END 'trx_15';
+XA PREPARE 'trx_15';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_16';
+INSERT INTO t SET a=16;
+XA END 'trx_16';
+XA PREPARE 'trx_16';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_17';
+INSERT INTO t SET a=17;
+XA END 'trx_17';
+XA PREPARE 'trx_17';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+XA START 'trx_18';
+INSERT INTO t SET a=18;
+XA END 'trx_18';
+XA PREPARE 'trx_18';
+connection default;
+KILL CONNECTION CONN_ID;
+connect conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@binlog_format = STATEMENT;
+SET @@binlog_format = ROW;
+XA START 'trx_19';
+INSERT INTO t SET a=19;
+XA END 'trx_19';
+XA PREPARE 'trx_19';
+connection default;
+KILL CONNECTION CONN_ID;
+connection default;
+XA ROLLBACK 'trx_0';
+XA ROLLBACK 'trx_1';
+XA ROLLBACK 'trx_2';
+XA ROLLBACK 'trx_3';
+XA ROLLBACK 'trx_4';
+XA COMMIT 'trx_5';
+XA COMMIT 'trx_6';
+XA COMMIT 'trx_7';
+XA COMMIT 'trx_8';
+XA COMMIT 'trx_9';
+# Kill and restart
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_0';
+INSERT INTO t SET a=0;
+XA END 'new_trx_0';
+XA PREPARE 'new_trx_0';
+disconnect conn_restart_0;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_1';
+INSERT INTO t SET a=1;
+XA END 'new_trx_1';
+XA PREPARE 'new_trx_1';
+disconnect conn_restart_1;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_2';
+INSERT INTO t SET a=2;
+XA END 'new_trx_2';
+XA PREPARE 'new_trx_2';
+disconnect conn_restart_2;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_3';
+INSERT INTO t SET a=3;
+XA END 'new_trx_3';
+XA PREPARE 'new_trx_3';
+disconnect conn_restart_3;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_4';
+INSERT INTO t SET a=4;
+XA END 'new_trx_4';
+XA PREPARE 'new_trx_4';
+disconnect conn_restart_4;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_5';
+INSERT INTO t SET a=5;
+XA END 'new_trx_5';
+XA PREPARE 'new_trx_5';
+disconnect conn_restart_5;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_6';
+INSERT INTO t SET a=6;
+XA END 'new_trx_6';
+XA PREPARE 'new_trx_6';
+disconnect conn_restart_6;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_7';
+INSERT INTO t SET a=7;
+XA END 'new_trx_7';
+XA PREPARE 'new_trx_7';
+disconnect conn_restart_7;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_8';
+INSERT INTO t SET a=8;
+XA END 'new_trx_8';
+XA PREPARE 'new_trx_8';
+disconnect conn_restart_8;
+connection default;
+connect conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'new_trx_9';
+INSERT INTO t SET a=9;
+XA END 'new_trx_9';
+XA PREPARE 'new_trx_9';
+disconnect conn_restart_9;
+connection default;
+connection default;
+XA COMMIT 'new_trx_0';
+XA COMMIT 'new_trx_1';
+XA COMMIT 'new_trx_2';
+XA COMMIT 'new_trx_3';
+XA COMMIT 'new_trx_4';
+XA COMMIT 'new_trx_5';
+XA COMMIT 'new_trx_6';
+XA COMMIT 'new_trx_7';
+XA COMMIT 'new_trx_8';
+XA COMMIT 'new_trx_9';
+XA START 'trx_10';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_10';
+XA START 'trx_11';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_11';
+XA START 'trx_12';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_12';
+XA START 'trx_13';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_13';
+XA START 'trx_14';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA COMMIT 'trx_14';
+XA START 'trx_15';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_15';
+XA START 'trx_16';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_16';
+XA START 'trx_17';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_17';
+XA START 'trx_18';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_18';
+XA START 'trx_19';
+ERROR XAE08: XAER_DUPID: The XID already exists
+XA ROLLBACK 'trx_19';
+SELECT * FROM t;
+a
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+disconnect conn2tmp;
+disconnect conn3tmp;
+disconnect conn2ro;
+disconnect conn3ro;
+disconnect conn2empty;
+disconnect conn3empty;
+connection default;
+XA ROLLBACK 'trx_20';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn19;
+connection default;
+XA ROLLBACK 'trx_19';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn18;
+connection default;
+XA ROLLBACK 'trx_18';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn17;
+connection default;
+XA ROLLBACK 'trx_17';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn16;
+connection default;
+XA ROLLBACK 'trx_16';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn15;
+connection default;
+XA ROLLBACK 'trx_15';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn14;
+connection default;
+XA ROLLBACK 'trx_14';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn13;
+connection default;
+XA ROLLBACK 'trx_13';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn12;
+connection default;
+XA ROLLBACK 'trx_12';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn11;
+connection default;
+XA ROLLBACK 'trx_11';
+ERROR XAE04: XAER_NOTA: Unknown XID
+disconnect conn10;
+connection default;
+XA START 'one_phase_trx_0';
+INSERT INTO t SET a=0;
+XA END 'one_phase_trx_0';
+XA COMMIT 'one_phase_trx_0' ONE PHASE;
+XA START 'one_phase_trx_1';
+INSERT INTO t SET a=1;
+XA END 'one_phase_trx_1';
+XA COMMIT 'one_phase_trx_1' ONE PHASE;
+XA START 'one_phase_trx_2';
+INSERT INTO t SET a=2;
+XA END 'one_phase_trx_2';
+XA COMMIT 'one_phase_trx_2' ONE PHASE;
+XA START 'one_phase_trx_3';
+INSERT INTO t SET a=3;
+XA END 'one_phase_trx_3';
+XA COMMIT 'one_phase_trx_3' ONE PHASE;
+XA START 'one_phase_trx_4';
+INSERT INTO t SET a=4;
+XA END 'one_phase_trx_4';
+XA COMMIT 'one_phase_trx_4' ONE PHASE;
+SELECT SUM(a) FROM t;
+SUM(a)
+290
+DROP TABLE t;
+DROP VIEW v_processlist;
+include/show_binlog_events.inc
+Log_name Pos Event_type Server_id End_log_pos Info
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # use `test`; CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v_processlist` AS SELECT * FROM performance_schema.threads where type = 'FOREGROUND'
+master-bin.000001 # Gtid # # BEGIN GTID #-#-#
+master-bin.000001 # Query # # use `mtr`; INSERT INTO test_suppressions (pattern) VALUES ( NAME_CONST('pattern',_latin1'Found 10 prepared XA transactions' COLLATE 'latin1_swedish_ci'))
+master-bin.000001 # Query # # COMMIT
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # use `test`; CREATE TABLE t (a INT) ENGINE=innodb
+master-bin.000001 # Gtid # # XA START X'7472785f30',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=0
+master-bin.000001 # Query # # XA END X'7472785f30',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f30',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f31',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=1
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f31',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f31',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f32',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=2
+master-bin.000001 # Query # # XA END X'7472785f32',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f32',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f33',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=3
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f33',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f33',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f34',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=4
+master-bin.000001 # Query # # XA END X'7472785f34',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f34',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f35',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=5
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f35',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f35',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f36',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=6
+master-bin.000001 # Query # # XA END X'7472785f36',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f36',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f37',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=7
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f37',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f37',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f38',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=8
+master-bin.000001 # Query # # XA END X'7472785f38',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f38',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f39',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=9
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f39',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f39',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3130',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=10
+master-bin.000001 # Query # # XA END X'7472785f3130',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3130',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3131',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=11
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f3131',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3131',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3132',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=12
+master-bin.000001 # Query # # XA END X'7472785f3132',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3132',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3133',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=13
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f3133',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3133',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3134',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=14
+master-bin.000001 # Query # # XA END X'7472785f3134',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3134',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3135',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=15
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f3135',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3135',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3136',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=16
+master-bin.000001 # Query # # XA END X'7472785f3136',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3136',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3137',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=17
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f3137',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3137',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3138',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO t SET a=18
+master-bin.000001 # Query # # XA END X'7472785f3138',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3138',X'',1
+master-bin.000001 # Gtid # # XA START X'7472785f3139',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO t SET a=19
+master-bin.000001 # Table_map # # table_id: # (test.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'7472785f3139',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'7472785f3139',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA ROLLBACK X'7472785f30',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA ROLLBACK X'7472785f31',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA ROLLBACK X'7472785f32',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA ROLLBACK X'7472785f33',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA ROLLBACK X'7472785f34',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'7472785f35',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'7472785f36',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'7472785f37',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'7472785f38',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'7472785f39',X'',1
+master-bin.000001 # Stop # #
+All transactions must be completed, to empty-list the following:
+XA RECOVER;
+formatID gtrid_length bqual_length data
+XA RECOVER;
+formatID gtrid_length bqual_length data
diff --git a/mysql-test/suite/binlog/t/binlog_xa_checkpoint.test b/mysql-test/suite/binlog/t/binlog_xa_checkpoint.test
new file mode 100644
index 00000000000..b208d02cf2a
--- /dev/null
+++ b/mysql-test/suite/binlog/t/binlog_xa_checkpoint.test
@@ -0,0 +1,57 @@
+--source include/have_innodb.inc
+--source include/have_debug.inc
+--source include/have_debug_sync.inc
+--source include/have_binlog_format_row.inc
+
+RESET MASTER;
+
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+
+# Test that
+# 1. XA PREPARE is binlogged before the XA has been prepared in Engine
+# 2. While XA PREPARE already binlogged in an old binlog file which has been rotated,
+# Binlog checkpoint is not generated for the latest log until
+# XA PREPARE returns, e.g OK to the client.
+
+
+# con1 will hang before doing commit checkpoint, blocking RESET MASTER.
+connect(con1,localhost,root,,);
+SET DEBUG_SYNC= "at_unlog_xa_prepare SIGNAL con1_ready WAIT_FOR con1_go";
+XA START '1';
+INSERT INTO t1 SET a=1;
+XA END '1';
+--send XA PREPARE '1';
+
+
+connection default;
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+FLUSH LOGS;
+FLUSH LOGS;
+FLUSH LOGS;
+
+--source include/show_binary_logs.inc
+--let $binlog_file= master-bin.000004
+--let $binlog_start= 4
+--source include/show_binlog_events.inc
+
+SET DEBUG_SYNC= "now SIGNAL con1_go";
+
+connection con1;
+reap;
+--echo *** master-bin.000004 checkpoint must show up now ***
+--source include/wait_for_binlog_checkpoint.inc
+
+# Todo: think about the error code returned, move to an appropriate test, or remove
+# connection default;
+#--error 1399
+# DROP TABLE t1;
+
+connection con1;
+XA ROLLBACK '1';
+SET debug_sync = 'reset';
+
+# Clean up.
+connection default;
+
+DROP TABLE t1;
+SET debug_sync = 'reset';
diff --git a/mysql-test/suite/binlog/t/binlog_xa_prepared.inc b/mysql-test/suite/binlog/t/binlog_xa_prepared.inc
new file mode 100644
index 00000000000..b6306791cf4
--- /dev/null
+++ b/mysql-test/suite/binlog/t/binlog_xa_prepared.inc
@@ -0,0 +1,102 @@
+--source include/have_innodb.inc
+--source include/have_perfschema.inc
+#
+# The test verifies binlogging of XA transaction and state of prepared XA
+# as far as binlog is concerned.
+#
+# The prepared XA transactions can be disconnected from the client,
+# discovered from another connection and commited or rolled back
+# later. They also survive the server restart. The test runs two
+# loops each consisting of prepared XA:s generation, their
+# manipulation and a server restart followed with survived XA:s
+# completion.
+#
+# Prepared XA can't get available to an external connection
+# until connection that either leaves actively or is killed
+# has completed a necessary part of its cleanup.
+# Selecting from P_S.threads provides a method to learn that.
+#
+# Total number of connection each performing one insert into table
+--let $conn_number=20
+# Number of rollbacks and commits from either side of the server restart
+--let $rollback_number=5
+--let $commit_number=5
+# Number of transactions that are terminated before server restarts
+--let $term_number=`SELECT $rollback_number + $commit_number`
+# Instead of disconnect make some connections killed when their
+# transactions got prepared.
+--let $killed_number=5
+# make some connections disconnected by shutdown rather than actively
+--let $server_disconn_number=5
+--let $prepared_at_server_restart = `SELECT $conn_number - $term_number`
+# number a "warmup" connection after server restart, they all commit
+--let $post_restart_conn_number=10
+
+# Counter to be used in GTID consistency check.
+# It's incremented per each non-XA transaction commit.
+# Local to this file variable to control one-phase commit loop
+--let $one_phase_number = 5
+
+--connection default
+
+# Remove possibly preceeding binlogs and clear initialization time
+# GTID executed info. In the following all transactions are counted
+# to conduct verification at the end of the test.
+if (`SELECT @@global.log_bin`)
+{
+ RESET MASTER;
+}
+
+# Disconected and follower threads need synchronization
+CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
+
+--eval call mtr.add_suppression("Found $prepared_at_server_restart prepared XA transactions")
+
+CREATE TABLE t (a INT) ENGINE=innodb;
+
+# Counter is incremented at the end of post restart to
+# reflect number of loops done in correctness computation.
+--let $restart_number = 0
+--let $how_to_restart=restart_mysqld.inc
+--source suite/binlog/include/binlog_xa_prepared_do_and_restart.inc
+
+--let $how_to_restart=kill_and_restart_mysqld.inc
+--source suite/binlog/include/binlog_xa_prepared_do_and_restart.inc
+
+--connection default
+
+# Few xs that commit in one phase, not subject to the server restart
+# nor reconnect.
+# This piece of test is related to mysqlbinlog recovery examine below.
+--let $k = 0
+while ($k < $one_phase_number)
+{
+ --eval XA START 'one_phase_trx_$k'
+ --eval INSERT INTO t SET a=$k
+ --eval XA END 'one_phase_trx_$k'
+ --eval XA COMMIT 'one_phase_trx_$k' ONE PHASE
+
+ --inc $k
+}
+
+SELECT SUM(a) FROM t;
+DROP TABLE t;
+DROP VIEW v_processlist;
+
+let $outfile= $MYSQLTEST_VARDIR/tmp/mysqlbinlog.sql;
+if (`SELECT @@global.log_bin`)
+{
+ # Recording proper samples of binlogged prepared XA:s
+ --source include/show_binlog_events.inc
+ --exec $MYSQL_BINLOG -R --to-last-log master-bin.000001 > $outfile
+}
+
+--echo All transactions must be completed, to empty-list the following:
+XA RECOVER;
+
+if (`SELECT @@global.log_bin`)
+{
+ --exec $MYSQL test < $outfile
+ --remove_file $outfile
+ XA RECOVER;
+}
diff --git a/mysql-test/suite/binlog/t/binlog_xa_prepared_disconnect.test b/mysql-test/suite/binlog/t/binlog_xa_prepared_disconnect.test
new file mode 100644
index 00000000000..2a3184030cf
--- /dev/null
+++ b/mysql-test/suite/binlog/t/binlog_xa_prepared_disconnect.test
@@ -0,0 +1,11 @@
+###############################################################################
+# Bug#12161 Xa recovery and client disconnection
+# Testing new server options and binary logging prepared XA transaction.
+###############################################################################
+
+#
+# MIXED mode is chosen because formats are varied inside the sourced tests.
+#
+--source include/have_binlog_format_mixed.inc
+
+--source suite/binlog/t/binlog_xa_prepared.inc
diff --git a/mysql-test/suite/rpl/include/rpl_xa_mixed_engines.inc b/mysql-test/suite/rpl/include/rpl_xa_mixed_engines.inc
new file mode 100644
index 00000000000..0707a04090a
--- /dev/null
+++ b/mysql-test/suite/rpl/include/rpl_xa_mixed_engines.inc
@@ -0,0 +1,183 @@
+#
+# The test file is invoked from rpl.rpl_xa_survive_disconnect_mixed_engines
+#
+# The test file is orginized as three sections: setup, run and cleanup.
+# The main logics is resided in the run section which generates
+# three types of XA transaction: two kinds of mixed and one on non-transactional
+# table.
+#
+# param $command one of three of: 'setup', 'run' or 'cleanup'
+# param $xa_terminate how to conclude: 'XA COMMIT' or 'XA ROLLBACK'
+# param $one_phase 'one_phase' can be opted with XA COMMIT above
+# param $xa_prepare_opt '1' or empty can be opted to test with and without XA PREPARE
+# param $xid arbitrary name for xa trx, defaults to 'xa_trx'
+# Note '' is merely to underline, not a part of the value.
+#
+
+if ($command == setup)
+{
+ # Test randomizes the following variable's value:
+ SET @@session.binlog_direct_non_transactional_updates := if(floor(rand()*10)%2,'ON','OFF');
+ CREATE TABLE t (a INT) ENGINE=innodb;
+ CREATE TABLE tm (a INT) ENGINE=myisam;
+}
+if (!$xid)
+{
+ --let $xid=xa_trx
+}
+if ($command == run)
+{
+ ## Non-temporary table cases
+ # Non transactional table goes first
+ --eval XA START '$xid'
+ --disable_warnings
+ INSERT INTO tm VALUES (1);
+ INSERT INTO t VALUES (1);
+ --enable_warnings
+ --eval XA END '$xid'
+ if ($xa_prepare_opt)
+ {
+ --eval XA PREPARE '$xid'
+ }
+ --eval $xa_terminate '$xid' $one_phase
+
+ # Transactional table goes first
+ --eval XA START '$xid'
+ --disable_warnings
+ INSERT INTO t VALUES (2);
+ INSERT INTO tm VALUES (2);
+ --enable_warnings
+ --eval XA END '$xid'
+ if ($xa_prepare_opt)
+ {
+ --eval XA PREPARE '$xid'
+ }
+ --eval $xa_terminate '$xid' $one_phase
+
+ # The pure non-transactional table
+ --eval XA START '$xid'
+ --disable_warnings
+ INSERT INTO tm VALUES (3);
+ --enable_warnings
+ --eval XA END '$xid'
+ if ($xa_prepare_opt)
+ {
+ --eval XA PREPARE '$xid'
+ }
+ --eval $xa_terminate '$xid' $one_phase
+
+ ## Temporary tables
+ # create outside xa use at the tail
+ CREATE TEMPORARY TABLE tmp_i LIKE t;
+ CREATE TEMPORARY TABLE tmp_m LIKE tm;
+ --eval XA START '$xid'
+ --disable_warnings
+ INSERT INTO t VALUES (4);
+ INSERT INTO tm VALUES (4);
+ INSERT INTO tmp_i VALUES (4);
+ INSERT INTO tmp_m VALUES (4);
+ INSERT INTO t SELECT * FROM tmp_i;
+ INSERT INTO tm SELECT * FROM tmp_m;
+ --enable_warnings
+ --eval XA END '$xid'
+ if ($xa_prepare_opt)
+ {
+ --eval XA PREPARE '$xid'
+ }
+ --eval $xa_terminate '$xid' $one_phase
+
+ # temporary tables at the head
+ --eval XA START '$xid'
+ --disable_warnings
+ INSERT INTO tmp_i VALUES (5);
+ INSERT INTO tmp_m VALUES (5);
+ INSERT INTO t SELECT * FROM tmp_i;
+ INSERT INTO tm SELECT * FROM tmp_m;
+ INSERT INTO t VALUES (5);
+ INSERT INTO tm VALUES (5);
+ --enable_warnings
+ --eval XA END '$xid'
+ if ($xa_prepare_opt)
+ {
+ --eval XA PREPARE '$xid'
+ }
+ --eval $xa_terminate '$xid' $one_phase
+
+ # create inside xa use at the tail
+ DROP TEMPORARY TABLE tmp_i;
+ DROP TEMPORARY TABLE tmp_m;
+
+ --eval XA START '$xid'
+ --disable_warnings
+ INSERT INTO t VALUES (6);
+ INSERT INTO tm VALUES (6);
+ CREATE TEMPORARY TABLE tmp_i LIKE t;
+ CREATE TEMPORARY TABLE tmp_m LIKE tm;
+ INSERT INTO tmp_i VALUES (6);
+ INSERT INTO tmp_m VALUES (6);
+ INSERT INTO t SELECT * FROM tmp_i;
+ INSERT INTO tm SELECT * FROM tmp_m;
+ --enable_warnings
+ --eval XA END '$xid'
+ if ($xa_prepare_opt)
+ {
+ --eval XA PREPARE '$xid'
+ }
+ --eval $xa_terminate '$xid' $one_phase
+
+ # use at the head
+ DROP TEMPORARY TABLE tmp_i;
+ DROP TEMPORARY TABLE tmp_m;
+ --eval XA START '$xid'
+ --disable_warnings
+ CREATE TEMPORARY TABLE tmp_i LIKE t;
+ CREATE TEMPORARY TABLE tmp_m LIKE tm;
+ INSERT INTO tmp_i VALUES (7);
+ INSERT INTO tmp_m VALUES (7);
+ INSERT INTO t SELECT * FROM tmp_i;
+ INSERT INTO tm SELECT * FROM tmp_m;
+ INSERT INTO t VALUES (7);
+ INSERT INTO tm VALUES (7);
+ --enable_warnings
+ --eval XA END '$xid'
+ if ($xa_prepare_opt)
+ {
+ --eval XA PREPARE '$xid'
+ }
+ --eval $xa_terminate '$xid' $one_phase
+
+ # use at the tail and drop
+ --eval XA START '$xid'
+ --disable_warnings
+ INSERT INTO t VALUES (8);
+ INSERT INTO tm VALUES (8);
+ INSERT INTO tmp_i VALUES (8);
+ INSERT INTO tmp_m VALUES (8);
+ INSERT INTO t SELECT * FROM tmp_i;
+ INSERT INTO tm SELECT * FROM tmp_m;
+ DROP TEMPORARY TABLE tmp_i;
+ DROP TEMPORARY TABLE tmp_m;
+ --enable_warnings
+ --eval XA END '$xid'
+ if ($xa_prepare_opt)
+ {
+ --eval XA PREPARE '$xid'
+ }
+ --eval $xa_terminate '$xid' $one_phase
+
+ ## Ineffective transactional table operation case
+
+ --eval XA START '$xid'
+ UPDATE t SET a = 99 where a = -1;
+ --eval XA END '$xid'
+ if ($xa_prepare_opt)
+ {
+ --eval XA PREPARE '$xid'
+ }
+ --eval $xa_terminate '$xid' $one_phase
+}
+
+if ($command == cleanup)
+{
+ DROP TABLE t, tm;
+}
diff --git a/mysql-test/suite/rpl/r/rpl_parallel_optimistic_xa.result b/mysql-test/suite/rpl/r/rpl_parallel_optimistic_xa.result
new file mode 100644
index 00000000000..4136f1885db
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_parallel_optimistic_xa.result
@@ -0,0 +1,51 @@
+include/master-slave.inc
+[connection master]
+call mtr.add_suppression("Deadlock found when trying to get lock; try restarting transaction");
+call mtr.add_suppression("WSREP: handlerton rollback failed");
+CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
+connection master;
+ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
+connection slave;
+include/stop_slave.inc
+SET @old_parallel_threads = @@GLOBAL.slave_parallel_threads;
+SET @@global.slave_parallel_threads = 7;
+SET @old_parallel_mode = @@GLOBAL.slave_parallel_mode;
+SET @@global.slave_parallel_mode ='optimistic';
+SET @old_gtid_cleanup_batch_size = @@GLOBAL.gtid_cleanup_batch_size;
+SET @@global.gtid_cleanup_batch_size = 1000000;
+CHANGE MASTER TO master_use_gtid=slave_pos;
+connection master;
+CREATE TABLE t0 (a int, b INT) ENGINE=InnoDB;
+CREATE TABLE t1 (a int PRIMARY KEY, b INT) ENGINE=InnoDB;
+INSERT INTO t1 VALUES (1, 0);
+include/save_master_gtid.inc
+connection slave;
+include/start_slave.inc
+include/sync_with_master_gtid.inc
+include/stop_slave.inc
+connection master;
+include/save_master_gtid.inc
+connection slave;
+include/start_slave.inc
+include/sync_with_master_gtid.inc
+include/diff_tables.inc [master:t0, slave:t0]
+include/diff_tables.inc [master:t1, slave:t1]
+connection slave;
+include/stop_slave.inc
+set global log_warnings=default;
+SET GLOBAL slave_parallel_mode=@old_parallel_mode;
+SET GLOBAL slave_parallel_threads=@old_parallel_threads;
+include/start_slave.inc
+connection master;
+DROP VIEW v_processlist;
+DROP TABLE t0, t1;
+include/save_master_gtid.inc
+connection slave;
+include/sync_with_master_gtid.inc
+SELECT COUNT(*) <= 5*@@GLOBAL.gtid_cleanup_batch_size
+FROM mysql.gtid_slave_pos;
+COUNT(*) <= 5*@@GLOBAL.gtid_cleanup_batch_size
+1
+SET GLOBAL gtid_cleanup_batch_size= @old_gtid_cleanup_batch_size;
+connection master;
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_parallel_optimistic_xa_lsu_off.result b/mysql-test/suite/rpl/r/rpl_parallel_optimistic_xa_lsu_off.result
new file mode 100644
index 00000000000..4136f1885db
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_parallel_optimistic_xa_lsu_off.result
@@ -0,0 +1,51 @@
+include/master-slave.inc
+[connection master]
+call mtr.add_suppression("Deadlock found when trying to get lock; try restarting transaction");
+call mtr.add_suppression("WSREP: handlerton rollback failed");
+CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
+connection master;
+ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
+connection slave;
+include/stop_slave.inc
+SET @old_parallel_threads = @@GLOBAL.slave_parallel_threads;
+SET @@global.slave_parallel_threads = 7;
+SET @old_parallel_mode = @@GLOBAL.slave_parallel_mode;
+SET @@global.slave_parallel_mode ='optimistic';
+SET @old_gtid_cleanup_batch_size = @@GLOBAL.gtid_cleanup_batch_size;
+SET @@global.gtid_cleanup_batch_size = 1000000;
+CHANGE MASTER TO master_use_gtid=slave_pos;
+connection master;
+CREATE TABLE t0 (a int, b INT) ENGINE=InnoDB;
+CREATE TABLE t1 (a int PRIMARY KEY, b INT) ENGINE=InnoDB;
+INSERT INTO t1 VALUES (1, 0);
+include/save_master_gtid.inc
+connection slave;
+include/start_slave.inc
+include/sync_with_master_gtid.inc
+include/stop_slave.inc
+connection master;
+include/save_master_gtid.inc
+connection slave;
+include/start_slave.inc
+include/sync_with_master_gtid.inc
+include/diff_tables.inc [master:t0, slave:t0]
+include/diff_tables.inc [master:t1, slave:t1]
+connection slave;
+include/stop_slave.inc
+set global log_warnings=default;
+SET GLOBAL slave_parallel_mode=@old_parallel_mode;
+SET GLOBAL slave_parallel_threads=@old_parallel_threads;
+include/start_slave.inc
+connection master;
+DROP VIEW v_processlist;
+DROP TABLE t0, t1;
+include/save_master_gtid.inc
+connection slave;
+include/sync_with_master_gtid.inc
+SELECT COUNT(*) <= 5*@@GLOBAL.gtid_cleanup_batch_size
+FROM mysql.gtid_slave_pos;
+COUNT(*) <= 5*@@GLOBAL.gtid_cleanup_batch_size
+1
+SET GLOBAL gtid_cleanup_batch_size= @old_gtid_cleanup_batch_size;
+connection master;
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_parallel_xa_same_xid.result b/mysql-test/suite/rpl/r/rpl_parallel_xa_same_xid.result
new file mode 100644
index 00000000000..03fe5157623
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_parallel_xa_same_xid.result
@@ -0,0 +1,23 @@
+include/master-slave.inc
+[connection master]
+connection slave;
+call mtr.add_suppression("WSREP: handlerton rollback failed");
+include/stop_slave.inc
+ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
+SET @old_parallel_threads = @@GLOBAL.slave_parallel_threads;
+SET @old_parallel_mode = @@GLOBAL.slave_parallel_mode;
+SET @@global.slave_parallel_mode ='optimistic';
+include/start_slave.inc
+connection master;
+CREATE TABLE t1 (a INT, b INT) ENGINE=InnoDB;
+CREATE TABLE t2 (a INT AUTO_INCREMENT PRIMARY KEY, b INT) ENGINE=InnoDB;
+include/sync_slave_sql_with_master.inc
+include/diff_tables.inc [master:t1, slave:t1]
+connection slave;
+include/stop_slave.inc
+SET GLOBAL slave_parallel_threads=@old_parallel_threads;
+SET GLOBAL slave_parallel_mode=@old_parallel_mode;
+include/start_slave.inc
+connection master;
+DROP TABLE t1, t2;
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_temporary_errors.result b/mysql-test/suite/rpl/r/rpl_temporary_errors.result
index 8654fe218dc..c126871e460 100644
--- a/mysql-test/suite/rpl/r/rpl_temporary_errors.result
+++ b/mysql-test/suite/rpl/r/rpl_temporary_errors.result
@@ -3,7 +3,7 @@ include/master-slave.inc
call mtr.add_suppression("Deadlock found");
call mtr.add_suppression("Can't find record in 't.'");
connection master;
-CREATE TABLE t1 (a INT PRIMARY KEY, b INT);
+CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=innodb;
INSERT INTO t1 VALUES (1,1), (2,2), (3,3), (4,4);
connection slave;
SHOW STATUS LIKE 'Slave_retried_transactions';
@@ -11,34 +11,67 @@ Variable_name Value
Slave_retried_transactions 0
set @@global.slave_exec_mode= 'IDEMPOTENT';
UPDATE t1 SET a = 5, b = 47 WHERE a = 1;
-SELECT * FROM t1;
+SELECT * FROM t1 ORDER BY a;
a b
-5 47
2 2
3 3
4 4
+5 47
connection master;
UPDATE t1 SET a = 5, b = 5 WHERE a = 1;
-SELECT * FROM t1;
+SELECT * FROM t1 ORDER BY a;
a b
-5 5
2 2
3 3
4 4
+5 5
connection slave;
set @@global.slave_exec_mode= default;
SHOW STATUS LIKE 'Slave_retried_transactions';
Variable_name Value
Slave_retried_transactions 0
-SELECT * FROM t1;
+SELECT * FROM t1 ORDER BY a;
a b
-5 47
2 2
3 3
4 4
+5 47
include/check_slave_is_running.inc
connection slave;
call mtr.add_suppression("Slave SQL.*Could not execute Update_rows event on table test.t1");
+call mtr.add_suppression("Slave SQL for channel '': worker thread retried transaction");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped");
+connection slave;
+set @save_innodb_lock_wait_timeout=@@global.innodb_lock_wait_timeout;
+set @save_slave_transaction_retries=@@global.slave_transaction_retries;
+set @@global.innodb_lock_wait_timeout=1;
+set @@global.slave_transaction_retries=2;
+include/restart_slave.inc
+connection slave1;
+BEGIN;
+INSERT INTO t1 SET a = 6, b = 7;
+connection master;
+INSERT INTO t1 SET a = 99, b = 99;
+XA START 'xa1';
+INSERT INTO t1 SET a = 6, b = 6;
+XA END 'xa1';
+XA PREPARE 'xa1';
+connection slave;
+include/wait_for_slave_sql_error.inc [errno=1213,1205]
+set @@global.innodb_lock_wait_timeout=1;
+set @@global.slave_transaction_retries=100;
+include/restart_slave.inc
+Warnings:
+Note 1255 Slave already has been stopped
+connection slave1;
+ROLLBACK;
+connection master;
+XA COMMIT 'xa1';
+include/sync_slave_sql_with_master.inc
+connection slave;
+include/assert.inc [XA transaction record must be in the table]
+set @@global.innodb_lock_wait_timeout=@save_innodb_lock_wait_timeout;
+set @@global.slave_transaction_retries= @save_slave_transaction_retries;
connection master;
DROP TABLE t1;
connection slave;
diff --git a/mysql-test/suite/rpl/r/rpl_xa.result b/mysql-test/suite/rpl/r/rpl_xa.result
new file mode 100644
index 00000000000..3420f2348e2
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa.result
@@ -0,0 +1,48 @@
+include/master-slave.inc
+[connection master]
+connection master;
+create table t1 (a int, b int) engine=InnoDB;
+insert into t1 values(0, 0);
+xa start 't';
+insert into t1 values(1, 2);
+xa end 't';
+xa prepare 't';
+xa commit 't';
+connection slave;
+include/diff_tables.inc [master:t1, slave:t1]
+connection master;
+xa start 't';
+insert into t1 values(3, 4);
+xa end 't';
+xa prepare 't';
+xa rollback 't';
+connection slave;
+include/diff_tables.inc [master:t1, slave:t1]
+connection master;
+SET pseudo_slave_mode=1;
+create table t2 (a int) engine=InnoDB;
+xa start 't';
+insert into t1 values (5, 6);
+xa end 't';
+xa prepare 't';
+xa start 's';
+insert into t2 values (0);
+xa end 's';
+xa prepare 's';
+include/save_master_gtid.inc
+connection slave;
+include/sync_with_master_gtid.inc
+xa recover;
+formatID gtrid_length bqual_length data
+1 1 0 t
+1 1 0 s
+connection master;
+xa commit 't';
+xa commit 's';
+SET pseudo_slave_mode=0;
+connection slave;
+include/diff_tables.inc [master:t1, slave:t1]
+include/diff_tables.inc [master:t2, slave:t2]
+connection master;
+drop table t1, t2;
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_xa_gap_lock.result b/mysql-test/suite/rpl/r/rpl_xa_gap_lock.result
new file mode 100644
index 00000000000..cb760abe2d2
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa_gap_lock.result
@@ -0,0 +1,44 @@
+include/master-slave.inc
+[connection master]
+connection slave;
+SET @saved_innodb_limit_optimistic_insert_debug = @@GLOBAL.innodb_limit_optimistic_insert_debug;
+SET @@GLOBAL.innodb_limit_optimistic_insert_debug = 2;
+connection master;
+CREATE TABLE t1 (
+c1 INT NOT NULL,
+KEY(c1)
+) ENGINE=InnoDB;
+CREATE TABLE t2 (
+c1 INT NOT NULL,
+FOREIGN KEY(c1) REFERENCES t1(c1)
+) ENGINE=InnoDB;
+INSERT INTO t1 VALUES (1), (3), (4);
+connection master1;
+XA START 'XA1';
+INSERT INTO t1 values(2);
+XA END 'XA1';
+connection master;
+XA START 'XA2';
+INSERT INTO t2 values(3);
+XA END 'XA2';
+XA PREPARE 'XA2';
+connection master1;
+XA PREPARE 'XA1';
+XA COMMIT 'XA1';
+connection master;
+XA COMMIT 'XA2';
+include/sync_slave_sql_with_master.inc
+include/stop_slave.inc
+DROP TABLE t2, t1;
+RESET SLAVE;
+RESET MASTER;
+connection master;
+Restore binary log from the master into the slave
+include/diff_tables.inc [master:test.t1, slave:test.t1]
+include/diff_tables.inc [master:test.t2, slave:test.t2]
+DROP TABLE t2, t1;
+connection slave;
+CHANGE MASTER TO MASTER_LOG_FILE='LOG_FILE', MASTER_LOG_POS=LOG_POS;
+SET @@GLOBAL.innodb_limit_optimistic_insert_debug = @saved_innodb_limit_optimistic_insert_debug;
+include/start_slave.inc
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_xa_gtid_pos_auto_engine.result b/mysql-test/suite/rpl/r/rpl_xa_gtid_pos_auto_engine.result
new file mode 100644
index 00000000000..a7ed0f97ea2
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa_gtid_pos_auto_engine.result
@@ -0,0 +1,64 @@
+include/master-slave.inc
+[connection master]
+connection slave;
+call mtr.add_suppression("The automatically created table.*name may not be entirely in lowercase");
+include/stop_slave.inc
+CHANGE MASTER TO master_use_gtid=slave_pos;
+SET @@global.gtid_pos_auto_engines="innodb";
+include/start_slave.inc
+connection master;
+create table t1 (a int, b int) engine=InnoDB;
+insert into t1 values(0, 0);
+xa start 't';
+insert into t1 values(1, 2);
+xa end 't';
+xa prepare 't';
+xa commit 't';
+connection slave;
+include/diff_tables.inc [master:t1, slave:t1]
+connection master;
+xa start 't';
+insert into t1 values(3, 4);
+xa end 't';
+xa prepare 't';
+xa rollback 't';
+connection slave;
+include/diff_tables.inc [master:t1, slave:t1]
+connection master;
+SET pseudo_slave_mode=1;
+create table t2 (a int) engine=InnoDB;
+xa start 't';
+insert into t1 values (5, 6);
+xa end 't';
+xa prepare 't';
+xa start 's';
+insert into t2 values (0);
+xa end 's';
+xa prepare 's';
+include/save_master_gtid.inc
+connection slave;
+include/sync_with_master_gtid.inc
+SELECT @@global.gtid_slave_pos = CONCAT(domain_id,"-",server_id,"-",seq_no) FROM mysql.gtid_slave_pos WHERE seq_no = (SELECT DISTINCT max(seq_no) FROM mysql.gtid_slave_pos);
+@@global.gtid_slave_pos = CONCAT(domain_id,"-",server_id,"-",seq_no)
+1
+xa recover;
+formatID gtrid_length bqual_length data
+1 1 0 t
+1 1 0 s
+connection master;
+xa commit 't';
+xa commit 's';
+SET pseudo_slave_mode=0;
+connection slave;
+include/diff_tables.inc [master:t1, slave:t1]
+include/diff_tables.inc [master:t2, slave:t2]
+connection master;
+drop table t1, t2;
+connection slave;
+include/stop_slave.inc
+SET @@global.gtid_pos_auto_engines="";
+SET @@session.sql_log_bin=0;
+DROP TABLE mysql.gtid_slave_pos_InnoDB;
+SET @@session.sql_log_bin=1;
+include/start_slave.inc
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_xa_survive_disconnect.result b/mysql-test/suite/rpl/r/rpl_xa_survive_disconnect.result
new file mode 100644
index 00000000000..d2ed1e2c235
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa_survive_disconnect.result
@@ -0,0 +1,319 @@
+include/master-slave.inc
+[connection master]
+connection master;
+call mtr.add_suppression("Found 2 prepared XA transactions");
+CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
+CREATE DATABASE d1;
+CREATE DATABASE d2;
+CREATE TABLE d1.t (a INT) ENGINE=innodb;
+CREATE TABLE d2.t (a INT) ENGINE=innodb;
+connect master_conn1, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@session.binlog_format= statement;
+XA START '1-stmt';
+INSERT INTO d1.t VALUES (1);
+XA END '1-stmt';
+XA PREPARE '1-stmt';
+disconnect master_conn1;
+connection master;
+connect master_conn2, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@session.binlog_format= row;
+XA START '1-row';
+INSERT INTO d2.t VALUES (1);
+XA END '1-row';
+XA PREPARE '1-row';
+disconnect master_conn2;
+connection master;
+XA START '2';
+INSERT INTO d1.t VALUES (2);
+XA END '2';
+XA PREPARE '2';
+XA COMMIT '2';
+XA COMMIT '1-row';
+XA COMMIT '1-stmt';
+include/show_binlog_events.inc
+Log_name Pos Event_type Server_id End_log_pos Info
+master-bin.000001 # Gtid # # BEGIN GTID #-#-#
+master-bin.000001 # Query # # use `mtr`; INSERT INTO test_suppressions (pattern) VALUES ( NAME_CONST('pattern',_latin1'Found 2 prepared XA transactions' COLLATE 'latin1_swedish_ci'))
+master-bin.000001 # Query # # COMMIT
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # use `test`; CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v_processlist` AS SELECT * FROM performance_schema.threads where type = 'FOREGROUND'
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # CREATE DATABASE d1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # CREATE DATABASE d2
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # use `test`; CREATE TABLE d1.t (a INT) ENGINE=innodb
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # use `test`; CREATE TABLE d2.t (a INT) ENGINE=innodb
+master-bin.000001 # Gtid # # XA START X'312d73746d74',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO d1.t VALUES (1)
+master-bin.000001 # Query # # XA END X'312d73746d74',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'312d73746d74',X'',1
+master-bin.000001 # Gtid # # XA START X'312d726f77',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO d2.t VALUES (1)
+master-bin.000001 # Table_map # # table_id: # (d2.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'312d726f77',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'312d726f77',X'',1
+master-bin.000001 # Gtid # # XA START X'32',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO d1.t VALUES (2)
+master-bin.000001 # Query # # XA END X'32',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'32',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'32',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'312d726f77',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'312d73746d74',X'',1
+include/sync_slave_sql_with_master.inc
+include/stop_slave.inc
+connection master;
+connect master2, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master2;
+SET @@session.binlog_format= statement;
+XA START '3-stmt';
+INSERT INTO d1.t VALUES (3);
+XA END '3-stmt';
+XA PREPARE '3-stmt';
+disconnect master2;
+connect master2, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master2;
+SET @@session.binlog_format= row;
+XA START '3-row';
+INSERT INTO d2.t VALUES (4);
+XA END '3-row';
+XA PREPARE '3-row';
+disconnect master2;
+connection master;
+connect master2, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master2;
+XA START '4';
+SELECT * FROM d1.t;
+a
+1
+2
+XA END '4';
+XA PREPARE '4';
+disconnect master2;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_10';
+INSERT INTO d1.t VALUES (10);
+INSERT INTO d2.t VALUES (10);
+XA END 'bulk_trx_10';
+XA PREPARE 'bulk_trx_10';
+disconnect master_bulk_conn10;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_9';
+INSERT INTO d1.t VALUES (9);
+INSERT INTO d2.t VALUES (9);
+XA END 'bulk_trx_9';
+XA PREPARE 'bulk_trx_9';
+disconnect master_bulk_conn9;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_8';
+INSERT INTO d1.t VALUES (8);
+INSERT INTO d2.t VALUES (8);
+XA END 'bulk_trx_8';
+XA PREPARE 'bulk_trx_8';
+disconnect master_bulk_conn8;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_7';
+INSERT INTO d1.t VALUES (7);
+INSERT INTO d2.t VALUES (7);
+XA END 'bulk_trx_7';
+XA PREPARE 'bulk_trx_7';
+disconnect master_bulk_conn7;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_6';
+INSERT INTO d1.t VALUES (6);
+INSERT INTO d2.t VALUES (6);
+XA END 'bulk_trx_6';
+XA PREPARE 'bulk_trx_6';
+disconnect master_bulk_conn6;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_5';
+INSERT INTO d1.t VALUES (5);
+INSERT INTO d2.t VALUES (5);
+XA END 'bulk_trx_5';
+XA PREPARE 'bulk_trx_5';
+disconnect master_bulk_conn5;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_4';
+INSERT INTO d1.t VALUES (4);
+INSERT INTO d2.t VALUES (4);
+XA END 'bulk_trx_4';
+XA PREPARE 'bulk_trx_4';
+disconnect master_bulk_conn4;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_3';
+INSERT INTO d1.t VALUES (3);
+INSERT INTO d2.t VALUES (3);
+XA END 'bulk_trx_3';
+XA PREPARE 'bulk_trx_3';
+disconnect master_bulk_conn3;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_2';
+INSERT INTO d1.t VALUES (2);
+INSERT INTO d2.t VALUES (2);
+XA END 'bulk_trx_2';
+XA PREPARE 'bulk_trx_2';
+disconnect master_bulk_conn2;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_1';
+INSERT INTO d1.t VALUES (1);
+INSERT INTO d2.t VALUES (1);
+XA END 'bulk_trx_1';
+XA PREPARE 'bulk_trx_1';
+disconnect master_bulk_conn1;
+connection master;
+connection slave;
+include/start_slave.inc
+connection master;
+include/sync_slave_sql_with_master.inc
+include/stop_slave.inc
+connection master;
+XA COMMIT 'bulk_trx_10';
+XA ROLLBACK 'bulk_trx_9';
+XA COMMIT 'bulk_trx_8';
+XA ROLLBACK 'bulk_trx_7';
+XA COMMIT 'bulk_trx_6';
+XA ROLLBACK 'bulk_trx_5';
+XA COMMIT 'bulk_trx_4';
+XA ROLLBACK 'bulk_trx_3';
+XA COMMIT 'bulk_trx_2';
+XA ROLLBACK 'bulk_trx_1';
+include/rpl_restart_server.inc [server_number=1]
+connection slave;
+include/start_slave.inc
+connection master;
+*** '3-stmt','3-row' xa-transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 6 0 3-stmt
+1 5 0 3-row
+XA COMMIT '3-stmt';
+XA ROLLBACK '3-row';
+include/sync_slave_sql_with_master.inc
+connection master;
+connect master_conn2, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START '0123456789012345678901234567890123456789012345678901234567890124','0123456789012345678901234567890123456789012345678901234567890124',4294967292;
+INSERT INTO d1.t VALUES (64);
+XA END '0123456789012345678901234567890123456789012345678901234567890124','0123456789012345678901234567890123456789012345678901234567890124',4294967292;
+XA PREPARE '0123456789012345678901234567890123456789012345678901234567890124','0123456789012345678901234567890123456789012345678901234567890124',4294967292;
+disconnect master_conn2;
+connection master;
+connect master_conn3, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START X'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF',X'00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000',0;
+INSERT INTO d1.t VALUES (0);
+XA END X'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF',X'00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000',0;
+XA PREPARE X'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF',X'00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000',0;
+disconnect master_conn3;
+connection master;
+disconnect master_conn4;
+connection master;
+XA COMMIT '0123456789012345678901234567890123456789012345678901234567890124','0123456789012345678901234567890123456789012345678901234567890124',4294967292;
+XA COMMIT X'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF',X'00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000',0;
+XA COMMIT 'RANDOM XID'
+include/sync_slave_sql_with_master.inc
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn10;
+XA START 'one_phase_10';
+INSERT INTO d1.t VALUES (10);
+INSERT INTO d2.t VALUES (10);
+XA END 'one_phase_10';
+XA COMMIT 'one_phase_10' ONE PHASE;
+disconnect master_bulk_conn10;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn9;
+XA START 'one_phase_9';
+INSERT INTO d1.t VALUES (9);
+INSERT INTO d2.t VALUES (9);
+XA END 'one_phase_9';
+XA COMMIT 'one_phase_9' ONE PHASE;
+disconnect master_bulk_conn9;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn8;
+XA START 'one_phase_8';
+INSERT INTO d1.t VALUES (8);
+INSERT INTO d2.t VALUES (8);
+XA END 'one_phase_8';
+XA COMMIT 'one_phase_8' ONE PHASE;
+disconnect master_bulk_conn8;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn7;
+XA START 'one_phase_7';
+INSERT INTO d1.t VALUES (7);
+INSERT INTO d2.t VALUES (7);
+XA END 'one_phase_7';
+XA COMMIT 'one_phase_7' ONE PHASE;
+disconnect master_bulk_conn7;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn6;
+XA START 'one_phase_6';
+INSERT INTO d1.t VALUES (6);
+INSERT INTO d2.t VALUES (6);
+XA END 'one_phase_6';
+XA COMMIT 'one_phase_6' ONE PHASE;
+disconnect master_bulk_conn6;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn5;
+XA START 'one_phase_5';
+INSERT INTO d1.t VALUES (5);
+INSERT INTO d2.t VALUES (5);
+XA END 'one_phase_5';
+XA COMMIT 'one_phase_5' ONE PHASE;
+disconnect master_bulk_conn5;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn4;
+XA START 'one_phase_4';
+INSERT INTO d1.t VALUES (4);
+INSERT INTO d2.t VALUES (4);
+XA END 'one_phase_4';
+XA COMMIT 'one_phase_4' ONE PHASE;
+disconnect master_bulk_conn4;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn3;
+XA START 'one_phase_3';
+INSERT INTO d1.t VALUES (3);
+INSERT INTO d2.t VALUES (3);
+XA END 'one_phase_3';
+XA COMMIT 'one_phase_3' ONE PHASE;
+disconnect master_bulk_conn3;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn2;
+XA START 'one_phase_2';
+INSERT INTO d1.t VALUES (2);
+INSERT INTO d2.t VALUES (2);
+XA END 'one_phase_2';
+XA COMMIT 'one_phase_2' ONE PHASE;
+disconnect master_bulk_conn2;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn1;
+XA START 'one_phase_1';
+INSERT INTO d1.t VALUES (1);
+INSERT INTO d2.t VALUES (1);
+XA END 'one_phase_1';
+XA COMMIT 'one_phase_1' ONE PHASE;
+disconnect master_bulk_conn1;
+connection master;
+include/sync_slave_sql_with_master.inc
+include/diff_tables.inc [master:d1.t, slave:d1.t]
+include/diff_tables.inc [master:d2.t, slave:d2.t]
+connection master;
+DELETE FROM d1.t;
+DELETE FROM d2.t;
+DROP TABLE d1.t, d2.t;
+DROP DATABASE d1;
+DROP DATABASE d2;
+DROP VIEW v_processlist;
+include/sync_slave_sql_with_master.inc
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_xa_survive_disconnect_lsu_off.result b/mysql-test/suite/rpl/r/rpl_xa_survive_disconnect_lsu_off.result
new file mode 100644
index 00000000000..d2ed1e2c235
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa_survive_disconnect_lsu_off.result
@@ -0,0 +1,319 @@
+include/master-slave.inc
+[connection master]
+connection master;
+call mtr.add_suppression("Found 2 prepared XA transactions");
+CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
+CREATE DATABASE d1;
+CREATE DATABASE d2;
+CREATE TABLE d1.t (a INT) ENGINE=innodb;
+CREATE TABLE d2.t (a INT) ENGINE=innodb;
+connect master_conn1, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@session.binlog_format= statement;
+XA START '1-stmt';
+INSERT INTO d1.t VALUES (1);
+XA END '1-stmt';
+XA PREPARE '1-stmt';
+disconnect master_conn1;
+connection master;
+connect master_conn2, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+SET @@session.binlog_format= row;
+XA START '1-row';
+INSERT INTO d2.t VALUES (1);
+XA END '1-row';
+XA PREPARE '1-row';
+disconnect master_conn2;
+connection master;
+XA START '2';
+INSERT INTO d1.t VALUES (2);
+XA END '2';
+XA PREPARE '2';
+XA COMMIT '2';
+XA COMMIT '1-row';
+XA COMMIT '1-stmt';
+include/show_binlog_events.inc
+Log_name Pos Event_type Server_id End_log_pos Info
+master-bin.000001 # Gtid # # BEGIN GTID #-#-#
+master-bin.000001 # Query # # use `mtr`; INSERT INTO test_suppressions (pattern) VALUES ( NAME_CONST('pattern',_latin1'Found 2 prepared XA transactions' COLLATE 'latin1_swedish_ci'))
+master-bin.000001 # Query # # COMMIT
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # use `test`; CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v_processlist` AS SELECT * FROM performance_schema.threads where type = 'FOREGROUND'
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # CREATE DATABASE d1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # CREATE DATABASE d2
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # use `test`; CREATE TABLE d1.t (a INT) ENGINE=innodb
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # use `test`; CREATE TABLE d2.t (a INT) ENGINE=innodb
+master-bin.000001 # Gtid # # XA START X'312d73746d74',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO d1.t VALUES (1)
+master-bin.000001 # Query # # XA END X'312d73746d74',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'312d73746d74',X'',1
+master-bin.000001 # Gtid # # XA START X'312d726f77',X'',1 GTID #-#-#
+master-bin.000001 # Annotate_rows # # INSERT INTO d2.t VALUES (1)
+master-bin.000001 # Table_map # # table_id: # (d2.t)
+master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
+master-bin.000001 # Query # # XA END X'312d726f77',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'312d726f77',X'',1
+master-bin.000001 # Gtid # # XA START X'32',X'',1 GTID #-#-#
+master-bin.000001 # Query # # use `test`; INSERT INTO d1.t VALUES (2)
+master-bin.000001 # Query # # XA END X'32',X'',1
+master-bin.000001 # XA_prepare # # XA PREPARE X'32',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'32',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'312d726f77',X'',1
+master-bin.000001 # Gtid # # GTID #-#-#
+master-bin.000001 # Query # # XA COMMIT X'312d73746d74',X'',1
+include/sync_slave_sql_with_master.inc
+include/stop_slave.inc
+connection master;
+connect master2, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master2;
+SET @@session.binlog_format= statement;
+XA START '3-stmt';
+INSERT INTO d1.t VALUES (3);
+XA END '3-stmt';
+XA PREPARE '3-stmt';
+disconnect master2;
+connect master2, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master2;
+SET @@session.binlog_format= row;
+XA START '3-row';
+INSERT INTO d2.t VALUES (4);
+XA END '3-row';
+XA PREPARE '3-row';
+disconnect master2;
+connection master;
+connect master2, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master2;
+XA START '4';
+SELECT * FROM d1.t;
+a
+1
+2
+XA END '4';
+XA PREPARE '4';
+disconnect master2;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_10';
+INSERT INTO d1.t VALUES (10);
+INSERT INTO d2.t VALUES (10);
+XA END 'bulk_trx_10';
+XA PREPARE 'bulk_trx_10';
+disconnect master_bulk_conn10;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_9';
+INSERT INTO d1.t VALUES (9);
+INSERT INTO d2.t VALUES (9);
+XA END 'bulk_trx_9';
+XA PREPARE 'bulk_trx_9';
+disconnect master_bulk_conn9;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_8';
+INSERT INTO d1.t VALUES (8);
+INSERT INTO d2.t VALUES (8);
+XA END 'bulk_trx_8';
+XA PREPARE 'bulk_trx_8';
+disconnect master_bulk_conn8;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_7';
+INSERT INTO d1.t VALUES (7);
+INSERT INTO d2.t VALUES (7);
+XA END 'bulk_trx_7';
+XA PREPARE 'bulk_trx_7';
+disconnect master_bulk_conn7;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_6';
+INSERT INTO d1.t VALUES (6);
+INSERT INTO d2.t VALUES (6);
+XA END 'bulk_trx_6';
+XA PREPARE 'bulk_trx_6';
+disconnect master_bulk_conn6;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_5';
+INSERT INTO d1.t VALUES (5);
+INSERT INTO d2.t VALUES (5);
+XA END 'bulk_trx_5';
+XA PREPARE 'bulk_trx_5';
+disconnect master_bulk_conn5;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_4';
+INSERT INTO d1.t VALUES (4);
+INSERT INTO d2.t VALUES (4);
+XA END 'bulk_trx_4';
+XA PREPARE 'bulk_trx_4';
+disconnect master_bulk_conn4;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_3';
+INSERT INTO d1.t VALUES (3);
+INSERT INTO d2.t VALUES (3);
+XA END 'bulk_trx_3';
+XA PREPARE 'bulk_trx_3';
+disconnect master_bulk_conn3;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_2';
+INSERT INTO d1.t VALUES (2);
+INSERT INTO d2.t VALUES (2);
+XA END 'bulk_trx_2';
+XA PREPARE 'bulk_trx_2';
+disconnect master_bulk_conn2;
+connection master;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START 'bulk_trx_1';
+INSERT INTO d1.t VALUES (1);
+INSERT INTO d2.t VALUES (1);
+XA END 'bulk_trx_1';
+XA PREPARE 'bulk_trx_1';
+disconnect master_bulk_conn1;
+connection master;
+connection slave;
+include/start_slave.inc
+connection master;
+include/sync_slave_sql_with_master.inc
+include/stop_slave.inc
+connection master;
+XA COMMIT 'bulk_trx_10';
+XA ROLLBACK 'bulk_trx_9';
+XA COMMIT 'bulk_trx_8';
+XA ROLLBACK 'bulk_trx_7';
+XA COMMIT 'bulk_trx_6';
+XA ROLLBACK 'bulk_trx_5';
+XA COMMIT 'bulk_trx_4';
+XA ROLLBACK 'bulk_trx_3';
+XA COMMIT 'bulk_trx_2';
+XA ROLLBACK 'bulk_trx_1';
+include/rpl_restart_server.inc [server_number=1]
+connection slave;
+include/start_slave.inc
+connection master;
+*** '3-stmt','3-row' xa-transactions must be in the list ***
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 6 0 3-stmt
+1 5 0 3-row
+XA COMMIT '3-stmt';
+XA ROLLBACK '3-row';
+include/sync_slave_sql_with_master.inc
+connection master;
+connect master_conn2, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START '0123456789012345678901234567890123456789012345678901234567890124','0123456789012345678901234567890123456789012345678901234567890124',4294967292;
+INSERT INTO d1.t VALUES (64);
+XA END '0123456789012345678901234567890123456789012345678901234567890124','0123456789012345678901234567890123456789012345678901234567890124',4294967292;
+XA PREPARE '0123456789012345678901234567890123456789012345678901234567890124','0123456789012345678901234567890123456789012345678901234567890124',4294967292;
+disconnect master_conn2;
+connection master;
+connect master_conn3, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+XA START X'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF',X'00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000',0;
+INSERT INTO d1.t VALUES (0);
+XA END X'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF',X'00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000',0;
+XA PREPARE X'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF',X'00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000',0;
+disconnect master_conn3;
+connection master;
+disconnect master_conn4;
+connection master;
+XA COMMIT '0123456789012345678901234567890123456789012345678901234567890124','0123456789012345678901234567890123456789012345678901234567890124',4294967292;
+XA COMMIT X'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF',X'00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000',0;
+XA COMMIT 'RANDOM XID'
+include/sync_slave_sql_with_master.inc
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn10;
+XA START 'one_phase_10';
+INSERT INTO d1.t VALUES (10);
+INSERT INTO d2.t VALUES (10);
+XA END 'one_phase_10';
+XA COMMIT 'one_phase_10' ONE PHASE;
+disconnect master_bulk_conn10;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn9;
+XA START 'one_phase_9';
+INSERT INTO d1.t VALUES (9);
+INSERT INTO d2.t VALUES (9);
+XA END 'one_phase_9';
+XA COMMIT 'one_phase_9' ONE PHASE;
+disconnect master_bulk_conn9;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn8;
+XA START 'one_phase_8';
+INSERT INTO d1.t VALUES (8);
+INSERT INTO d2.t VALUES (8);
+XA END 'one_phase_8';
+XA COMMIT 'one_phase_8' ONE PHASE;
+disconnect master_bulk_conn8;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn7;
+XA START 'one_phase_7';
+INSERT INTO d1.t VALUES (7);
+INSERT INTO d2.t VALUES (7);
+XA END 'one_phase_7';
+XA COMMIT 'one_phase_7' ONE PHASE;
+disconnect master_bulk_conn7;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn6;
+XA START 'one_phase_6';
+INSERT INTO d1.t VALUES (6);
+INSERT INTO d2.t VALUES (6);
+XA END 'one_phase_6';
+XA COMMIT 'one_phase_6' ONE PHASE;
+disconnect master_bulk_conn6;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn5;
+XA START 'one_phase_5';
+INSERT INTO d1.t VALUES (5);
+INSERT INTO d2.t VALUES (5);
+XA END 'one_phase_5';
+XA COMMIT 'one_phase_5' ONE PHASE;
+disconnect master_bulk_conn5;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn4;
+XA START 'one_phase_4';
+INSERT INTO d1.t VALUES (4);
+INSERT INTO d2.t VALUES (4);
+XA END 'one_phase_4';
+XA COMMIT 'one_phase_4' ONE PHASE;
+disconnect master_bulk_conn4;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn3;
+XA START 'one_phase_3';
+INSERT INTO d1.t VALUES (3);
+INSERT INTO d2.t VALUES (3);
+XA END 'one_phase_3';
+XA COMMIT 'one_phase_3' ONE PHASE;
+disconnect master_bulk_conn3;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn2;
+XA START 'one_phase_2';
+INSERT INTO d1.t VALUES (2);
+INSERT INTO d2.t VALUES (2);
+XA END 'one_phase_2';
+XA COMMIT 'one_phase_2' ONE PHASE;
+disconnect master_bulk_conn2;
+connect master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,;
+connection master_bulk_conn1;
+XA START 'one_phase_1';
+INSERT INTO d1.t VALUES (1);
+INSERT INTO d2.t VALUES (1);
+XA END 'one_phase_1';
+XA COMMIT 'one_phase_1' ONE PHASE;
+disconnect master_bulk_conn1;
+connection master;
+include/sync_slave_sql_with_master.inc
+include/diff_tables.inc [master:d1.t, slave:d1.t]
+include/diff_tables.inc [master:d2.t, slave:d2.t]
+connection master;
+DELETE FROM d1.t;
+DELETE FROM d2.t;
+DROP TABLE d1.t, d2.t;
+DROP DATABASE d1;
+DROP DATABASE d2;
+DROP VIEW v_processlist;
+include/sync_slave_sql_with_master.inc
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/r/rpl_xa_survive_disconnect_mixed_engines.result b/mysql-test/suite/rpl/r/rpl_xa_survive_disconnect_mixed_engines.result
new file mode 100644
index 00000000000..09bfffc0da4
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_xa_survive_disconnect_mixed_engines.result
@@ -0,0 +1,373 @@
+include/master-slave.inc
+[connection master]
+connection master;
+CALL mtr.add_suppression("Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT");
+SET @@session.binlog_direct_non_transactional_updates := if(floor(rand()*10)%2,'ON','OFF');
+CREATE TABLE t (a INT) ENGINE=innodb;
+CREATE TABLE tm (a INT) ENGINE=myisam;
+=== COMMIT ===
+XA START 'xa_trx';
+INSERT INTO tm VALUES (1);
+INSERT INTO t VALUES (1);
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+XA COMMIT 'xa_trx' ;
+XA START 'xa_trx';
+INSERT INTO t VALUES (2);
+INSERT INTO tm VALUES (2);
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+XA COMMIT 'xa_trx' ;
+XA START 'xa_trx';
+INSERT INTO tm VALUES (3);
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+XA COMMIT 'xa_trx' ;
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+XA START 'xa_trx';
+INSERT INTO t VALUES (4);
+INSERT INTO tm VALUES (4);
+INSERT INTO tmp_i VALUES (4);
+INSERT INTO tmp_m VALUES (4);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+XA COMMIT 'xa_trx' ;
+XA START 'xa_trx';
+INSERT INTO tmp_i VALUES (5);
+INSERT INTO tmp_m VALUES (5);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+INSERT INTO t VALUES (5);
+INSERT INTO tm VALUES (5);
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+XA COMMIT 'xa_trx' ;
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA START 'xa_trx';
+INSERT INTO t VALUES (6);
+INSERT INTO tm VALUES (6);
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+INSERT INTO tmp_i VALUES (6);
+INSERT INTO tmp_m VALUES (6);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+XA COMMIT 'xa_trx' ;
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA START 'xa_trx';
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+INSERT INTO tmp_i VALUES (7);
+INSERT INTO tmp_m VALUES (7);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+INSERT INTO t VALUES (7);
+INSERT INTO tm VALUES (7);
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+XA COMMIT 'xa_trx' ;
+XA START 'xa_trx';
+INSERT INTO t VALUES (8);
+INSERT INTO tm VALUES (8);
+INSERT INTO tmp_i VALUES (8);
+INSERT INTO tmp_m VALUES (8);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+XA COMMIT 'xa_trx' ;
+XA START 'xa_trx';
+UPDATE t SET a = 99 where a = -1;
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+XA COMMIT 'xa_trx' ;
+include/sync_slave_sql_with_master.inc
+connection master;
+=== COMMIT ONE PHASE ===
+XA START 'xa_trx';
+INSERT INTO tm VALUES (1);
+INSERT INTO t VALUES (1);
+XA END 'xa_trx';
+XA COMMIT 'xa_trx' ONE PHASE;
+XA START 'xa_trx';
+INSERT INTO t VALUES (2);
+INSERT INTO tm VALUES (2);
+XA END 'xa_trx';
+XA COMMIT 'xa_trx' ONE PHASE;
+XA START 'xa_trx';
+INSERT INTO tm VALUES (3);
+XA END 'xa_trx';
+XA COMMIT 'xa_trx' ONE PHASE;
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+XA START 'xa_trx';
+INSERT INTO t VALUES (4);
+INSERT INTO tm VALUES (4);
+INSERT INTO tmp_i VALUES (4);
+INSERT INTO tmp_m VALUES (4);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+XA END 'xa_trx';
+XA COMMIT 'xa_trx' ONE PHASE;
+XA START 'xa_trx';
+INSERT INTO tmp_i VALUES (5);
+INSERT INTO tmp_m VALUES (5);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+INSERT INTO t VALUES (5);
+INSERT INTO tm VALUES (5);
+XA END 'xa_trx';
+XA COMMIT 'xa_trx' ONE PHASE;
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA START 'xa_trx';
+INSERT INTO t VALUES (6);
+INSERT INTO tm VALUES (6);
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+INSERT INTO tmp_i VALUES (6);
+INSERT INTO tmp_m VALUES (6);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+XA END 'xa_trx';
+XA COMMIT 'xa_trx' ONE PHASE;
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA START 'xa_trx';
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+INSERT INTO tmp_i VALUES (7);
+INSERT INTO tmp_m VALUES (7);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+INSERT INTO t VALUES (7);
+INSERT INTO tm VALUES (7);
+XA END 'xa_trx';
+XA COMMIT 'xa_trx' ONE PHASE;
+XA START 'xa_trx';
+INSERT INTO t VALUES (8);
+INSERT INTO tm VALUES (8);
+INSERT INTO tmp_i VALUES (8);
+INSERT INTO tmp_m VALUES (8);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA END 'xa_trx';
+XA COMMIT 'xa_trx' ONE PHASE;
+XA START 'xa_trx';
+UPDATE t SET a = 99 where a = -1;
+XA END 'xa_trx';
+XA COMMIT 'xa_trx' ONE PHASE;
+include/sync_slave_sql_with_master.inc
+connection master;
+=== ROLLBACK with PREPARE ===
+XA START 'xa_trx';
+INSERT INTO tm VALUES (1);
+INSERT INTO t VALUES (1);
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+XA START 'xa_trx';
+INSERT INTO t VALUES (2);
+INSERT INTO tm VALUES (2);
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+XA START 'xa_trx';
+INSERT INTO tm VALUES (3);
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+XA START 'xa_trx';
+INSERT INTO t VALUES (4);
+INSERT INTO tm VALUES (4);
+INSERT INTO tmp_i VALUES (4);
+INSERT INTO tmp_m VALUES (4);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+XA START 'xa_trx';
+INSERT INTO tmp_i VALUES (5);
+INSERT INTO tmp_m VALUES (5);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+INSERT INTO t VALUES (5);
+INSERT INTO tm VALUES (5);
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA START 'xa_trx';
+INSERT INTO t VALUES (6);
+INSERT INTO tm VALUES (6);
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+INSERT INTO tmp_i VALUES (6);
+INSERT INTO tmp_m VALUES (6);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA START 'xa_trx';
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+INSERT INTO tmp_i VALUES (7);
+INSERT INTO tmp_m VALUES (7);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+INSERT INTO t VALUES (7);
+INSERT INTO tm VALUES (7);
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+XA START 'xa_trx';
+INSERT INTO t VALUES (8);
+INSERT INTO tm VALUES (8);
+INSERT INTO tmp_i VALUES (8);
+INSERT INTO tmp_m VALUES (8);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+XA START 'xa_trx';
+UPDATE t SET a = 99 where a = -1;
+XA END 'xa_trx';
+XA PREPARE 'xa_trx';
+xa rollback 'xa_trx' ;
+include/sync_slave_sql_with_master.inc
+connection master;
+=== ROLLBACK with no PREPARE ===
+XA START 'xa_trx';
+INSERT INTO tm VALUES (1);
+INSERT INTO t VALUES (1);
+XA END 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+XA START 'xa_trx';
+INSERT INTO t VALUES (2);
+INSERT INTO tm VALUES (2);
+XA END 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+XA START 'xa_trx';
+INSERT INTO tm VALUES (3);
+XA END 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+XA START 'xa_trx';
+INSERT INTO t VALUES (4);
+INSERT INTO tm VALUES (4);
+INSERT INTO tmp_i VALUES (4);
+INSERT INTO tmp_m VALUES (4);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+XA END 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+XA START 'xa_trx';
+INSERT INTO tmp_i VALUES (5);
+INSERT INTO tmp_m VALUES (5);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+INSERT INTO t VALUES (5);
+INSERT INTO tm VALUES (5);
+XA END 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA START 'xa_trx';
+INSERT INTO t VALUES (6);
+INSERT INTO tm VALUES (6);
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+INSERT INTO tmp_i VALUES (6);
+INSERT INTO tmp_m VALUES (6);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+XA END 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA START 'xa_trx';
+CREATE TEMPORARY TABLE tmp_i LIKE t;
+CREATE TEMPORARY TABLE tmp_m LIKE tm;
+INSERT INTO tmp_i VALUES (7);
+INSERT INTO tmp_m VALUES (7);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+INSERT INTO t VALUES (7);
+INSERT INTO tm VALUES (7);
+XA END 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+XA START 'xa_trx';
+INSERT INTO t VALUES (8);
+INSERT INTO tm VALUES (8);
+INSERT INTO tmp_i VALUES (8);
+INSERT INTO tmp_m VALUES (8);
+INSERT INTO t SELECT * FROM tmp_i;
+INSERT INTO tm SELECT * FROM tmp_m;
+DROP TEMPORARY TABLE tmp_i;
+DROP TEMPORARY TABLE tmp_m;
+XA END 'xa_trx';
+xa rollback 'xa_trx' ;
+Warnings:
+Warning 1196 Some non-transactional changed tables couldn't be rolled back
+XA START 'xa_trx';
+UPDATE t SET a = 99 where a = -1;
+XA END 'xa_trx';
+xa rollback 'xa_trx' ;
+include/sync_slave_sql_with_master.inc
+include/diff_tables.inc [master:tm, slave:tm]
+connection master;
+DROP TABLE t, tm;
+include/sync_slave_sql_with_master.inc
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_parallel_optimistic_xa.test b/mysql-test/suite/rpl/t/rpl_parallel_optimistic_xa.test
new file mode 100644
index 00000000000..35c22d1e92e
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_parallel_optimistic_xa.test
@@ -0,0 +1,235 @@
+# The tests verify concurrent execution of replicated (MDEV-742)
+# XA transactions in the parallel optimistic mode.
+
+--source include/have_innodb.inc
+--source include/have_perfschema.inc
+--source include/master-slave.inc
+
+# Tests' global declarations
+--let $trx = _trx_
+
+call mtr.add_suppression("Deadlock found when trying to get lock; try restarting transaction");
+call mtr.add_suppression("WSREP: handlerton rollback failed");
+#call mtr.add_suppression("Can't find record in 't1'");
+CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
+
+--connection master
+ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
+--save_master_pos
+
+# Prepare to restart slave into optimistic parallel mode
+--connection slave
+--sync_with_master
+--source include/stop_slave.inc
+SET @old_parallel_threads = @@GLOBAL.slave_parallel_threads;
+SET @@global.slave_parallel_threads = 7;
+SET @old_parallel_mode = @@GLOBAL.slave_parallel_mode;
+SET @@global.slave_parallel_mode ='optimistic';
+# Run the first part of the test with high batch size and see that
+# old rows remain in the table.
+SET @old_gtid_cleanup_batch_size = @@GLOBAL.gtid_cleanup_batch_size;
+SET @@global.gtid_cleanup_batch_size = 1000000;
+
+CHANGE MASTER TO master_use_gtid=slave_pos;
+
+# LOAD GENERATOR creates XA:s interleaved in binlog when they are from
+# different connections. All the following block XA:s of the same connection
+# update the same data which challenges slave optimistic scheduler's correctness.
+# Slave must eventually apply such load, and correctly (checked).
+
+--connection master
+CREATE TABLE t0 (a int, b INT) ENGINE=InnoDB;
+CREATE TABLE t1 (a int PRIMARY KEY, b INT) ENGINE=InnoDB;
+INSERT INTO t1 VALUES (1, 0);
+
+
+# I. Logging some sequence of XA:s by one connection.
+#
+# The slave applier's task is to successfully execute a series of
+# Prepare and Complete parts of a sequence of XA:s
+
+--let $trx_num = 300
+--let $i = $trx_num
+--let $conn = master
+--disable_query_log
+while($i > 0)
+{
+ # 'decision' to commit 0, or rollback 1
+ --let $decision = `SELECT $i % 2`
+ --eval XA START '$conn$trx$i'
+ --eval UPDATE t1 SET b = 1 - 2 * $decision WHERE a = 1
+ --eval XA END '$conn$trx$i'
+ --let $one_phase = `SELECT IF(floor(rand()*10)%2, "ONE PHASE", 0)`
+ if (!$one_phase)
+ {
+ --eval XA PREPARE '$conn$trx$i'
+ --let $one_phase =
+ }
+
+ --let $term = COMMIT
+ if ($decision)
+ {
+ --let $term = ROLLBACK
+ --let $one_phase =
+ }
+ --eval XA $term '$conn$trx$i' $one_phase
+
+ --dec $i
+}
+--enable_query_log
+--source include/save_master_gtid.inc
+
+--connection slave
+--source include/start_slave.inc
+--source include/sync_with_master_gtid.inc
+--source include/stop_slave.inc
+
+
+# II. Logging XS:s from multiple connections in random interweaving manner:
+#
+# in a loop ($i) per connection
+# arrange an inner ($k) loop where
+# start and prepare an XA;
+# decide whether to terminate it and then continue to loop innerly
+# OR disconnect to break the inner loop;
+# the disconnected one's XA is taken care by 'master' connection
+#
+# Effectively binlog must collect a well mixed XA- prepared and terminated
+# groups for slave to handle.
+
+--connection master
+# Total # of connections
+--let $conn_num=53
+
+--let $i = $conn_num
+--disable_query_log
+while($i > 0)
+{
+ --connect (master_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,)
+--dec $i
+}
+--enable_query_log
+
+--let $i = $conn_num
+while($i > 0)
+{
+ --let $conn_i = conn$i
+ # $i2 indexes the current connection's "own" row
+ --let $i2 = `SELECT $i + 2`
+--disable_query_log
+ --connection master_conn$i
+--enable_query_log
+ --disable_query_log
+ --let $i_conn_id = `SELECT connection_id()`
+
+ --let $decision = 0
+ # the row id of the last connection that committed its XA
+ --let $c_max = 1
+ --let $k = 0
+ while ($decision < 3)
+ {
+ --inc $k
+ --eval XA START '$conn_i$trx$k'
+ # UPDATE depends on previously *committed* transactions
+ --eval UPDATE t1 SET b = b + $k + 1 WHERE a = $c_max
+ if (`SELECT $k % 2 = 1`)
+ {
+ --eval REPLACE INTO t1 VALUES ($i2, $k)
+ }
+ if (`SELECT $k % 2 = 0`)
+ {
+ --eval DELETE FROM t1 WHERE a = $i2
+ }
+ CREATE TEMPORARY TABLE tmp LIKE t0;
+ --eval INSERT INTO tmp SET a=$i, b= $k
+ INSERT INTO t0 SELECT * FROM tmp;
+ DROP TEMPORARY TABLE tmp;
+ --eval XA END '$conn_i$trx$k'
+
+ --let $term = COMMIT
+ --let $decision = `SELECT (floor(rand()*10 % 10) + ($i+$k)) % 4`
+ if ($decision == 1)
+ {
+ --let $term = ROLLBACK
+ }
+ if ($decision < 2)
+ {
+ --eval XA PREPARE '$conn_i$trx$k'
+ --eval XA $term '$conn_i$trx$k'
+ # Iteration counter is taken care *now*
+ }
+ if ($decision == 2)
+ {
+ --eval XA COMMIT '$conn_i$trx$k' ONE PHASE
+ }
+ }
+
+ # $decision = 3
+ --eval XA PREPARE '$conn_i$trx$k'
+ # disconnect now
+ --disconnect master_conn$i
+ --connection master
+
+ --let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $i_conn_id
+ --source include/wait_condition.inc
+
+ --disable_query_log
+ --let $decision = `SELECT ($i+$k) % 2`
+ --let $term = COMMIT
+ if ($decision == 1)
+ {
+ --let $term = ROLLBACK
+ }
+ --eval XA $term '$conn_i$trx$k'
+ --let $c_max = $i2
+
+--dec $i
+}
+--enable_query_log
+--source include/save_master_gtid.inc
+
+--connection slave
+--source include/start_slave.inc
+--source include/sync_with_master_gtid.inc
+
+#
+# Overall consistency check
+#
+--let $diff_tables= master:t0, slave:t0
+--source include/diff_tables.inc
+--let $diff_tables= master:t1, slave:t1
+--source include/diff_tables.inc
+
+
+#
+# Clean up.
+#
+--connection slave
+--source include/stop_slave.inc
+set global log_warnings=default;
+SET GLOBAL slave_parallel_mode=@old_parallel_mode;
+SET GLOBAL slave_parallel_threads=@old_parallel_threads;
+--source include/start_slave.inc
+
+--connection master
+DROP VIEW v_processlist;
+DROP TABLE t0, t1;
+--source include/save_master_gtid.inc
+
+--connection slave
+--source include/sync_with_master_gtid.inc
+# Check that old rows are deleted from mysql.gtid_slave_pos.
+# Deletion is asynchronous, so use wait_condition.inc.
+# Also, there is a small amount of non-determinism in the deletion of old
+# rows, so it is not guaranteed that there can never be more than
+# @@gtid_cleanup_batch_size rows in the table; so allow a bit of slack
+# here.
+let $wait_condition=
+ SELECT COUNT(*) <= 5*@@GLOBAL.gtid_cleanup_batch_size
+ FROM mysql.gtid_slave_pos;
+--source include/wait_condition.inc
+eval $wait_condition;
+SET GLOBAL gtid_cleanup_batch_size= @old_gtid_cleanup_batch_size;
+
+--connection master
+--source include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_parallel_optimistic_xa_lsu_off-slave.opt b/mysql-test/suite/rpl/t/rpl_parallel_optimistic_xa_lsu_off-slave.opt
new file mode 100644
index 00000000000..88cf77fd281
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_parallel_optimistic_xa_lsu_off-slave.opt
@@ -0,0 +1 @@
+--log-slave-updates=OFF
diff --git a/mysql-test/suite/rpl/t/rpl_parallel_optimistic_xa_lsu_off.test b/mysql-test/suite/rpl/t/rpl_parallel_optimistic_xa_lsu_off.test
new file mode 100644
index 00000000000..f82b522eefe
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_parallel_optimistic_xa_lsu_off.test
@@ -0,0 +1,2 @@
+# --log-slave-updates OFF version of rpl_parallel_optimistic_xa
+--source rpl_parallel_optimistic_xa.test
diff --git a/mysql-test/suite/rpl/t/rpl_parallel_xa_same_xid.test b/mysql-test/suite/rpl/t/rpl_parallel_xa_same_xid.test
new file mode 100644
index 00000000000..888dd2f177b
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_parallel_xa_same_xid.test
@@ -0,0 +1,138 @@
+# The tests verify concurrent execution of replicated (MDEV-742)
+# XA transactions in the parallel optimistic mode.
+# Prove optimistic scheduler handles xid-namesake XA:s.
+# That is despite running in parallel there must be no conflicts
+# caused by multiple transactions' same xid.
+
+--source include/have_binlog_format_mixed_or_row.inc
+--source include/have_innodb.inc
+--source include/have_perfschema.inc
+--source include/master-slave.inc
+
+--let $xid_num = 19
+--let $repeat = 17
+--let $workers = 7
+--connection slave
+call mtr.add_suppression("WSREP: handlerton rollback failed");
+
+--source include/stop_slave.inc
+# a measure against MDEV-20605
+ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
+
+SET @old_parallel_threads = @@GLOBAL.slave_parallel_threads;
+--disable_query_log
+--eval SET @@global.slave_parallel_threads = $workers
+--enable_query_log
+SET @old_parallel_mode = @@GLOBAL.slave_parallel_mode;
+SET @@global.slave_parallel_mode ='optimistic';
+--source include/start_slave.inc
+
+--connection master
+CREATE TABLE t1 (a INT, b INT) ENGINE=InnoDB;
+
+--let $i = $xid_num
+--let $t = t1
+--disable_query_log
+while ($i)
+{
+--let $k = $repeat
+while ($k)
+{
+--eval XA START 'xid_$i'
+--eval INSERT INTO $t SET a=$i, b=$k
+--eval XA END 'xid_$i'
+--let $one_phase = `SELECT IF(floor(rand()*10)%2, "ONE PHASE", 0)`
+ if (!$one_phase)
+ {
+ --eval XA PREPARE 'xid_$i'
+ --eval XA COMMIT 'xid_$i'
+ }
+ if ($one_phase)
+ {
+ --eval XA COMMIT 'xid_$i' ONE PHASE
+ }
+
+ if (!$one_phase)
+ {
+ --eval XA START 'xid_$i'
+ --eval INSERT INTO $t SET a=$i, b=$k
+ --eval XA END 'xid_$i'
+ --eval XA PREPARE 'xid_$i'
+ --eval XA ROLLBACK 'xid_$i'
+ }
+
+--dec $k
+}
+
+--dec $i
+}
+--enable_query_log
+
+
+
+# Above-like test complicates execution env to create
+# data conflicts as well. They will be resolved by the optmistic
+# scheduler as usual.
+
+CREATE TABLE t2 (a INT AUTO_INCREMENT PRIMARY KEY, b INT) ENGINE=InnoDB;
+
+--let $i = $xid_num
+--let $t = t2
+--disable_query_log
+while ($i)
+{
+--let $k = $repeat
+while ($k)
+{
+--eval XA START 'xid_$i'
+--eval INSERT INTO $t SET a=NULL, b=$k
+--eval UPDATE $t SET b=$k + 1 WHERE a=last_insert_id() % $workers
+--eval XA END 'xid_$i'
+--let $one_phase = `SELECT IF(floor(rand()*10)%2, "ONE PHASE", 0)`
+ if (!$one_phase)
+ {
+ --eval XA PREPARE 'xid_$i'
+ --eval XA COMMIT 'xid_$i'
+ }
+ if ($one_phase)
+ {
+ --eval XA COMMIT 'xid_$i' ONE PHASE
+ }
+
+--eval XA START 'xid_$i'
+--eval UPDATE $t SET b=$k + 1 WHERE a=last_insert_id() % $workers
+--eval DELETE FROM $t WHERE a=last_insert_id()
+--eval XA END 'xid_$i'
+--eval XA PREPARE 'xid_$i'
+--eval XA ROLLBACK 'xid_$i'
+
+--let $do_drop_create = `SELECT IF(floor(rand()*10)%100, 1, 0)`
+if ($do_drop_create)
+{
+ DROP TABLE t1;
+ CREATE TABLE t1 (a INT, b INT) ENGINE=InnoDB;
+}
+--dec $k
+}
+
+--dec $i
+}
+--enable_query_log
+
+--source include/sync_slave_sql_with_master.inc
+--let $diff_tables= master:t1, slave:t1
+--source include/diff_tables.inc
+
+#
+# Clean up.
+#
+--connection slave
+--source include/stop_slave.inc
+SET GLOBAL slave_parallel_threads=@old_parallel_threads;
+SET GLOBAL slave_parallel_mode=@old_parallel_mode;
+--source include/start_slave.inc
+
+--connection master
+DROP TABLE t1, t2;
+
+--source include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_temporary_errors.test b/mysql-test/suite/rpl/t/rpl_temporary_errors.test
index 6392fb90b9b..85e16afa270 100644
--- a/mysql-test/suite/rpl/t/rpl_temporary_errors.test
+++ b/mysql-test/suite/rpl/t/rpl_temporary_errors.test
@@ -6,7 +6,7 @@ call mtr.add_suppression("Deadlock found");
call mtr.add_suppression("Can't find record in 't.'");
connection master;
-CREATE TABLE t1 (a INT PRIMARY KEY, b INT);
+CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=innodb;
INSERT INTO t1 VALUES (1,1), (2,2), (3,3), (4,4);
sync_slave_with_master;
SHOW STATUS LIKE 'Slave_retried_transactions';
@@ -14,20 +14,94 @@ SHOW STATUS LIKE 'Slave_retried_transactions';
# the following UPDATE t1 to pass the mode is switched temprorarily
set @@global.slave_exec_mode= 'IDEMPOTENT';
UPDATE t1 SET a = 5, b = 47 WHERE a = 1;
-SELECT * FROM t1;
+SELECT * FROM t1 ORDER BY a;
connection master;
UPDATE t1 SET a = 5, b = 5 WHERE a = 1;
-SELECT * FROM t1;
+SELECT * FROM t1 ORDER BY a;
#SHOW BINLOG EVENTS;
sync_slave_with_master;
set @@global.slave_exec_mode= default;
SHOW STATUS LIKE 'Slave_retried_transactions';
-SELECT * FROM t1;
+SELECT * FROM t1 ORDER BY a;
source include/check_slave_is_running.inc;
connection slave;
call mtr.add_suppression("Slave SQL.*Could not execute Update_rows event on table test.t1");
+call mtr.add_suppression("Slave SQL for channel '': worker thread retried transaction");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped");
+#
+# Bug#24764800 REPLICATION FAILING ON SLAVE WITH XAER_RMFAIL ERROR
+#
+# Verify that a temporary failing replicated xa transaction completes
+# upon slave applier restart after previous
+# @@global.slave_transaction_retries number of retries in vain.
+#
+connection slave;
+
+set @save_innodb_lock_wait_timeout=@@global.innodb_lock_wait_timeout;
+set @save_slave_transaction_retries=@@global.slave_transaction_retries;
+
+# Slave applier parameters for the failed retry
+set @@global.innodb_lock_wait_timeout=1;
+set @@global.slave_transaction_retries=2;
+--source include/restart_slave_sql.inc
+
+# Temporary error implement: a record is blocked by slave local trx
+connection slave1;
+BEGIN;
+INSERT INTO t1 SET a = 6, b = 7;
+
+connection master;
+INSERT INTO t1 SET a = 99, b = 99; # slave applier warm up trx
+XA START 'xa1';
+INSERT INTO t1 SET a = 6, b = 6; # this record eventually must be found on slave
+XA END 'xa1';
+XA PREPARE 'xa1';
+
+connection slave;
+# convert_error(ER_LOCK_WAIT_TIMEOUT)
+--let $err_timeout= 1205
+# convert_error(ER_LOCK_DEADLOCK)
+--let $err_deadlock= 1213
+--let $slave_sql_errno=$err_deadlock,$err_timeout
+--let $show_slave_sql_error=
+--source include/wait_for_slave_sql_error.inc
+
+# b. Slave applier parameters for successful retry after restart
+set @@global.innodb_lock_wait_timeout=1;
+set @@global.slave_transaction_retries=100;
+
+--source include/restart_slave_sql.inc
+
+--let $last_retries= query_get_value(SHOW GLOBAL STATUS LIKE 'Slave_retried_transactions', Value, 1)
+--let $status_type=GLOBAL
+--let $status_var=Slave_retried_transactions
+--let $status_var_value=`SELECT 1 + $last_retries`
+--let $$status_var_comparsion= >
+--source include/wait_for_status_var.inc
+
+# Release the record after just one retry
+connection slave1;
+ROLLBACK;
+
+connection master;
+XA COMMIT 'xa1';
+
+--source include/sync_slave_sql_with_master.inc
+
+# Proof of correctness: the committed XA is on the slave
+connection slave;
+--let $assert_text=XA transaction record must be in the table
+--let $assert_cond=count(*)=1 FROM t1 WHERE a=6 AND b=6
+--source include/assert.inc
+
+# Bug#24764800 cleanup:
+set @@global.innodb_lock_wait_timeout=@save_innodb_lock_wait_timeout;
+set @@global.slave_transaction_retries= @save_slave_transaction_retries;
+#
+# Total cleanup:
+#
connection master;
DROP TABLE t1;
--sync_slave_with_master
diff --git a/mysql-test/suite/rpl/t/rpl_xa.inc b/mysql-test/suite/rpl/t/rpl_xa.inc
new file mode 100644
index 00000000000..f1ba4cf8557
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa.inc
@@ -0,0 +1,73 @@
+#
+# This "body" file checks general properties of XA transaction replication
+# as of MDEV-7974.
+# Parameters:
+# --let rpl_xa_check= SELECT ...
+#
+connection master;
+create table t1 (a int, b int) engine=InnoDB;
+insert into t1 values(0, 0);
+xa start 't';
+insert into t1 values(1, 2);
+xa end 't';
+xa prepare 't';
+xa commit 't';
+
+sync_slave_with_master;
+let $diff_tables= master:t1, slave:t1;
+source include/diff_tables.inc;
+
+connection master;
+
+xa start 't';
+insert into t1 values(3, 4);
+xa end 't';
+xa prepare 't';
+xa rollback 't';
+
+sync_slave_with_master;
+let $diff_tables= master:t1, slave:t1;
+source include/diff_tables.inc;
+
+connection master;
+--disable_warnings
+SET pseudo_slave_mode=1;
+--enable_warnings
+create table t2 (a int) engine=InnoDB;
+xa start 't';
+insert into t1 values (5, 6);
+xa end 't';
+xa prepare 't';
+xa start 's';
+insert into t2 values (0);
+xa end 's';
+xa prepare 's';
+--source include/save_master_gtid.inc
+
+connection slave;
+source include/sync_with_master_gtid.inc;
+if ($rpl_xa_check)
+{
+ --eval $rpl_xa_check
+ if ($rpl_xa_verbose)
+ {
+ --eval SELECT $rpl_xa_check_lhs
+ --eval SELECT $rpl_xa_check_rhs
+ }
+}
+xa recover;
+
+connection master;
+xa commit 't';
+xa commit 's';
+--disable_warnings
+SET pseudo_slave_mode=0;
+--enable_warnings
+sync_slave_with_master;
+let $diff_tables= master:t1, slave:t1;
+source include/diff_tables.inc;
+let $diff_tables= master:t2, slave:t2;
+source include/diff_tables.inc;
+
+connection master;
+drop table t1, t2;
diff --git a/mysql-test/suite/rpl/t/rpl_xa.test b/mysql-test/suite/rpl/t/rpl_xa.test
new file mode 100644
index 00000000000..05a1abe59ae
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa.test
@@ -0,0 +1,5 @@
+source include/have_innodb.inc;
+source include/master-slave.inc;
+
+source rpl_xa.inc;
+source include/rpl_end.inc;
diff --git a/mysql-test/suite/rpl/t/rpl_xa_gap_lock-slave.opt b/mysql-test/suite/rpl/t/rpl_xa_gap_lock-slave.opt
new file mode 100644
index 00000000000..4602a43ce25
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_gap_lock-slave.opt
@@ -0,0 +1 @@
+--transaction-isolation=READ-COMMITTED
diff --git a/mysql-test/suite/rpl/t/rpl_xa_gap_lock.test b/mysql-test/suite/rpl/t/rpl_xa_gap_lock.test
new file mode 100644
index 00000000000..9c48891b889
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_gap_lock.test
@@ -0,0 +1,137 @@
+# ==== Purpose ====
+#
+# This test will generate two XA transactions on the master in a way that
+# they will block each other on the slave if the transaction isolation level
+# used by the slave applier is more restrictive than the READ COMMITTED one.
+#
+# Consider:
+# E=execute, P=prepare, C=commit;
+# 1=first transaction, 2=second transaction;
+#
+# Master does: E1, E2, P2, P1, C1, C2
+# Slave does: E2, P2, E1, P1, C1, C2
+#
+# The transactions are designed so that, if the applier transaction isolation
+# level is more restrictive than the READ COMMITTED, E1 will be blocked on
+# the slave waiting for gap locks to be released.
+#
+# Step 1
+#
+# The test will verify that the transactions don't block each other because
+# the applier thread automatically changed the isolation level.
+#
+# Step 2
+#
+# The test will verify that applying master's binary log dump in slave doesn't
+# block because mysqlbinlog is informing the isolation level to be used.
+#
+# ==== Related Bugs and Worklogs ====
+#
+# BUG#25040331: INTERLEAVED XA TRANSACTIONS MAY DEADLOCK SLAVE APPLIER WITH
+# REPEATABLE READ
+#
+--source include/have_debug.inc
+--source include/have_innodb.inc
+# The test case only make sense for RBR
+--source include/have_binlog_format_row.inc
+--source include/master-slave.inc
+
+--connection slave
+# To hit the issue, we need to split the data in two pages.
+# This global variable will help us.
+SET @saved_innodb_limit_optimistic_insert_debug = @@GLOBAL.innodb_limit_optimistic_insert_debug;
+SET @@GLOBAL.innodb_limit_optimistic_insert_debug = 2;
+
+#
+# Step 1 - Using async replication
+#
+
+# Let's generate the workload on the master
+--connection master
+CREATE TABLE t1 (
+ c1 INT NOT NULL,
+ KEY(c1)
+) ENGINE=InnoDB;
+
+CREATE TABLE t2 (
+ c1 INT NOT NULL,
+ FOREIGN KEY(c1) REFERENCES t1(c1)
+) ENGINE=InnoDB;
+
+INSERT INTO t1 VALUES (1), (3), (4);
+
+--connection master1
+XA START 'XA1';
+INSERT INTO t1 values(2);
+XA END 'XA1';
+
+# This transaction will reference the gap where XA1
+# was inserted, and will be prepared and committed
+# before XA1, so the slave will prepare it (but will
+# not commit it) before preparing XA1.
+--connection master
+XA START 'XA2';
+INSERT INTO t2 values(3);
+XA END 'XA2';
+
+# The XA2 prepare should be binary logged first
+XA PREPARE 'XA2';
+
+# The XA1 prepare should be binary logged
+# after XA2 prepare and before XA2 commit.
+--connection master1
+XA PREPARE 'XA1';
+
+# The commit order doesn't matter much for the issue being tested.
+XA COMMIT 'XA1';
+--connection master
+XA COMMIT 'XA2';
+
+# Everything is fine if the slave can sync with the master.
+--source include/sync_slave_sql_with_master.inc
+
+#
+# Step 2 - Using mysqlbinlog dump to restore the salve
+#
+--source include/stop_slave.inc
+DROP TABLE t2, t1;
+RESET SLAVE;
+RESET MASTER;
+
+--connection master
+--let $master_data_dir= `SELECT @@datadir`
+--let $master_log_file= query_get_value(SHOW MASTER STATUS, File, 1)
+--let $mysql_server= $MYSQL --defaults-group-suffix=.2
+--echo Restore binary log from the master into the slave
+--exec $MYSQL_BINLOG --force-if-open $master_data_dir/$master_log_file | $mysql_server
+
+--let $diff_tables= master:test.t1, slave:test.t1
+--source include/diff_tables.inc
+--let $diff_tables= master:test.t2, slave:test.t2
+--source include/diff_tables.inc
+
+#
+# Cleanup
+#
+--let $master_file= query_get_value(SHOW MASTER STATUS, File, 1)
+--let $master_pos= query_get_value(SHOW MASTER STATUS, Position, 1)
+DROP TABLE t2, t1;
+
+## When GTID_MODE=OFF, we need to skip already applied transactions
+--connection slave
+#--let $gtid_mode= `SELECT @@GTID_MODE`
+#if ($gtid_mode == OFF)
+#{
+# --disable_query_log
+# --disable_result_log
+# --eval CHANGE MASTER TO MASTER_LOG_FILE='$master_file', MASTER_LOG_POS=$master_pos
+# --enable_result_log
+# --enable_query_log
+#}
+--replace_result $master_file LOG_FILE $master_pos LOG_POS
+--eval CHANGE MASTER TO MASTER_LOG_FILE='$master_file', MASTER_LOG_POS=$master_pos
+
+SET @@GLOBAL.innodb_limit_optimistic_insert_debug = @saved_innodb_limit_optimistic_insert_debug;
+--source include/start_slave.inc
+
+--source include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_xa_gtid_pos_auto_engine.test b/mysql-test/suite/rpl/t/rpl_xa_gtid_pos_auto_engine.test
new file mode 100644
index 00000000000..b83493762c3
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_gtid_pos_auto_engine.test
@@ -0,0 +1,29 @@
+--source include/have_innodb.inc
+--source include/master-slave.inc
+
+--connection slave
+call mtr.add_suppression("The automatically created table.*name may not be entirely in lowercase");
+
+--source include/stop_slave.inc
+CHANGE MASTER TO master_use_gtid=slave_pos;
+
+SET @@global.gtid_pos_auto_engines="innodb";
+--source include/start_slave.inc
+--let $rpl_xa_check_lhs= @@global.gtid_slave_pos
+--let $rpl_xa_check_rhs= CONCAT(domain_id,"-",server_id,"-",seq_no) FROM mysql.gtid_slave_pos WHERE seq_no = (SELECT DISTINCT max(seq_no) FROM mysql.gtid_slave_pos)
+--let $rpl_xa_check=SELECT $rpl_xa_check_lhs = $rpl_xa_check_rhs
+--source rpl_xa.inc
+
+--connection slave
+--source include/stop_slave.inc
+SET @@global.gtid_pos_auto_engines="";
+SET @@session.sql_log_bin=0;
+DROP TABLE mysql.gtid_slave_pos_InnoDB;
+if (`SHOW COUNT(*) WARNINGS`)
+{
+ show tables in mysql like 'gtid_slave_pos%';
+}
+SET @@session.sql_log_bin=1;
+--source include/start_slave.inc
+
+--source include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect.test b/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect.test
new file mode 100644
index 00000000000..1c33435473e
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect.test
@@ -0,0 +1,294 @@
+# BUG #12161 Xa recovery and client disconnection
+# the test verifies that
+# a. disconnection does not lose a prepared transaction
+# so it can be committed from another connection
+# c. the prepared transaction is logged
+# d. interleaved prepared transactions are correctly applied on the slave.
+
+#
+# Both replication format are checked through explict
+# set @@binlog_format in the test.
+#
+--source include/have_innodb.inc
+--source include/have_binlog_format_mixed.inc
+#
+# Prepared XA can't get available to an external connection
+# until a connection, that either leaves actively or is killed,
+# has completed a necessary part of its cleanup.
+# Selecting from P_S.threads provides a method to learn that.
+#
+--source include/have_perfschema.inc
+--source include/master-slave.inc
+
+--connection master
+call mtr.add_suppression("Found 2 prepared XA transactions");
+CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
+
+CREATE DATABASE d1;
+CREATE DATABASE d2;
+
+CREATE TABLE d1.t (a INT) ENGINE=innodb;
+CREATE TABLE d2.t (a INT) ENGINE=innodb;
+
+connect (master_conn1, 127.0.0.1,root,,test,$MASTER_MYPORT,);
+--let $conn_id=`SELECT connection_id()`
+SET @@session.binlog_format= statement;
+XA START '1-stmt';
+INSERT INTO d1.t VALUES (1);
+XA END '1-stmt';
+XA PREPARE '1-stmt';
+
+--disconnect master_conn1
+
+--connection master
+
+--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
+--source include/wait_condition.inc
+
+connect (master_conn2, 127.0.0.1,root,,test,$MASTER_MYPORT,);
+--let $conn_id=`SELECT connection_id()`
+SET @@session.binlog_format= row;
+XA START '1-row';
+INSERT INTO d2.t VALUES (1);
+XA END '1-row';
+XA PREPARE '1-row';
+
+--disconnect master_conn2
+
+--connection master
+--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
+--source include/wait_condition.inc
+
+XA START '2';
+INSERT INTO d1.t VALUES (2);
+XA END '2';
+XA PREPARE '2';
+XA COMMIT '2';
+
+XA COMMIT '1-row';
+XA COMMIT '1-stmt';
+source include/show_binlog_events.inc;
+
+# the proof: slave is in sync with the table updated by the prepared transactions.
+--source include/sync_slave_sql_with_master.inc
+
+--source include/stop_slave.inc
+
+#
+# Recover with Master server restart
+#
+--connection master
+
+connect (master2, 127.0.0.1,root,,test,$MASTER_MYPORT,);
+--connection master2
+SET @@session.binlog_format= statement;
+XA START '3-stmt';
+INSERT INTO d1.t VALUES (3);
+XA END '3-stmt';
+XA PREPARE '3-stmt';
+--disconnect master2
+
+connect (master2, 127.0.0.1,root,,test,$MASTER_MYPORT,);
+--connection master2
+SET @@session.binlog_format= row;
+XA START '3-row';
+INSERT INTO d2.t VALUES (4);
+XA END '3-row';
+XA PREPARE '3-row';
+--disconnect master2
+
+--connection master
+
+#
+# Testing read-only
+#
+connect (master2, 127.0.0.1,root,,test,$MASTER_MYPORT,);
+--connection master2
+XA START '4';
+SELECT * FROM d1.t;
+XA END '4';
+XA PREPARE '4';
+--disconnect master2
+
+#
+# Logging few disconnected XA:s for replication.
+#
+--let $bulk_trx_num=10
+--let $i = $bulk_trx_num
+
+while($i > 0)
+{
+ --connect (master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,)
+ --let $conn_id=`SELECT connection_id()`
+
+ --eval XA START 'bulk_trx_$i'
+ --eval INSERT INTO d1.t VALUES ($i)
+ --eval INSERT INTO d2.t VALUES ($i)
+ --eval XA END 'bulk_trx_$i'
+ --eval XA PREPARE 'bulk_trx_$i'
+
+ --disconnect master_bulk_conn$i
+
+ --connection master
+ --let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
+ --source include/wait_condition.inc
+
+ --dec $i
+}
+
+#
+# Prove the slave applier is capable to resume the prepared XA:s
+# upon its restart.
+#
+--connection slave
+--source include/start_slave.inc
+--connection master
+--source include/sync_slave_sql_with_master.inc
+--source include/stop_slave.inc
+
+--connection master
+--let $i = $bulk_trx_num
+while($i > 0)
+{
+ --let $command=COMMIT
+ if (`SELECT $i % 2`)
+ {
+ --let $command=ROLLBACK
+ }
+ --eval XA $command 'bulk_trx_$i'
+ --dec $i
+}
+
+--let $rpl_server_number= 1
+--source include/rpl_restart_server.inc
+
+--connection slave
+--source include/start_slave.inc
+
+--connection master
+--echo *** '3-stmt','3-row' xa-transactions must be in the list ***
+XA RECOVER;
+XA COMMIT '3-stmt';
+XA ROLLBACK '3-row';
+
+--source include/sync_slave_sql_with_master.inc
+
+#
+# Testing replication with marginal XID values and in two formats.
+#
+
+--connection master
+--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
+--source include/wait_condition.inc
+
+# Max size XID incl max value of formatID
+connect (master_conn2, 127.0.0.1,root,,test,$MASTER_MYPORT,);
+--let $conn_id=`SELECT connection_id()`
+
+--let $gtrid=0123456789012345678901234567890123456789012345678901234567890124
+--let $bqual=0123456789012345678901234567890123456789012345678901234567890124
+--eval XA START '$gtrid','$bqual',4294967292
+ INSERT INTO d1.t VALUES (64);
+--eval XA END '$gtrid','$bqual',4294967292
+--eval XA PREPARE '$gtrid','$bqual',4294967292
+
+--disconnect master_conn2
+
+--connection master
+--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
+--source include/wait_condition.inc
+
+# Max size XID with non-ascii chars
+connect (master_conn3, 127.0.0.1,root,,test,$MASTER_MYPORT,);
+--let $conn_id=`SELECT connection_id()`
+
+--let $gtrid_hex=FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
+--let $bqual_hex=00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+--eval XA START X'$gtrid_hex',X'$bqual_hex',0
+ INSERT INTO d1.t VALUES (0);
+--eval XA END X'$gtrid_hex',X'$bqual_hex',0
+--eval XA PREPARE X'$gtrid_hex',X'$bqual_hex',0
+
+--disconnect master_conn3
+
+--connection master
+--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
+--source include/wait_condition.inc
+
+# Random XID
+--disable_query_log
+
+connect (master_conn4, 127.0.0.1,root,,test,$MASTER_MYPORT,);
+--let $conn_id=`SELECT connection_id()`
+
+--let $gtridlen=`SELECT 2*(1 + round(rand()*100) % 31)`
+--let $bquallen=`SELECT 2*(1 + round(rand()*100) % 31)`
+--let $gtrid_rand=`SELECT substring(concat(MD5(rand()), MD5(rand())), 1, $gtridlen)`
+--let $bqual_rand=`SELECT substring(concat(MD5(rand()), MD5(rand())), 1, $bquallen)`
+--let $formt_rand=`SELECT floor((rand()*10000000000) % 4294967293)`
+--eval XA START X'$gtrid_rand',X'$bqual_rand',$formt_rand
+ INSERT INTO d1.t VALUES (0);
+--eval XA END X'$gtrid_rand',X'$bqual_rand',$formt_rand
+--eval XA PREPARE X'$gtrid_rand',X'$bqual_rand',$formt_rand
+
+--enable_query_log
+
+--disconnect master_conn4
+
+--connection master
+--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
+--source include/wait_condition.inc
+
+--eval XA COMMIT '$gtrid','$bqual',4294967292
+--eval XA COMMIT X'$gtrid_hex',X'$bqual_hex',0
+--disable_query_log
+--echo XA COMMIT 'RANDOM XID'
+--eval XA COMMIT X'$gtrid_rand',X'$bqual_rand',$formt_rand
+--enable_query_log
+
+--source include/sync_slave_sql_with_master.inc
+
+#
+# Testing ONE PHASE
+#
+--let $onephase_trx_num=10
+--let $i = $onephase_trx_num
+while($i > 0)
+{
+ --connect (master_bulk_conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,)
+
+ --connection master_bulk_conn$i
+ --eval XA START 'one_phase_$i'
+ --eval INSERT INTO d1.t VALUES ($i)
+ --eval INSERT INTO d2.t VALUES ($i)
+ --eval XA END 'one_phase_$i'
+ --eval XA COMMIT 'one_phase_$i' ONE PHASE
+
+ --disconnect master_bulk_conn$i
+ --dec $i
+}
+--connection master
+--source include/sync_slave_sql_with_master.inc
+
+#
+# Overall consistency check
+#
+--let $diff_tables= master:d1.t, slave:d1.t
+--source include/diff_tables.inc
+--let $diff_tables= master:d2.t, slave:d2.t
+--source include/diff_tables.inc
+#
+# cleanup
+#
+--connection master
+
+DELETE FROM d1.t;
+DELETE FROM d2.t;
+DROP TABLE d1.t, d2.t;
+DROP DATABASE d1;
+DROP DATABASE d2;
+DROP VIEW v_processlist;
+
+--source include/sync_slave_sql_with_master.inc
+
+--source include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect_lsu_off-slave.opt b/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect_lsu_off-slave.opt
new file mode 100644
index 00000000000..94c3650024f
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect_lsu_off-slave.opt
@@ -0,0 +1,2 @@
+--log-slave-updates=off
+
diff --git a/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect_lsu_off.test b/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect_lsu_off.test
new file mode 100644
index 00000000000..df3811df6ae
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect_lsu_off.test
@@ -0,0 +1,8 @@
+# ==== Purpose ====
+# 'rpl_xa_survive_disconnect_lsu_off' verifies the same properties as the sourced file
+# in conditions of the slave does not log own updates
+# (lsu in the name stands for log_slave_updates).
+# Specifically this mode aims at proving correct operations on the slave
+# mysql.gtid_executed.
+
+--source ./rpl_xa_survive_disconnect.test
diff --git a/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect_mixed_engines.test b/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect_mixed_engines.test
new file mode 100644
index 00000000000..f52a9630a87
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_xa_survive_disconnect_mixed_engines.test
@@ -0,0 +1,68 @@
+# BUG#12161 Xa recovery and client disconnection
+#
+# The test verifies correct XA transaction two phase logging and its applying
+# in a case the transaction updates transactional and non-transactional tables.
+# Transactions are terminated according to specfied parameters to
+# a sourced inc-file.
+
+--source include/have_innodb.inc
+--source include/master-slave.inc
+
+--connection master
+CALL mtr.add_suppression("Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT");
+
+--let $command=setup
+--source include/rpl_xa_mixed_engines.inc
+
+--echo === COMMIT ===
+--let $command=run
+--let $xa_terminate=XA COMMIT
+--let $xa_prepare_opt=1
+--source include/rpl_xa_mixed_engines.inc
+
+--source include/sync_slave_sql_with_master.inc
+--connection master
+
+--echo === COMMIT ONE PHASE ===
+
+--let $command=run
+--let $xa_terminate=XA COMMIT
+--let $one_phase=ONE PHASE
+--let $xa_prepare_opt=
+--source include/rpl_xa_mixed_engines.inc
+--let $one_phase=
+--source include/sync_slave_sql_with_master.inc
+--connection master
+
+--echo === ROLLBACK with PREPARE ===
+
+--let $command=run
+--let $xa_terminate=xa rollback
+--let $xa_prepare_opt=1
+--source include/rpl_xa_mixed_engines.inc
+
+--source include/sync_slave_sql_with_master.inc
+--connection master
+
+--echo === ROLLBACK with no PREPARE ===
+
+--let $command=run
+--let $xa_terminate=xa rollback
+--let $xa_prepare_opt=
+--source include/rpl_xa_mixed_engines.inc
+--let $xa_rollback_only=
+
+--source include/sync_slave_sql_with_master.inc
+
+--let $diff_tables= master:tm, slave:tm
+--source include/diff_tables.inc
+
+# Cleanup
+
+--connection master
+--let $command=cleanup
+--source include/rpl_xa_mixed_engines.inc
+
+--source include/sync_slave_sql_with_master.inc
+
+--source include/rpl_end.inc
diff --git a/sql/handler.cc b/sql/handler.cc
index 7d61252eea6..c72386c7e99 100644
--- a/sql/handler.cc
+++ b/sql/handler.cc
@@ -1300,6 +1300,14 @@ int ha_prepare(THD *thd)
}
}
+
+ DEBUG_SYNC(thd, "at_unlog_xa_prepare");
+
+ if (tc_log->unlog_xa_prepare(thd, all))
+ {
+ ha_rollback_trans(thd, all);
+ error=1;
+ }
}
DBUG_RETURN(error);
@@ -1853,7 +1861,8 @@ int ha_rollback_trans(THD *thd, bool all)
rollback without signalling following transactions. And in release
builds, we explicitly do the signalling before rolling back.
*/
- DBUG_ASSERT(!(thd->rgi_slave && thd->rgi_slave->did_mark_start_commit));
+ DBUG_ASSERT(!(thd->rgi_slave && thd->rgi_slave->did_mark_start_commit) ||
+ thd->transaction.xid_state.is_explicit_XA());
if (thd->rgi_slave && thd->rgi_slave->did_mark_start_commit)
thd->rgi_slave->unmark_start_commit();
}
diff --git a/sql/log.cc b/sql/log.cc
index 56e83bf2448..e13f8fbc88f 100644
--- a/sql/log.cc
+++ b/sql/log.cc
@@ -91,7 +91,13 @@ static bool binlog_savepoint_rollback_can_release_mdl(handlerton *hton,
static int binlog_commit(handlerton *hton, THD *thd, bool all);
static int binlog_rollback(handlerton *hton, THD *thd, bool all);
static int binlog_prepare(handlerton *hton, THD *thd, bool all);
+static int binlog_xa_recover_dummy(handlerton *hton, XID *xid_list, uint len);
+static int binlog_commit_by_xid(handlerton *hton, XID *xid);
+static int binlog_rollback_by_xid(handlerton *hton, XID *xid);
static int binlog_start_consistent_snapshot(handlerton *hton, THD *thd);
+static int binlog_flush_cache(THD *thd, binlog_cache_mngr *cache_mngr,
+ Log_event *end_ev, bool all, bool using_stmt,
+ bool using_trx);
static const LEX_CSTRING write_error_msg=
{ STRING_WITH_LEN("error writing to the binary log") };
@@ -1693,6 +1699,10 @@ int binlog_init(void *p)
{
binlog_hton->prepare= binlog_prepare;
binlog_hton->start_consistent_snapshot= binlog_start_consistent_snapshot;
+ binlog_hton->commit_by_xid= binlog_commit_by_xid;
+ binlog_hton->rollback_by_xid= binlog_rollback_by_xid;
+ // recover needs to be set to make xa{commit,rollback}_handlerton effective
+ binlog_hton->recover= binlog_xa_recover_dummy;
}
binlog_hton->flags= HTON_NOT_USER_SELECTABLE | HTON_HIDDEN;
return 0;
@@ -1765,7 +1775,8 @@ binlog_flush_cache(THD *thd, binlog_cache_mngr *cache_mngr,
DBUG_PRINT("enter", ("end_ev: %p", end_ev));
if ((using_stmt && !cache_mngr->stmt_cache.empty()) ||
- (using_trx && !cache_mngr->trx_cache.empty()))
+ (using_trx && !cache_mngr->trx_cache.empty()) ||
+ thd->transaction.xid_state.is_explicit_XA())
{
if (using_stmt && thd->binlog_flush_pending_rows_event(TRUE, FALSE))
DBUG_RETURN(1);
@@ -1837,6 +1848,17 @@ binlog_commit_flush_stmt_cache(THD *thd, bool all,
DBUG_RETURN(binlog_flush_cache(thd, cache_mngr, &end_evt, all, TRUE, FALSE));
}
+
+inline size_t serialize_with_xid(XID *xid, char *buf,
+ const char *query, size_t q_len)
+{
+ memcpy(buf, query, q_len);
+
+ return
+ q_len + strlen(static_cast<event_xid_t*>(xid)->serialize(buf + q_len));
+}
+
+
/**
This function flushes the trx-cache upon commit.
@@ -1850,11 +1872,28 @@ static inline int
binlog_commit_flush_trx_cache(THD *thd, bool all, binlog_cache_mngr *cache_mngr)
{
DBUG_ENTER("binlog_commit_flush_trx_cache");
- Query_log_event end_evt(thd, STRING_WITH_LEN("COMMIT"),
- TRUE, TRUE, TRUE, 0);
+
+ const char query[]= "XA COMMIT ";
+ const size_t q_len= sizeof(query) - 1; // do not count trailing 0
+ char buf[q_len + ser_buf_size]= "COMMIT";
+ size_t buflen= sizeof("COMMIT") - 1;
+
+ if (thd->lex->sql_command == SQLCOM_XA_COMMIT &&
+ thd->lex->xa_opt != XA_ONE_PHASE)
+ {
+ DBUG_ASSERT(thd->transaction.xid_state.is_explicit_XA());
+ DBUG_ASSERT(thd->transaction.xid_state.get_state_code() ==
+ XA_PREPARED);
+
+ buflen= serialize_with_xid(thd->transaction.xid_state.get_xid(),
+ buf, query, q_len);
+ }
+ Query_log_event end_evt(thd, buf, buflen, TRUE, TRUE, TRUE, 0);
+
DBUG_RETURN(binlog_flush_cache(thd, cache_mngr, &end_evt, all, FALSE, TRUE));
}
+
/**
This function flushes the trx-cache upon rollback.
@@ -1868,8 +1907,20 @@ static inline int
binlog_rollback_flush_trx_cache(THD *thd, bool all,
binlog_cache_mngr *cache_mngr)
{
- Query_log_event end_evt(thd, STRING_WITH_LEN("ROLLBACK"),
- TRUE, TRUE, TRUE, 0);
+ const char query[]= "XA ROLLBACK ";
+ const size_t q_len= sizeof(query) - 1; // do not count trailing 0
+ char buf[q_len + ser_buf_size]= "ROLLBACK";
+ size_t buflen= sizeof("ROLLBACK") - 1;
+
+ if (thd->transaction.xid_state.is_explicit_XA())
+ {
+ /* for not prepared use plain ROLLBACK */
+ if (thd->transaction.xid_state.get_state_code() == XA_PREPARED)
+ buflen= serialize_with_xid(thd->transaction.xid_state.get_xid(),
+ buf, query, q_len);
+ }
+ Query_log_event end_evt(thd, buf, buflen, TRUE, TRUE, TRUE, 0);
+
return (binlog_flush_cache(thd, cache_mngr, &end_evt, all, FALSE, TRUE));
}
@@ -1887,23 +1938,10 @@ static inline int
binlog_commit_flush_xid_caches(THD *thd, binlog_cache_mngr *cache_mngr,
bool all, my_xid xid)
{
- if (xid)
- {
- Xid_log_event end_evt(thd, xid, TRUE);
- return (binlog_flush_cache(thd, cache_mngr, &end_evt, all, TRUE, TRUE));
- }
- else
- {
- /*
- Empty xid occurs in XA COMMIT ... ONE PHASE.
- In this case, we do not have a MySQL xid for the transaction, and the
- external XA transaction coordinator will have to handle recovery if
- needed. So we end the transaction with a plain COMMIT query event.
- */
- Query_log_event end_evt(thd, STRING_WITH_LEN("COMMIT"),
- TRUE, TRUE, TRUE, 0);
- return (binlog_flush_cache(thd, cache_mngr, &end_evt, all, TRUE, TRUE));
- }
+ DBUG_ASSERT(xid); // replaced former treatment of ONE-PHASE XA
+
+ Xid_log_event end_evt(thd, xid, TRUE);
+ return (binlog_flush_cache(thd, cache_mngr, &end_evt, all, TRUE, TRUE));
}
/**
@@ -1959,17 +1997,67 @@ binlog_truncate_trx_cache(THD *thd, binlog_cache_mngr *cache_mngr, bool all)
DBUG_RETURN(error);
}
+
+inline bool is_preparing_xa(THD *thd)
+{
+ return
+ thd->transaction.xid_state.is_explicit_XA() &&
+ thd->lex->sql_command == SQLCOM_XA_PREPARE;
+}
+
+
static int binlog_prepare(handlerton *hton, THD *thd, bool all)
{
/*
- do nothing.
- just pretend we can do 2pc, so that MySQL won't
- switch to 1pc.
- real work will be done in MYSQL_BIN_LOG::log_and_order()
+ Do nothing unless the transaction is a user XA.
*/
+ return
+ !is_preparing_xa(thd) ? 0 : binlog_commit(NULL, thd, all);
+}
+
+
+static int binlog_xa_recover_dummy(handlerton *hton __attribute__((unused)),
+ XID *xid_list __attribute__((unused)),
+ uint len __attribute__((unused)))
+{
+ /* Does nothing. */
return 0;
}
+
+static int binlog_commit_by_xid(handlerton *hton, XID *xid)
+{
+ THD *thd= current_thd;
+
+ if (thd->transaction.xid_state.is_binlogged())
+ (void) thd->binlog_setup_trx_data();
+
+ DBUG_ASSERT(thd->lex->sql_command == SQLCOM_XA_COMMIT);
+
+ return binlog_commit(hton, thd, TRUE);
+}
+
+
+static int binlog_rollback_by_xid(handlerton *hton, XID *xid)
+{
+ THD *thd= current_thd;
+
+ if (thd->transaction.xid_state.is_binlogged())
+ (void) thd->binlog_setup_trx_data();
+
+ DBUG_ASSERT(thd->lex->sql_command == SQLCOM_XA_ROLLBACK);
+
+ return binlog_rollback(hton, thd, TRUE);
+}
+
+
+inline bool is_prepared_xa(THD *thd)
+{
+ return thd->transaction.xid_state.is_explicit_XA() &&
+ thd->transaction.xid_state.get_state_code() == XA_PREPARED;
+}
+
+
/*
We flush the cache wrapped in a beging/rollback if:
. aborting a single or multi-statement transaction and;
@@ -1992,7 +2080,55 @@ static bool trans_cannot_safely_rollback(THD *thd, bool all)
thd->wsrep_binlog_format() == BINLOG_FORMAT_MIXED) ||
(trans_has_updated_non_trans_table(thd) &&
ending_single_stmt_trans(thd,all) &&
- thd->wsrep_binlog_format() == BINLOG_FORMAT_MIXED));
+ thd->wsrep_binlog_format() == BINLOG_FORMAT_MIXED) ||
+ is_prepared_xa(thd));
+}
+
+
+/**
+ Specific log flusher invoked through log_xa_prepare().
+*/
+static int binlog_commit_flush_xa_prepare(THD *thd, bool all,
+ binlog_cache_mngr *cache_mngr)
+{
+ XID *xid= thd->transaction.xid_state.get_xid();
+ {
+ // todo assert wsrep_simulate || is_open()
+
+ /*
+ Log the XA END event first.
+ We don't do that in trans_xa_end() as XA COMMIT ONE PHASE
+ is logged as simple BEGIN/COMMIT so the XA END should
+ not get to the log.
+ */
+ const char query[]= "XA END ";
+ const size_t q_len= sizeof(query) - 1; // do not count trailing 0
+ char buf[q_len + ser_buf_size];
+ size_t buflen;
+ binlog_cache_data *cache_data;
+ IO_CACHE *file;
+
+ memcpy(buf, query, q_len);
+ buflen= q_len +
+ strlen(static_cast<event_xid_t*>(xid)->serialize(buf + q_len));
+ cache_data= cache_mngr->get_binlog_cache_data(true);
+ file= &cache_data->cache_log;
+ thd->lex->sql_command= SQLCOM_XA_END;
+ Query_log_event xa_end(thd, buf, buflen, true, false, true, 0);
+ if (mysql_bin_log.write_event(&xa_end, cache_data, file))
+ return 1;
+ thd->lex->sql_command= SQLCOM_XA_PREPARE;
+ }
+
+ cache_mngr->using_xa= FALSE;
+ XA_prepare_log_event end_evt(thd, xid, FALSE);
+ /*
+ Memorize the fact of prepare-logging to recall at commit
+ possibly from another session.
+ */
+ if (thd->variables.option_bits & OPTION_BIN_LOG)
+ thd->transaction.xid_state.set_binlogged();
+ return (binlog_flush_cache(thd, cache_mngr, &end_evt, all, TRUE, TRUE));
}
@@ -2019,7 +2155,9 @@ static int binlog_commit(handlerton *hton, THD *thd, bool all)
if (!cache_mngr)
{
- DBUG_ASSERT(WSREP(thd));
+ DBUG_ASSERT(WSREP(thd) ||
+ (thd->transaction.xid_state.is_explicit_XA() &&
+ !thd->transaction.xid_state.is_binlogged()));
DBUG_RETURN(0);
}
@@ -2038,7 +2176,8 @@ static int binlog_commit(handlerton *hton, THD *thd, bool all)
error= binlog_commit_flush_stmt_cache(thd, all, cache_mngr);
}
- if (cache_mngr->trx_cache.empty())
+ if (cache_mngr->trx_cache.empty() &&
+ !thd->transaction.xid_state.is_binlogged())
{
/*
we're here because cache_log was flushed in MYSQL_BIN_LOG::log_xid()
@@ -2055,8 +2194,11 @@ static int binlog_commit(handlerton *hton, THD *thd, bool all)
Otherwise, we accumulate the changes.
*/
if (likely(!error) && ending_trans(thd, all))
- error= binlog_commit_flush_trx_cache(thd, all, cache_mngr);
-
+ {
+ error= !is_preparing_xa(thd) ?
+ binlog_commit_flush_trx_cache (thd, all, cache_mngr) :
+ binlog_commit_flush_xa_prepare(thd, all, cache_mngr);
+ }
/*
This is part of the stmt rollback.
*/
@@ -2080,13 +2222,16 @@ static int binlog_commit(handlerton *hton, THD *thd, bool all)
static int binlog_rollback(handlerton *hton, THD *thd, bool all)
{
DBUG_ENTER("binlog_rollback");
+
int error= 0;
binlog_cache_mngr *const cache_mngr=
(binlog_cache_mngr*) thd_get_ha_data(thd, binlog_hton);
if (!cache_mngr)
{
- DBUG_ASSERT(WSREP(thd));
+ DBUG_ASSERT(WSREP(thd) ||
+ (is_prepared_xa(thd) ||
+ !thd->transaction.xid_state.is_binlogged()));
DBUG_RETURN(0);
}
@@ -2101,15 +2246,16 @@ static int binlog_rollback(handlerton *hton, THD *thd, bool all)
*/
if (cache_mngr->stmt_cache.has_incident())
{
- error= mysql_bin_log.write_incident(thd);
+ error |= static_cast<int>(mysql_bin_log.write_incident(thd));
cache_mngr->reset(true, false);
}
else if (!cache_mngr->stmt_cache.empty())
{
- error= binlog_commit_flush_stmt_cache(thd, all, cache_mngr);
+ error |= binlog_commit_flush_stmt_cache(thd, all, cache_mngr);
}
- if (cache_mngr->trx_cache.empty())
+ if (cache_mngr->trx_cache.empty() &&
+ !thd->transaction.xid_state.is_binlogged())
{
/*
we're here because cache_log was flushed in MYSQL_BIN_LOG::log_xid()
@@ -7340,10 +7486,10 @@ MYSQL_BIN_LOG::write_transaction_to_binlog(THD *thd,
entry.all= all;
entry.using_stmt_cache= using_stmt_cache;
entry.using_trx_cache= using_trx_cache;
- entry.need_unlog= false;
+ entry.need_unlog= is_preparing_xa(thd);
ha_info= all ? thd->transaction.all.ha_list : thd->transaction.stmt.ha_list;
- for (; ha_info; ha_info= ha_info->next())
+ for (; !entry.need_unlog && ha_info; ha_info= ha_info->next())
{
if (ha_info->is_started() && ha_info->ht() != binlog_hton &&
!ha_info->ht()->commit_checkpoint_request)
@@ -7916,7 +8062,9 @@ MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader)
We already checked before that at least one cache is non-empty; if both
are empty we would have skipped calling into here.
*/
- DBUG_ASSERT(!cache_mngr->stmt_cache.empty() || !cache_mngr->trx_cache.empty());
+ DBUG_ASSERT(!cache_mngr->stmt_cache.empty() ||
+ !cache_mngr->trx_cache.empty() ||
+ current->thd->transaction.xid_state.is_explicit_XA());
if (unlikely((current->error= write_transaction_or_stmt(current,
commit_id))))
@@ -7925,7 +8073,7 @@ MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader)
strmake_buf(cache_mngr->last_commit_pos_file, log_file_name);
commit_offset= my_b_write_tell(&log_file);
cache_mngr->last_commit_pos_offset= commit_offset;
- if (cache_mngr->using_xa && cache_mngr->xa_xid)
+ if ((cache_mngr->using_xa && cache_mngr->xa_xid) || current->need_unlog)
{
/*
If all storage engines support commit_checkpoint_request(), then we
@@ -8160,7 +8308,8 @@ MYSQL_BIN_LOG::write_transaction_or_stmt(group_commit_entry *entry,
binlog_cache_mngr *mngr= entry->cache_mngr;
DBUG_ENTER("MYSQL_BIN_LOG::write_transaction_or_stmt");
- if (write_gtid_event(entry->thd, false, entry->using_trx_cache, commit_id))
+ if (write_gtid_event(entry->thd, is_prepared_xa(entry->thd),
+ entry->using_trx_cache, commit_id))
DBUG_RETURN(ER_ERROR_ON_WRITE);
if (entry->using_stmt_cache && !mngr->stmt_cache.empty() &&
@@ -9862,6 +10011,24 @@ int TC_LOG_BINLOG::unlog(ulong cookie, my_xid xid)
DBUG_RETURN(BINLOG_COOKIE_GET_ERROR_FLAG(cookie));
}
+
+int TC_LOG_BINLOG::unlog_xa_prepare(THD *thd, bool all)
+{
+ DBUG_ASSERT(is_preparing_xa(thd));
+
+ binlog_cache_mngr *cache_mngr= thd->binlog_setup_trx_data();
+ int cookie= 0;
+
+ if (!cache_mngr || !cache_mngr->need_unlog)
+ return 0;
+ else
+ cookie= BINLOG_COOKIE_MAKE(cache_mngr->binlog_id, cache_mngr->delayed_error);
+ cache_mngr->need_unlog= false;
+
+ return unlog(cookie, 1);
+}
+
+
void
TC_LOG_BINLOG::commit_checkpoint_notify(void *cookie)
{
@@ -10178,6 +10345,7 @@ int TC_LOG_BINLOG::recover(LOG_INFO *linfo, const char *last_log_name,
((last_gtid_standalone && !ev->is_part_of_group(typ)) ||
(!last_gtid_standalone &&
(typ == XID_EVENT ||
+ typ == XA_PREPARE_LOG_EVENT ||
(LOG_EVENT_IS_QUERY(typ) &&
(((Query_log_event *)ev)->is_commit() ||
((Query_log_event *)ev)->is_rollback()))))))
diff --git a/sql/log.h b/sql/log.h
index 8684eaba786..8e70d3c8f4c 100644
--- a/sql/log.h
+++ b/sql/log.h
@@ -61,6 +61,7 @@ class TC_LOG
bool need_prepare_ordered,
bool need_commit_ordered) = 0;
virtual int unlog(ulong cookie, my_xid xid)=0;
+ virtual int unlog_xa_prepare(THD *thd, bool all)= 0;
virtual void commit_checkpoint_notify(void *cookie)= 0;
protected:
@@ -115,6 +116,10 @@ class TC_LOG_DUMMY: public TC_LOG // use it to disable the logging
return 1;
}
int unlog(ulong cookie, my_xid xid) { return 0; }
+ int unlog_xa_prepare(THD *thd, bool all)
+ {
+ return 0;
+ }
void commit_checkpoint_notify(void *cookie) { DBUG_ASSERT(0); };
};
@@ -198,6 +203,10 @@ class TC_LOG_MMAP: public TC_LOG
int log_and_order(THD *thd, my_xid xid, bool all,
bool need_prepare_ordered, bool need_commit_ordered);
int unlog(ulong cookie, my_xid xid);
+ int unlog_xa_prepare(THD *thd, bool all)
+ {
+ return 0;
+ }
void commit_checkpoint_notify(void *cookie);
int recover();
@@ -695,6 +704,7 @@ class MYSQL_BIN_LOG: public TC_LOG, private MYSQL_LOG
int log_and_order(THD *thd, my_xid xid, bool all,
bool need_prepare_ordered, bool need_commit_ordered);
int unlog(ulong cookie, my_xid xid);
+ int unlog_xa_prepare(THD *thd, bool all);
void commit_checkpoint_notify(void *cookie);
int recover(LOG_INFO *linfo, const char *last_log_name, IO_CACHE *first_log,
Format_description_log_event *fdle, bool do_xa);
diff --git a/sql/log_event.cc b/sql/log_event.cc
index 4c1c18fffff..ee44f7f1da4 100644
--- a/sql/log_event.cc
+++ b/sql/log_event.cc
@@ -1178,6 +1178,9 @@ Log_event* Log_event::read_log_event(const char* buf, uint event_len,
case XID_EVENT:
ev = new Xid_log_event(buf, fdle);
break;
+ case XA_PREPARE_LOG_EVENT:
+ ev = new XA_prepare_log_event(buf, fdle);
+ break;
case RAND_EVENT:
ev = new Rand_log_event(buf, fdle);
break;
@@ -1229,7 +1232,6 @@ Log_event* Log_event::read_log_event(const char* buf, uint event_len,
case PREVIOUS_GTIDS_LOG_EVENT:
case TRANSACTION_CONTEXT_EVENT:
case VIEW_CHANGE_EVENT:
- case XA_PREPARE_LOG_EVENT:
ev= new Ignorable_log_event(buf, fdle,
get_type_str((Log_event_type) event_type));
break;
@@ -2066,6 +2068,7 @@ Format_description_log_event(uint8 binlog_ver, const char* server_ver)
post_header_len[USER_VAR_EVENT-1]= USER_VAR_HEADER_LEN;
post_header_len[FORMAT_DESCRIPTION_EVENT-1]= FORMAT_DESCRIPTION_HEADER_LEN;
post_header_len[XID_EVENT-1]= XID_HEADER_LEN;
+ post_header_len[XA_PREPARE_LOG_EVENT-1]= XA_PREPARE_HEADER_LEN;
post_header_len[BEGIN_LOAD_QUERY_EVENT-1]= BEGIN_LOAD_QUERY_HEADER_LEN;
post_header_len[EXECUTE_LOAD_QUERY_EVENT-1]= EXECUTE_LOAD_QUERY_HEADER_LEN;
/*
@@ -2556,7 +2559,7 @@ Binlog_checkpoint_log_event::Binlog_checkpoint_log_event(
Gtid_log_event::Gtid_log_event(const char *buf, uint event_len,
const Format_description_log_event *description_event)
- : Log_event(buf, description_event), seq_no(0), commit_id(0)
+ : Log_event(buf, description_event), seq_no(0), commit_id(0), xid_pins_idx(0)
{
uint8 header_size= description_event->common_header_len;
uint8 post_header_len= description_event->post_header_len[GTID_EVENT-1];
@@ -2569,7 +2572,7 @@ Gtid_log_event::Gtid_log_event(const char *buf, uint event_len,
buf+= 8;
domain_id= uint4korr(buf);
buf+= 4;
- flags2= *buf;
+ flags2= *(buf++);
if (flags2 & FL_GROUP_COMMIT_ID)
{
if (event_len < (uint)header_size + GTID_HEADER_LEN + 2)
@@ -2577,8 +2580,24 @@ Gtid_log_event::Gtid_log_event(const char *buf, uint event_len,
seq_no= 0; // So is_valid() returns false
return;
}
- ++buf;
commit_id= uint8korr(buf);
+ buf+= 8;
+ }
+ if (flags2 & (FL_PREPARED_XA | FL_COMPLETED_XA))
+ {
+ uint32 temp= 0;
+
+ memcpy(&temp, buf, sizeof(temp));
+ xid.formatID= uint4korr(&temp);
+ buf += sizeof(temp);
+
+ xid.gtrid_length= (long) buf[0];
+ xid.bqual_length= (long) buf[1];
+ buf+= 2;
+
+ long data_length= xid.bqual_length + xid.gtrid_length;
+ memcpy(xid.data, buf, data_length);
+ buf+= data_length;
}
}
@@ -2764,7 +2783,7 @@ Rand_log_event::Rand_log_event(const char* buf,
Xid_log_event::
Xid_log_event(const char* buf,
const Format_description_log_event *description_event)
- :Log_event(buf, description_event)
+ :Xid_apply_log_event(buf, description_event)
{
/* The Post-Header is empty. The Variable Data part begins immediately. */
buf+= description_event->common_header_len +
@@ -2772,6 +2791,36 @@ Xid_log_event(const char* buf,
memcpy((char*) &xid, buf, sizeof(xid));
}
+/**************************************************************************
+ XA_prepare_log_event methods
+**************************************************************************/
+XA_prepare_log_event::
+XA_prepare_log_event(const char* buf,
+ const Format_description_log_event *description_event)
+ :Xid_apply_log_event(buf, description_event)
+{
+ uint32 temp= 0;
+ uint8 temp_byte;
+
+ buf+= description_event->common_header_len +
+ description_event->post_header_len[XA_PREPARE_LOG_EVENT-1];
+ memcpy(&temp_byte, buf, 1);
+ one_phase= (bool) temp_byte;
+ buf += sizeof(temp_byte);
+ memcpy(&temp, buf, sizeof(temp));
+ m_xid.formatID= uint4korr(&temp);
+ buf += sizeof(temp);
+ memcpy(&temp, buf, sizeof(temp));
+ m_xid.gtrid_length= uint4korr(&temp);
+ buf += sizeof(temp);
+ memcpy(&temp, buf, sizeof(temp));
+ m_xid.bqual_length= uint4korr(&temp);
+ buf += sizeof(temp);
+ memcpy(m_xid.data, buf, m_xid.gtrid_length + m_xid.bqual_length);
+
+ xid= NULL;
+}
+
/**************************************************************************
User_var_log_event methods
diff --git a/sql/log_event.h b/sql/log_event.h
index 15442bd5a97..a6543b70eb5 100644
--- a/sql/log_event.h
+++ b/sql/log_event.h
@@ -222,6 +222,7 @@ class String;
#define GTID_HEADER_LEN 19
#define GTID_LIST_HEADER_LEN 4
#define START_ENCRYPTION_HEADER_LEN 0
+#define XA_PREPARE_HEADER_LEN 0
/*
Max number of possible extra bytes in a replication event compared to a
@@ -664,6 +665,7 @@ enum Log_event_type
/* MySQL 5.7 events, ignored by MariaDB */
TRANSACTION_CONTEXT_EVENT= 36,
VIEW_CHANGE_EVENT= 37,
+ /* not ignored */
XA_PREPARE_LOG_EVENT= 38,
/*
@@ -3022,6 +3024,32 @@ class Rand_log_event: public Log_event
#endif
};
+
+class Xid_apply_log_event: public Log_event
+{
+public:
+#ifdef MYSQL_SERVER
+ Xid_apply_log_event(THD* thd_arg):
+ Log_event(thd_arg, 0, TRUE) {}
+#endif
+ Xid_apply_log_event(const char* buf,
+ const Format_description_log_event *description_event):
+ Log_event(buf, description_event) {}
+
+ ~Xid_apply_log_event() {}
+ bool is_valid() const { return 1; }
+private:
+#if defined(MYSQL_SERVER) && defined(HAVE_REPLICATION)
+ virtual int do_commit()= 0;
+ virtual int do_apply_event(rpl_group_info *rgi);
+ int do_record_gtid(THD *thd, rpl_group_info *rgi, bool in_trans,
+ void **out_hton);
+ enum_skip_reason do_shall_skip(rpl_group_info *rgi);
+ virtual const char* get_query()= 0;
+#endif
+};
+
+
/**
@class Xid_log_event
@@ -3034,18 +3062,22 @@ class Rand_log_event: public Log_event
typedef ulonglong my_xid; // this line is the same as in handler.h
#endif
-class Xid_log_event: public Log_event
+class Xid_log_event: public Xid_apply_log_event
{
- public:
- my_xid xid;
+public:
+ my_xid xid;
#ifdef MYSQL_SERVER
Xid_log_event(THD* thd_arg, my_xid x, bool direct):
- Log_event(thd_arg, 0, TRUE), xid(x)
+ Xid_apply_log_event(thd_arg), xid(x)
{
if (direct)
cache_type= Log_event::EVENT_NO_CACHE;
}
+ const char* get_query()
+ {
+ return "COMMIT /* implicit, from Xid_log_event */";
+ }
#ifdef HAVE_REPLICATION
void pack_info(Protocol* protocol);
#endif /* HAVE_REPLICATION */
@@ -3061,15 +3093,171 @@ class Xid_log_event: public Log_event
#ifdef MYSQL_SERVER
bool write();
#endif
- bool is_valid() const { return 1; }
private:
#if defined(MYSQL_SERVER) && defined(HAVE_REPLICATION)
- virtual int do_apply_event(rpl_group_info *rgi);
- enum_skip_reason do_shall_skip(rpl_group_info *rgi);
+ int do_commit();
#endif
};
+
+/**
+ @class XA_prepare_log_event
+
+ Similar to Xid_log_event except that
+ - it is specific to XA transaction
+ - it carries out the prepare logics rather than the final committing
+ when @c one_phase member is off. The latter option is only for
+ compatibility with the upstream.
+
+ From the groupping perspective the event finalizes the current
+ "prepare" group that is started with Gtid_log_event similarly to the
+ regular replicated transaction.
+*/
+
+/**
+ Function serializes XID which is characterized by by four last arguments
+ of the function.
+ Serialized XID is presented in valid hex format and is returned to
+ the caller in a buffer pointed by the first argument.
+ The buffer size provived by the caller must be not less than
+ 8 + 2 * XIDDATASIZE + 4 * sizeof(XID::formatID) + 1, see
+ {MYSQL_,}XID definitions.
+
+ @param buf pointer to a buffer allocated for storing serialized data
+ @param fmt formatID value
+ @param gln gtrid_length value
+ @param bln bqual_length value
+ @param dat data value
+
+ @return the value of the buffer pointer
+*/
+
+inline char *serialize_xid(char *buf, long fmt, long gln, long bln,
+ const char *dat)
+{
+ int i;
+ char *c= buf;
+ /*
+ Build a string consisting of the hex format representation of XID
+ as passed through fmt,gln,bln,dat argument:
+ X'hex11hex12...hex1m',X'hex21hex22...hex2n',11
+ and store it into buf.
+ */
+ c[0]= 'X';
+ c[1]= '\'';
+ c+= 2;
+ for (i= 0; i < gln; i++)
+ {
+ c[0]=_dig_vec_lower[((uchar*) dat)[i] >> 4];
+ c[1]=_dig_vec_lower[((uchar*) dat)[i] & 0x0f];
+ c+= 2;
+ }
+ c[0]= '\'';
+ c[1]= ',';
+ c[2]= 'X';
+ c[3]= '\'';
+ c+= 4;
+
+ for (; i < gln + bln; i++)
+ {
+ c[0]=_dig_vec_lower[((uchar*) dat)[i] >> 4];
+ c[1]=_dig_vec_lower[((uchar*) dat)[i] & 0x0f];
+ c+= 2;
+ }
+ c[0]= '\'';
+ sprintf(c+1, ",%lu", fmt);
+
+ return buf;
+}
+
+/*
+ The size of the string containing serialized Xid representation
+ is computed as a sum of
+ eight as the number of formatting symbols (X'',X'',)
+ plus 2 x XIDDATASIZE (2 due to hex format),
+ plus space for decimal digits of XID::formatID,
+ plus one for 0x0.
+*/
+static const uint ser_buf_size=
+ 8 + 2 * MYSQL_XIDDATASIZE + 4 * sizeof(long) + 1;
+
+struct event_mysql_xid_t : MYSQL_XID
+{
+ char buf[ser_buf_size];
+ char *serialize()
+ {
+ return serialize_xid(buf, formatID, gtrid_length, bqual_length, data);
+ }
+};
+
+#ifndef MYSQL_CLIENT
+struct event_xid_t : XID
+{
+ char buf[ser_buf_size];
+
+ char *serialize(char *buf_arg)
+ {
+ return serialize_xid(buf_arg, formatID, gtrid_length, bqual_length, data);
+ }
+ char *serialize()
+ {
+ return serialize(buf);
+ }
+};
+#endif
+
+class XA_prepare_log_event: public Xid_apply_log_event
+{
+protected:
+
+ /* Constant contributor to subheader in write() by members of XID struct. */
+ static const int xid_subheader_no_data= 12;
+ event_mysql_xid_t m_xid;
+ void *xid;
+ bool one_phase;
+
+public:
+#ifdef MYSQL_SERVER
+ XA_prepare_log_event(THD* thd_arg, XID *xid_arg, bool one_phase_arg):
+ Xid_apply_log_event(thd_arg), xid(xid_arg), one_phase(one_phase_arg)
+ {
+ cache_type= Log_event::EVENT_NO_CACHE;
+ }
+#ifdef HAVE_REPLICATION
+ void pack_info(Protocol* protocol);
+#endif /* HAVE_REPLICATION */
+#else
+ bool print(FILE* file, PRINT_EVENT_INFO* print_event_info);
+#endif
+ XA_prepare_log_event(const char* buf,
+ const Format_description_log_event *description_event);
+ ~XA_prepare_log_event() {}
+ Log_event_type get_type_code() { return XA_PREPARE_LOG_EVENT; }
+ int get_data_size()
+ {
+ return xid_subheader_no_data + m_xid.gtrid_length + m_xid.bqual_length;
+ }
+
+#ifdef MYSQL_SERVER
+ bool write();
+#endif
+
+private:
+#if defined(MYSQL_SERVER) && defined(HAVE_REPLICATION)
+ char query[sizeof("XA COMMIT ONE PHASE") + 1 + ser_buf_size];
+ int do_commit();
+ const char* get_query()
+ {
+ sprintf(query,
+ (one_phase ? "XA COMMIT %s ONE PHASE" : "XA PREPARE %s"),
+ m_xid.serialize());
+ return query;
+ }
+#endif
+};
+
+
/**
@class User_var_log_event
@@ -3382,8 +3570,13 @@ class Gtid_log_event: public Log_event
uint64 seq_no;
uint64 commit_id;
uint32 domain_id;
+#ifdef MYSQL_SERVER
+ event_xid_t xid;
+#else
+ event_mysql_xid_t xid;
+#endif
uchar flags2;
-
+ uint32 xid_pins_idx;
/* Flags2. */
/* FL_STANDALONE is set when there is no terminating COMMIT event. */
@@ -3410,6 +3603,10 @@ class Gtid_log_event: public Log_event
static const uchar FL_WAITED= 16;
/* FL_DDL is set for event group containing DDL. */
static const uchar FL_DDL= 32;
+ /* FL_PREPARED_XA is set for XA transaction. */
+ static const uchar FL_PREPARED_XA= 64;
+ /* FL_"COMMITTED or ROLLED-BACK"_XA is set for XA transaction. */
+ static const uchar FL_COMPLETED_XA= 128;
#ifdef MYSQL_SERVER
Gtid_log_event(THD *thd_arg, uint64 seq_no, uint32 domain_id, bool standalone,
diff --git a/sql/log_event_client.cc b/sql/log_event_client.cc
index cae4842355a..0e57b82476b 100644
--- a/sql/log_event_client.cc
+++ b/sql/log_event_client.cc
@@ -3892,11 +3892,44 @@ Gtid_log_event::print(FILE *file, PRINT_EVENT_INFO *print_event_info)
buf, print_event_info->delimiter))
goto err;
}
- if (!(flags2 & FL_STANDALONE))
- if (my_b_printf(&cache, is_flashback ? "COMMIT\n%s\n" : "BEGIN\n%s\n", print_event_info->delimiter))
+ if ((flags2 & FL_PREPARED_XA) && !is_flashback)
+ {
+ my_b_write_string(&cache, "XA START ");
+ xid.serialize();
+ my_b_write(&cache, (uchar*) xid.buf, strlen(xid.buf));
+ if (my_b_printf(&cache, "%s\n", print_event_info->delimiter))
+ goto err;
+ }
+ else if (!(flags2 & FL_STANDALONE))
+ {
+ if (my_b_printf(&cache, is_flashback ? "COMMIT\n%s\n" : "BEGIN\n%s\n",
+ print_event_info->delimiter))
goto err;
+ }
return cache.flush_data();
err:
return 1;
}
+
+bool XA_prepare_log_event::print(FILE* file, PRINT_EVENT_INFO* print_event_info)
+{
+ Write_on_release_cache cache(&print_event_info->head_cache, file,
+ Write_on_release_cache::FLUSH_F, this);
+ m_xid.serialize();
+
+ if (!print_event_info->short_form)
+ {
+ print_header(&cache, print_event_info, FALSE);
+ if (my_b_printf(&cache, "\tXID = %s\n", m_xid.buf))
+ goto error;
+ }
+
+ if (my_b_printf(&cache, "XA PREPARE %s\n%s\n",
+ m_xid.buf, print_event_info->delimiter))
+ goto error;
+
+ return cache.flush_data();
+error:
+ return TRUE;
+}
diff --git a/sql/log_event_server.cc b/sql/log_event_server.cc
index 202a41c2837..210d321d2cd 100644
--- a/sql/log_event_server.cc
+++ b/sql/log_event_server.cc
@@ -1487,6 +1487,7 @@ Query_log_event::Query_log_event(THD* thd_arg, const char* query_arg, size_t que
case SQLCOM_RELEASE_SAVEPOINT:
case SQLCOM_ROLLBACK_TO_SAVEPOINT:
case SQLCOM_SAVEPOINT:
+ case SQLCOM_XA_END:
use_cache= trx_cache= TRUE;
break;
default:
@@ -3218,6 +3219,26 @@ Gtid_log_event::Gtid_log_event(THD *thd_arg, uint64 seq_no_arg,
/* Preserve any DDL or WAITED flag in the slave's binlog. */
if (thd_arg->rgi_slave)
flags2|= (thd_arg->rgi_slave->gtid_ev_flags2 & (FL_DDL|FL_WAITED));
+
+ XID_STATE &xid_state= thd->transaction.xid_state;
+ if (is_transactional && xid_state.is_explicit_XA() &&
+ ((xid_state.get_state_code() == XA_IDLE &&
+ thd->lex->sql_command == SQLCOM_XA_PREPARE) ||
+ xid_state.get_state_code() == XA_PREPARED))
+ {
+ DBUG_ASSERT(thd->lex->xa_opt != XA_ONE_PHASE);
+ DBUG_ASSERT(xid_state.get_state_code() == XA_IDLE ||
+ xid_state.is_binlogged());
+
+ flags2|= xid_state.get_state_code() == XA_IDLE ?
+ FL_PREPARED_XA : FL_COMPLETED_XA;
+
+ xid.formatID= xid_state.get_xid()->formatID;
+ xid.gtrid_length= xid_state.get_xid()->gtrid_length;
+ xid.bqual_length= xid_state.get_xid()->bqual_length;
+ long data_length= xid.bqual_length + xid.gtrid_length;
+ memcpy(xid.data, xid_state.get_xid()->data, data_length);
+ }
}
@@ -3260,7 +3281,7 @@ Gtid_log_event::peek(const char *event_start, size_t event_len,
bool
Gtid_log_event::write()
{
- uchar buf[GTID_HEADER_LEN+2];
+ uchar buf[GTID_HEADER_LEN+2+sizeof(XID)];
size_t write_len;
int8store(buf, seq_no);
@@ -3272,8 +3293,22 @@ Gtid_log_event::write()
write_len= GTID_HEADER_LEN + 2;
}
else
+ write_len= 13;
+
+ if (flags2 & (FL_PREPARED_XA | FL_COMPLETED_XA))
+ {
+ int4store(&buf[write_len], xid.formatID);
+ buf[write_len +4]= (uchar) xid.gtrid_length;
+ buf[write_len +4+1]= (uchar) xid.bqual_length;
+ write_len+= 6;
+ long data_length= xid.bqual_length + xid.gtrid_length;
+ memcpy(buf+write_len, xid.data, data_length);
+ write_len+= data_length;
+ }
+
+ if (write_len < GTID_HEADER_LEN)
{
- bzero(buf+13, GTID_HEADER_LEN-13);
+ bzero(buf+write_len, GTID_HEADER_LEN-write_len);
write_len= GTID_HEADER_LEN;
}
return write_header(write_len) ||
@@ -3316,9 +3351,14 @@ Gtid_log_event::make_compatible_event(String *packet, bool *need_dummy_event,
void
Gtid_log_event::pack_info(Protocol *protocol)
{
- char buf[6+5+10+1+10+1+20+1+4+20+1];
+ char buf[6+5+10+1+10+1+20+1+4+20+1+ ser_buf_size+5 /* sprintf */];
char *p;
- p = strmov(buf, (flags2 & FL_STANDALONE ? "GTID " : "BEGIN GTID "));
+ p = strmov(buf, (flags2 & FL_STANDALONE ? "GTID " :
+ flags2 & FL_PREPARED_XA ? "XA START " : "BEGIN GTID "));
+ if (flags2 & FL_PREPARED_XA)
+ {
+ p += sprintf(p, "%s GTID ", xid.serialize());
+ }
p= longlong10_to_str(domain_id, p, 10);
*p++= '-';
p= longlong10_to_str(server_id, p, 10);
@@ -3378,16 +3418,37 @@ Gtid_log_event::do_apply_event(rpl_group_info *rgi)
bits|= (ulonglong)OPTION_RPL_SKIP_PARALLEL;
thd->variables.option_bits= bits;
DBUG_PRINT("info", ("Set OPTION_GTID_BEGIN"));
- thd->set_query_and_id(gtid_begin_string, sizeof(gtid_begin_string)-1,
- &my_charset_bin, next_query_id());
- thd->lex->sql_command= SQLCOM_BEGIN;
thd->is_slave_error= 0;
- status_var_increment(thd->status_var.com_stat[thd->lex->sql_command]);
- if (trans_begin(thd, 0))
+
+ char buf_xa[sizeof("XA START") + 1 + ser_buf_size];
+ if (flags2 & FL_PREPARED_XA)
{
- DBUG_PRINT("error", ("trans_begin() failed"));
- thd->is_slave_error= 1;
+ const char fmt[]= "XA START %s";
+
+ thd->lex->xid= &xid;
+ thd->lex->xa_opt= XA_NONE;
+ sprintf(buf_xa, fmt, xid.serialize());
+ thd->set_query_and_id(buf_xa, static_cast<uint32>(strlen(buf_xa)),
+ &my_charset_bin, next_query_id());
+ thd->lex->sql_command= SQLCOM_XA_START;
+ if (trans_xa_start(thd))
+ {
+ DBUG_PRINT("error", ("trans_xa_start() failed"));
+ thd->is_slave_error= 1;
+ }
+ }
+ else
+ {
+ thd->set_query_and_id(gtid_begin_string, sizeof(gtid_begin_string)-1,
+ &my_charset_bin, next_query_id());
+ thd->lex->sql_command= SQLCOM_BEGIN;
+ if (trans_begin(thd, 0))
+ {
+ DBUG_PRINT("error", ("trans_begin() failed"));
+ thd->is_slave_error= 1;
+ }
}
+ status_var_increment(thd->status_var.com_stat[thd->lex->sql_command]);
thd->update_stats();
if (likely(!thd->is_slave_error))
@@ -3771,46 +3832,58 @@ bool slave_execute_deferred_events(THD *thd)
/**************************************************************************
- Xid_log_event methods
+ Xid_apply_log_event methods
**************************************************************************/
#if defined(HAVE_REPLICATION)
-void Xid_log_event::pack_info(Protocol *protocol)
+
+int Xid_apply_log_event::do_record_gtid(THD *thd, rpl_group_info *rgi,
+ bool in_trans, void **out_hton)
{
- char buf[128], *pos;
- pos= strmov(buf, "COMMIT /* xid=");
- pos= longlong10_to_str(xid, pos, 10);
- pos= strmov(pos, " */");
- protocol->store(buf, (uint) (pos-buf), &my_charset_bin);
-}
-#endif
+ int err= 0;
+ Relay_log_info const *rli= rgi->rli;
+ rgi->gtid_pending= false;
+ err= rpl_global_gtid_slave_state->record_gtid(thd, &rgi->current_gtid,
+ rgi->gtid_sub_id,
+ in_trans, false, out_hton);
-bool Xid_log_event::write()
-{
- DBUG_EXECUTE_IF("do_not_write_xid", return 0;);
- return write_header(sizeof(xid)) ||
- write_data((uchar*)&xid, sizeof(xid)) ||
- write_footer();
-}
+ if (unlikely(err))
+ {
+ int ec= thd->get_stmt_da()->sql_errno();
+ /*
+ Do not report an error if this is really a kill due to a deadlock.
+ In this case, the transaction will be re-tried instead.
+ */
+ if (!is_parallel_retry_error(rgi, ec))
+ rli->report(ERROR_LEVEL, ER_CANNOT_UPDATE_GTID_STATE, rgi->gtid_info(),
+ "Error during XID COMMIT: failed to update GTID state in "
+ "%s.%s: %d: %s",
+ "mysql", rpl_gtid_slave_state_table_name.str, ec,
+ thd->get_stmt_da()->message());
+ thd->is_slave_error= 1;
+ }
+ return err;
+}
-#if defined(HAVE_REPLICATION)
-int Xid_log_event::do_apply_event(rpl_group_info *rgi)
+int Xid_apply_log_event::do_apply_event(rpl_group_info *rgi)
{
bool res;
int err;
- rpl_gtid gtid;
uint64 sub_id= 0;
- Relay_log_info const *rli= rgi->rli;
void *hton= NULL;
+ rpl_gtid gtid;
/*
- XID_EVENT works like a COMMIT statement. And it also updates the
- mysql.gtid_slave_pos table with the GTID of the current transaction.
-
+ An instance of this class such as XID_EVENT works like a COMMIT
+ statement. It updates mysql.gtid_slave_pos with the GTID of the
+ current transaction.
Therefore, it acts much like a normal SQL statement, so we need to do
THD::reset_for_next_command() as if starting a new statement.
+
+ XA_PREPARE_LOG_EVENT also updates the gtid table *but* the update gets
+ committed as separate "autocommit" transaction.
*/
thd->reset_for_next_command();
/*
@@ -3824,57 +3897,50 @@ int Xid_log_event::do_apply_event(rpl_group_info *rgi)
if (rgi->gtid_pending)
{
sub_id= rgi->gtid_sub_id;
- rgi->gtid_pending= false;
-
gtid= rgi->current_gtid;
- err= rpl_global_gtid_slave_state->record_gtid(thd, >id, sub_id, true,
- false, &hton);
- if (unlikely(err))
+
+ if (!thd->transaction.xid_state.is_explicit_XA())
{
- int ec= thd->get_stmt_da()->sql_errno();
- /*
- Do not report an error if this is really a kill due to a deadlock.
- In this case, the transaction will be re-tried instead.
- */
- if (!is_parallel_retry_error(rgi, ec))
- rli->report(ERROR_LEVEL, ER_CANNOT_UPDATE_GTID_STATE, rgi->gtid_info(),
- "Error during XID COMMIT: failed to update GTID state in "
- "%s.%s: %d: %s",
- "mysql", rpl_gtid_slave_state_table_name.str, ec,
- thd->get_stmt_da()->message());
- thd->is_slave_error= 1;
- return err;
+ if ((err= do_record_gtid(thd, rgi, true /* in_trans */, &hton)))
+ return err;
+
+ DBUG_EXECUTE_IF("gtid_fail_after_record_gtid",
+ {
+ my_error(ER_ERROR_DURING_COMMIT, MYF(0),
+ HA_ERR_WRONG_COMMAND);
+ thd->is_slave_error= 1;
+ return 1;
+ });
}
-
- DBUG_EXECUTE_IF("gtid_fail_after_record_gtid",
- { my_error(ER_ERROR_DURING_COMMIT, MYF(0), HA_ERR_WRONG_COMMAND);
- thd->is_slave_error= 1;
- return 1;
- });
}
- /* For a slave Xid_log_event is COMMIT */
- general_log_print(thd, COM_QUERY,
- "COMMIT /* implicit, from Xid_log_event */");
+ general_log_print(thd, COM_QUERY, get_query());
thd->variables.option_bits&= ~OPTION_GTID_BEGIN;
- res= trans_commit(thd); /* Automatically rolls back on error. */
- thd->mdl_context.release_transactional_locks();
+ res= do_commit();
+ if (!res && rgi->gtid_pending)
+ {
+ DBUG_ASSERT(!thd->transaction.xid_state.is_explicit_XA());
+ if ((err= do_record_gtid(thd, rgi, false, &hton)))
+ return err;
+ }
if (likely(!res) && sub_id)
rpl_global_gtid_slave_state->update_state_hash(sub_id, >id, hton, rgi);
/*
Increment the global status commit count variable
*/
- status_var_increment(thd->status_var.com_stat[SQLCOM_COMMIT]);
+ enum enum_sql_command cmd= !thd->transaction.xid_state.is_explicit_XA() ?
+ SQLCOM_COMMIT : SQLCOM_XA_PREPARE;
+ status_var_increment(thd->status_var.com_stat[cmd]);
return res;
}
Log_event::enum_skip_reason
-Xid_log_event::do_shall_skip(rpl_group_info *rgi)
+Xid_apply_log_event::do_shall_skip(rpl_group_info *rgi)
{
- DBUG_ENTER("Xid_log_event::do_shall_skip");
+ DBUG_ENTER("Xid_apply_log_event::do_shall_skip");
if (rgi->rli->slave_skip_counter > 0)
{
DBUG_ASSERT(!rgi->rli->get_flag(Relay_log_info::IN_TRANSACTION));
@@ -3898,9 +3964,108 @@ Xid_log_event::do_shall_skip(rpl_group_info *rgi)
#endif
DBUG_RETURN(Log_event::do_shall_skip(rgi));
}
+#endif /* HAVE_REPLICATION */
+
+/**************************************************************************
+ Xid_log_event methods
+**************************************************************************/
+
+#if defined(HAVE_REPLICATION)
+void Xid_log_event::pack_info(Protocol *protocol)
+{
+ char buf[128], *pos;
+ pos= strmov(buf, "COMMIT /* xid=");
+ pos= longlong10_to_str(xid, pos, 10);
+ pos= strmov(pos, " */");
+ protocol->store(buf, (uint) (pos-buf), &my_charset_bin);
+}
+
+
+int Xid_log_event::do_commit()
+{
+ bool res;
+ res= trans_commit(thd); /* Automatically rolls back on error. */
+ thd->mdl_context.release_transactional_locks();
+ return res;
+}
+#endif
+
+
+bool Xid_log_event::write()
+{
+ DBUG_EXECUTE_IF("do_not_write_xid", return 0;);
+ return write_header(sizeof(xid)) ||
+ write_data((uchar*)&xid, sizeof(xid)) ||
+ write_footer();
+}
+
+/**************************************************************************
+ XA_prepare_log_event methods
+**************************************************************************/
+
+#if defined(HAVE_REPLICATION)
+void XA_prepare_log_event::pack_info(Protocol *protocol)
+{
+ char query[sizeof("XA COMMIT ONE PHASE") + 1 + ser_buf_size];
+
+ sprintf(query,
+ (one_phase ? "XA COMMIT %s ONE PHASE" : "XA PREPARE %s"),
+ m_xid.serialize());
+
+ protocol->store(query, strlen(query), &my_charset_bin);
+}
+
+
+int XA_prepare_log_event::do_commit()
+{
+ int res;
+ xid_t xid;
+ xid.set(m_xid.formatID,
+ m_xid.data, m_xid.gtrid_length,
+ m_xid.data + m_xid.gtrid_length, m_xid.bqual_length);
+
+ thd->lex->xid= &xid;
+ if (!one_phase)
+ {
+ if ((res= thd->wait_for_prior_commit()))
+ return res;
+
+ thd->lex->sql_command= SQLCOM_XA_PREPARE;
+ res= trans_xa_prepare(thd);
+ }
+ else
+ {
+ res= trans_xa_commit(thd);
+ thd->mdl_context.release_transactional_locks();
+ }
+
+ return res;
+}
#endif // HAVE_REPLICATION
+bool XA_prepare_log_event::write()
+{
+ uchar data[1 + 4 + 4 + 4]= {one_phase,};
+ uint8 one_phase_byte= one_phase;
+
+ int4store(data+1, static_cast<XID*>(xid)->formatID);
+ int4store(data+(1+4), static_cast<XID*>(xid)->gtrid_length);
+ int4store(data+(1+4+4), static_cast<XID*>(xid)->bqual_length);
+
+ DBUG_ASSERT(xid_subheader_no_data == sizeof(data) - 1);
+
+ return write_header(sizeof(one_phase_byte) + xid_subheader_no_data +
+ static_cast<XID*>(xid)->gtrid_length +
+ static_cast<XID*>(xid)->bqual_length) ||
+ write_data(data, sizeof(data)) ||
+ write_data((uchar*) static_cast<XID*>(xid)->data,
+ static_cast<XID*>(xid)->gtrid_length +
+ static_cast<XID*>(xid)->bqual_length) ||
+ write_footer();
+}
+
+
/**************************************************************************
User_var_log_event methods
**************************************************************************/
@@ -8303,7 +8468,6 @@ bool event_that_should_be_ignored(const char *buf)
event_type == PREVIOUS_GTIDS_LOG_EVENT ||
event_type == TRANSACTION_CONTEXT_EVENT ||
event_type == VIEW_CHANGE_EVENT ||
- event_type == XA_PREPARE_LOG_EVENT ||
(uint2korr(buf + FLAGS_OFFSET) & LOG_EVENT_IGNORABLE_F))
return 1;
return 0;
diff --git a/sql/rpl_parallel.cc b/sql/rpl_parallel.cc
index 4313840119e..fc6475a170b 100644
--- a/sql/rpl_parallel.cc
+++ b/sql/rpl_parallel.cc
@@ -27,6 +27,38 @@ struct rpl_parallel_thread_pool global_rpl_thread_pool;
static void signal_error_to_sql_driver_thread(THD *thd, rpl_group_info *rgi,
int err);
+struct XID_cache_insert_element
+{
+ XID *xid;
+ uint32 worker_idx;
+
+ XID_cache_insert_element(XID *xid_arg, uint32 idx_arg):
+ xid(xid_arg), worker_idx(idx_arg) {}
+};
+
+class XID_cache_element_para
+{
+public:
+ XID xid;
+ uint32 worker_idx;
+ Atomic_counter<int32_t> cnt; // of consecutive namesake xa:s queued for exec
+ static void lf_hash_initializer(LF_HASH *hash __attribute__((unused)),
+ XID_cache_element_para *element,
+ XID_cache_insert_element *new_element)
+ {
+ element->xid.set(new_element->xid);
+ element->worker_idx= new_element->worker_idx;
+ element->cnt= 1;
+ }
+ static uchar *key(const XID_cache_element_para *element, size_t *length,
+ my_bool)
+ {
+ *length= element->xid.key_length();
+ return element->xid.key();
+ }
+};
+
+
static int
rpt_handle_event(rpl_parallel_thread::queued_event *qev,
struct rpl_parallel_thread *rpt)
@@ -271,6 +303,33 @@ finish_event_group(rpl_parallel_thread *rpt, uint64 sub_id,
*/
thd->get_stmt_da()->reset_diagnostics_area();
wfc->wakeup_subsequent_commits(rgi->worker_error);
+
+ if (!rgi->current_xid.is_null())
+ {
+ Relay_log_info *rli= rgi->rli;
+ LF_PINS *pins= rli->parallel.rpl_xid_pins[rgi->xid_pins_idx];
+
+ DBUG_ASSERT(rgi->xid_pins_idx > 0 &&
+ rgi->xid_pins_idx <= opt_slave_parallel_threads);
+
+ XID_cache_element_para* el=
+ rli->parallel.xid_cache_search(&rgi->current_xid, pins);
+
+ if (el)
+ {
+ lf_hash_search_unpin(pins); // it's safe unpin now none but us can delete
+ if (el->cnt-- == 1) // comparison aganst old value
+ (void) rli->parallel.xid_cache_delete(el, pins);
+
+ DBUG_ASSERT(el->cnt >= 0);
+ }
+ else
+ {
+ // no record is begign when replication resumes after XA PREPARE.
+ sql_print_warning(ER_THD(rli->sql_driver_thd, ER_XAER_NOTA));
+ }
+ }
+ rgi->current_xid.null();
}
@@ -672,12 +731,14 @@ convert_kill_to_deadlock_error(rpl_group_info *rgi)
static int
is_group_ending(Log_event *ev, Log_event_type event_type)
{
- if (event_type == XID_EVENT)
+ if (event_type == XID_EVENT || event_type == XA_PREPARE_LOG_EVENT)
return 1;
if (event_type == QUERY_EVENT) // COMMIT/ROLLBACK are never compressed
{
Query_log_event *qev = (Query_log_event *)ev;
- if (qev->is_commit())
+ if (qev->is_commit() ||
+ !strncmp(qev->query, STRING_WITH_LEN("XA COMMIT")) ||
+ !strncmp(qev->query, STRING_WITH_LEN("XA ROLLBACK")))
return 1;
if (qev->is_rollback())
return 2;
@@ -1269,6 +1330,12 @@ handle_rpl_parallel_thread(void *arg)
slave_output_error_info(rgi, thd);
signal_error_to_sql_driver_thread(thd, rgi, 1);
}
+ if (static_cast<Gtid_log_event*>(qev->ev)->
+ flags2 & Gtid_log_event::FL_COMPLETED_XA)
+ {
+ rgi->current_xid.set(&static_cast<Gtid_log_event*>(qev->ev)->xid);
+ rgi->xid_pins_idx= static_cast<Gtid_log_event*>(qev->ev)->xid_pins_idx;
+ }
}
group_rgi= rgi;
@@ -2090,24 +2157,71 @@ rpl_parallel_thread_pool::release_thread(rpl_parallel_thread *rpt)
If the flag `reuse' is set, the last worker thread will be returned again,
if it is still available. Otherwise a new worker thread is allocated.
+
+ XA flagged as COMPLETED or PREPARED are handled as the following.
+
+ For the PREPARED one, a record consisting of the xid and a choosen worker's
+ descriptor is inserted into a local rli parallel xid hash.
+ The record also contains a usage counter field to account for
+ "duplicate" xid:s which may arise naturally as the result of the
+ same xid transaction multiple times execution on master. Each emerging
+ PREPARED xa increments the usage counter of the record keyed by xid.
+
+ For the COMPLETED xa, its xid is searched in the local hash for the
+ xa prepared worker. A found record's worker is reused, and when not
+ found (which may be benign) a new worker is allocated by the regular rule.
+
+ While the driver thread is responsible to insert a xid record into
+ the local hash and possibly increment its usage counter, a Worker assigned
+ for that xid decrements the counter at the end of xa's completion and deletes
+ the record when the counter drops to zero.
+ Pins associated with the Worker are passed to it through Gtid_log_event::pins
+ of the passed pointer.
*/
rpl_parallel_thread *
rpl_parallel_entry::choose_thread(rpl_group_info *rgi, bool *did_enter_cond,
- PSI_stage_info *old_stage, bool reuse)
+ PSI_stage_info *old_stage,
+ Gtid_log_event *gtid_ev)
{
uint32 idx;
Relay_log_info *rli= rgi->rli;
rpl_parallel_thread *thr;
+ bool reuse= gtid_ev == NULL;
idx= rpl_thread_idx;
+
if (!reuse)
{
+ if (gtid_ev->flags2 &
+ (Gtid_log_event::FL_COMPLETED_XA | Gtid_log_event::FL_PREPARED_XA))
+ {
+ LF_PINS *pins= rli->parallel.rpl_xid_pins[0];
+ XID_cache_element_para* el= rli->parallel.xid_cache_search(>id_ev->xid,
+ pins);
+
+ if (el)
+ {
+ idx= el->worker_idx;
+ lf_hash_search_unpin(pins);
+ goto idx_assigned;
+ }
+ else
+ {
+ // Further execution will clear out whether it's indeed the error case.
+ // XA completion event may arrive without the prepare one done so.
+ if (gtid_ev->flags2 & Gtid_log_event::FL_COMPLETED_XA)
+ sql_print_warning(ER_THD(rli->sql_driver_thd, ER_XAER_NOTA));
+ }
+ }
++idx;
if (idx >= rpl_thread_max)
idx= 0;
+
+idx_assigned:
rpl_thread_idx= idx;
}
thr= rpl_threads[idx];
+
if (thr)
{
*did_enter_cond= false;
@@ -2177,6 +2291,14 @@ rpl_parallel_entry::choose_thread(rpl_group_info *rgi, bool *did_enter_cond,
rpl_threads[idx]= thr= global_rpl_thread_pool.get_thread(&rpl_threads[idx],
this);
+ if (thr && gtid_ev)
+ {
+ if (gtid_ev->flags2 & Gtid_log_event::FL_PREPARED_XA)
+ (void) rli->parallel.xid_cache_replace(>id_ev->xid, idx);
+ else if (gtid_ev->flags2 & Gtid_log_event::FL_COMPLETED_XA)
+ gtid_ev->xid_pins_idx= idx + 1; // pass pins index to the assigned worker
+ }
+
return thr;
}
@@ -2205,12 +2327,21 @@ rpl_parallel::rpl_parallel() :
}
-void
-rpl_parallel::reset()
+bool
+rpl_parallel::reset(bool is_parallel)
{
my_hash_reset(&domain_hash);
current= NULL;
sql_thread_stopping= false;
+ if (is_parallel)
+ {
+ xid_cache_init();
+ rpl_xid_pins= new LF_PINS*[opt_slave_parallel_threads + 1];
+ for (ulong i= 0; i <= opt_slave_parallel_threads; i++)
+ if (!(rpl_xid_pins[i]= lf_hash_get_pins(&xid_cache_para)))
+ return true;
+ }
+ return false;
}
@@ -2473,6 +2604,76 @@ rpl_parallel::wait_for_workers_idle(THD *thd)
}
+void rpl_parallel::xid_cache_init()
+{
+ lf_hash_init(&xid_cache_para, sizeof(XID_cache_element_para),
+ LF_HASH_UNIQUE, 0, 0,
+ (my_hash_get_key) XID_cache_element_para::key, &my_charset_bin);
+ xid_cache_para.alloc.constructor= NULL;
+ xid_cache_para.alloc.destructor= NULL;
+ xid_cache_para.initializer=
+ (lf_hash_initializer) XID_cache_element_para::lf_hash_initializer;
+}
+
+
+void rpl_parallel::xid_cache_free()
+{
+ lf_hash_destroy(&xid_cache_para);
+}
+
+
+XID_cache_element_para* rpl_parallel::xid_cache_search(XID *xid, LF_PINS* pins)
+{
+ return (XID_cache_element_para*) lf_hash_search(&xid_cache_para, pins,
+ xid->key(),
+ xid->key_length());
+}
+
+/**
+ Insert a first xid-keyed record, or "replace" it with incremented
+ usage counter.
+*/
+void rpl_parallel::xid_cache_replace(XID *xid, uint32 idx)
+{
+ LF_PINS *pins= rpl_xid_pins[idx + 1];
+ XID_cache_element_para* el= xid_cache_search(xid, pins);
+
+ if (el)
+ {
+ if (unlikely(el->cnt++ == 0))
+ {
+ lf_hash_search_unpin(pins);
+ while (xid_cache_search(xid, pins)) // record must be at being deleted
+ (void) LF_BACKOFF();
+ lf_hash_search_unpin(pins);
+ (void) xid_cache_insert(xid, idx);
+ }
+
+ DBUG_ASSERT(el->cnt > 0);
+ }
+ else
+ {
+ (void) xid_cache_insert(xid, idx);
+ }
+ lf_hash_search_unpin(pins);
+}
+
+bool rpl_parallel::xid_cache_insert(XID *xid, uint32 idx)
+{
+ LF_PINS *pins= rpl_xid_pins[idx + 1];
+ XID_cache_insert_element new_element(xid, idx);
+
+ return lf_hash_insert(&xid_cache_para, pins, &new_element);
+}
+
+
+bool rpl_parallel::xid_cache_delete(XID_cache_element_para* el, LF_PINS *pins)
+{
+ return lf_hash_delete(&xid_cache_para, pins,
+ el->xid.key(), el->xid.key_length());
+}
+
+
/*
Handle seeing a GTID during slave restart in GTID mode. If we stopped with
different replication domains having reached different positions in the relay
@@ -2662,7 +2863,7 @@ rpl_parallel::do_event(rpl_group_info *serial_rgi, Log_event *ev,
else
{
DBUG_ASSERT(rli->gtid_skip_flag == GTID_SKIP_TRANSACTION);
- if (typ == XID_EVENT ||
+ if (typ == XID_EVENT || typ == XA_PREPARE_LOG_EVENT ||
(typ == QUERY_EVENT && // COMMIT/ROLLBACK are never compressed
(((Query_log_event *)ev)->is_commit() ||
((Query_log_event *)ev)->is_rollback())))
@@ -2673,10 +2874,11 @@ rpl_parallel::do_event(rpl_group_info *serial_rgi, Log_event *ev,
}
}
+ Gtid_log_event *gtid_ev= NULL;
if (typ == GTID_EVENT)
{
rpl_gtid gtid;
- Gtid_log_event *gtid_ev= static_cast<Gtid_log_event *>(ev);
+ gtid_ev= static_cast<Gtid_log_event *>(ev);
uint32 domain_id= (rli->mi->using_gtid == Master_info::USE_GTID_NO ||
rli->mi->parallel_mode <= SLAVE_PARALLEL_MINIMAL ?
0 : gtid_ev->domain_id);
@@ -2715,8 +2917,7 @@ rpl_parallel::do_event(rpl_group_info *serial_rgi, Log_event *ev,
instead re-use a thread that we queued for previously.
*/
cur_thread=
- e->choose_thread(serial_rgi, &did_enter_cond, &old_stage,
- typ != GTID_EVENT);
+ e->choose_thread(serial_rgi, &did_enter_cond, &old_stage, gtid_ev);
if (!cur_thread)
{
/* This means we were killed. The error is already signalled. */
@@ -2734,7 +2935,6 @@ rpl_parallel::do_event(rpl_group_info *serial_rgi, Log_event *ev,
if (typ == GTID_EVENT)
{
- Gtid_log_event *gtid_ev= static_cast<Gtid_log_event *>(ev);
bool new_gco;
enum_slave_parallel_mode mode= rli->mi->parallel_mode;
uchar gtid_flags= gtid_ev->flags2;
diff --git a/sql/rpl_parallel.h b/sql/rpl_parallel.h
index 4579d0da9bc..b27bc63255e 100644
--- a/sql/rpl_parallel.h
+++ b/sql/rpl_parallel.h
@@ -2,7 +2,7 @@
#define RPL_PARALLEL_H
#include "log_event.h"
-
+#include "lf.h"
struct rpl_parallel;
struct rpl_parallel_entry;
@@ -345,10 +345,14 @@ struct rpl_parallel_entry {
group_commit_orderer *current_gco;
rpl_parallel_thread * choose_thread(rpl_group_info *rgi, bool *did_enter_cond,
- PSI_stage_info *old_stage, bool reuse);
+ PSI_stage_info *old_stage,
+ Gtid_log_event *gtid_ev);
int queue_master_restart(rpl_group_info *rgi,
Format_description_log_event *fdev);
};
+
+class XID_cache_element_para;
+
struct rpl_parallel {
HASH domain_hash;
rpl_parallel_entry *current;
@@ -356,13 +360,31 @@ struct rpl_parallel {
rpl_parallel();
~rpl_parallel();
- void reset();
+ bool reset(bool is_parallel);
rpl_parallel_entry *find(uint32 domain_id);
void wait_for_done(THD *thd, Relay_log_info *rli);
void stop_during_until();
bool workers_idle();
int wait_for_workers_idle(THD *thd);
int do_event(rpl_group_info *serial_rgi, Log_event *ev, ulonglong event_size);
+ void leave(THD *thd, Relay_log_info *rli)
+ {
+ wait_for_done(thd, rli);
+ for (ulong i= 0; i <= opt_slave_parallel_threads; i++)
+ lf_hash_put_pins(rpl_xid_pins[i]);
+ delete[] rpl_xid_pins;
+ xid_cache_free();
+ };
+ // XA related. API follows xa.h naming.
+ LF_HASH xid_cache_para;
+ LF_PINS **rpl_xid_pins;
+
+ void xid_cache_init();
+ void xid_cache_free();
+ bool xid_cache_insert(XID *xid, uint32 idx);
+ void xid_cache_replace(XID *xid, uint32 idx);
+ bool xid_cache_delete(XID_cache_element_para *el, LF_PINS *pins);
+ XID_cache_element_para *xid_cache_search(XID *xid, LF_PINS *pins);
};
diff --git a/sql/rpl_rli.cc b/sql/rpl_rli.cc
index 6d55b06b497..ab035afb276 100644
--- a/sql/rpl_rli.cc
+++ b/sql/rpl_rli.cc
@@ -35,7 +35,7 @@
#include "sql_table.h"
static int count_relay_log_space(Relay_log_info* rli);
-
+bool xa_trans_force_rollback(THD *thd);
/**
Current replication state (hash of last GTID executed, per replication
domain).
@@ -2103,13 +2103,15 @@ rpl_group_info::reinit(Relay_log_info *rli)
rpl_group_info::rpl_group_info(Relay_log_info *rli)
: thd(0), wait_commit_sub_id(0),
wait_commit_group_info(0), parallel_entry(0),
- deferred_events(NULL), m_annotate_event(0), is_parallel_exec(false)
+ deferred_events(NULL), m_annotate_event(0), is_parallel_exec(false),
+ xid_pins_idx(0)
{
reinit(rli);
bzero(¤t_gtid, sizeof(current_gtid));
mysql_mutex_init(key_rpl_group_info_sleep_lock, &sleep_lock,
MY_MUTEX_INIT_FAST);
mysql_cond_init(key_rpl_group_info_sleep_cond, &sleep_cond, NULL);
+ current_xid.null();
}
@@ -2230,6 +2232,14 @@ void rpl_group_info::cleanup_context(THD *thd, bool error)
if (unlikely(error))
{
+ /*Todo/fixme: does it still not hold? Sort out and optimize if does not.
+ trans_rollback above does not rollback XA transactions.
+ It could be done only after necessarily closing tables which dictates
+ the following placement.
+ */
+ if (thd->transaction.xid_state.is_explicit_XA())
+ xa_trans_force_rollback(thd);
+
thd->mdl_context.release_transactional_locks();
if (thd == rli->sql_driver_thd)
diff --git a/sql/rpl_rli.h b/sql/rpl_rli.h
index 0e2e42fcb08..0b7d7f6a377 100644
--- a/sql/rpl_rli.h
+++ b/sql/rpl_rli.h
@@ -808,6 +808,10 @@ struct rpl_group_info
};
uchar killed_for_retry;
+ /* A store to remember xid of being completed XA */
+ XID current_xid;
+ uint32 xid_pins_idx; /* xid pins index to use by XA competing worker */
+
rpl_group_info(Relay_log_info *rli_);
~rpl_group_info();
void reinit(Relay_log_info *rli);
diff --git a/sql/slave.cc b/sql/slave.cc
index 87c1cf6cb77..4b3ad57fe41 100644
--- a/sql/slave.cc
+++ b/sql/slave.cc
@@ -4230,7 +4230,7 @@ inline void update_state_of_relay_log(Relay_log_info *rli, Log_event *ev)
rli->clear_flag(Relay_log_info::IN_TRANSACTION);
}
}
- if (typ == XID_EVENT)
+ if (typ == XID_EVENT || typ == XA_PREPARE_LOG_EVENT)
rli->clear_flag(Relay_log_info::IN_TRANSACTION);
if (typ == GTID_EVENT &&
!(((Gtid_log_event*) ev)->flags2 & Gtid_log_event::FL_STANDALONE))
@@ -5434,7 +5434,6 @@ pthread_handler_t handle_slave_sql(void *arg)
But the master timestamp is reset by RESET SLAVE & CHANGE MASTER.
*/
rli->clear_error();
- rli->parallel.reset();
//tell the I/O thread to take relay_log_space_limit into account from now on
rli->ignore_log_space_limit= 0;
@@ -5601,7 +5600,9 @@ pthread_handler_t handle_slave_sql(void *arg)
}
#endif /* WITH_WSREP */
/* Read queries from the IO/THREAD until this thread is killed */
-
+ if (rli->parallel.reset(mi->using_parallel()))
+ rli->report(ERROR_LEVEL, ER_SLAVE_FATAL_ERROR, NULL,
+ "Error initializing parallel mode");
thd->set_command(COM_SLAVE_SQL);
while (!sql_slave_killed(serial_rgi))
{
@@ -5663,7 +5664,7 @@ pthread_handler_t handle_slave_sql(void *arg)
err:
if (mi->using_parallel())
- rli->parallel.wait_for_done(thd, rli);
+ rli->parallel.leave(thd, rli);
/* Thread stopped. Print the current replication position to the log */
{
@@ -7095,6 +7096,7 @@ static int queue_event(Master_info* mi,const char* buf, ulong event_len)
buf[EVENT_TYPE_OFFSET])) ||
(!mi->last_queued_gtid_standalone &&
((uchar)buf[EVENT_TYPE_OFFSET] == XID_EVENT ||
+ (uchar)buf[EVENT_TYPE_OFFSET] == XA_PREPARE_LOG_EVENT ||
((uchar)buf[EVENT_TYPE_OFFSET] == QUERY_EVENT && /* QUERY_COMPRESSED_EVENT would never be commmit or rollback */
Query_log_event::peek_is_commit_rollback(buf, event_len,
checksum_alg))))))
diff --git a/sql/sql_repl.cc b/sql/sql_repl.cc
index 5bfa29b72c4..fd7fa89f227 100644
--- a/sql/sql_repl.cc
+++ b/sql/sql_repl.cc
@@ -1651,7 +1651,7 @@ is_until_reached(binlog_send_info *info, ulong *ev_offset,
return false;
break;
case GTID_UNTIL_STOP_AFTER_TRANSACTION:
- if (event_type != XID_EVENT &&
+ if (event_type != XID_EVENT && event_type != XA_PREPARE_LOG_EVENT &&
(event_type != QUERY_EVENT || /* QUERY_COMPRESSED_EVENT would never be commmit or rollback */
!Query_log_event::peek_is_commit_rollback
(info->packet->ptr()+*ev_offset,
@@ -1886,7 +1886,7 @@ send_event_to_slave(binlog_send_info *info, Log_event_type event_type,
info->gtid_skip_group= GTID_SKIP_NOT;
return NULL;
case GTID_SKIP_TRANSACTION:
- if (event_type == XID_EVENT ||
+ if (event_type == XID_EVENT || event_type == XA_PREPARE_LOG_EVENT ||
(event_type == QUERY_EVENT && /* QUERY_COMPRESSED_EVENT would never be commmit or rollback */
Query_log_event::peek_is_commit_rollback(packet->ptr() + ev_offset,
len - ev_offset,
diff --git a/sql/xa.cc b/sql/xa.cc
index e4cad40318e..da6c9c93157 100644
--- a/sql/xa.cc
+++ b/sql/xa.cc
@@ -20,13 +20,11 @@
#include "sql_class.h"
#include "transaction.h"
+static bool slave_applier_reset_xa_trans(THD *thd);
/***************************************************************************
Handling of XA id cacheing
***************************************************************************/
-enum xa_states { XA_ACTIVE= 0, XA_IDLE, XA_PREPARED, XA_ROLLBACK_ONLY };
-
-
struct XID_cache_insert_element
{
enum xa_states xa_state;
@@ -78,6 +76,7 @@ class XID_cache_element
uint rm_error;
enum xa_states xa_state;
XID xid;
+ bool binlogged;
bool is_set(int32_t flag)
{ return m_state.load(std::memory_order_relaxed) & flag; }
void set(int32_t flag)
@@ -131,6 +130,7 @@ class XID_cache_element
{
DBUG_ASSERT(!element->is_set(ACQUIRED | RECOVERED));
element->rm_error= 0;
+ element->binlogged= false;
element->xa_state= new_element->xa_state;
element->xid.set(new_element->xid);
new_element->xid_cache_element= element;
@@ -158,6 +158,29 @@ static LF_HASH xid_cache;
static bool xid_cache_inited;
+bool XID_STATE::is_binlogged()
+{
+ return is_explicit_XA() && xid_cache_element->binlogged;
+}
+
+
+void XID_STATE::set_binlogged()
+{
+ if (xid_cache_element)
+ xid_cache_element->binlogged= true;
+}
+
+
+void XID_STATE::unset_binlogged()
+{
+ if (xid_cache_element)
+ xid_cache_element->binlogged= false;
+}
+
+
+enum xa_states XID_STATE::get_state_code() { return xid_cache_element->xa_state; }
+
+
bool THD::fix_xid_hash_pins()
{
if (!xid_hash_pins)
@@ -267,6 +290,7 @@ bool xid_cache_insert(XID *xid)
{
case 0:
new_element.xid_cache_element->set(XID_cache_element::RECOVERED);
+ new_element.xid_cache_element->binlogged= true;
break;
case 1:
res= 0;
@@ -308,7 +332,11 @@ static void xid_cache_delete(THD *thd, XID_cache_element *&element)
void xid_cache_delete(THD *thd, XID_STATE *xid_state)
{
- DBUG_ASSERT(xid_state->is_explicit_XA());
+ DBUG_ASSERT(xid_state->is_explicit_XA() || thd->lex->xa_opt == XA_ONE_PHASE);
+
+ if (!xid_state->is_explicit_XA())
+ return;
+
xid_cache_delete(thd, xid_state->xid_cache_element);
xid_state->xid_cache_element= 0;
}
@@ -380,7 +408,7 @@ static bool xa_trans_rolled_back(XID_cache_element *element)
@return TRUE if the rollback failed, FALSE otherwise.
*/
-static bool xa_trans_force_rollback(THD *thd)
+bool xa_trans_force_rollback(THD *thd)
{
bool rc= false;
@@ -389,8 +417,8 @@ static bool xa_trans_force_rollback(THD *thd)
my_error(ER_XAER_RMERR, MYF(0));
rc= true;
}
-
- thd->variables.option_bits&= ~(OPTION_BEGIN | OPTION_KEEP_LOG);
+ thd->variables.option_bits&=
+ ~(OPTION_BEGIN | OPTION_KEEP_LOG | OPTION_GTID_BEGIN);
thd->transaction.all.reset();
thd->server_status&=
~(SERVER_STATUS_IN_TRANS | SERVER_STATUS_IN_TRANS_READONLY);
@@ -492,6 +520,8 @@ bool trans_xa_end(THD *thd)
bool trans_xa_prepare(THD *thd)
{
+ int res= 1;
+
DBUG_ENTER("trans_xa_prepare");
if (!thd->transaction.xid_state.is_explicit_XA() ||
@@ -499,16 +529,40 @@ bool trans_xa_prepare(THD *thd)
thd->transaction.xid_state.er_xaer_rmfail();
else if (!thd->transaction.xid_state.xid_cache_element->xid.eq(thd->lex->xid))
my_error(ER_XAER_NOTA, MYF(0));
- else if (ha_prepare(thd))
+ else
{
- xid_cache_delete(thd, &thd->transaction.xid_state);
- my_error(ER_XA_RBROLLBACK, MYF(0));
+ /*
+ Acquire metadata lock which will ensure that COMMIT is blocked
+ by active FLUSH TABLES WITH READ LOCK (and vice versa COMMIT in
+ progress blocks FTWRL).
+
+ We allow FLUSHer to COMMIT; we assume FLUSHer knows what it does.
+ */
+ MDL_request mdl_request;
+ mdl_request.init(MDL_key::BACKUP, "", "", MDL_BACKUP_COMMIT,
+ MDL_STATEMENT);
+ if (thd->mdl_context.acquire_lock(&mdl_request,
+ thd->variables.lock_wait_timeout) ||
+ ha_prepare(thd))
+ {
+ if (!mdl_request.ticket)
+ ha_rollback_trans(thd, TRUE);
+ thd->variables.option_bits&= ~(OPTION_BEGIN | OPTION_KEEP_LOG);
+ thd->transaction.all.reset();
+ thd->server_status&=
+ ~(SERVER_STATUS_IN_TRANS | SERVER_STATUS_IN_TRANS_READONLY);
+ xid_cache_delete(thd, &thd->transaction.xid_state);
+ my_error(ER_XA_RBROLLBACK, MYF(0));
+ }
+ else
+ {
+ thd->transaction.xid_state.xid_cache_element->xa_state= XA_PREPARED;
+ res= thd->variables.pseudo_slave_mode || thd->slave_thread ?
+ slave_applier_reset_xa_trans(thd) : 0;
+ }
}
- else
- thd->transaction.xid_state.xid_cache_element->xa_state= XA_PREPARED;
- DBUG_RETURN(thd->is_error() ||
- thd->transaction.xid_state.xid_cache_element->xa_state != XA_PREPARED);
+ DBUG_RETURN(res);
}
@@ -523,11 +577,13 @@ bool trans_xa_prepare(THD *thd)
bool trans_xa_commit(THD *thd)
{
- bool res= TRUE;
+ bool res= true;
+ XID_STATE &xid_state= thd->transaction.xid_state;
+
DBUG_ENTER("trans_xa_commit");
- if (!thd->transaction.xid_state.is_explicit_XA() ||
- !thd->transaction.xid_state.xid_cache_element->xid.eq(thd->lex->xid))
+ if (!xid_state.is_explicit_XA() ||
+ !xid_state.xid_cache_element->xid.eq(thd->lex->xid))
{
if (thd->in_multi_stmt_transaction_mode() || thd->lex->xa_opt != XA_NONE)
{
@@ -543,7 +599,45 @@ bool trans_xa_commit(THD *thd)
if (auto xs= xid_cache_search(thd, thd->lex->xid))
{
res= xa_trans_rolled_back(xs);
+ /*
+ Acquire metadata lock which will ensure that COMMIT is blocked
+ by active FLUSH TABLES WITH READ LOCK (and vice versa COMMIT in
+ progress blocks FTWRL).
+
+ We allow FLUSHer to COMMIT; we assume FLUSHer knows what it does.
+ */
+ MDL_request mdl_request;
+ mdl_request.init(MDL_key::BACKUP, "", "", MDL_BACKUP_COMMIT,
+ MDL_STATEMENT);
+ if (thd->mdl_context.acquire_lock(&mdl_request,
+ thd->variables.lock_wait_timeout))
+ {
+ /*
+ We can't rollback an XA transaction on lock failure due to
+ Innodb redo log and bin log update is involved in rollback.
+ Return error to user for a retry.
+ */
+ DBUG_ASSERT(thd->is_error());
+
+ xs->acquired_to_recovered();
+ DBUG_RETURN(true);
+ }
+ DBUG_ASSERT(!xid_state.xid_cache_element);
+
+ DEBUG_SYNC(thd, "at_trans_xa_commit");
+ if (thd->wait_for_prior_commit())
+ {
+ DBUG_ASSERT(thd->is_error());
+
+ xs->acquired_to_recovered();
+ DBUG_RETURN(true);
+ }
+
+ xid_state.xid_cache_element= xs;
ha_commit_or_rollback_by_xid(thd->lex->xid, !res);
+ xid_state.xid_cache_element= 0;
+
+ res= res || thd->is_error();
xid_cache_delete(thd, xs);
}
else
@@ -551,19 +645,20 @@ bool trans_xa_commit(THD *thd)
DBUG_RETURN(res);
}
- if (xa_trans_rolled_back(thd->transaction.xid_state.xid_cache_element))
+ if (xa_trans_rolled_back(xid_state.xid_cache_element))
{
xa_trans_force_rollback(thd);
DBUG_RETURN(thd->is_error());
}
- else if (thd->transaction.xid_state.xid_cache_element->xa_state == XA_IDLE &&
+ else if (xid_state.xid_cache_element->xa_state == XA_IDLE &&
thd->lex->xa_opt == XA_ONE_PHASE)
{
+ xid_cache_delete(thd, &xid_state);
int r= ha_commit_trans(thd, TRUE);
if ((res= MY_TEST(r)))
my_error(r == 1 ? ER_XA_RBROLLBACK : ER_XAER_RMERR, MYF(0));
}
- else if (thd->transaction.xid_state.xid_cache_element->xa_state == XA_PREPARED &&
+ else if (xid_state.xid_cache_element->xa_state == XA_PREPARED &&
thd->lex->xa_opt == XA_NONE)
{
MDL_request mdl_request;
@@ -576,26 +671,30 @@ bool trans_xa_commit(THD *thd)
We allow FLUSHer to COMMIT; we assume FLUSHer knows what it does.
*/
mdl_request.init(MDL_key::BACKUP, "", "", MDL_BACKUP_COMMIT,
- MDL_TRANSACTION);
+ MDL_STATEMENT);
if (thd->mdl_context.acquire_lock(&mdl_request,
thd->variables.lock_wait_timeout))
{
- ha_rollback_trans(thd, TRUE);
+ /*
+ We can't rollback an XA transaction on lock failure due to
+ Innodb redo log and bin log update is involved in rollback.
+ Return error to user for a retry.
+ */
my_error(ER_XAER_RMERR, MYF(0));
+ DBUG_RETURN(true);
}
else
{
DEBUG_SYNC(thd, "trans_xa_commit_after_acquire_commit_lock");
- res= MY_TEST(ha_commit_one_phase(thd, 1));
- if (res)
+ if ((res= MY_TEST(ha_commit_one_phase(thd, 1))))
my_error(ER_XAER_RMERR, MYF(0));
}
}
else
{
- thd->transaction.xid_state.er_xaer_rmfail();
+ xid_state.er_xaer_rmfail();
DBUG_RETURN(TRUE);
}
@@ -604,7 +703,7 @@ bool trans_xa_commit(THD *thd)
thd->server_status&=
~(SERVER_STATUS_IN_TRANS | SERVER_STATUS_IN_TRANS_READONLY);
DBUG_PRINT("info", ("clearing SERVER_STATUS_IN_TRANS"));
- xid_cache_delete(thd, &thd->transaction.xid_state);
+ xid_cache_delete(thd, &xid_state);
trans_track_end_trx(thd);
@@ -623,10 +722,13 @@ bool trans_xa_commit(THD *thd)
bool trans_xa_rollback(THD *thd)
{
+ bool res= false;
+ XID_STATE &xid_state= thd->transaction.xid_state;
+
DBUG_ENTER("trans_xa_rollback");
- if (!thd->transaction.xid_state.is_explicit_XA() ||
- !thd->transaction.xid_state.xid_cache_element->xid.eq(thd->lex->xid))
+ if (!xid_state.is_explicit_XA() ||
+ !xid_state.xid_cache_element->xid.eq(thd->lex->xid))
{
if (thd->in_multi_stmt_transaction_mode())
{
@@ -641,8 +743,36 @@ bool trans_xa_rollback(THD *thd)
if (auto xs= xid_cache_search(thd, thd->lex->xid))
{
+ MDL_request mdl_request;
+ mdl_request.init(MDL_key::BACKUP, "", "", MDL_BACKUP_COMMIT,
+ MDL_STATEMENT);
+ if (thd->mdl_context.acquire_lock(&mdl_request,
+ thd->variables.lock_wait_timeout))
+ {
+ /*
+ We can't rollback an XA transaction on lock failure due to
+ Innodb redo log and bin log update is involved in rollback.
+ Return error to user for a retry.
+ */
+ DBUG_ASSERT(thd->is_error());
+
+ xs->acquired_to_recovered();
+ DBUG_RETURN(true);
+ }
xa_trans_rolled_back(xs);
+ DBUG_ASSERT(!xid_state.xid_cache_element);
+
+ DEBUG_SYNC(thd, "at_trans_xa_rollback");
+ if (thd->wait_for_prior_commit())
+ {
+ DBUG_ASSERT(thd->is_error());
+ xs->acquired_to_recovered();
+ DBUG_RETURN(true);
+ }
+
+ xid_state.xid_cache_element= xs;
ha_commit_or_rollback_by_xid(thd->lex->xid, 0);
+ xid_state.xid_cache_element= 0;
xid_cache_delete(thd, xs);
}
else
@@ -650,21 +780,35 @@ bool trans_xa_rollback(THD *thd)
DBUG_RETURN(thd->get_stmt_da()->is_error());
}
- if (thd->transaction.xid_state.xid_cache_element->xa_state == XA_ACTIVE)
+ if (xid_state.xid_cache_element->xa_state == XA_ACTIVE)
{
- thd->transaction.xid_state.er_xaer_rmfail();
+ xid_state.er_xaer_rmfail();
DBUG_RETURN(TRUE);
}
- DBUG_RETURN(xa_trans_force_rollback(thd));
+
+ MDL_request mdl_request;
+ mdl_request.init(MDL_key::BACKUP, "", "", MDL_BACKUP_COMMIT,
+ MDL_STATEMENT);
+ if (thd->mdl_context.acquire_lock(&mdl_request,
+ thd->variables.lock_wait_timeout))
+ {
+ /*
+ We can't rollback an XA transaction on lock failure due to
+ Innodb redo log and bin log update is involved in rollback.
+ Return error to user for a retry.
+ */
+ my_error(ER_XAER_RMERR, MYF(0));
+ DBUG_RETURN(true);
+ }
+
+ DBUG_RETURN(res != 0 || xa_trans_force_rollback(thd));
}
bool trans_xa_detach(THD *thd)
{
DBUG_ASSERT(thd->transaction.xid_state.is_explicit_XA());
-#if 1
- return xa_trans_force_rollback(thd);
-#else
+
if (thd->transaction.xid_state.xid_cache_element->xa_state != XA_PREPARED)
return xa_trans_force_rollback(thd);
thd->transaction.xid_state.xid_cache_element->acquired_to_recovered();
@@ -683,7 +827,6 @@ bool trans_xa_detach(THD *thd)
thd->transaction.all.ha_list= 0;
thd->transaction.all.no_2pc= 0;
return false;
-#endif
}
@@ -877,3 +1020,44 @@ bool mysql_xa_recover(THD *thd)
my_eof(thd);
DBUG_RETURN(0);
}
+
+
+/**
+ This is a specific to (pseudo-) slave applier collection of standard cleanup
+ actions to reset XA transaction state sim to @c ha_commit_one_phase.
+ THD of the slave applier is dissociated from a transaction object in engine
+ that continues to exist there.
+
+ @param THD current thread
+ @return the value of is_error()
+*/
+
+static bool slave_applier_reset_xa_trans(THD *thd)
+{
+ thd->variables.option_bits&= ~(OPTION_BEGIN | OPTION_KEEP_LOG);
+ thd->server_status&=
+ ~(SERVER_STATUS_IN_TRANS | SERVER_STATUS_IN_TRANS_READONLY);
+ DBUG_PRINT("info", ("clearing SERVER_STATUS_IN_TRANS"));
+
+ thd->transaction.xid_state.xid_cache_element->acquired_to_recovered();
+ thd->transaction.xid_state.xid_cache_element= 0;
+
+ for (Ha_trx_info *ha_info= thd->transaction.all.ha_list, *ha_info_next;
+ ha_info; ha_info= ha_info_next)
+ {
+ ha_info_next= ha_info->next();
+ ha_info->reset();
+ }
+ thd->transaction.all.ha_list= 0;
+
+ ha_close_connection(thd);
+ thd->transaction.cleanup();
+ thd->transaction.all.reset();
+
+ DBUG_ASSERT(!thd->transaction.all.ha_list);
+ DBUG_ASSERT(!thd->transaction.all.no_2pc);
+
+ thd->has_waiter= false;
+
+ return thd->is_error();
+}
diff --git a/sql/xa.h b/sql/xa.h
index 7cf74efad35..507d07f638f 100644
--- a/sql/xa.h
+++ b/sql/xa.h
@@ -1,3 +1,5 @@
+#ifndef XA_INCLUDED
+#define XA_INCLUDED
/*
Copyright (c) 2000, 2016, Oracle and/or its affiliates.
Copyright (c) 2009, 2019, MariaDB Corporation.
@@ -16,17 +18,31 @@
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
*/
-
class XID_cache_element;
+enum xa_states { XA_ACTIVE= 0, XA_IDLE, XA_PREPARED, XA_ROLLBACK_ONLY };
struct XID_STATE {
XID_cache_element *xid_cache_element;
- bool check_has_uncommitted_xa() const;
bool is_explicit_XA() const { return xid_cache_element != 0; }
+ /*
+ Binary logging status of explicit "user" XA.
+ It is set to TRUE at XA PREPARE if the transaction was written
+ to the binlog.
+ It may be FALSE after preparing when the transaction does not modify
+ transactional tables or binlogging is turned off.
+ In that case a consequent XA COMMIT/ROLLBACK shouldn't be binlogged.
+
+ The recovered transaction after server restart sets it to TRUE always.
+ */
+ bool is_binlogged();
+ bool check_has_uncommitted_xa() const;
void set_error(uint error);
void er_xaer_rmfail() const;
XID *get_xid() const;
+ void set_binlogged();
+ void unset_binlogged();
+ enum xa_states get_state_code();
};
void xid_cache_init(void);
@@ -42,3 +58,5 @@ bool trans_xa_commit(THD *thd);
bool trans_xa_rollback(THD *thd);
bool trans_xa_detach(THD *thd);
bool mysql_xa_recover(THD *thd);
+
+#endif /* XA_INCLUDED */
diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index a36cc87a201..8a173001ed5 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -4844,6 +4844,7 @@ innobase_close_connection(
if (trx->has_logged_persistent()) {
trx_disconnect_prepared(trx);
} else {
+ trx_rollback_for_mysql(trx);
trx_deregister_from_2pc(trx);
goto rollback_and_free;
}
diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc
index c5e48ff65ab..1004198ea0b 100644
--- a/storage/innobase/trx/trx0trx.cc
+++ b/storage/innobase/trx/trx0trx.cc
@@ -549,8 +549,10 @@ void trx_disconnect_prepared(trx_t *trx)
trx->read_view.close();
trx->is_recovered= true;
trx->mysql_thd= NULL;
+ trx->mysql_log_file_name = 0;
/* todo/fixme: suggest to do it at innodb prepare */
trx->will_lock= 0;
+ trx_sys.rw_trx_hash.put_pins(trx);
}
/****************************************************************//**
@@ -1390,8 +1392,21 @@ trx_commit_in_memory(
trx->release_locks();
}
- DEBUG_SYNC_C("after_trx_committed_in_memory");
-
+#ifndef DBUG_OFF
+ const bool debug_sync = trx->mysql_thd &&
+ trx->has_logged_persistent();
+ /* In case of this function is called from a stack executing
+ THD::free_connection -> ...
+ innobase_connection_close() ->
+ trx_rollback_for_mysql... -> .
+ mysql's thd does not seem to have
+ thd->debug_sync_control defined any longer. However the stack
+ is possible only with a prepared trx not updating any data.
+ */
+ if (debug_sync) {
+ DEBUG_SYNC_C("after_trx_committed_in_memory");
+ }
+#endif
if (trx->read_only || !trx->rsegs.m_redo.rseg) {
MONITOR_INC(MONITOR_TRX_RO_COMMIT);
} else {
diff --git a/storage/rocksdb/ha_rocksdb.cc b/storage/rocksdb/ha_rocksdb.cc
index 8488f9ee963..272d2800319 100644
--- a/storage/rocksdb/ha_rocksdb.cc
+++ b/storage/rocksdb/ha_rocksdb.cc
@@ -3121,6 +3121,8 @@ class Rdb_transaction {
s_tx_list.erase(this);
RDB_MUTEX_UNLOCK_CHECK(s_tx_list_mutex);
}
+ virtual bool is_prepared() { return false; };
+ virtual void detach_prepared_tx() {};
};
/*
@@ -3157,7 +3159,16 @@ class Rdb_transaction_impl : public Rdb_transaction {
virtual bool is_writebatch_trx() const override { return false; }
- private:
+ bool is_prepared() {
+ return m_rocksdb_tx && rocksdb::Transaction::PREPARED == m_rocksdb_tx->GetState();
+ }
+
+ void detach_prepared_tx() {
+ DBUG_ASSERT(rocksdb::Transaction::PREPARED == m_rocksdb_tx->GetState());
+ m_rocksdb_tx = nullptr;
+ }
+
+private:
void release_tx(void) {
// We are done with the current active transaction object. Preserve it
// for later reuse.
@@ -3798,7 +3809,8 @@ static int rocksdb_close_connection(handlerton *const hton, THD *const thd) {
"disconnecting",
rc);
}
-
+ if (tx->is_prepared())
+ tx->detach_prepared_tx();
delete tx;
}
return HA_EXIT_SUCCESS;
@@ -5301,7 +5313,7 @@ static int rocksdb_init_func(void *const p) {
#ifdef MARIAROCKS_NOT_YET
rocksdb_hton->update_table_stats = rocksdb_update_table_stats;
#endif // MARIAROCKS_NOT_YET
-
+
/*
Not needed in MariaDB:
rocksdb_hton->flush_logs = rocksdb_flush_wal;
diff --git a/storage/rocksdb/mysql-test/rocksdb/r/xa.result b/storage/rocksdb/mysql-test/rocksdb/r/xa.result
index 12ae2b474b6..8cb6f39bbac 100644
--- a/storage/rocksdb/mysql-test/rocksdb/r/xa.result
+++ b/storage/rocksdb/mysql-test/rocksdb/r/xa.result
@@ -1,6 +1,7 @@
-#
-# MDEV-13155: XA recovery not supported for RocksDB (Just a testcase)
#
+# MDEV-742 fixes
+# MDEV-13155: XA recovery not supported for RocksDB
+# as well.
call mtr.add_suppression("Found .* prepared XA transactions");
connect con1,localhost,root,,test;
DROP TABLE IF EXISTS t1;
@@ -15,19 +16,55 @@ INSERT INTO t1 (a) VALUES (3);
INSERT INTO t1 (a) VALUES (4);
XA END 'xa2';
XA PREPARE 'xa2';
+connect con3,localhost,root,,test;
+XA START 'xa3';
+INSERT INTO t1 (a) VALUES (5);
+INSERT INTO t1 (a) VALUES (6);
+XA END 'xa3';
+XA PREPARE 'xa3';
+disconnect con3;
connection default;
SELECT * FROM t1;
a
+Must be all three XA:s in
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 3 0 xa3
+1 3 0 xa1
+1 3 0 xa2
# restart
connect con3,localhost,root,,test;
XA RECOVER;
formatID gtrid_length bqual_length data
+1 3 0 xa3
1 3 0 xa1
1 3 0 xa2
XA ROLLBACK 'xa1';
XA COMMIT 'xa2';
+XA ROLLBACK 'xa3';
+SELECT a FROM t1;
+a
+3
+4
+connect con4,localhost,root,,test;
+XA START 'xa4';
+INSERT INTO t1 (a) VALUES (7);
+INSERT INTO t1 (a) VALUES (8);
+XA END 'xa4';
+XA PREPARE 'xa4';
+connection default;
+# Now restart through graceful shutdown
+# restart
+connect con5,localhost,root,,test;
+Must have 'xa4'
+XA RECOVER;
+formatID gtrid_length bqual_length data
+1 3 0 xa4
+XA COMMIT 'xa4';
SELECT a FROM t1;
a
3
4
+7
+8
DROP TABLE t1;
diff --git a/storage/rocksdb/mysql-test/rocksdb/t/xa.test b/storage/rocksdb/mysql-test/rocksdb/t/xa.test
index f8f381f0580..0c23e71df8c 100644
--- a/storage/rocksdb/mysql-test/rocksdb/t/xa.test
+++ b/storage/rocksdb/mysql-test/rocksdb/t/xa.test
@@ -1,6 +1,7 @@
---echo #
---echo # MDEV-13155: XA recovery not supported for RocksDB (Just a testcase)
--echo #
+--echo # MDEV-742 fixes
+--echo # MDEV-13155: XA recovery not supported for RocksDB
+--echo # as well.
call mtr.add_suppression("Found .* prepared XA transactions");
@@ -22,17 +23,51 @@ INSERT INTO t1 (a) VALUES (3);
INSERT INTO t1 (a) VALUES (4);
XA END 'xa2';
XA PREPARE 'xa2';
-
+
+--connect (con3,localhost,root,,test)
+XA START 'xa3';
+INSERT INTO t1 (a) VALUES (5);
+INSERT INTO t1 (a) VALUES (6);
+XA END 'xa3';
+XA PREPARE 'xa3';
+--disconnect con3
+
--connection default
SELECT * FROM t1;
+--echo Must be all three XA:s in
+XA RECOVER;
+
--let $shutdown_timeout= 0
--source include/restart_mysqld.inc
--connect (con3,localhost,root,,test)
--disable_abort_on_error
-XA RECOVER;
+XA RECOVER; # like above
XA ROLLBACK 'xa1';
XA COMMIT 'xa2';
+XA ROLLBACK 'xa3';
SELECT a FROM t1;
+
+--connect (con4,localhost,root,,test)
+XA START 'xa4';
+INSERT INTO t1 (a) VALUES (7);
+INSERT INTO t1 (a) VALUES (8);
+XA END 'xa4';
+XA PREPARE 'xa4';
+
+--connection default
+--echo # Now restart through graceful shutdown
+--source include/restart_mysqld.inc
+
+
+--connect (con5,localhost,root,,test)
+--disable_abort_on_error
+
+--echo Must have 'xa4'
+XA RECOVER;
+XA COMMIT 'xa4';
+
+SELECT a FROM t1;
+
DROP TABLE t1;
diff --git a/storage/rocksdb/mysql-test/rocksdb_rpl/r/rpl_xa.result b/storage/rocksdb/mysql-test/rocksdb_rpl/r/rpl_xa.result
new file mode 100644
index 00000000000..b4713c68390
--- /dev/null
+++ b/storage/rocksdb/mysql-test/rocksdb_rpl/r/rpl_xa.result
@@ -0,0 +1,50 @@
+include/master-slave.inc
+[connection master]
+connection master;
+create table t1 (a int, b int) engine=InnoDB;
+insert into t1 values(0, 0);
+xa start 't';
+insert into t1 values(1, 2);
+xa end 't';
+xa prepare 't';
+xa commit 't';
+connection slave;
+include/diff_tables.inc [master:t1, slave:t1]
+connection master;
+xa start 't';
+insert into t1 values(3, 4);
+xa end 't';
+xa prepare 't';
+xa rollback 't';
+connection slave;
+include/diff_tables.inc [master:t1, slave:t1]
+connection master;
+SET pseudo_slave_mode=1;
+create table t2 (a int) engine=InnoDB;
+xa start 't';
+insert into t1 values (5, 6);
+xa end 't';
+xa prepare 't';
+xa start 's';
+insert into t2 values (0);
+xa end 's';
+xa prepare 's';
+include/save_master_gtid.inc
+connection slave;
+include/sync_with_master_gtid.inc
+xa recover;
+formatID gtrid_length bqual_length data
+1 1 0 t
+1 1 0 s
+connection master;
+xa commit 't';
+xa commit 's';
+SET pseudo_slave_mode=0;
+Warnings:
+Warning 1231 Slave applier execution mode not active, statement ineffective.
+connection slave;
+include/diff_tables.inc [master:t1, slave:t1]
+include/diff_tables.inc [master:t2, slave:t2]
+connection master;
+drop table t1, t2;
+include/rpl_end.inc
diff --git a/storage/rocksdb/mysql-test/rocksdb_rpl/t/rpl_xa.inc b/storage/rocksdb/mysql-test/rocksdb_rpl/t/rpl_xa.inc
new file mode 100644
index 00000000000..c1300c1e27a
--- /dev/null
+++ b/storage/rocksdb/mysql-test/rocksdb_rpl/t/rpl_xa.inc
@@ -0,0 +1,70 @@
+#
+# This "body" file checks general properties of XA transaction replication
+# as of MDEV-7974.
+# Parameters:
+# --let rpl_xa_check= SELECT ...
+#
+connection master;
+create table t1 (a int, b int) engine=InnoDB;
+insert into t1 values(0, 0);
+xa start 't';
+insert into t1 values(1, 2);
+xa end 't';
+xa prepare 't';
+xa commit 't';
+
+sync_slave_with_master;
+let $diff_tables= master:t1, slave:t1;
+source include/diff_tables.inc;
+
+connection master;
+
+xa start 't';
+insert into t1 values(3, 4);
+xa end 't';
+xa prepare 't';
+xa rollback 't';
+
+sync_slave_with_master;
+let $diff_tables= master:t1, slave:t1;
+source include/diff_tables.inc;
+
+connection master;
+SET pseudo_slave_mode=1;
+create table t2 (a int) engine=InnoDB;
+xa start 't';
+insert into t1 values (5, 6);
+xa end 't';
+xa prepare 't';
+xa start 's';
+insert into t2 values (0);
+xa end 's';
+xa prepare 's';
+--source include/save_master_gtid.inc
+
+connection slave;
+source include/sync_with_master_gtid.inc;
+if ($rpl_xa_check)
+{
+ --eval $rpl_xa_check
+ if ($rpl_xa_verbose)
+ {
+ --eval SELECT $rpl_xa_check_lhs
+ --eval SELECT $rpl_xa_check_rhs
+ }
+}
+xa recover;
+
+connection master;
+xa commit 't';
+xa commit 's';
+SET pseudo_slave_mode=0;
+
+sync_slave_with_master;
+let $diff_tables= master:t1, slave:t1;
+source include/diff_tables.inc;
+let $diff_tables= master:t2, slave:t2;
+source include/diff_tables.inc;
+
+connection master;
+drop table t1, t2;
diff --git a/storage/rocksdb/mysql-test/rocksdb_rpl/t/rpl_xa.test b/storage/rocksdb/mysql-test/rocksdb_rpl/t/rpl_xa.test
new file mode 100644
index 00000000000..7d667aa96d2
--- /dev/null
+++ b/storage/rocksdb/mysql-test/rocksdb_rpl/t/rpl_xa.test
@@ -0,0 +1,6 @@
+source include/have_rocksdb.inc;
+source include/master-slave.inc;
+source include/have_binlog_format_row.inc;
+
+source rpl_xa.inc;
+source include/rpl_end.inc;
diff --git a/storage/tokudb/mysql-test/tokudb_mariadb/r/xa.result b/storage/tokudb/mysql-test/tokudb_mariadb/r/xa.result
index 4724a0af926..34233b6fd8d 100644
--- a/storage/tokudb/mysql-test/tokudb_mariadb/r/xa.result
+++ b/storage/tokudb/mysql-test/tokudb_mariadb/r/xa.result
@@ -65,4 +65,5 @@ a
20
disconnect con1;
connection default;
+xa rollback 'testb',0x2030405060,11;
drop table t1;
diff --git a/storage/tokudb/mysql-test/tokudb_mariadb/t/xa.test b/storage/tokudb/mysql-test/tokudb_mariadb/t/xa.test
index dc5520a39b8..a6be07963f5 100644
--- a/storage/tokudb/mysql-test/tokudb_mariadb/t/xa.test
+++ b/storage/tokudb/mysql-test/tokudb_mariadb/t/xa.test
@@ -68,6 +68,9 @@ xa start 'zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz';
select * from t1;
disconnect con1;
+xa recover;
+
connection default;
+xa rollback 'testb',0x2030405060,11;
drop table t1;
1
0

03 Mar '20
2
1

Re: [Maria-developers] 9aace068f0c: MDEV-21743 Split up SUPER privilege to smaller privileges
by Sergei Golubchik 01 Mar '20
by Sergei Golubchik 01 Mar '20
01 Mar '20
Hi, Alexander!
Just a couple of issues that were left open after our slack chat.
Everything else looks fine, nothing new below.
On Mar 01, Alexander Barkov wrote:
> revision-id: 9aace068f0c (mariadb-10.5.0-277-g9aace068f0c)
> parent(s): 607960c7722
> author: Alexander Barkov <bar(a)mariadb.com>
> committer: Alexander Barkov <bar(a)mariadb.com>
> timestamp: 2020-03-01 11:49:25 +0400
> message:
>
> MDEV-21743 Split up SUPER privilege to smaller privileges
> +constexpr uint PRIVILEGE_T_MAX_BIT= 36;
> +
> +static_assert((privilege_t)(1ULL << PRIVILEGE_T_MAX_BIT) == LAST_CURRENT_ACL,
> + "LAST_CURRENT_ACL and PRIVILEGE_T_MAX_BIT do not match");
I'd still prefer to get rid of static_assert here.
E.g. with
constexpr uint PRIVILEGE_T_MAX_BIT= my_bit_log2(LAST_CURRENT_ACL);
(and fixing my_bit_log2 to work with ulonglong).
> @@ -7151,6 +7150,18 @@ bool check_global_access(THD *thd, privilege_t want_access, bool no_errors)
> {
> #ifndef NO_EMBEDDED_ACCESS_CHECKS
> char command[128];
> + /*
> + The userstat plugin in
> + plugin/userstat/client_stats.cc
> + plugin/userstat/user_stats.cc
> + calls
> + check_global_access(SUPER_ACL | PROCESS_ACL)
> + when populating CLIENT_STATISTICS and USER_STATISTICS
> + INFORMATION_SCHEMA tables.
> + We should eventually fix it to use one privilege only and uncomment
> + the DBUG_ASSERT below:
> + */
> + //DBUG_ASSERT(my_count_bits(want_access) <= 1);
I think you can remove SUPER_ACL from the userstat plugin and uncomment
the assert above.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] 256753e8ae8: Clean up and speed up interfaces for binary row logging
by Sergei Golubchik 01 Mar '20
by Sergei Golubchik 01 Mar '20
01 Mar '20
Hi, Michael!
Just a few comments, see below
On Feb 28, Michael Widenius wrote:
> revision-id: 256753e8ae8 (mariadb-10.5.0-269-g256753e8ae8)
> parent(s): 9fd961cf687
> author: Michael Widenius <monty(a)mariadb.com>
> committer: Michael Widenius <monty(a)mariadb.com>
> timestamp: 2020-02-26 16:05:53 +0200
> message:
>
> Clean up and speed up interfaces for binary row logging
...
> diff --git a/mysql-test/suite/rpl/r/create_or_replace_mix.result b/mysql-test/suite/rpl/r/create_or_replace_mix.result
> index 661278aa7ef..6c83d27eef9 100644
> --- a/mysql-test/suite/rpl/r/create_or_replace_mix.result
> +++ b/mysql-test/suite/rpl/r/create_or_replace_mix.result
> @@ -223,26 +226,12 @@ Log_name Pos Event_type Server_id End_log_pos Info
> slave-bin.000001 # Gtid # # GTID #-#-#
> slave-bin.000001 # Query # # use `test`; create table t1 (a int)
> slave-bin.000001 # Gtid # # BEGIN GTID #-#-#
> -slave-bin.000001 # Annotate_rows # # insert into t1 values (0),(1),(2)
> -slave-bin.000001 # Table_map # # table_id: # (test.t1)
> -slave-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
> +slave-bin.000001 # Query # # use `test`; insert into t1 values (0),(1),(2)
why is it STATEMENT now?
> slave-bin.000001 # Query # # COMMIT
> -slave-bin.000001 # Gtid # # BEGIN GTID #-#-#
> -slave-bin.000001 # Query # # use `test`; CREATE TABLE `t2` (
> - `a` int(11) DEFAULT NULL
> -) ENGINE=MyISAM
> -slave-bin.000001 # Annotate_rows # # create table t2 engine=myisam select * from t1
> -slave-bin.000001 # Table_map # # table_id: # (test.t2)
> -slave-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
> -slave-bin.000001 # Query # # COMMIT
> -slave-bin.000001 # Gtid # # BEGIN GTID #-#-#
> -slave-bin.000001 # Query # # use `test`; CREATE OR REPLACE TABLE `t2` (
> - `a` int(11) DEFAULT NULL
> -) ENGINE=InnoDB
> -slave-bin.000001 # Annotate_rows # # create or replace table t2 engine=innodb select * from t1
> -slave-bin.000001 # Table_map # # table_id: # (test.t2)
> -slave-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F
> -slave-bin.000001 # Xid # # COMMIT /* XID */
> +slave-bin.000001 # Gtid # # GTID #-#-#
> +slave-bin.000001 # Query # # use `test`; create table t2 engine=myisam select * from t1
> +slave-bin.000001 # Gtid # # GTID #-#-#
> +slave-bin.000001 # Query # # use `test`; create or replace table t2 engine=innodb select * from t1
> connection server_1;
> drop table t1;
> #
> diff --git a/mysql-test/suite/rpl/t/rpl_foreign_key.test b/mysql-test/suite/rpl/t/rpl_foreign_key.test
> new file mode 100644
> index 00000000000..50be97af24d
> --- /dev/null
> +++ b/mysql-test/suite/rpl/t/rpl_foreign_key.test
> @@ -0,0 +1,18 @@
> +--source include/have_innodb.inc
> +--source include/have_binlog_format_row.inc
> +
> +CREATE TABLE t1 (
> + id INT,
> + k INT,
> + c CHAR(8),
> + KEY (k),
> + PRIMARY KEY (id),
> + FOREIGN KEY (id) REFERENCES t1 (k)
> +) ENGINE=InnoDB;
> +LOCK TABLES t1 WRITE;
> +SET SESSION FOREIGN_KEY_CHECKS= OFF;
> +SET AUTOCOMMIT=OFF;
> +INSERT INTO t1 VALUES (1,1,'foo');
> +DROP TABLE t1;
> +SET SESSION FOREIGN_KEY_CHECKS= ON;
> +SET AUTOCOMMIT=ON;
What kind of replication test is it? You don't check binlog events, you
don't compare master and slave, you don't even run anything on the slave
to check whether your staments were replicated correctly.
In fact, you don't have any slave, you have not included
master-slave.inc, you only have binlog, so this test should be in the
binlog suite, not in the rpl suite - it's a binlog test, not replication
test.
And even in the binlog suite it would make sense to see what's actually
in a binlog. Just as a test that it's ok.
> diff --git a/sql/ha_sequence.cc b/sql/ha_sequence.cc
> index 6cb9937ebb4..71da208d775 100644
> --- a/sql/ha_sequence.cc
> +++ b/sql/ha_sequence.cc
> @@ -202,7 +202,11 @@ int ha_sequence::write_row(const uchar *buf)
> DBUG_ENTER("ha_sequence::write_row");
> DBUG_ASSERT(table->record[0] == buf);
>
> - row_already_logged= 0;
> + /*
> + Log to binary log even if this function has been called before
> + (The function ends by setting row_logging to 0)
> + */
> + row_logging= row_logging_init;
this is a sequence-specific hack, so you should define row_logging_init
in ha_sequence class, not in the base handler class
> if (unlikely(sequence->initialized == SEQUENCE::SEQ_IN_PREPARE))
> {
> /* This calls is from ha_open() as part of create table */
> diff --git a/sql/handler.cc b/sql/handler.cc
> index 1e3f987b4e5..4096ae8b90f 100644
> --- a/sql/handler.cc
> +++ b/sql/handler.cc
> @@ -6224,32 +6225,37 @@ bool ha_show_status(THD *thd, handlerton *db_type, enum ha_stat_type stat)
> 1 Row needs to be logged
> */
>
> -bool handler::check_table_binlog_row_based(bool binlog_row)
> +bool handler::check_table_binlog_row_based()
> {
> - if (table->versioned(VERS_TRX_ID))
> - return false;
> - if (unlikely((table->in_use->variables.sql_log_bin_off)))
> - return 0; /* Called by partitioning engine */
> #ifdef WITH_WSREP
> - if (!table->in_use->variables.sql_log_bin &&
> - wsrep_thd_is_applying(table->in_use))
> - return 0; /* wsrep patch sets sql_log_bin to silence binlogging
> - from high priority threads */
> #endif /* WITH_WSREP */
That's an empty #ifdef :)
> if (unlikely((!check_table_binlog_row_based_done)))
> {
> check_table_binlog_row_based_done= 1;
> check_table_binlog_row_based_result=
> - check_table_binlog_row_based_internal(binlog_row);
> + check_table_binlog_row_based_internal();
> }
> return check_table_binlog_row_based_result;
> }
>
> -bool handler::check_table_binlog_row_based_internal(bool binlog_row)
> +bool handler::check_table_binlog_row_based_internal()
> {
> THD *thd= table->in_use;
>
> +#ifdef WITH_WSREP
> + if (!thd->variables.sql_log_bin &&
> + wsrep_thd_is_applying(table->in_use))
> + {
> + /*
> + wsrep patch sets sql_log_bin to silence binlogging from high
> + priority threads
> + */
> + return 0;
> + }
> +#endif
> return (table->s->can_do_row_logging &&
> + !table->versioned(VERS_TRX_ID) &&
> + !(thd->variables.option_bits & OPTION_BIN_TMP_LOG_OFF) &&
> thd->is_current_stmt_binlog_format_row() &&
> /*
> Wsrep partially enables binary logging if it have not been
> @@ -6769,13 +6718,17 @@ int handler::ha_write_row(const uchar *buf)
> { error= write_row(buf); })
>
> MYSQL_INSERT_ROW_DONE(error);
> - if (likely(!error) && !row_already_logged)
> + if (likely(!error))
> {
> rows_changed++;
> - error= binlog_log_row(table, 0, buf, log_func);
> + if (row_logging)
> + {
> + Log_func *log_func= Write_rows_log_event::binlog_row_logging_function;
> + error= binlog_log_row(table, 0, buf, log_func);
> + }
> #ifdef WITH_WSREP
> - if (table_share->tmp_table == NO_TMP_TABLE &&
> - WSREP(ha_thd()) && (error= wsrep_after_row(ha_thd())))
> + if (WSREP_NNULL(ha_thd()) && table_share->tmp_table == NO_TMP_TABLE &&
why did you swap tests? NO_TMP_TABLE check is cheaper
(same in update and delete)
> + !error && (error= wsrep_after_row(ha_thd())))
> {
> DBUG_RETURN(error);
> }
> diff --git a/sql/sql_table.cc b/sql/sql_table.cc
> index 102416ec0a6..9e40c2ae8c8 100644
> --- a/sql/sql_table.cc
> +++ b/sql/sql_table.cc
> @@ -10506,10 +10506,10 @@ do_continue:;
> No additional logging of query is needed
> */
> binlog_done= 1;
> + DBUG_ASSERT(new_table->file->row_logging);
> new_table->mark_columns_needed_for_insert();
> thd->binlog_start_trans_and_stmt();
> - binlog_write_table_map(thd, new_table,
> - thd->variables.binlog_annotate_row_events);
> + thd->binlog_write_table_map(new_table, 1);
does it mean you force annotations for ALTER TABLE even if they were
configured off?
And why would ALTER TABLE generate row events anyway?
> }
> if (copy_data_between_tables(thd, table, new_table,
> alter_info->create_list, ignore,
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1
Hello, Everyone! I am new here and in open source too. I got to know about MariaDB through GSoC.
I have intermediate level of knowledge in sql and have experience working in languages such as c, c++ and java.
I did go through the list of tasks mentioned in the mariaDB GSoC 2020, and I was in particularly interested
in the tasks listed under MariaDB Server: Optimizer.
In that 1) Add FULL OUTER JOIN to MariaDB 2) Recursive CTE support for UPDATE (and DELETE) statements.
As I understand the functioning of both of these constructs to some extent, I believe that
I would be able to follow the control flow.
I would really love to contribute to the code base and be of help but I just don't know where to start from.
1
0

Re: [Maria-developers] 5ae74b4823a: mysqld --help will now load mysqld.options table
by Sergei Golubchik 28 Feb '20
by Sergei Golubchik 28 Feb '20
28 Feb '20
Hi, Michael!
On Feb 28, Michael Widenius wrote:
> revision-id: 5ae74b4823a (mariadb-10.5.0-274-g5ae74b4823a)
> parent(s): ee73f2c6e71
> author: Michael Widenius <monty(a)mariadb.com>
> committer: Michael Widenius <monty(a)mariadb.com>
> timestamp: 2020-02-26 16:05:54 +0200
> message:
>
> mysqld --help will now load mysqld.options table
mysql.plugin
>
> Changes:
> - Initalize Aria early to allow it to load mysql.plugin table with --help
> - Don't print 'aborting' when doing --help
> - Don't write 'loose' error messages on log_warning < 2 (2 is default)
> - Don't write warnings about disabled plugings when doing --help
> - Don't write aria_log_control or aria log files when doing --help
> - When using --help, open all Aria tables in read only mode (safety)
> - If aria_init() fails, do a cleanup(). (Frees used memory)
> - If aria_log_control is locked with --help, then don't wait 30 seconds
> but instead return at once without initialzing Aria plugin.
>
> diff --git a/sql/mysqld.cc b/sql/mysqld.cc
> index b2f8afca7a6..415a12f4783 100644
> --- a/sql/mysqld.cc
> +++ b/sql/mysqld.cc
> @@ -8511,8 +8511,8 @@ static void option_error_reporter(enum loglevel level, const char *format, ...)
> va_start(args, format);
>
> /* Don't print warnings for --loose options during bootstrap */
> - if (level == ERROR_LEVEL || !opt_bootstrap ||
> - global_system_variables.log_warnings)
> + if (level == ERROR_LEVEL ||
> + (!opt_bootstrap && global_system_variables.log_warnings > 1))
You've completely suppressed all --loose warnings during bootstrap.
Before your patch they were basically always enabled (because of
log_warnings==2).
I am not sure it's a good idea to disable warnings completely in
bootstrap.
If fact, I don't see why bootstrap should be special, so I'd simply
remove !opt_bootstrap condition completely here. But if you want to keep
it you can do something like
global_system_variables > opt_bootstrap
it'll make bootstrap a bit quieter than normal startup.
> {
> vprint_msg_to_log(level, format, args);
> }
> diff --git a/sql/sql_plugin.cc b/sql/sql_plugin.cc
> index d7d7fcca4a2..31de259a218 100644
> --- a/sql/sql_plugin.cc
> +++ b/sql/sql_plugin.cc
> @@ -1679,7 +1680,22 @@ int plugin_init(int *argc, char **argv, int flags)
> global_system_variables.table_plugin =
> intern_plugin_lock(NULL, plugin_int_to_ref(plugin_ptr));
> DBUG_SLOW_ASSERT(plugin_ptr->ref_count == 1);
> + }
> + /* Initialize Aria plugin so that we can load mysql.plugin */
> + plugin_ptr= plugin_find_internal(&Aria, MYSQL_STORAGE_ENGINE_PLUGIN);
> + DBUG_ASSERT(plugin_ptr || !mysql_mandatory_plugins[0]);
> + if (plugin_ptr)
> + {
> + DBUG_ASSERT(plugin_ptr->load_option == PLUGIN_FORCE);
>
> + if (plugin_initialize(&tmp_root, plugin_ptr, argc, argv, false))
> + {
> + if (!opt_help)
> + goto err_unlock;
> + plugin_ptr->state= PLUGIN_IS_DISABLED;
> + }
> + else
> + aria_loaded= 1;
> }
> mysql_mutex_unlock(&LOCK_plugin);
I think this should be done differently. In a completely opposite way.
I had unfurtunately hard-coded MyISAM here. A proper fix here could be
to remove this special treatment of MyISAM instead of adding another
special treatment of Aria.
Then plugin_init() could work like:
* run dd_frm_type() for the mysql.plugin table - like it's done now
* instead of hard-coding MyISAM (and Aria), find this engine name
in the plugin_array[] (note, all builtin plugins are already there)
* initialize it and (if successful) load mysql.plugin table
It only concerns the sql_plugin.cc part of your commit. Your aria part
of the commit is still needed, because a good-behaving engine has to be
read-only in --help.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1

Re: [Maria-developers] 044dfaff670: Replace handler::primary_key_is_clustered() with handler::ha_is_clustered_key()
by Sergei Golubchik 28 Feb '20
by Sergei Golubchik 28 Feb '20
28 Feb '20
Hi, Michael!
On Feb 28, Michael Widenius wrote:
> revision-id: 044dfaff670 (mariadb-10.5.0-275-g044dfaff670)
> parent(s): 5ae74b4823a
> author: Michael Widenius <monty(a)mariadb.com>
> committer: Michael Widenius <monty(a)mariadb.com>
> timestamp: 2020-02-26 16:05:55 +0200
> message:
>
> Replace handler::primary_key_is_clustered() with handler::ha_is_clustered_key()
>
> This was done to both simplify the code and also to be easier to handle
> storage engines that are clustered on some other index than the primary
> key.
No, I don't get it.
Old method meant "Check if the key is a clustered and a reference key".
Just "clustered" is marked with HA_CLUSTERED_INDEX.
So,
1. You've renamed the method but the name does not match the semantics.
It is called is_clustering_key() but it really means
"clustered AND reference key"
2. What other engine can be where the reference key isn't a primary key?
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] MDEV-21580: Allow packed sort keys in sort buffer, part #3
by Sergey Petrunia 27 Feb '20
by Sergey Petrunia 27 Feb '20
27 Feb '20
Hi Varun,
Please more input below:
Input on code structure:
> SORT_INFO *filesort(THD *thd, TABLE *table, Filesort *filesort...
> {
> ...
> if (!(sort_keys= filesort->make_sortorder(thd, join, first_table_bit)))
> DBUG_RETURN(NULL); /* purecov: inspected */
The main effect of this function used to be to create Filesort::sortorder, and
so the name 'make_sortorder' used to make sense.
Now, it creates a Sort_keys object so the name is counter-intuitive.
..
> uint sort_len= sortlength(thd, sort_keys, &multi_byte_charset,
> &allow_packing_for_sortkeys);
Another counter-intuitive name. Maybe this should be a method of sort_keys
object with a different name?
And maybe this call and make_sortorder should be moved together since they're
logically doing the same thing?
...
> Sort_keys*
> Filesort::make_sortorder(THD *thd, JOIN *join, table_map first_table_bit)
> {
...
> if (!sortorder)
> sortorder= (SORT_FIELD*) thd->alloc(sizeof(SORT_FIELD) * count);
Using "if(!sortorder)" means the sort order can be already present? Is this
because we're inside a subquery which we are re-executing?
...
> Sort_keys* sort_keys= new Sort_keys(sort_keys_array);
But then, we create sort_keys every time, and we do it on a MEM_ROOT which
causes a potential O(#rows_examined) memory use. Is my logic correct?
I think we should only call make_sortorder() if this hasn't already been done.
Any objections to this?
>
> void Filesort_buffer::sort_buffer(const Sort_param *param, uint count)
> {
>
> ...
> qsort2_cmp cmp_func= param->using_packed_sortkeys() ?
> get_packed_keys_compare_ptr() :
> get_ptr_compare(size);
>
> my_qsort2(m_sort_keys, count, sizeof(uchar*), cmp_func,
> param->using_packed_sortkeys() ?
> (void*)param :
> (void*) &size);
This choose-the-comparison-function logic is duplicated in merge_buffers().
Can we have in one place? I would add appropriate members into class Sort_key
(or Sort_param).
> if (param->using_packed_addons() || param->using_packed_sortkeys())
> {
> /*
> The last record read is most likely not complete here.
> We need to loop through all the records, reading the length fields,
> and then "chop off" the final incomplete record.
> */
Why change the identation of this block, from correct to invorrect one?
Please move it back two spaces to the left.
> static uint make_sortkey(Sort_param *param, uchar *to, uchar *ref_pos)
>
This function only seems to access Sort_param members that relate to Sort_keys.
Did you consider making it accept Sort_keys as an argument?
This seems like a sound idea to me: make_sortkey() is only concerned with
making sort keys from original records. This is what Sort_keys class should be
handling. Sort_param on the other hand covers the entire sorting process:
the buffers, total # of rows, etc.
Please give it a try.
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
1
0

Re: [Maria-developers] 4c263b6d30b: MDEV-20632: Recursive CTE cycle detection using CYCLE clause
by Sergei Golubchik 27 Feb '20
by Sergei Golubchik 27 Feb '20
27 Feb '20
Hi, Oleksandr!
On Feb 26, Oleksandr Byelkin wrote:
> On Wed, Feb 26, 2020 at 1:44 PM Sergei Golubchik <serg(a)mariadb.org> wrote:
>
> > Hi, Sanja!
> >
> > _some_ comments are below.
> >
> > The main thing, I still don't understand your changes in sql_select.cc.
> >
> > Why did you create separate counters and code paths with if() for CYCLE?
> > I'd think it could be just a generalization of DISTINCT
>
> DISTINCT makes all fields distinct, except hidden, I need also except come
> other (or better say by list of not hidden)
Exactly. I thought DISTINCT will be just CYCLE with an empty list of
non-distinct columns. Or something like that. Not like
if (cycle) {
cycle code
} else {
distinct code
}
> > > --- a/sql/sql_cte.cc
> > > +++ b/sql/sql_cte.cc
> > > @@ -982,6 +982,38 @@ With_element::rename_columns_of_derived_unit(THD *thd,
> >
> > are you sure what you're doing below can still be called
> > "rename_columns_of_derived_unit" ?
>
> I can rename it to prepare_olumns_of_derived_unit, is it OK?
anything you like :)
> > > +opt_cycle:
> > > + /* empty */
> > > + { $$= NULL; }
> > > + |
> > > + CYCLE_SYM
> > > + {
> > > + if (!Lex->curr_with_clause->with_recursive)
> > > + {
> > > + thd->parse_error(ER_SYNTAX_ERROR, $1.pos());
> > > + }
> > > + }
> > > + '(' with_column_list ')'
> >
> > Where did you see that the standard requires parentheses here?
>
> Maybe in ORACLE docs... but without it it creates a lot of conflicts.
The standard says no parentheses. What kind of conflicts do you get?
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1
Hi Everyone!
This mail will describe usage and implementation of Lag Free Alter. And this
will also answer the question raised by Kristian and Simon J Mudd(in 2016)
Desc:- This will split Alter into 2 different commits. START ALTER and COMMIT
/ROLLBACK ALTER , Start Alter will be written in binlog as soon as we get the
locks for the table, alter will proceeds as usual and at the time of writing
binlog if alter is successful we will write COMMIT Alter other wise ROLLBACK
Alter.
Usage:- Using this feature is quite simple.
1. On master you have to turn on `BINLOG_SPLIT_ALTER` dynamic variable.
2. Slave must be using parallel replication.
Advance Usage:-
So alter is divided like this.
1. START identifier Actual_alter_stmt
2. COMMIT/ROLLBACK identifier Actual_alter_stmt. OR
2. COMMIT/ROLLBACK identifier ALTER
identifier is thread_id.
Questions by Simon Mudd.
>>* this behaviour should be configurable?
Yes.
>> - new global variable on the master to allow injection of this changed event stream?
Right , `BINLOG_SPLIT_ALTER`
>> - a new session variable set in the session where the command is triggered ?
Right , `BINLOG_SPLIT_ALTER`
>> - on slave a setting to indicate how many INPLACE ALTER TABLEs can run at once?
No setting so far , but I am thinking of maximum no of CONCURRENT ALTER =
slave_parallel_threads
>>* how does a DBA monitor what’s going on?
>> - the progress and number of active DDL statements
So as we there is 2 part, So progress will go like this
1. Executing of start alter (this will take most of time)
2. Waiting for commit/rollback Signal
3. Commit/ Rollback Alter.
Number of active ALTER , these will create new threads so DBA can know
using this
or I am thinking of adding variable in SHOW SLAVE INFO. which will
show active DDL.
>> - please consider adding counters/metrics for:
>> * number of “asynchronous DDLs” in progress / completed successfully / failed or rolled back
Okay, We can have counter for these metrics. If i get time to implement this.
>> * sizes of the DDL changes made so far
Not sure if we need this.
>> * number of threads busy in this state (maybe can be implied from SHOW PROCESSLIST or equivalent but nicer to have an easy to query metric)
Show processlist will show the busy threads.
Sample Output when concurrent alter is running on slave.
slave_parallel_threads= 10
Every 1.0s: mysql -uroot -S
/home/sachin/alter/build/mysql-test/var/tmp/mysqld.2.sock -e show
processlist
sachin-sp52: Wed Jan 22 16:56:00 2020
Id User Host db Command Time State Info Progress
1 system user NULL Daemon NULL InnoDB purge
coordinator NULL 0.000
3 system user NULL Daemon NULL InnoDB purge
worker NULL 0.000
2 system user NULL Daemon NULL InnoDB purge
worker NULL 0.000
4 system user NULL Daemon NULL InnoDB purge
worker NULL 0.000
5 system user NULL Daemon NULL InnoDB
shutdown handler NULL 0.000
10 root localhost:44478 test Sleep 6 NULL 0.000
11 root localhost:44480 test Sleep 7 NULL 0.000
15 root localhost:44492 test Sleep 6 NULL 0.000
16 root localhost:44494 test Sleep 6 NULL 0.000
17 system user NULL Slave_IO 6
Waiting for master to send event NULL 0.000
19 system user NULL Slave_worker 6
Waiting for work from SQL thread NULL 0.000
22 system user NULL Slave_worker 6
Waiting for work from SQL thread NULL 0.000
20 system user NULL Slave_worker 6
Waiting for work from SQL thread NULL 0.000
23 system user NULL Slave_worker 6
Waiting for work from SQL thread NULL 0.000
24 system user NULL Slave_worker 6
Waiting for work from SQL thread NULL 0.000
25 system user NULL Slave_worker 6
Waiting for work from SQL thread NULL 0.000
26 system user NULL Slave_worker 6
Waiting for work from SQL thread NULL 0.000
21 system user NULL Slave_worker 6
Waiting for work from SQL thread NULL 0.000
27 system user NULL Slave_worker 6
Waiting for work from SQL thread NULL 0.000
28 system user NULL Slave_worker 6
Waiting for work from SQL thread NULL 0.000
18 system user NULL Slave_SQL 6 Slave
has read all relay log; waiting for the slave I/O thread to update it
NULL 0.000
29 system user test Slave_worker 6 NULL
/*!100001 START 19 alter table t3 add column c int, force,
algorithm=inplace */ 0.000
30 system user test Slave_worker 6 NULL
/*!100001 START 17 alter table t1 add column c int, force,
algorithm=inplace */ 0.000
31 system user test Slave_worker 6 NULL
/*!100001 START 18 alter table t2 add column c int, force,
algorithm=inplace */ 0.000
32 system user test Slave_worker 6 NULL
/*!100001 START 20 alter table t4 add column c int, force,
algorithm=inplace */ 0.000
33 system user test Slave_worker 6 NULL
/*!100001 START 21 alter table t5 add column c int, force,
algorithm=inplace */ 0.000
34 root localhost NULL Query 0 Init show
processlist 0.000
>>* what happens if the commit/rollback ddl event never arrives? (the statement on the master could be skipped for one of several reasons)
So here are 2 options
If slave is still running DBA can manually run
COMMIT/ROLLBACK identifier alter_stmt/ALTER , where identifier is same as
START ALTER identifier.
so this will either commit or rollback alter.
If slave is not running.
Since slave is not running all context is lost , so this will take as much as
time a normal alter.
>> - I assume that users on the slave see the old structure all the time until it completes
Right, we work on copy of alter for myisam and for other case innodb will take
care of it.
>> - would this have a storage or performance penalty on the slave if the commit/rollback DDL event never arrives?
Not performance but yes storage and ram penalty.
>> - can a “commit” / “rollback” be manually triggered by the DBA in such circumstances?
yes.
>>* what happens if the server crashes while this process (or these processes) are ongoing when the server starts up again?
So if we crash in START ALTER no issue , we can run it again , since it is
transactional(we are working on copy so ..)
If we crash after start alter and we receive COMMIT alter then it will be
treated as normal alter.
ROLLBACK ALTER will be instant.
If we crash in COMMIT ALTER this will be same as crash in normal ALTER table ,
so DBA has to deal with it.
Architecture:-
Lets look at sample alter perf data for innodb and myisam
Myisam
+ 96.72% 0.00% mysqld mysqld [.] mysql_parse
+ 96.72% 0.00% mysqld mysqld [.] mysql_execute_command
+ 96.72% 0.00% mysqld mysqld [.]
Sql_cmd_alter_table::execute
- 96.72% 0.00% mysqld mysqld [.] mysql_alter_table
- mysql_alter_table
- 96.63% copy_data_between_tables
+ 79.64% handler::ha_write_row
+ 6.24% TABLE::update_default_fields
+ 5.71% READ_RECORD::read_record
+ 1.87% do_copy_null
1.05% Field::do_field_int
+ 0.95% _mcount
InnoDB
+ 41.27% 0.00% mysqld mysqld [.] mysql_parse
+ 41.27% 0.00% mysqld mysqld [.] mysql_execute_command
+ 41.27% 0.00% mysqld mysqld [.]
Sql_cmd_alter_table::execute
- 41.27% 0.00% mysqld mysqld [.] mysql_alter_table
- mysql_alter_table
- 41.27% mysql_inplace_alter_table
- 41.15% handler::ha_inplace_alter_table
ha_innobase::inplace_alter_table
- row_merge_build_indexes
- 41.00% row_merge_read_clustered_index
+ 22.88% row_merge_insert_index_tuples
+ 5.89% row_build_w_add_vcol
+ 5.20% BtrBulk::finish
+ 3.20% row_merge_buf_add
+ 0.87% page_cur_move_to_next
0.62% rec_get_offsets_func
So as we can see most of the work is done by copy_data_between_tables/
mysql_inplace_alter_table.
Master Side:-
There is not much change into master side.
So after locking table and before executing these functions we write
START _id_ alter_stmt into binlog.
And at the time of write_bin_log we write
COMMIT _id_ alter_stmt into binlog.
So alter statement will be divided into 2 , hence 2 gtid. START_ALTER will
have special flag FL_START_ALTER_E1(4), this will be used on slave side to
create new worker thread for start alter processing.
Slave Side:-
This require parallel replication on slave side.
So in do_event when we get a gtid_event with FL_START_ALTER_E1, ::choose_thread
will call rpl_parallel_add_extra_worker, which will create a new worker thread
and this will do start alter processing. So the gtid_log_event and next
Query_log_event will be scheduled to this new thread. And this thread will be
exited as soon as alter is finished.
On slave side START ALTER will binlog after successfully getting MDL lock and
thd lock. Untill this point we are executed as DDL(so new GCO), but after
getting locks we will call finish_event_group. So that new events can proceed in
parallel.
Then it will continue to execute code in mysql_alter_table until
we reach a non transactional part(like renaming table/dropping table in myisam)
So before executing NON-Transactional part it will wait signal from other worker
thread to either abort or proceed forward. We will add a entry into mi->
start_alter_list, with thread_id(of master) as a key.
COMMIT/ROLLBACK Alter is treated as a normal query_log_event so it will be
assigned normal worker. This command will take SQLCOM_COMMIT_PREVIOUS path in
mysql_execute_command. We will simple search for thread_id into start_alter_list
and change status from ::WAITING to ::COMMIT_ALTER and signal the wait condition
if we dont find the thread_id we will wait on wait condition, So
simple it is just
consumer producer between start alter and commit/rollback.
Questions by Kristian:-
>Can you explain what you mean by "true LOCK=NONE"? It is not clear from your
>description. I think you mean something that will allow to run an ALTER
>TABLE on the slave in parallel with the same statement on the master?
Yes
>There will be a number of things to consider with this. Previously, all
>transactions in the binlog have been atomic; this introduces related events
>that can be arbitrarily separated in the binlog.
>For example, suppose the slave stops in the middle between BEGIN_DDL_EVENT
>and COMMIT_DDL_EVENT/ROLLBACK_DDL_EVENT. What is then the slave's binlog
>position / GTID position?
Now we are using 2 gtid for ALTER , So it wont be issue.
>Hopefully, the exiting pool of slave worker threads (currently used for
>parallel replication) can be used for this as well? Doesn't seem right to
>introduce a new kind of thread pool...
We are using same(global) thread pool but I am creating new threads for START
ALTER. Otherwise we will have a deadlock , Suppose if we have 5 concurrent START
ALTER and just 5 slave-worker-threads , all worker will be waiting for
COMMIT/ROLLBACK, but we cant execute commit rollback because we dont have any
free worker. There can be more case when we have concurrent DML too ,
so creating
a new thread was the safest option.
>Won't you need some mechanism to prevent following events to access the
>table before ALTER TABLE in a worker thread can acquire the metadata lock?
Right, So Untill Locking START ALTER is executed as DDL so no following event
will be executed before locking.
And one more thing regarding locks , So there will we only one thread will be
doing whole work of alter, (we can just assume that there is some
sleep on slave)
so execution on master and slave will be equivalent.
>There are some nasty issues to be aware of with potential deadlocks related
>to FLUSH TABLES WITH GLOBAL READ LOCK and such, when multiple threads are
>trying to coordinate waiting and locking; there were already some quite hard
>issues with this for parallel replication.
I have to test for it.
Code branch bb-10.5-olter
Jira (Mdev-11675(https://jira.mariadb.org/browse/MDEV-11675))
Regards
Sachin
--
Regards
Sachin Setiya
Software Engineer at MariaDB
3
4

Re: [Maria-developers] ec1a860bd09: MDEV-21743 Split up SUPER privilege to smaller privileges
by Sergei Golubchik 26 Feb '20
by Sergei Golubchik 26 Feb '20
26 Feb '20
Hi, Alexander!
Thanks!
Looks very straightforward.
See few comments below.
The main one - I am not sure I like the idea of creating numerous
aliases for privileges. This approach looks rather confusing to me.
On Feb 26, Alexander Barkov wrote:
> revision-id: ec1a860bd09 (mariadb-10.5.0-231-gec1a860bd09)
> parent(s): b8b5a6a2f9d
> author: Alexander Barkov <bar(a)mariadb.com>
> committer: Alexander Barkov <bar(a)mariadb.com>
> timestamp: 2020-02-23 22:09:55 +0400
> message:
>
> MDEV-21743 Split up SUPER privilege to smaller privileges
> diff --git a/mysql-test/main/alter_user.test b/mysql-test/main/alter_user.test
> index 9ea98615272..60a36499a55 100644
> --- a/mysql-test/main/alter_user.test
> +++ b/mysql-test/main/alter_user.test
> @@ -30,7 +30,7 @@ alter user foo;
>
> --echo # Grant super privilege to the user.
> connection default;
> -grant super on *.* to foo;
> +grant READ_ONLY ADMIN on *.* to foo;
--echo comments are now wrong
> --echo # We now have super privilege. We should be able to run alter user.
> connect (b, localhost, foo);
> diff --git a/mysql-test/main/grant.result b/mysql-test/main/grant.result
> index e83083be4ed..1452ada11f5 100644
> --- a/mysql-test/main/grant.result
> +++ b/mysql-test/main/grant.result
> @@ -631,6 +634,10 @@ Super Server Admin To use KILL thread, SET GLOBAL, CHANGE MASTER, etc.
> Trigger Tables To use triggers
> Create tablespace Server Admin To create/alter/drop tablespaces
> Update Tables To update existing rows
> +Set user Server To create views and stored routines with a different definer
> +Federated admin Server To execute the CREATE SERVER, ALTER SERVER, DROP SERVER statements
> +Connection admin Server To skip connection related limits tests
^^^ allows KILL too
> +Read_only admin Server To perform write operations even if @@read_only=ON
> Usage Server Admin No privileges - allow connect only
> connect root,localhost,root,,test,$MASTER_MYPORT,$MASTER_MYSOCK;
> connection root;
> diff --git a/sql/lock.cc b/sql/lock.cc
> index 6f86c0a38f6..9a4024606f8 100644
> --- a/sql/lock.cc
> +++ b/sql/lock.cc
> @@ -114,7 +113,7 @@ lock_tables_check(THD *thd, TABLE **tables, uint count, uint flags)
> DBUG_ENTER("lock_tables_check");
>
> system_count= 0;
> - is_superuser= (thd->security_ctx->master_access & SUPER_ACL) != NO_ACL;
> + is_superuser= (thd->security_ctx->master_access & IGNORE_READ_ONLY_ACL) != NO_ACL;
may be then s/is_superuser/ignores_read_only/ ?
> log_table_write_query= (is_log_table_write_query(thd->lex->sql_command)
> || ((flags & MYSQL_LOCK_LOG_TABLE) != 0));
>
> diff --git a/sql/privilege.h b/sql/privilege.h
> index 5dbc0b6dbdf..f80e726d8d0 100644
> --- a/sql/privilege.h
> +++ b/sql/privilege.h
> @@ -59,24 +59,60 @@ enum privilege_t: unsigned long long
> TRIGGER_ACL = (1UL << 27),
> CREATE_TABLESPACE_ACL = (1UL << 28),
> DELETE_HISTORY_ACL = (1UL << 29), // Added in 10.3.4
> + SET_USER_ACL = (1UL << 30), // Added in 10.5.2
> + FEDERATED_ADMIN_ACL = (1UL << 31), // Added in 10.5.2
> + CONNECTION_ADMIN_ACL = (1ULL << 32), // Added in 10.5.2
> + READ_ONLY_ADMIN_ACL = (1ULL << 33), // Added in 10.5.2
> + REPL_SLAVE_ADMIN_ACL = (1ULL << 34), // Added in 10.5.2
> + REPL_MASTER_ADMIN_ACL = (1ULL << 35), // Added in 10.5.2
> + BINLOG_ADMIN_ACL = (1ULL << 36) // Added in 10.5.2
> /*
> - don't forget to update
> - 1. static struct show_privileges_st sys_privileges[]
> - 2. static const char *command_array[] and static uint command_lengths[]
> - 3. mysql_system_tables.sql and mysql_system_tables_fix.sql
> - 4. acl_init() or whatever - to define behaviour for old privilege tables
> - 5. sql_yacc.yy - for GRANT/REVOKE to work
> - 6. Add a new ALL_KNOWN_ACL_VERSION
> - 7. Change ALL_KNOWN_ACL to ALL_KNOWN_ACL_VERSION
> - 8. Update User_table_json::get_access()
> + don't forget to update:
> + In this file:
> + - Add a new LAST_version_ACL
> + - Fix PRIVILEGE_T_MAX_BIT
> + - Add a new ALL_KNOWN_ACL_version
> + - Change ALL_KNOWN_ACL to ALL_KNOWN_ACL_version
> + - Change GLOBAL_ACLS if needed
> + - Change SUPER_ADDED_SINCE_USER_TABLE_ACL if needed
> +
> + In other files:
> + - static struct show_privileges_st sys_privileges[]
> + - static const char *command_array[] and static uint command_lengths[]
> + - mysql_system_tables.sql and mysql_system_tables_fix.sql
> + - acl_init() or whatever - to define behaviour for old privilege tables
> + - Update User_table_json::get_access()
> + - sql_yacc.yy - for GRANT/REVOKE to work
> +
> + Important: the enum should contain only single-bit values.
> + In this case, debuggers print bit combinations in the readable form:
> + (gdb) p (privilege_t) (15)
> + $8 = (SELECT_ACL | INSERT_ACL | UPDATE_ACL | DELETE_ACL)
> +
> + Bit-OR combinations of the above values should be declared outside!
> */
> -
> - // A combination of all bits defined in 10.3.4 (and earlier)
> - ALL_KNOWN_ACL_100304 = (1UL << 30) - 1
> };
>
>
> -constexpr privilege_t ALL_KNOWN_ACL= ALL_KNOWN_ACL_100304;
> +// Version markers
> +constexpr privilege_t LAST_100304_ACL= DELETE_HISTORY_ACL;
> +constexpr privilege_t LAST_100502_ACL= BINLOG_ADMIN_ACL;
> +constexpr privilege_t LAST_CURRENT_ACL= LAST_100502_ACL;
> +constexpr uint PRIVILEGE_T_MAX_BIT= 36;
> +
> +static_assert((privilege_t)(1ULL << PRIVILEGE_T_MAX_BIT) == LAST_CURRENT_ACL,
> + "LAST_CURRENT_ACL and PRIVILEGE_T_MAX_BIT do not match");
why wouldn't you define PRIVILEGE_T_MAX_BIT via LAST_CURRENT_ACL instead?
> +
> +// A combination of all bits defined in 10.3.4 (and earlier)
> +constexpr privilege_t ALL_KNOWN_ACL_100304 =
> + (privilege_t) ((LAST_100304_ACL << 1) - 1);
> +
> +// A combination of all bits defined in 10.5.2
> +constexpr privilege_t ALL_KNOWN_ACL_100502=
> + (privilege_t) ((LAST_100502_ACL << 1) - 1);
> +
> +// A combination of all bits defined as of the current version
> +constexpr privilege_t ALL_KNOWN_ACL= ALL_KNOWN_ACL_100502;
>
>
> // Unary operators
> @@ -229,6 +280,104 @@ constexpr privilege_t SHOW_CREATE_TABLE_ACLS=
> constexpr privilege_t TMP_TABLE_ACLS=
> COL_DML_ACLS | ALL_TABLE_DDL_ACLS;
>
> +
> +
> +/*
> + If a VIEW has a `definer=invoker@host` clause and
> + the specified definer does not exists, then
> + - The invoker with REVEAL_MISSING_DEFINER_ACL gets:
> + ERROR: The user specified as a definer ('definer1'@'localhost') doesn't exist
> + - The invoker without MISSING_DEFINER_ACL gets a generic access error,
> + without revealing details that the definer does not exists.
> +
> + TODO: we should eventually test the same privilege when processing
> + other objects that have the DEFINER clause (e.g. routines, triggers).
> + Currently the missing definer is revealed for non-privileged invokers
> + in case of routines, triggers, etc.
> +*/
> +constexpr privilege_t REVEAL_MISSING_DEFINER_ACL= SUPER_ACL;
1.
why did you create these aliases? I don't think they make the
code simpler, on the contrary, now one can never know whether
say, IGNORE_READ_ONLY_ACL is a real privilege like in
GRANT IGNORE READ_ONLY ON *.* TO user@host
or it's just an alias.
2. REVEAL_MISSING_DEFINER_ACL should be SET_USER_ACL, I think
> +constexpr privilege_t DES_DECRYPT_ONE_ARG_ACL= SUPER_ACL;
> +constexpr privilege_t LOG_BIN_TRUSTED_SP_CREATOR_ACL= SUPER_ACL;
this could be SET_USER_ACL too
> +constexpr privilege_t DEBUG_ACL= SUPER_ACL;
> +constexpr privilege_t SET_GLOBAL_SYSTEM_VARIABLE_ACL= SUPER_ACL;
> +constexpr privilege_t SET_RESTRICTED_SESSION_SYSTEM_VARIABLE_ACL= SUPER_ACL;
> +
> +/* Privileges related to --read-only */
> +constexpr privilege_t IGNORE_READ_ONLY_ACL= READ_ONLY_ADMIN_ACL;
> +
> +/* Privileges related to connection handling */
> +constexpr privilege_t IGNORE_INIT_CONNECT_ACL= CONNECTION_ADMIN_ACL;
> +constexpr privilege_t IGNORE_MAX_USER_CONNECTIONS_ACL= CONNECTION_ADMIN_ACL;
> +constexpr privilege_t IGNORE_MAX_CONNECTIONS_ACL= CONNECTION_ADMIN_ACL;
> +constexpr privilege_t IGNORE_MAX_PASSWORD_ERRORS_ACL= CONNECTION_ADMIN_ACL;
> +// Was SUPER_ACL prior to 10.5.2
> +constexpr privilege_t KILL_OTHER_USER_PROCESS_ACL= CONNECTION_ADMIN_ACL;
> +
> +
> +/*
> + Binary log related privileges that are checked regardless
> + of active replication running.
> +*/
> +
> +/*
> + This command was renamed from "SHOW MASTER STATUS"
> + to "SHOW BINLOG STATUS" in 10.5.2.
> + Was SUPER_ACL | REPL_CLIENT_ACL prior to 10.5.2
> +*/
> +constexpr privilege_t STMT_SHOW_BINLOG_STATUS_ACL= REPL_CLIENT_ACL;
> +
> +// Was SUPER_ACL | REPL_CLIENT_ACL prior to 10.5.2
> +constexpr privilege_t STMT_SHOW_BINARY_LOGS_ACL= REPL_CLIENT_ACL;
> +
> +// Was SUPER_ACL prior to 10.5.2
> +constexpr privilege_t STMT_PURGE_BINLOG_ACL= BINLOG_ADMIN_ACL;
> +
> +// Was REPL_SLAVE_ACL prior to 10.5.2
> +constexpr privilege_t STMT_SHOW_BINLOG_EVENTS_ACL= PROCESS_ACL;
> +
> +
> +/*
> + Privileges for replication related statements
> + that are executed on the master.
> +*/
> +constexpr privilege_t COM_REGISTER_SLAVE_ACL= REPL_SLAVE_ACL;
> +constexpr privilege_t COM_BINLOG_DUMP_ACL= REPL_SLAVE_ACL;
> +// Was REPL_SLAVE_ACL prior to 10.5.2
> +constexpr privilege_t STMT_SHOW_SLAVE_HOSTS_ACL= REPL_MASTER_ADMIN_ACL;
> +
> +
> +/* Privileges for statements that are executed on the slave */
> +// Was SUPER_ACL prior to 10.5.2
> +constexpr privilege_t STMT_START_SLAVE_ACL= REPL_SLAVE_ADMIN_ACL;
> +// Was SUPER_ACL prior to 10.5.2
> +constexpr privilege_t STMT_STOP_SLAVE_ACL= REPL_SLAVE_ADMIN_ACL;
> +// Was SUPER_ACL prior to 10.5.2
> +constexpr privilege_t STMT_CHANGE_MASTER_ACL= REPL_SLAVE_ADMIN_ACL;
> +// Was (SUPER_ACL | REPL_CLIENT_ACL) prior to 10.5.2
> +constexpr privilege_t STMT_SHOW_SLAVE_STATUS_ACL= REPL_SLAVE_ADMIN_ACL;
> +// Was SUPER_ACL prior to 10.5.2
> +constexpr privilege_t STMT_BINLOG_ACL= REPL_SLAVE_ADMIN_ACL;
> +// Was REPL_SLAVE_ACL prior to 10.5.2
> +constexpr privilege_t STMT_SHOW_RELAYLOG_EVENTS_ACL= REPL_SLAVE_ADMIN_ACL;
> +
> +
> +/* Privileges for federated database related statements */
> +// Was SUPER_ACL prior to 10.5.2
> +constexpr privilege_t STMT_CREATE_SERVER_ACL= FEDERATED_ADMIN_ACL;
> +// Was SUPER_ACL prior to 10.5.2
> +constexpr privilege_t STMT_ALTER_SERVER_ACL= FEDERATED_ADMIN_ACL;
> +// Was SUPER_ACL prior to 10.5.2
> +constexpr privilege_t STMT_DROP_SERVER_ACL= FEDERATED_ADMIN_ACL;
> +
> +
> +/* Privileges related to processes */
> +constexpr privilege_t COM_PROCESS_INFO_ACL= PROCESS_ACL;
> +constexpr privilege_t STMT_SHOW_EXPLAIN_ACL= PROCESS_ACL;
> +constexpr privilege_t STMT_SHOW_ENGINE_STATUS_ACL= PROCESS_ACL;
> +constexpr privilege_t STMT_SHOW_ENGINE_MUTEX_ACL= PROCESS_ACL;
> +constexpr privilege_t STMT_SHOW_PROCESSLIST_ACL= PROCESS_ACL;
> +
> +
> /*
> Defines to change the above bits to how things are stored in tables
> This is needed as the 'host' and 'db' table is missing a few privileges
> diff --git a/sql/sql_connect.cc b/sql/sql_connect.cc
> index e2a3c482ae4..b096bfa7a95 100644
> --- a/sql/sql_connect.cc
> +++ b/sql/sql_connect.cc
> @@ -1246,7 +1245,8 @@ void prepare_new_connection_state(THD* thd)
> thd->set_command(COM_SLEEP);
> thd->init_for_queries();
>
> - if (opt_init_connect.length && !(sctx->master_access & SUPER_ACL))
> + if (opt_init_connect.length &&
> + (sctx->master_access & IGNORE_INIT_CONNECT_ACL) == NO_ACL)
dunno, I kind of like !(access & SOME_ACL)
why not to keep privilege_t -> bool?
> {
> execute_init_command(thd, &opt_init_connect, &LOCK_sys_init_connect);
> if (unlikely(thd->is_error()))
> diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc
> index dac5b025821..7f3a436a4c2 100644
> --- a/sql/sql_parse.cc
> +++ b/sql/sql_parse.cc
> @@ -7138,8 +7138,7 @@ bool check_some_access(THD *thd, privilege_t want_access, TABLE_LIST *table)
> @warning
> One gets access right if one has ANY of the rights in want_access.
> This is useful as one in most cases only need one global right,
> - but in some case we want to check if the user has SUPER or
> - REPL_CLIENT_ACL rights.
> + but in some case we want to check multiple rights.
In what cases? Are there any left?
>
> @retval
> 0 ok
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] 4c263b6d30b: MDEV-20632: Recursive CTE cycle detection using CYCLE clause
by Sergei Golubchik 26 Feb '20
by Sergei Golubchik 26 Feb '20
26 Feb '20
Hi, Sanja!
_some_ comments are below.
The main thing, I still don't understand your changes in sql_select.cc.
Why did you create separate counters and code paths with if() for CYCLE?
I'd think it could be just a generalization of DISTINCT
On Feb 20, Oleksandr Byelkin wrote:
> revision-id: 4c263b6d30b (mariadb-10.5.0-92-g4c263b6d30b)
> parent(s): 6f2e2285291
> author: Oleksandr Byelkin <sanja(a)mariadb.com>
> committer: Oleksandr Byelkin <sanja(a)mariadb.com>
> timestamp: 2020-01-27 21:56:02 +0100
> message:
>
> MDEV-20632: Recursive CTE cycle detection using CYCLE clause
>
> Added CYCLE clause to recursive CTE.
>
> diff --git a/sql/item.h b/sql/item.h
> index c1f68a4f942..2cbaf880e00 100644
> --- a/sql/item.h
> +++ b/sql/item.h
> @@ -622,6 +622,13 @@ class st_select_lex_unit;
> class Item_func_not;
> class Item_splocal;
>
> +/* Item::common_flags */
> +/* Indicates that name of this Item autogenerated or set by user */
> +#define IS_AUTO_GENETATED_NAME 1
typo ^^^
> +/* Inticates that this item is in CYCLE clause of WITH */
typo ^^^
> +#define IS_IN_WITH_CYCLE 2
> +
> +
> /**
> String_copier that sends Item specific warnings.
> */
> diff --git a/sql/sql_cte.cc b/sql/sql_cte.cc
> index 5b3d3108da5..971844120bc 100644
> --- a/sql/sql_cte.cc
> +++ b/sql/sql_cte.cc
> @@ -982,6 +982,38 @@ With_element::rename_columns_of_derived_unit(THD *thd,
are you sure what you're doing below can still be called
"rename_columns_of_derived_unit" ?
> else
> make_valid_column_names(thd, select->item_list);
>
> + if (cycle_list)
> + {
> + List_iterator_fast<Item> it(select->item_list);
> + List_iterator_fast<LEX_CSTRING> nm(*cycle_list);
> + List_iterator_fast<LEX_CSTRING> nm_check(*cycle_list);
> + DBUG_ASSERT(cycle_list->elements != 0);
> + while (LEX_CSTRING *name= nm++)
> + {
> + Item *item;
> + // unique check
> + LEX_CSTRING *check;
> + nm_check.rewind();
> + while ((check= nm_check++) && check != name)
> + {
> + if (check->length == name->length &&
> + strncmp(check->str, name->str, name->length) == 0)
> + {
> + my_error(ER_DUP_FIELDNAME, MYF(0), check->str);
> + return true;
> + }
> + }
> + while ((item= it++) &&
> + (item->name.length != name->length ||
> + strncmp(item->name.str, name->str, name->length) != 0));
> + if (item == NULL)
> + {
> + my_error(ER_BAD_FIELD_ERROR, MYF(0), name->str, "CYCLE clause");
> + return true;
> + }
this pattern is used so often that we might want to have
a helper, like, List_iterator_fast<>::find()
not in this commit, though.
> + item->common_flags|= IS_IN_WITH_CYCLE;
> + }
> + }
> unit->columns_are_renamed= true;
>
> return false;
> @@ -1425,6 +1457,21 @@ void With_clause::print(String *str, enum_query_type query_type)
> }
>
>
> +static void list_strlex_print(String *str, List<LEX_CSTRING> *list)
> +{
> + List_iterator_fast<LEX_CSTRING> li(*list);
> + bool first= TRUE;
> + while(LEX_CSTRING *col_name= li++)
> + {
> + if (first)
> + first= FALSE;
> + else
> + str->append(',');
> + str->append(col_name);
should be "append_identifier", shouldn't it?
add a test where a column name is a reserved word.
> + }
> +}
> +
> +
> /**
> @brief
> Print this with element
> @@ -1444,22 +1491,20 @@ void With_element::print(String *str, enum_query_type query_type)
> {
> List_iterator_fast<LEX_CSTRING> li(column_list);
> str->append('(');
> - for (LEX_CSTRING *col_name= li++; ; )
> - {
> - str->append(col_name);
> - col_name= li++;
> - if (!col_name)
> - {
> - str->append(')');
> - break;
> - }
> - str->append(',');
> - }
> + list_strlex_print(str, &column_list);
any other code that could make use of your list_strlex_print ?
> + str->append(')');
> }
> - str->append(STRING_WITH_LEN(" as "));
> - str->append('(');
> + str->append(STRING_WITH_LEN(" as ("));
> spec->print(str, query_type);
> str->append(')');
> +
> + if (cycle_list)
> + {
> + DBUG_ASSERT(cycle_list->elements != 0);
> + str->append(STRING_WITH_LEN(" CYCLE ("));
> + list_strlex_print(str, cycle_list);
> + str->append(") ");
> + }
> }
>
>
> diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy
> index c115d9352aa..0d60ab8579f 100644
> --- a/sql/sql_yacc.yy
> +++ b/sql/sql_yacc.yy
> @@ -14704,9 +14704,30 @@ with_list_element:
> if (elem->set_unparsed_spec(thd, spec_start, $6.pos(),
> spec_start - query_start))
> MYSQL_YYABORT;
> + if ($7)
> + {
> + elem->set_cycle_list($7);
> + }
> }
> ;
>
> +opt_cycle:
> + /* empty */
> + { $$= NULL; }
> + |
> + CYCLE_SYM
> + {
> + if (!Lex->curr_with_clause->with_recursive)
> + {
> + thd->parse_error(ER_SYNTAX_ERROR, $1.pos());
> + }
> + }
> + '(' with_column_list ')'
Where did you see that the standard requires parentheses here?
> + {
> + $$= $4;
> + }
> + ;
> +
>
> opt_with_column_list:
> /* empty */
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0
Hello Igor,
> @spetrunia: your "cleanup" of commit e637355156cb28388a291b0e3a5e9ee863b2854d
> actually changes the architecture: you forced the pushed filter to be the
> same as the filter of SQL layer.
I'm not sure I understand that.
I assume the layers are:
1. SQL layer
2. handler level (generic, with ha_xxx methods)
3. DS-MRR layer
4. The storage engine internals
handler::pushed_rowid_filter is at level 2.
Its value matches the rowid filter that has been pushed to level 4.
What the code does after my fix:
// Take the filter from level 2, put it to level 3 (DS-MRR)
rowid_filter= h_arg->pushed_rowid_filter;
// Inform level 4 that it doesn't need to do filtering
// (This also affects level 2 as handler::pushed_rowid_filter will be
// cleared)
h_arg->cancel_pushed_rowid_filter();
I see a slight problem here that handler->pushed_rowid_filter will be NULL
while actually storage engine + MRR (levels 2,3,4 together) will still be
checking the rowid filter.
But this seems to be unavoidable in current design (and it was also the case
before my cleanup patch).
> Although they are the same now for each engine where filters are allowed
> this might be not the case in future development.
I'm not sure what scenario you have in mind, could you please elaborate?
> I agree that accessing filter from the handler layer is not a clean solution,
> but in this case the owner of SQL layer filter must be the TABLE structure.
> Summing up: if we decide not to use a pushed filter in the engine we have to
> take the filter from SQL layer.
So, you're suggesting that taking the filter from handler::pushed_rowid_filter
is worse than taking it from table->reginfo.join_tab ?
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
1
0

Re: [Maria-developers] MDEV-21580: Allow packed sort keys in sort buffer
by Sergey Petrunia 20 Feb '20
by Sergey Petrunia 20 Feb '20
20 Feb '20
Hi Varun,
Please find some input below.
> commit f75829eebe96db55508cbc03c967e1c340da0cfc
> Author: Varun Gupta <varun.gupta(a)mariadb.com>
> Date: Fri Feb 7 02:30:06 2020 +0530
>
> MDEV-21580: Allow packed sort keys in sort buffer
>
> This task deals with packing the sort key inside the sort buffer, which would
> lead to efficient usage of the memory allocated for the sort buffer.
>
> The changes brought by this feature are
> 1) Sort buffers would have sort keys of variable length
> 2) The format for sort keys inside the sort buffer would look like
> |<sort_length><null_byte><key_part1><null_byte><key_part2>.......|
> sort_length is the extra bytes that are required to store the variable
> length of a sort key.
> 3) When packing of sort key is done we store the ORIGINAL VALUES inside
> the sort buffer and not the STRXFRM form (mem-comparable sort keys).
> 4) Special comparison function packed_keys_comparison() is introduced
> to compare 2 sort keys.
>
> diff --git a/mysql-test/main/mdev21580.result b/mysql-test/main/mdev21580.result
> new file mode 100644
> index 00000000000..c504b79d52f
> --- /dev/null
> +++ b/mysql-test/main/mdev21580.result
> @@ -0,0 +1,6427 @@
> +create table t1(a int);
> diff --git a/mysql-test/main/mdev21580.test b/mysql-test/main/mdev21580.test
This test is too long, I assume it will be gone after the "diff solution"
is implemented.
> diff --git a/sql/filesort.cc b/sql/filesort.cc
> index 763f9f59246..1cdc8e7af00 100644
> --- a/sql/filesort.cc
> +++ b/sql/filesort.cc
> @@ -215,16 +219,14 @@ SORT_INFO *filesort(THD *thd, TABLE *table, Filesort *filesort,
> error= 1;
> sort->found_rows= HA_POS_ERROR;
>
> - param.init_for_filesort(sortlength(thd, filesort->sortorder, s_length,
> - &multi_byte_charset),
> - table, max_rows, filesort->sort_positions);
> + param.sort_keys= sort_keys;
> + uint sort_len= sortlength(thd, sort_keys, &multi_byte_charset,
> + &allow_packing_for_sortkeys);
psergey:
I think this doesn't work as intended. Consider the two tests:
create table t2 (a text collate utf8_unicode_ci, b varchar(32));
insert into t2 select a,a from ten;
Q1: select * from t2 order by a;
Q2: select * from t2 order by a, b;
When debugging Q1, I can see allow_packing_for_sortkeys=false after the sortlength()
call. This is ok I think.
When debugging Q2, can see allow_packing_for_sortkeys= true after the
sortlength() call. This is wrong.
> @@ -491,12 +503,20 @@ uint Filesort::make_sortorder(THD *thd, JOIN *join, table_map first_table_bit)
> for (ord = order; ord; ord= ord->next)
> count++;
> if (!sortorder)
> - sortorder= (SORT_FIELD*) thd->alloc(sizeof(SORT_FIELD) * (count + 1));
> + sortorder= (SORT_FIELD*) thd->alloc(sizeof(SORT_FIELD) * count);
> + void *rawmem= thd->alloc(sizeof(Sort_keys));
> pos= sort= sortorder;
>
> if (!pos)
> DBUG_RETURN(0);
>
> + Sort_keys_array sort_keys_array(sortorder, count);
> + Sort_keys* sort_keys= new (rawmem) Sort_keys(sort_keys_array);
> +
psergey:
Why not inherit the class from Sql_alloc and use its operator new?
The above is probably correct but it makes me wonder what is it doing every
time I encounter this piece of code.
> + if (!sort_keys)
> + DBUG_RETURN(0);
psergey:
You're checking this (which cant fail) but not checking the result of
thd->alloc() call above (which can fail)?
> +
> + pos= sort_keys->begin();
> for (ord= order; ord; ord= ord->next, pos++)
> {
> Item *first= ord->item[0];
...
> +/*
> + @brief
> + Comparison function to compare two packed sort keys
> +
> + @param sort_param cmp argument
> + @param a_ptr packed sort key
> + @param b_ptr packed sort key
> +
> + @retval
> + >0 key a_ptr greater than b_ptr
> + =0 key a_ptr equal to b_ptr
> + <0 key a_ptr less than b_ptr
> +
> +*/
> +
psergey: function names typically start with a verb. Can this one follow the
convention?
> +int packed_keys_comparison(void *sort_param,
> + unsigned char **a_ptr, unsigned char **b_ptr)
> +{
> + int retval= 0;
> + size_t a_len, b_len;
> + Sort_param *param= (Sort_param*)sort_param;
> + Sort_keys *sort_keys= param->sort_keys;
> + uchar *a= *a_ptr;
> + uchar *b= *b_ptr;
> +
...
> @@ -772,6 +777,21 @@ static int rr_cmp(uchar *a,uchar *b)
> #endif
> }
>
> +
> +/**
> + Copy (unpack) values appended to sorted fields from a buffer back to
> + their regular positions specified by the Field::ptr pointers.
> +
> + @param addon_field Array of descriptors for appended fields
> + @param buff Buffer which to unpack the value from
> +
> + @note
> + The function is supposed to be used only as a callback function
> + when getting field values for the sorted result set.
> +
> + @return
> + void.
psergey: the above two lines do not have any value, please remove :)
> +*/
> template<bool Packed_addon_fields>
> inline void SORT_INFO::unpack_addon_fields(uchar *buff)
> {
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
1
1

Re: [Maria-developers] 8790c095ab0: MDEV-21704 Add a new JSON field "version_id" into mysql.global_priv.priv
by Sergei Golubchik 20 Feb '20
by Sergei Golubchik 20 Feb '20
20 Feb '20
Hi, Alexander!
Looks ok, just one test related comment.
On Feb 20, Alexander Barkov wrote:
> revision-id: 8790c095ab0 (mariadb-10.5.0-225-g8790c095ab0)
> parent(s): 2c34315df6e
> author: Alexander Barkov <bar(a)mariadb.com>
> committer: Alexander Barkov <bar(a)mariadb.com>
> timestamp: 2020-02-14 21:56:12 +0400
> message:
>
> MDEV-21704 Add a new JSON field "version_id" into mysql.global_priv.priv
>
> +--log-error=$MYSQLTEST_VARDIR/tmp/system_mysql_db_error_log.err
...
> +CREATE TABLE error_log (msg TEXT);
> +--disable_query_log
> +--eval LOAD DATA INFILE "$MYSQLTEST_VARDIR/tmp/system_mysql_db_error_log.err" INTO TABLE error_log;
> +--enable_query_log
> +
> +# On some platforms hours less than 10 can be logged without the leading zero:
> +# '2020-02-13 8:16:56' rather than
> +# '2020-02-13 08:16:56'
> +# Hence the space character in the hour part of the regular expression.
> +# Also, replace '\r' to '', so the output is equal on Linux and Windows.
> +
> +SELECT
> + REPLACE(
> + REGEXP_REPLACE(msg,
> + '^[0-9]{4}-[0-9]{2}-[0-9]{2} [ 0-9]{2}:[0-9]{2}:[0-9]{2}',
> + 'YYYY-MM-DD hh:mm:ss'),
> + '\r',''
> + ) AS msg
> +FROM error_log WHERE msg LIKE '%user%entry%';
instead of --log-error and LOAD DATA with REGEXP_REPLACE, please use
something like
--let SEARCH_FILE=$MYSQLTEST_VARDIR/log/mysqld.1.err
--let SEARCH_PATTERN="'user' entry 'bad_.*has a wrong"
--source include/search_pattern_in_file.inc
this will print that there were 4 matches. Alternatively you can use
four very specific SEARCH_PATTERN, one for each expected error.
It's easier than what you've used, keeps the error log in one file, and
doesn't run unnecessary sql on the server (otherwise debugging quickly
becomes very annoying, grr, replication tests).
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] bd98ef106c8: MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT
by Sergei Golubchik 19 Feb '20
by Sergei Golubchik 19 Feb '20
19 Feb '20
Hi, Aleksey!
1.
It should be possible to enable/disable auto-creation.
For example, CREATE TABLE ... PARTITION BY SYSTEM_TIME ... PARTITIONS AUTO;
this solves few problems at once:
* a user explicitly tells when auto-creation should work
* you don't need to worry "if this name is already occupied"
* you can define that these partitions can never overflow (for INTERVAL)
* if you know that AUTO partitions never overflow, you can keep the old
behavior for ALTER TABLE ADD PARTITION.
2.
Separate thread is an interesting solution. Nicely avoids lots of
problems. Still:
* it's asynchronous, all tests have to take it into account
* it's racy. "low probability" or not, with our number of users they'll
hit it and rightfully will report it as a bug
* if LOCK TABLE prevents it, partitions can overflow
But I think these problems are easier to solve than those we'll face if
auto-creation will happen in the main connection thread.
So, let's keep your approach with a thread.
But instead of going through the parser and mysql_execute_command,
create a function that takes a TABLE or a TABLE_SHARE. And adds a
partition there. It'll fix the "racy" part. This function can be called
from a new thread.
As for the LOCK TABLES - if you're under LOCK TABLES, you can simply
call that function (to add a partition) directly at the end of the main
INSERT/UPDATE/DELETE statement. It'll solve the last problem, LOCK
TABLES won't prevent auto-creation.
A couple of style comments:
1. create a one-liner make_partition_name (or something) with:
sprintf(move_ptr, "p%u", i)
and use it both in create_default_partition_names() and in your new
code. Just to make sure partition names are always generated uniformly
by the same function and if we'll want to change it to "pp%u" it should
be done only in one place.
2. don't overload operators, use, say,
bool greater_than(size_t seconds)
But overall it's pretty good. A nice idea with a separate thread.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
On Nov 07, Aleksey Midenkov wrote:
> revision-id: bd98ef106c8 (mariadb-10.4.7-55-gbd98ef106c8)
> parent(s): 6d1fa01e8f1
> author: Aleksey Midenkov <midenok(a)gmail.com>
> committer: Aleksey Midenkov <midenok(a)gmail.com>
> timestamp: 2019-08-19 12:01:14 +0300
> message:
>
> MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT
>
> When there are E empty partitions left, auto-create N new empty
> partitions.
>
> This scheme must not allow partition overflow. I.e. E-fill time must
> not exceed N-creation time. This means that low values for INTERVAL
> and LIMIT must not be allowed for auto-creation. In case when overflow
> is detected there is no need to do anything special: a warning will be
> issued and the user will run manual rebuild to redistribute records
> correctly. This is important because automatic ADD must be done fast,
> without forced rebuild, by the obvious reason.
>
> Initial version implements hard-coded values of 1 for E and N. As well
> as auto-creation threshold MinInterval = 1 hour, MinLimit = 1000.
>
> The name for newly added partition will be first chosen as "pX", where
> X is partition number and "p" is hard-coded name prefix. If this name
> is already occupied, the X will be incremented until the resulting
> name will be free to use.
>
> Auto-creation mechanism is applied to every table having LIMIT or
> INTERVAL clause. Note that there is no much sense in specifying
> explicit partition list in this case and this is covered by
> MDEV-19903. The syntax to explicitly turn it on/off as well as
> user-defined values for E, N and name prefix is subject for further
> discussion and feature requests.
>
> ALTER TABLE ADD PARTITION is now always fast. If there some history
> partition overflow occurs manual ALTER TABLE REBUILD PARTITION is
> needed.
2
5
Hi,
Looking for some assistance in getting a patch commited to resolve MDEV-19603.
Since Galera was upgraded from 3 to 4 the CMake bits include a hardcoded libdl
which broke the build of 10.4+ on OpenBSD.
1
0

Re: [Maria-developers] [Commits] a906aae: MDEV-21610 Different query results from 10.4.11 to 10.4.12
by Sergey Petrunia 14 Feb '20
by Sergey Petrunia 14 Feb '20
14 Feb '20
Hello Igor,
I've found a deficiency in the scheme used by the patch: it doesn't work if the
join buffer is re-filled multiple times.
On the first execution, everything works as intended. dsmrr_init() executes
these lines:
rowid_filter= h_arg->pushed_rowid_filter;
h_arg->cancel_pushed_rowid_filter();
then I can see that Mrr_ordered_rndpos_reader::refill_from_index_reader uses
the filter.
The MRR scan continues until it finishes. Then, SQL layer fills the join buffer
again, and calls multi_range_read_init() again, which calls dsmrr_init().
And here, h_arg->pushed_rowid_filter==NULL (as we've cleared it previously),
and the second MRR scan is not used anymore.
Example that I used for debugging (maybe it's larger than necessary):
create table t10 (
pk int primary key,
a int,
b int,
filler char(100),
key(a),
key(b)
);
insert into t10 select
A.a + 1000 *B.a,
A.a + 1000 *B.a,
A.a + 1000 *B.a,
'filler-data=FILLER=DATA'
from one_k A, one_k B;
create table t11 (a int);
insert into t11 select a from one_k where a < 200;
set optimizer_switch='mrr=on';
set join_cache_level=6;
set join_buffer_size=128;
MariaDB [test]> explain select * from t11, t10 where t10.a=t11.a and t10.b < 300;
+------+-------------+-------+------------+---------------+------+---------+------------+--------+-----------------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------------+---------------+------+---------+------------+--------+-----------------------------------------------------------------------------------------+
| 1 | SIMPLE | t11 | ALL | NULL | NULL | NULL | NULL | 200 | Using where |
| 1 | SIMPLE | t10 | ref|filter | a,b | a|b | 5|5 | test.t11.a | 1 (0%) | Using where; Using join buffer (flat, BKA join); Rowid-ordered scan; Using rowid filter |
+------+-------------+-------+------------+---------------+------+---------+------------+--------+-----------------------------------------------------------------------------------------+
select * from t11, t10 where t10.a=t11.a and t10.b < 300;
Ideas about the solution:
1. DS-MRR code should put the rowid filter back into the 'h_arg' handler when the scan
finishes (i.e. returns HA_ERR_END_OF_FILE)
2. DS-MRR code should store the rowid filter internally (and clear it up when
dsmrr_close() is called.
I haven't investigated which is better.
On Thu, Feb 13, 2020 at 10:55:56PM -0800, IgorBabaev wrote:
> revision-id: a906aaee26a7be57fe2db62214179476ec124486 (mariadb-10.4.11-38-ga906aae)
> parent(s): 7ea413ac2d80c7f03d1dbad90ac30ecddd8b2835
> author: Igor Babaev
> committer: Igor Babaev
> timestamp: 2020-02-13 22:55:56 -0800
> message:
>
> MDEV-21610 Different query results from 10.4.11 to 10.4.12
>
> This patch fixes the following defects/bugs.
> 1. If BKA[H] algorithm was used to join a table for which the optimizer
> had decided to employ a rowid filter the filter actually was not built.
> 2. The patch for the bug MDEV-21356 that added the code canceling pushing
> rowid filter into an engine for the table joined with employment of
> BKA[H] and MRR was not quite correct for Innodb engine because this
> cancellation was done after InnoDB code had already bound the the pushed
> filter to internal InnoDB structures.
>
> ---
> mysql-test/main/rowid_filter_innodb.result | 333 +++++++++++++++++++++++++++++
> mysql-test/main/rowid_filter_innodb.test | 153 +++++++++++++
> sql/multi_range_read.cc | 39 ++--
> sql/multi_range_read.h | 5 +-
> sql/opt_range.cc | 3 +-
> sql/sql_join_cache.cc | 2 +
> 6 files changed, 515 insertions(+), 20 deletions(-)
>
> diff --git a/mysql-test/main/rowid_filter_innodb.result b/mysql-test/main/rowid_filter_innodb.result
> index c59b95b..9423fb1 100644
> --- a/mysql-test/main/rowid_filter_innodb.result
> +++ b/mysql-test/main/rowid_filter_innodb.result
> @@ -2522,3 +2522,336 @@ id select_type table type possible_keys key key_len ref rows r_rows filtered r_f
> 1 SIMPLE t1 index a,b PRIMARY 4 NULL 3008 3008.00 1.36 0.00 Using where
> DROP TABLE t1;
> SET global innodb_stats_persistent= @stats.save;
> +#
> +# MDEV-21610: Using rowid filter with BKA+MRR
> +#
> +set @stats.save= @@innodb_stats_persistent;
> +set global innodb_stats_persistent=on;
> +CREATE TABLE acli (
> +id bigint(20) NOT NULL,
> +rid varchar(255) NOT NULL,
> +tp smallint(6) NOT NULL DEFAULT 0,
> +PRIMARY KEY (id),
> +KEY acli_rid (rid),
> +KEY acli_tp (tp)
> +) ENGINE=InnoDB DEFAULT CHARSET=utf8;
> +insert into acli(id,rid,tp) values
> +(184929059698905997,'ABABABABABABABABAB',103),
> +(184929059698905998,'ABABABABABABABABAB',121),
> +(283586039035985921,'00000000000000000000000000000000',103),
> +(2216474704108064678,'020BED6D07B741CE9B10AB2200FEF1DF',103),
> +(2216474704108064679,'020BED6D07B741CE9B10AB2200FEF1DF',121),
> +(3080602882609775593,'B5FCC8C7111E4E3CBC21AAF5012F59C2',103),
> +(3080602882609775594,'B5FCC8C7111E4E3CBC21AAF5012F59C2',121),
> +(3080602882609776594,'B5FCC8C7111E4E3CBC21AAF5012F59C2',121),
> +(3080602882609777595,'B5FCC8C7111E4E3CBC21AAF5012F59C2',121),
> +(4269412446747236214,'SCSCSCSCSCSCSCSC',103),
> +(4269412446747236215,'SCSCSCSCSCSCSCSC',121),
> +(6341490487802728356,'6072D47E513F4A4794BBAB2200FDB67D',103),
> +(6341490487802728357,'6072D47E513F4A4794BBAB2200FDB67D',121);
> +CREATE TABLE acei (
> +id bigint(20) NOT NULL,
> +aclid bigint(20) NOT NULL DEFAULT 0,
> +atp smallint(6) NOT NULL DEFAULT 0,
> +clus smallint(6) NOT NULL DEFAULT 0,
> +PRIMARY KEY (id),
> +KEY acei_aclid (aclid),
> +KEY acei_clus (clus)
> +) ENGINE=InnoDB DEFAULT CHARSET=utf8;
> +insert into acei(id,aclid,atp,clus) values
> +(184929059698905999,184929059698905997,0,1),
> +(184929059698906000,184929059698905997,0,1),
> +(184929059698906001,184929059698905997,1,1),
> +(184929059698906002,184929059698905998,1,1),
> +(283586039035985922,283586039035985921,1,1),
> +(2216474704108064684,2216474704108064678,0,1),
> +(2216474704108064685,2216474704108064678,0,1),
> +(2216474704108064686,2216474704108064678,1,1),
> +(2216474704108064687,2216474704108064679,1,1),
> +(3080602882609775595,3080602882609775593,0,1),
> +(3080602882609775596,3080602882609775593,0,1),
> +(3080602882609775597,3080602882609775593,1,1),
> +(3080602882609775598,3080602882609775594,1,1),
> +(3080602882609776595,3080602882609776594,1,1),
> +(3080602882609777596,3080602882609777595,1,1),
> +(4269412446747236216,4269412446747236214,0,1),
> +(4269412446747236217,4269412446747236214,0,1),
> +(4269412446747236218,4269412446747236214,1,1),
> +(4269412446747236219,4269412446747236215,1,1),
> +(6341490487802728358,6341490487802728356,0,1),
> +(6341490487802728359,6341490487802728356,0,1),
> +(6341490487802728360,6341490487802728356,1,1),
> +(6341490487802728361,6341490487802728357,1,1);
> +CREATE TABLE filt (
> +id bigint(20) NOT NULL,
> +aceid bigint(20) NOT NULL DEFAULT 0,
> +clid smallint(6) NOT NULL DEFAULT 0,
> +fh bigint(20) NOT NULL DEFAULT 0,
> +PRIMARY KEY (id),
> +KEY filt_aceid (aceid),
> +KEY filt_clid (clid),
> +KEY filt_fh (fh)
> +) ENGINE=InnoDB DEFAULT CHARSET=utf8;
> +insert into filt(id,aceid,clid,fh) values
> +(184929059698905999,184929059698905999,1,8948400944397203540),
> +(184929059698906000,184929059698906000,1,-3516039679025944536),
> +(184929059698906001,184929059698906001,1,-3516039679025944536),
> +(184929059698906002,184929059698906001,1,2965370193075218252),
> +(184929059698906003,184929059698906001,1,8948400944397203540),
> +(184929059698906004,184929059698906002,1,2478709353550777738),
> +(283586039035985922,283586039035985922,1,5902600816362013271),
> +(2216474704108064686,2216474704108064684,1,8948400944397203540),
> +(2216474704108064687,2216474704108064685,1,-7244708939311117030),
> +(2216474704108064688,2216474704108064686,1,-7244708939311117030),
> +(2216474704108064689,2216474704108064686,1,7489060986210282479),
> +(2216474704108064690,2216474704108064686,1,8948400944397203540),
> +(2216474704108064691,2216474704108064687,1,-3575268945274980038),
> +(3080602882609775595,3080602882609775595,1,8948400944397203540),
> +(3080602882609775596,3080602882609775596,1,-5420422472375069774),
> +(3080602882609775597,3080602882609775597,1,-5420422472375069774),
> +(3080602882609775598,3080602882609775597,1,8518228073041491534),
> +(3080602882609775599,3080602882609775597,1,8948400944397203540),
> +(3080602882609775600,3080602882609775598,1,6311439873746261694),
> +(3080602882609775601,3080602882609775598,1,6311439873746261694),
> +(3080602882609776595,3080602882609776595,1,-661101805245999843),
> +(3080602882609777596,3080602882609777596,1,-661101805245999843),
> +(3080602882609777597,3080602882609777596,1,2216865386202464067),
> +(4269412446747236216,4269412446747236216,1,8948400944397203540),
> +(4269412446747236217,4269412446747236217,1,-1143096194892676000),
> +(4269412446747236218,4269412446747236218,1,-1143096194892676000),
> +(4269412446747236219,4269412446747236218,1,5313391811364818290),
> +(4269412446747236220,4269412446747236218,1,8948400944397203540),
> +(4269412446747236221,4269412446747236219,1,7624499822621753835),
> +(6341490487802728358,6341490487802728358,1,8948400944397203540),
> +(6341490487802728359,6341490487802728359,1,8141092449587136068),
> +(6341490487802728360,6341490487802728360,1,8141092449587136068),
> +(6341490487802728361,6341490487802728360,1,1291319099896431785),
> +(6341490487802728362,6341490487802728360,1,8948400944397203540),
> +(6341490487802728363,6341490487802728361,1,6701841652906431497);
> +analyze table filt, acei, acli;
> +Table Op Msg_type Msg_text
> +test.filt analyze status Engine-independent statistics collected
> +test.filt analyze status OK
> +test.acei analyze status Engine-independent statistics collected
> +test.acei analyze status OK
> +test.acli analyze status Engine-independent statistics collected
> +test.acli analyze status OK
> +set @save_optimizer_switch=@@optimizer_switch;
> +set @save_join_cache_level=@@join_cache_level;
> +set optimizer_switch='mrr=off';
> +set join_cache_level=2;
> +set statement optimizer_switch='rowid_filter=off' for explain extended select t.id, fi.*
> +from (acli t inner join acei a on a.aclid = t.id)
> +inner join filt fi on a.id = fi.aceid
> +where
> +t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and
> +t.tp = 121 and
> +a.atp = 1 and
> +fi.fh in (6311439873746261694,-397087483897438286,
> +8518228073041491534,-5420422472375069774);
> +id select_type table type possible_keys key key_len ref rows filtered Extra
> +1 SIMPLE t index_merge PRIMARY,acli_rid,acli_tp acli_tp,acli_rid 2,767 NULL 2 100.00 Using intersect(acli_tp,acli_rid); Using where; Using index
> +1 SIMPLE a ref PRIMARY,acei_aclid acei_aclid 8 test.t.id 1 100.00 Using where
> +1 SIMPLE fi ref filt_aceid,filt_fh filt_aceid 8 test.a.id 1 17.14 Using where
> +Warnings:
> +Note 1003 select `test`.`t`.`id` AS `id`,`test`.`fi`.`id` AS `id`,`test`.`fi`.`aceid` AS `aceid`,`test`.`fi`.`clid` AS `clid`,`test`.`fi`.`fh` AS `fh` from `test`.`acli` `t` join `test`.`acei` `a` join `test`.`filt` `fi` where `test`.`t`.`tp` = 121 and `test`.`a`.`atp` = 1 and `test`.`fi`.`aceid` = `test`.`a`.`id` and `test`.`a`.`aclid` = `test`.`t`.`id` and `test`.`t`.`rid` = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and `test`.`fi`.`fh` in (6311439873746261694,-397087483897438286,8518228073041491534,-5420422472375069774)
> +set statement optimizer_switch='rowid_filter=off' for select t.id, fi.*
> +from (acli t inner join acei a on a.aclid = t.id)
> +inner join filt fi on a.id = fi.aceid
> +where
> +t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and
> +t.tp = 121 and
> +a.atp = 1 and
> +fi.fh in (6311439873746261694,-397087483897438286,
> +8518228073041491534,-5420422472375069774);
> +id id aceid clid fh
> +3080602882609775594 3080602882609775600 3080602882609775598 1 6311439873746261694
> +3080602882609775594 3080602882609775601 3080602882609775598 1 6311439873746261694
> +set statement optimizer_switch='rowid_filter=on' for explain extended select t.id, fi.*
> +from (acli t inner join acei a on a.aclid = t.id)
> +inner join filt fi on a.id = fi.aceid
> +where
> +t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and
> +t.tp = 121 and
> +a.atp = 1 and
> +fi.fh in (6311439873746261694,-397087483897438286,
> +8518228073041491534,-5420422472375069774);
> +id select_type table type possible_keys key key_len ref rows filtered Extra
> +1 SIMPLE t index_merge PRIMARY,acli_rid,acli_tp acli_tp,acli_rid 2,767 NULL 2 100.00 Using intersect(acli_tp,acli_rid); Using where; Using index
> +1 SIMPLE a ref PRIMARY,acei_aclid acei_aclid 8 test.t.id 1 100.00 Using where
> +1 SIMPLE fi ref|filter filt_aceid,filt_fh filt_aceid|filt_fh 8|8 test.a.id 1 (17%) 17.14 Using where; Using rowid filter
> +Warnings:
> +Note 1003 select `test`.`t`.`id` AS `id`,`test`.`fi`.`id` AS `id`,`test`.`fi`.`aceid` AS `aceid`,`test`.`fi`.`clid` AS `clid`,`test`.`fi`.`fh` AS `fh` from `test`.`acli` `t` join `test`.`acei` `a` join `test`.`filt` `fi` where `test`.`t`.`tp` = 121 and `test`.`a`.`atp` = 1 and `test`.`fi`.`aceid` = `test`.`a`.`id` and `test`.`a`.`aclid` = `test`.`t`.`id` and `test`.`t`.`rid` = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and `test`.`fi`.`fh` in (6311439873746261694,-397087483897438286,8518228073041491534,-5420422472375069774)
> +set statement optimizer_switch='rowid_filter=on' for select t.id, fi.*
> +from (acli t inner join acei a on a.aclid = t.id)
> +inner join filt fi on a.id = fi.aceid
> +where
> +t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and
> +t.tp = 121 and
> +a.atp = 1 and
> +fi.fh in (6311439873746261694,-397087483897438286,
> +8518228073041491534,-5420422472375069774);
> +id id aceid clid fh
> +3080602882609775594 3080602882609775600 3080602882609775598 1 6311439873746261694
> +3080602882609775594 3080602882609775601 3080602882609775598 1 6311439873746261694
> +set optimizer_switch='mrr=on';
> +set join_cache_level=6;
> +set statement optimizer_switch='rowid_filter=off' for explain extended select t.id, fi.*
> +from (acli t inner join acei a on a.aclid = t.id)
> +inner join filt fi on a.id = fi.aceid
> +where
> +t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and
> +t.tp = 121 and
> +a.atp = 1 and
> +fi.fh in (6311439873746261694,-397087483897438286,
> +8518228073041491534,-5420422472375069774);
> +id select_type table type possible_keys key key_len ref rows filtered Extra
> +1 SIMPLE t index_merge PRIMARY,acli_rid,acli_tp acli_tp,acli_rid 2,767 NULL 2 100.00 Using intersect(acli_tp,acli_rid); Using where; Using index
> +1 SIMPLE a ref PRIMARY,acei_aclid acei_aclid 8 test.t.id 1 100.00 Using where; Using join buffer (flat, BKA join); Rowid-ordered scan
> +1 SIMPLE fi ref filt_aceid,filt_fh filt_aceid 8 test.a.id 1 17.14 Using where; Using join buffer (incremental, BKA join); Rowid-ordered scan
> +Warnings:
> +Note 1003 select `test`.`t`.`id` AS `id`,`test`.`fi`.`id` AS `id`,`test`.`fi`.`aceid` AS `aceid`,`test`.`fi`.`clid` AS `clid`,`test`.`fi`.`fh` AS `fh` from `test`.`acli` `t` join `test`.`acei` `a` join `test`.`filt` `fi` where `test`.`t`.`tp` = 121 and `test`.`a`.`atp` = 1 and `test`.`fi`.`aceid` = `test`.`a`.`id` and `test`.`a`.`aclid` = `test`.`t`.`id` and `test`.`t`.`rid` = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and `test`.`fi`.`fh` in (6311439873746261694,-397087483897438286,8518228073041491534,-5420422472375069774)
> +set statement optimizer_switch='rowid_filter=off' for select t.id, fi.*
> +from (acli t inner join acei a on a.aclid = t.id)
> +inner join filt fi on a.id = fi.aceid
> +where
> +t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and
> +t.tp = 121 and
> +a.atp = 1 and
> +fi.fh in (6311439873746261694,-397087483897438286,
> +8518228073041491534,-5420422472375069774);
> +id id aceid clid fh
> +3080602882609775594 3080602882609775600 3080602882609775598 1 6311439873746261694
> +3080602882609775594 3080602882609775601 3080602882609775598 1 6311439873746261694
> +set statement optimizer_switch='rowid_filter=on' for explain extended select t.id, fi.*
> +from (acli t inner join acei a on a.aclid = t.id)
> +inner join filt fi on a.id = fi.aceid
> +where
> +t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and
> +t.tp = 121 and
> +a.atp = 1 and
> +fi.fh in (6311439873746261694,-397087483897438286,
> +8518228073041491534,-5420422472375069774);
> +id select_type table type possible_keys key key_len ref rows filtered Extra
> +1 SIMPLE t index_merge PRIMARY,acli_rid,acli_tp acli_tp,acli_rid 2,767 NULL 2 100.00 Using intersect(acli_tp,acli_rid); Using where; Using index
> +1 SIMPLE a ref PRIMARY,acei_aclid acei_aclid 8 test.t.id 1 100.00 Using where; Using join buffer (flat, BKA join); Rowid-ordered scan
> +1 SIMPLE fi ref|filter filt_aceid,filt_fh filt_aceid|filt_fh 8|8 test.a.id 1 (17%) 17.14 Using where; Using join buffer (incremental, BKA join); Rowid-ordered scan; Using rowid filter
> +Warnings:
> +Note 1003 select `test`.`t`.`id` AS `id`,`test`.`fi`.`id` AS `id`,`test`.`fi`.`aceid` AS `aceid`,`test`.`fi`.`clid` AS `clid`,`test`.`fi`.`fh` AS `fh` from `test`.`acli` `t` join `test`.`acei` `a` join `test`.`filt` `fi` where `test`.`t`.`tp` = 121 and `test`.`a`.`atp` = 1 and `test`.`fi`.`aceid` = `test`.`a`.`id` and `test`.`a`.`aclid` = `test`.`t`.`id` and `test`.`t`.`rid` = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and `test`.`fi`.`fh` in (6311439873746261694,-397087483897438286,8518228073041491534,-5420422472375069774)
> +set statement optimizer_switch='rowid_filter=on' for select t.id, fi.*
> +from (acli t inner join acei a on a.aclid = t.id)
> +inner join filt fi on a.id = fi.aceid
> +where
> +t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and
> +t.tp = 121 and
> +a.atp = 1 and
> +fi.fh in (6311439873746261694,-397087483897438286,
> +8518228073041491534,-5420422472375069774);
> +id id aceid clid fh
> +3080602882609775594 3080602882609775600 3080602882609775598 1 6311439873746261694
> +3080602882609775594 3080602882609775601 3080602882609775598 1 6311439873746261694
> +set statement optimizer_switch='rowid_filter=on' for analyze format=json select t.id, fi.*
> +from (acli t inner join acei a on a.aclid = t.id)
> +inner join filt fi on a.id = fi.aceid
> +where
> +t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and
> +t.tp = 121 and
> +a.atp = 1 and
> +fi.fh in (6311439873746261694,-397087483897438286,
> +8518228073041491534,-5420422472375069774);
> +ANALYZE
> +{
> + "query_block": {
> + "select_id": 1,
> + "r_loops": 1,
> + "r_total_time_ms": "REPLACED",
> + "table": {
> + "table_name": "t",
> + "access_type": "index_merge",
> + "possible_keys": ["PRIMARY", "acli_rid", "acli_tp"],
> + "key_length": "2,767",
> + "index_merge": {
> + "intersect": {
> + "range": {
> + "key": "acli_tp",
> + "used_key_parts": ["tp"]
> + },
> + "range": {
> + "key": "acli_rid",
> + "used_key_parts": ["rid"]
> + }
> + }
> + },
> + "r_loops": 1,
> + "rows": 2,
> + "r_rows": 3,
> + "r_total_time_ms": "REPLACED",
> + "filtered": 100,
> + "r_filtered": 100,
> + "attached_condition": "t.tp = 121 and t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2'",
> + "using_index": true
> + },
> + "block-nl-join": {
> + "table": {
> + "table_name": "a",
> + "access_type": "ref",
> + "possible_keys": ["PRIMARY", "acei_aclid"],
> + "key": "acei_aclid",
> + "key_length": "8",
> + "used_key_parts": ["aclid"],
> + "ref": ["test.t.id"],
> + "r_loops": 1,
> + "rows": 1,
> + "r_rows": 3,
> + "r_total_time_ms": "REPLACED",
> + "filtered": 100,
> + "r_filtered": 100
> + },
> + "buffer_type": "flat",
> + "buffer_size": "8Kb",
> + "join_type": "BKA",
> + "mrr_type": "Rowid-ordered scan",
> + "attached_condition": "a.atp = 1",
> + "r_filtered": 100
> + },
> + "block-nl-join": {
> + "table": {
> + "table_name": "fi",
> + "access_type": "ref",
> + "possible_keys": ["filt_aceid", "filt_fh"],
> + "key": "filt_aceid",
> + "key_length": "8",
> + "used_key_parts": ["aceid"],
> + "ref": ["test.a.id"],
> + "rowid_filter": {
> + "range": {
> + "key": "filt_fh",
> + "used_key_parts": ["fh"]
> + },
> + "rows": 6,
> + "selectivity_pct": 17.143,
> + "r_rows": 5,
> + "r_selectivity_pct": 40,
> + "r_buffer_size": "REPLACED",
> + "r_filling_time_ms": "REPLACED"
> + },
> + "r_loops": 1,
> + "rows": 1,
> + "r_rows": 2,
> + "r_total_time_ms": "REPLACED",
> + "filtered": 17.143,
> + "r_filtered": 100
> + },
> + "buffer_type": "incremental",
> + "buffer_size": "603",
> + "join_type": "BKA",
> + "mrr_type": "Rowid-ordered scan",
> + "attached_condition": "fi.fh in (6311439873746261694,-397087483897438286,8518228073041491534,-5420422472375069774)",
> + "r_filtered": 100
> + }
> + }
> +}
> +set optimizer_switch=@save_optimizer_switch;
> +set join_cache_level=@save_join_cache_level;
> +drop table filt, acei, acli;
> +set global innodb_stats_persistent= @stats.save;
> diff --git a/mysql-test/main/rowid_filter_innodb.test b/mysql-test/main/rowid_filter_innodb.test
> index 30e0ede..74349b8 100644
> --- a/mysql-test/main/rowid_filter_innodb.test
> +++ b/mysql-test/main/rowid_filter_innodb.test
> @@ -381,3 +381,156 @@ ORDER BY pk LIMIT 1;
>
> DROP TABLE t1;
> SET global innodb_stats_persistent= @stats.save;
> +
> +--echo #
> +--echo # MDEV-21610: Using rowid filter with BKA+MRR
> +--echo #
> +
> +set @stats.save= @@innodb_stats_persistent;
> +set global innodb_stats_persistent=on;
> +
> +CREATE TABLE acli (
> + id bigint(20) NOT NULL,
> + rid varchar(255) NOT NULL,
> + tp smallint(6) NOT NULL DEFAULT 0,
> + PRIMARY KEY (id),
> + KEY acli_rid (rid),
> + KEY acli_tp (tp)
> +) ENGINE=InnoDB DEFAULT CHARSET=utf8;
> +
> +insert into acli(id,rid,tp) values
> +(184929059698905997,'ABABABABABABABABAB',103),
> +(184929059698905998,'ABABABABABABABABAB',121),
> +(283586039035985921,'00000000000000000000000000000000',103),
> +(2216474704108064678,'020BED6D07B741CE9B10AB2200FEF1DF',103),
> +(2216474704108064679,'020BED6D07B741CE9B10AB2200FEF1DF',121),
> +(3080602882609775593,'B5FCC8C7111E4E3CBC21AAF5012F59C2',103),
> +(3080602882609775594,'B5FCC8C7111E4E3CBC21AAF5012F59C2',121),
> +(3080602882609776594,'B5FCC8C7111E4E3CBC21AAF5012F59C2',121),
> +(3080602882609777595,'B5FCC8C7111E4E3CBC21AAF5012F59C2',121),
> +(4269412446747236214,'SCSCSCSCSCSCSCSC',103),
> +(4269412446747236215,'SCSCSCSCSCSCSCSC',121),
> +(6341490487802728356,'6072D47E513F4A4794BBAB2200FDB67D',103),
> +(6341490487802728357,'6072D47E513F4A4794BBAB2200FDB67D',121);
> +
> +CREATE TABLE acei (
> + id bigint(20) NOT NULL,
> + aclid bigint(20) NOT NULL DEFAULT 0,
> + atp smallint(6) NOT NULL DEFAULT 0,
> + clus smallint(6) NOT NULL DEFAULT 0,
> + PRIMARY KEY (id),
> + KEY acei_aclid (aclid),
> + KEY acei_clus (clus)
> +) ENGINE=InnoDB DEFAULT CHARSET=utf8;
> +
> +insert into acei(id,aclid,atp,clus) values
> +(184929059698905999,184929059698905997,0,1),
> +(184929059698906000,184929059698905997,0,1),
> +(184929059698906001,184929059698905997,1,1),
> +(184929059698906002,184929059698905998,1,1),
> +(283586039035985922,283586039035985921,1,1),
> +(2216474704108064684,2216474704108064678,0,1),
> +(2216474704108064685,2216474704108064678,0,1),
> +(2216474704108064686,2216474704108064678,1,1),
> +(2216474704108064687,2216474704108064679,1,1),
> +(3080602882609775595,3080602882609775593,0,1),
> +(3080602882609775596,3080602882609775593,0,1),
> +(3080602882609775597,3080602882609775593,1,1),
> +(3080602882609775598,3080602882609775594,1,1),
> +(3080602882609776595,3080602882609776594,1,1),
> +(3080602882609777596,3080602882609777595,1,1),
> +(4269412446747236216,4269412446747236214,0,1),
> +(4269412446747236217,4269412446747236214,0,1),
> +(4269412446747236218,4269412446747236214,1,1),
> +(4269412446747236219,4269412446747236215,1,1),
> +(6341490487802728358,6341490487802728356,0,1),
> +(6341490487802728359,6341490487802728356,0,1),
> +(6341490487802728360,6341490487802728356,1,1),
> +(6341490487802728361,6341490487802728357,1,1);
> +
> +CREATE TABLE filt (
> + id bigint(20) NOT NULL,
> + aceid bigint(20) NOT NULL DEFAULT 0,
> + clid smallint(6) NOT NULL DEFAULT 0,
> + fh bigint(20) NOT NULL DEFAULT 0,
> + PRIMARY KEY (id),
> + KEY filt_aceid (aceid),
> + KEY filt_clid (clid),
> + KEY filt_fh (fh)
> +) ENGINE=InnoDB DEFAULT CHARSET=utf8;
> +
> +insert into filt(id,aceid,clid,fh) values
> +(184929059698905999,184929059698905999,1,8948400944397203540),
> +(184929059698906000,184929059698906000,1,-3516039679025944536),
> +(184929059698906001,184929059698906001,1,-3516039679025944536),
> +(184929059698906002,184929059698906001,1,2965370193075218252),
> +(184929059698906003,184929059698906001,1,8948400944397203540),
> +(184929059698906004,184929059698906002,1,2478709353550777738),
> +(283586039035985922,283586039035985922,1,5902600816362013271),
> +(2216474704108064686,2216474704108064684,1,8948400944397203540),
> +(2216474704108064687,2216474704108064685,1,-7244708939311117030),
> +(2216474704108064688,2216474704108064686,1,-7244708939311117030),
> +(2216474704108064689,2216474704108064686,1,7489060986210282479),
> +(2216474704108064690,2216474704108064686,1,8948400944397203540),
> +(2216474704108064691,2216474704108064687,1,-3575268945274980038),
> +(3080602882609775595,3080602882609775595,1,8948400944397203540),
> +(3080602882609775596,3080602882609775596,1,-5420422472375069774),
> +(3080602882609775597,3080602882609775597,1,-5420422472375069774),
> +(3080602882609775598,3080602882609775597,1,8518228073041491534),
> +(3080602882609775599,3080602882609775597,1,8948400944397203540),
> +(3080602882609775600,3080602882609775598,1,6311439873746261694),
> +(3080602882609775601,3080602882609775598,1,6311439873746261694),
> +(3080602882609776595,3080602882609776595,1,-661101805245999843),
> +(3080602882609777596,3080602882609777596,1,-661101805245999843),
> +(3080602882609777597,3080602882609777596,1,2216865386202464067),
> +(4269412446747236216,4269412446747236216,1,8948400944397203540),
> +(4269412446747236217,4269412446747236217,1,-1143096194892676000),
> +(4269412446747236218,4269412446747236218,1,-1143096194892676000),
> +(4269412446747236219,4269412446747236218,1,5313391811364818290),
> +(4269412446747236220,4269412446747236218,1,8948400944397203540),
> +(4269412446747236221,4269412446747236219,1,7624499822621753835),
> +(6341490487802728358,6341490487802728358,1,8948400944397203540),
> +(6341490487802728359,6341490487802728359,1,8141092449587136068),
> +(6341490487802728360,6341490487802728360,1,8141092449587136068),
> +(6341490487802728361,6341490487802728360,1,1291319099896431785),
> +(6341490487802728362,6341490487802728360,1,8948400944397203540),
> +(6341490487802728363,6341490487802728361,1,6701841652906431497);
> +
> +analyze table filt, acei, acli;
> +
> +let $q=
> +select t.id, fi.*
> +from (acli t inner join acei a on a.aclid = t.id)
> + inner join filt fi on a.id = fi.aceid
> + where
> + t.rid = 'B5FCC8C7111E4E3CBC21AAF5012F59C2' and
> + t.tp = 121 and
> + a.atp = 1 and
> + fi.fh in (6311439873746261694,-397087483897438286,
> + 8518228073041491534,-5420422472375069774);
> +
> +set @save_optimizer_switch=@@optimizer_switch;
> +set @save_join_cache_level=@@join_cache_level;
> +
> +set optimizer_switch='mrr=off';
> +set join_cache_level=2;
> +eval $without_filter explain extended $q;
> +eval $without_filter $q;
> +eval $with_filter explain extended $q;
> +eval $with_filter $q;
> +
> +set optimizer_switch='mrr=on';
> +set join_cache_level=6;
> +eval $without_filter explain extended $q;
> +eval $without_filter $q;
> +eval $with_filter explain extended $q;
> +eval $with_filter $q;
> +--source include/analyze-format.inc
> +eval $with_filter analyze format=json $q;
> +
> +set optimizer_switch=@save_optimizer_switch;
> +set join_cache_level=@save_join_cache_level;
> +
> +drop table filt, acei, acli;
> +
> +set global innodb_stats_persistent= @stats.save;
> diff --git a/sql/multi_range_read.cc b/sql/multi_range_read.cc
> index 7e4c2ed..daeb53d 100644
> --- a/sql/multi_range_read.cc
> +++ b/sql/multi_range_read.cc
> @@ -702,7 +702,8 @@ static int rowid_cmp_reverse(void *file, uchar *a, uchar *b)
> int Mrr_ordered_rndpos_reader::init(handler *h_arg,
> Mrr_index_reader *index_reader_arg,
> uint mode,
> - Lifo_buffer *buf)
> + Lifo_buffer *buf,
> + Rowid_filter *filter)
> {
> file= h_arg;
> index_reader= index_reader_arg;
> @@ -710,19 +711,7 @@ int Mrr_ordered_rndpos_reader::init(handler *h_arg,
> is_mrr_assoc= !MY_TEST(mode & HA_MRR_NO_ASSOCIATION);
> index_reader_exhausted= FALSE;
> index_reader_needs_refill= TRUE;
> -
> - /*
> - Currently usage of a rowid filter within InnoDB engine is not supported
> - if the table is accessed by the primary key.
> - With optimizer switches ''mrr' and 'mrr_sort_keys' are both enabled
> - any access by a secondary index is converted to the rndpos access. In
> - InnoDB the rndpos access is always uses the primary key.
> - Do not use pushed rowid filter if the table is accessed actually by the
> - primary key. Use the rowid filter outside the engine code (see
> - Mrr_ordered_rndpos_reader::refill_from_index_reader).
> - */
> - if (file->pushed_rowid_filter && file->primary_key_is_clustered())
> - file->cancel_pushed_rowid_filter();
> + rowid_filter= filter;
>
> return 0;
> }
> @@ -817,10 +806,8 @@ int Mrr_ordered_rndpos_reader::refill_from_index_reader()
> index_reader->position();
>
> /*
> - If the built rowid filter cannot be used at the engine level use it here.
> + If the built rowid filter cannot be used at the engine level, use it here.
> */
> - Rowid_filter *rowid_filter=
> - file->get_table()->reginfo.join_tab->rowid_filter;
> if (rowid_filter && !file->pushed_rowid_filter &&
> !rowid_filter->check((char *)index_rowid))
> continue;
> @@ -967,6 +954,7 @@ int DsMrr_impl::dsmrr_init(handler *h_arg, RANGE_SEQ_IF *seq_funcs,
> handler *h_idx;
> Mrr_ordered_rndpos_reader *disk_strategy= NULL;
> bool do_sort_keys= FALSE;
> + Rowid_filter *rowid_filter= NULL;
> DBUG_ENTER("DsMrr_impl::dsmrr_init");
> /*
> index_merge may invoke a scan on an object for which dsmrr_info[_const]
> @@ -1015,6 +1003,21 @@ int DsMrr_impl::dsmrr_init(handler *h_arg, RANGE_SEQ_IF *seq_funcs,
> if (!(keyno == table->s->primary_key && h_idx->primary_key_is_clustered()))
> {
> strategy= disk_strategy= &reader_factory.ordered_rndpos_reader;
> + if (h_arg->pushed_rowid_filter)
> + {
> + /*
> + Currently usage of a rowid filter within InnoDB engine is not supported
> + if the table is accessed by the primary key.
> + With optimizer switches ''mrr' and 'mrr_sort_keys' are both enabled
> + any access by a secondary index is converted to the rndpos access. In
> + InnoDB the rndpos access is always uses the primary key.
> + Do not use pushed rowid filter if the table is accessed actually by the
> + primary key. Use the rowid filter outside the engine code (see
> + Mrr_ordered_rndpos_reader::refill_from_index_reader).
> + */
> + rowid_filter= h_arg->pushed_rowid_filter;
> + h_arg->cancel_pushed_rowid_filter();
> + }
> }
>
> full_buf= buf->buffer;
> @@ -1101,7 +1104,7 @@ int DsMrr_impl::dsmrr_init(handler *h_arg, RANGE_SEQ_IF *seq_funcs,
> n_ranges, mode, &keypar, key_buffer,
> &buf_manager)) ||
> (res= disk_strategy->init(primary_file, index_strategy, mode,
> - &rowid_buffer)))
> + &rowid_buffer, rowid_filter)))
> {
> goto error;
> }
> diff --git a/sql/multi_range_read.h b/sql/multi_range_read.h
> index 0473fef..6be9537 100644
> --- a/sql/multi_range_read.h
> +++ b/sql/multi_range_read.h
> @@ -364,7 +364,7 @@ class Mrr_ordered_rndpos_reader : public Mrr_reader
> {
> public:
> int init(handler *file, Mrr_index_reader *index_reader, uint mode,
> - Lifo_buffer *buf);
> + Lifo_buffer *buf, Rowid_filter *filter);
> int get_next(range_id_t *range_info);
> int refill_buffer(bool initial);
> private:
> @@ -399,6 +399,9 @@ class Mrr_ordered_rndpos_reader : public Mrr_reader
> /* Buffer to store (rowid, range_id) pairs */
> Lifo_buffer *rowid_buffer;
>
> + /* Rowid filter to be checked against (if any) */
> + Rowid_filter *rowid_filter;
> +
> int refill_from_index_reader();
> };
>
> diff --git a/sql/opt_range.cc b/sql/opt_range.cc
> index c47da28..5f034c6 100644
> --- a/sql/opt_range.cc
> +++ b/sql/opt_range.cc
> @@ -2902,7 +2902,8 @@ int SQL_SELECT::test_quick_select(THD *thd, key_map keys_to_use,
> remove_nonrange_trees(¶m, tree);
>
> /* Get best 'range' plan and prepare data for making other plans */
> - if ((range_trp= get_key_scans_params(¶m, tree, FALSE, TRUE,
> + if ((range_trp= get_key_scans_params(¶m, tree,
> + only_single_index_range_scan, TRUE,
> best_read_time)))
> {
> best_trp= range_trp;
> diff --git a/sql/sql_join_cache.cc b/sql/sql_join_cache.cc
> index 3a509b3..e9ad538 100644
> --- a/sql/sql_join_cache.cc
> +++ b/sql/sql_join_cache.cc
> @@ -2248,6 +2248,8 @@ enum_nested_loop_state JOIN_CACHE::join_matching_records(bool skip_last)
> if ((rc= join_tab_execution_startup(join_tab)) < 0)
> goto finish2;
>
> + join_tab->build_range_rowid_filter_if_needed();
> +
> /* Prepare to retrieve all records of the joined table */
> if (unlikely((error= join_tab_scan->open())))
> {
> _______________________________________________
> commits mailing list
> commits(a)mariadb.org
> https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
--
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
1
0
Hi MariaDB developers,
I want to start working on this ticket but I can't assing it to me with my
current JIRA privileges.
Someone higher in the chain should do it for me or I nee to get my JIRA
account fixed?
Thanks!
2
1

Re: [Maria-developers] 6887f16b61a: MDEV-17399 Add support for JSON_TABLE.
by Sergei Golubchik 06 Feb '20
by Sergei Golubchik 06 Feb '20
06 Feb '20
Hi, Alexey!
There isn't much to review yet, I don't see how different parts will
work together yet. But here you are, some preliminary comments.
On Feb 06, Alexey Botchkov wrote:
> revision-id: 6887f16b61a (mariadb-10.5.0-153-g6887f16b61a)
> parent(s): 74f76206369
> author: Alexey Botchkov <holyfoot(a)mariadb.com>
> committer: Alexey Botchkov <holyfoot(a)mariadb.com>
> timestamp: 2020-02-04 17:12:55 +0400
> message:
>
> MDEV-17399 Add support for JSON_TABLE.
>
> Syntax for the JSON_TABLE added.
Standard JSON_TABLE syntax is quite complex.
You are doing a subset of it, I presume.
It would be good to specify here, in the comment, what syntax exactly
are you implementing.
> ---
> sql/lex.h | 5 ++
> sql/sql_digest.cc | 1 +
> sql/sql_lex.h | 3 +
> sql/sql_yacc.yy | 164 ++++++++++++++++++++++++++++++++++++++++++++++++++-
> sql/table.cc | 1 +
> sql/table.h | 2 +
> sql/table_function.h | 89 ++++++++++++++++++++++++++++
> 7 files changed, 264 insertions(+), 1 deletion(-)
>
> diff --git a/sql/lex.h b/sql/lex.h
> index f3fc1513369..774f9c1517d 100644
> --- a/sql/lex.h
> +++ b/sql/lex.h
> @@ -208,6 +208,7 @@ static SYMBOL symbols[] = {
> { "ELSE", SYM(ELSE)},
> { "ELSEIF", SYM(ELSEIF_MARIADB_SYM)},
> { "ELSIF", SYM(ELSIF_MARIADB_SYM)},
> + { "EMPTY", SYM(EMPTY_SYM)},
> { "ENABLE", SYM(ENABLE_SYM)},
> { "ENCLOSED", SYM(ENCLOSED)},
> { "END", SYM(END)},
> @@ -414,6 +415,7 @@ static SYMBOL symbols[] = {
> { "NATIONAL", SYM(NATIONAL_SYM)},
> { "NATURAL", SYM(NATURAL)},
> { "NCHAR", SYM(NCHAR_SYM)},
> + { "NESTED", SYM(NESTED_SYM)},
> { "NEVER", SYM(NEVER_SYM)},
> { "NEW", SYM(NEW_SYM)},
> { "NEXT", SYM(NEXT_SYM)},
> @@ -448,6 +450,7 @@ static SYMBOL symbols[] = {
> { "OPTIONALLY", SYM(OPTIONALLY)},
> { "OR", SYM(OR_SYM)},
> { "ORDER", SYM(ORDER_SYM)},
> + { "ORDINALITY", SYM(ORDINALITY_SYM)},
> { "OTHERS", SYM(OTHERS_MARIADB_SYM)},
> { "OUT", SYM(OUT_SYM)},
> { "OUTER", SYM(OUTER)},
> @@ -460,6 +463,7 @@ static SYMBOL symbols[] = {
> { "PAGE_CHECKSUM", SYM(PAGE_CHECKSUM_SYM)},
> { "PARSER", SYM(PARSER_SYM)},
> { "PARSE_VCOL_EXPR", SYM(PARSE_VCOL_EXPR_SYM)},
> + { "PATH", SYM(PATH_SYM)},
> { "PERIOD", SYM(PERIOD_SYM)},
> { "PARTIAL", SYM(PARTIAL)},
> { "PARTITION", SYM(PARTITION_SYM)},
> @@ -742,6 +746,7 @@ static SYMBOL sql_functions[] = {
> { "FIRST_VALUE", SYM(FIRST_VALUE_SYM)},
> { "GROUP_CONCAT", SYM(GROUP_CONCAT_SYM)},
> { "JSON_ARRAYAGG", SYM(JSON_ARRAYAGG_SYM)},
> + { "JSON_TABLE", SYM(JSON_TABLE_SYM)},
> { "JSON_OBJECTAGG", SYM(JSON_OBJECTAGG_SYM)},
> { "LAG", SYM(LAG_SYM)},
> { "LEAD", SYM(LEAD_SYM)},
you've added them all as reserved words, is it intentonal?
> diff --git a/sql/sql_lex.h b/sql/sql_lex.h
> index 308bebd9fa1..591de1e1583 100644
> --- a/sql/sql_lex.h
> +++ b/sql/sql_lex.h
> @@ -33,6 +33,7 @@
> #include "sql_tvc.h"
> #include "item.h"
> #include "sql_limit.h" // Select_limit_counters
> +#include "table_function.h" // Json_table_column
>
> /* Used for flags of nesting constructs */
> #define SELECT_NESTING_MAP_SIZE 64
> @@ -3249,6 +3250,8 @@ struct LEX: public Query_tables_list
> SQL_I_List<ORDER> proc_list;
> SQL_I_List<TABLE_LIST> auxiliary_table_list, save_list;
> Column_definition *last_field;
> + Json_table_column *cur_json_table_column, *json_table_column_nest;
> + Table_function_json_table *json_table;
Did you have to put everything in LEX?
It's pretty big already and it's never fully used, its members allocate
memory all together at the same time, but used for different parts of
the grammar, different statements.
Could you try to pass that through the parser stack?
> Item_sum *in_sum_func;
> udf_func udf;
> HA_CHECK_OPT check_opt; // check/repair options
> diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy
> index c00ba42846a..5ef1bbd0721 100644
> --- a/sql/sql_yacc.yy
> +++ b/sql/sql_yacc.yy
> @@ -207,6 +208,8 @@ void _CONCAT_UNDERSCORED(turn_parser_debug_on,yyparse)()
> Lex_for_loop_st for_loop;
> Lex_for_loop_bounds_st for_loop_bounds;
> Lex_trim_st trim;
> + Json_table_column::On_response json_on_response;
> +
> vers_history_point_t vers_history_point;
> struct
> {
> @@ -225,6 +228,7 @@ void _CONCAT_UNDERSCORED(turn_parser_debug_on,yyparse)()
>
> /* pointers */
> Create_field *create_field;
> + Json_table_column *json_table_column;
> Spvar_definition *spvar_definition;
> Row_definition_list *spvar_definition_list;
> const Type_handler *type_handler;
> @@ -308,6 +312,7 @@ void _CONCAT_UNDERSCORED(turn_parser_debug_on,yyparse)()
> enum vers_kind_t vers_range_unit;
> enum Column_definition::enum_column_versioning vers_column_versioning;
> enum plsql_cursor_attr_t plsql_cursor_attr;
> + enum Json_table_column::enum_on_type json_on_type;
> }
>
> %{
> @@ -489,6 +494,7 @@ End SQL_MODE_ORACLE_SPECIFIC */
> %token <kwd> ELSEIF_MARIADB_SYM
> %token <kwd> ELSE /* SQL-2003-R */
> %token <kwd> ELSIF_ORACLE_SYM /* PLSQL-R */
> +%token <kwd> EMPTY_SYM /* SQL-2016-R */
> %token <kwd> ENCLOSED
> %token <kwd> ESCAPED
> %token <kwd> EXCEPT_SYM /* SQL-2003-R */
> @@ -506,6 +512,7 @@ End SQL_MODE_ORACLE_SPECIFIC */
> %token <kwd> GRANT /* SQL-2003-R */
> %token <kwd> GROUP_CONCAT_SYM
> %token <rwd> JSON_ARRAYAGG_SYM
> +%token <rwd> JSON_TABLE_SYM
> %token <rwd> JSON_OBJECTAGG_SYM
> %token <kwd> GROUP_SYM /* SQL-2003-R */
> %token <kwd> HAVING /* SQL-2003-R */
> @@ -564,6 +571,7 @@ End SQL_MODE_ORACLE_SPECIFIC */
> %token <kwd> MOD_SYM /* SQL-2003-N */
> %token <kwd> NATURAL /* SQL-2003-R */
> %token <kwd> NEG
> +%token <kwd> NESTED_SYM /* SQL-2003-N */
No, NESTED is not in the SQL:2003 list of non-reserved words.
Please specify a correct comment tag for every keyword you're adding here.
> %token <kwd> NOT_SYM /* SQL-2003-R */
> %token <kwd> NO_WRITE_TO_BINLOG
> %token <kwd> NOW_SYM
> @@ -575,6 +583,7 @@ End SQL_MODE_ORACLE_SPECIFIC */
> %token <kwd> OPTIMIZE
> %token <kwd> OPTIONALLY
> %token <kwd> ORDER_SYM /* SQL-2003-R */
> +%token <kwd> ORDINALITY_SYM /* SQL-2003-N */
> %token <kwd> OR_SYM /* SQL-2003-R */
> %token <kwd> OTHERS_ORACLE_SYM /* SQL-2011-N, PLSQL-R */
> %token <kwd> OUTER
> @@ -585,6 +594,7 @@ End SQL_MODE_ORACLE_SPECIFIC */
> %token <kwd> PAGE_CHECKSUM_SYM
> %token <kwd> PARSE_VCOL_EXPR_SYM
> %token <kwd> PARTITION_SYM /* SQL-2003-R */
> +%token <kwd> PATH_SYM /* SQL-2003-N */
> %token <kwd> PERCENTILE_CONT_SYM
> %token <kwd> PERCENTILE_DISC_SYM
> %token <kwd> PERCENT_RANK_SYM
> @@ -1336,12 +1346,16 @@ End SQL_MODE_ORACLE_SPECIFIC */
>
> %type <type_handler> int_type real_type
>
> +%type <json_on_type> json_empty_or_error
> +%type <json_on_response> json_on_response
> +
> %type <Lex_field_type> type_with_opt_collate field_type
> field_type_numeric
> field_type_string
> field_type_lob
> field_type_temporal
> field_type_misc
> + json_table_field_type
>
> %type <Lex_dyncol_type> opt_dyncol_type dyncol_type
> numeric_dyncol_type temporal_dyncol_type string_dyncol_type
> @@ -1487,7 +1501,7 @@ End SQL_MODE_ORACLE_SPECIFIC */
> table_primary_derived table_primary_derived_opt_parens
> derived_table_list table_reference_list_parens
> nested_table_reference_list join_table_parens
> - update_table_list
> + update_table_list table_function
> %type <date_time_type> date_time_type;
> %type <interval> interval
>
> @@ -1642,6 +1656,9 @@ End SQL_MODE_ORACLE_SPECIFIC */
> opt_delete_gtid_domain
> asrow_attribute
> opt_constraint_no_id
> + json_table_columns_clause json_table_columns_list json_table_column
> + json_table_column_type json_opt_on_empty_or_error
> +
>
> %type <NONE> call sp_proc_stmts sp_proc_stmts1 sp_proc_stmt
> %type <NONE> sp_if_then_statements sp_case_then_statements
> @@ -11320,6 +11337,150 @@ join_table_list:
> derived_table_list { MYSQL_YYABORT_UNLESS($$=$1); }
> ;
>
> +json_table_columns_clause:
> + COLUMNS '(' json_table_columns_list ')'
> + {}
> + ;
> +
> +json_table_columns_list:
> + json_table_column
> + | json_table_columns_list ',' json_table_column
> + {}
> + ;
> +
> +json_table_column:
> + ident
> + {
> + LEX *lex=Lex;
> + Create_field *f= new (thd->mem_root) Create_field();
> +
> + if (unlikely(check_string_char_length(&$1, 0, NAME_CHAR_LEN,
> + system_charset_info, 1)))
> + my_yyabort_error((ER_TOO_LONG_IDENT, MYF(0), $1.str));
> +
> + lex->cur_json_table_column=
> + new (thd->mem_root) Json_table_column(f, lex->json_table_column_nest);
> +
> + if (unlikely(!f || !lex->cur_json_table_column))
> + MYSQL_YYABORT;
> +
> + lex->init_last_field(f, &$1, NULL);
> + }
> + json_table_column_type
> + {
> + LEX *lex=Lex;
> + lex->json_table->m_columns.push_back(lex->cur_json_table_column,
> + thd->mem_root);
> + }
> + ;
> +
> +json_table_column_type:
> + FOR_SYM ORDINALITY_SYM
> + {
> + Lex->cur_json_table_column->set(Json_table_column::FOR_ORDINALITY);
> + }
> + | json_table_field_type PATH_SYM TEXT_STRING_sys
> + json_opt_on_empty_or_error
> + {
> + Lex->last_field->set_attributes(thd, $1, Lex->charset,
> + COLUMN_DEFINITION_TABLE_FIELD);
> + Lex->cur_json_table_column->set(Json_table_column::PATH, &$3);
> + }
> + | json_table_field_type EXISTS PATH_SYM TEXT_STRING_sys
> + {
> + Lex->last_field->set_attributes(thd, $1, Lex->charset,
> + COLUMN_DEFINITION_TABLE_FIELD);
> + Lex->cur_json_table_column->set(Json_table_column::EXISTS_PATH, &$4);
> + }
> + | NESTED_SYM PATH_SYM TEXT_STRING_sys
> + {
> + LEX *lex=Lex;
> + lex->cur_json_table_column->set(Json_table_column::NESTED_PATH);
> + lex->json_table_column_nest= lex->cur_json_table_column;
> + }
> + json_table_columns_clause
> + {
> + LEX *lex=Lex;
> + lex->cur_json_table_column= lex->json_table_column_nest;
> + lex->json_table_column_nest= lex->cur_json_table_column->m_nest;
> + }
> + ;
> +
> +json_table_field_type:
> + field_type_numeric
> + | field_type_temporal
> + | field_type_string
> + ;
> +
> +json_opt_on_empty_or_error:
> + /* none */
> + {}
> + | json_on_response ON json_empty_or_error json_opt_on_empty_or_error
> + {
> + if ($3 == Json_table_column::ON_EMPTY)
> + {
> + Lex->cur_json_table_column->m_on_empty= $1;
> + }
> + else /* ON_ERROR */
> + {
> + Lex->cur_json_table_column->m_on_error= $1;
> + }
> + }
> + ;
> +
> +json_on_response:
> + ERROR_SYM
> + {
> + $$.m_response= Json_table_column::RESPONSE_ERROR;
> + }
> + | NULL_SYM
> + {
> + $$.m_response= Json_table_column::RESPONSE_NULL;
> + }
> + | DEFAULT TEXT_STRING_sys
> + {
> + $$.m_response= Json_table_column::RESPONSE_DEFAULT;
> + $$.m_default= $1;
> + }
> + ;
> +
> +json_empty_or_error:
> + ERROR_SYM
> + {
> + $$= Json_table_column::ON_EMPTY;
> + }
> + | EMPTY_SYM
> + {
> + $$= Json_table_column::ON_ERROR;
> + }
> + ;
> +
> +table_function:
> + JSON_TABLE_SYM '(' expr ',' TEXT_STRING_sys
difficult to see what you're parsing without a syntax specs in the
commit comment.
> + {
> + Table_function_json_table *jt=
> + new (thd->mem_root) Table_function_json_table($3, &$5);
> + if (unlikely(!jt))
> + MYSQL_YYABORT;
> + Lex->json_table= jt;
> + Lex->json_table_column_nest= NULL;
> + }
> + json_table_columns_clause ')' AS ident_table_alias
> + {
> +LEX_CSTRING tn;
> + SELECT_LEX *sel= Select;
> + sel->table_join_options= 0;
> +tn.str= "t1";
> +tn.length= 2;
what's that? just to make the code compile for now?
why not to use $10 - this is your ident_table_alias, currently it's
ignored.
> + if (!($$= Select->add_table_to_list(thd,
> + new (thd->mem_root) Table_ident(&tn), &tn,
> + Select->get_table_join_options(),
> + YYPS->m_lock_type,
> + YYPS->m_mdl_type)))
> + MYSQL_YYABORT;
> + }
> + ;
> +
> /*
> The ODBC escape syntax for Outer Join is: '{' OJ join_table '}'
> The parser does not define OJ as a token, any ident is accepted
> @@ -11527,6 +11688,7 @@ table_factor:
> $$= $1;
> }
> | table_reference_list_parens { $$= $1; }
> + | table_function { $$= $1; }
> ;
>
> table_primary_ident_opt_parens:
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

[Maria-developers] [DISCUSS] Don't see ARM release information in MariaDB download page
by bo zhaobo 03 Feb '20
by bo zhaobo 03 Feb '20
03 Feb '20
Hi Team,
I want to make clear a thing about the Mariadb download page. Hope team
could give some advices or help. As I had sent out an discussion to
maria-docs, but didn't get some help. So I think I may get something in dev
channel.
When I want to download a aarch64 mariadb package, I'm failed to search any
information in the content of page [1].
But I tried to search the RPM/DEB package, such as in '*Red Hat, Fedora,
and CentOS Packages*' and '*Debian and Ubuntu Packages*', and
I can see there are ARM(aarch64/arm64) specific packages in these linked
pages [2].
And documentation in [1] just say OS/CPU with "*RedHat/CentOS/Fedora (x86,
x86_64, ppc64, ppc64le)*". It seems that there is difference
between the documentation with real released packages.
That's why I'm here to want to make it clear and discuss with team.
*How about add 'aarch64' in the OS/CPU description?*
I think that would be great if community could fix that, because it is good
for users. They could get what they want from MariaDB download page(MariaDB
docs related).
Thanks very much.
BR
ZhaoBo
[1] https://downloads.mariadb.org/mariadb/10.5.0/
[2] http://ftp.igh.cnrs.fr/pub/mariadb//mariadb-10.5.0/yum/
http://ftp.igh.cnrs.fr/pub/mariadb//mariadb-10.5.0/repo/
[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=sig…>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=sig…>
20/01/06
下午07:31:54
3
7

778e96749bc: MDEV-20923:UBSAN: member access within address … which does not point to an object of type 'xid_count_per_binlog'
by sujatha 23 Jan '20
by sujatha 23 Jan '20
23 Jan '20
revision-id: 778e96749bcadc5528bb7f214a711272f7413b96 (mariadb-10.1.43-66-g778e96749bc)
parent(s): 982294ac1680938ac9223fb64a64e21f0cbc322a
author: Sujatha
committer: Sujatha
timestamp: 2020-01-23 16:17:55 +0530
message:
MDEV-20923:UBSAN: member access within address … which does not point to an object of type 'xid_count_per_binlog'
Problem:
-------
Accessing a member within 'xid_count_per_binlog' structure results in
following error when 'UBSAN' is enabled.
member access within address 0xXXX which does not point to an object of type
'xid_count_per_binlog'
Analysis:
---------
The problem appears to be that no constructor for 'xid_count_per_binlog' is
being called, and thus the vtable will not be initialized.
Fix:
---
Defined a parameterized constructor for 'xid_count_per_binlog' class.
---
sql/log.cc | 27 +++++++++++----------------
sql/log.h | 13 ++++++++++++-
2 files changed, 23 insertions(+), 17 deletions(-)
diff --git a/sql/log.cc b/sql/log.cc
index acf1f8f8a9c..0efef6d1e29 100644
--- a/sql/log.cc
+++ b/sql/log.cc
@@ -3216,7 +3216,7 @@ void MYSQL_BIN_LOG::cleanup()
DBUG_ASSERT(!binlog_xid_count_list.head());
WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::cleanup(): Removing xid_list_entry "
"for %s (%lu)", b);
- my_free(b);
+ delete b;
}
mysql_mutex_destroy(&LOCK_log);
@@ -3580,17 +3580,9 @@ bool MYSQL_BIN_LOG::open(const char *log_name,
*/
uint off= dirname_length(log_file_name);
uint len= strlen(log_file_name) - off;
- char *entry_mem, *name_mem;
- if (!(new_xid_list_entry = (xid_count_per_binlog *)
- my_multi_malloc(MYF(MY_WME),
- &entry_mem, sizeof(xid_count_per_binlog),
- &name_mem, len,
- NULL)))
+ new_xid_list_entry= new xid_count_per_binlog(log_file_name+off, (int)len);
+ if (!new_xid_list_entry)
goto err;
- memcpy(name_mem, log_file_name+off, len);
- new_xid_list_entry->binlog_name= name_mem;
- new_xid_list_entry->binlog_name_len= len;
- new_xid_list_entry->xid_count= 0;
/*
Find the name for the Initial binlog checkpoint.
@@ -3607,7 +3599,10 @@ bool MYSQL_BIN_LOG::open(const char *log_name,
mysql_mutex_unlock(&LOCK_xid_list);
if (!b)
b= new_xid_list_entry;
- strmake(buf, b->binlog_name, b->binlog_name_len);
+ if (b->binlog_name)
+ strmake(buf, b->binlog_name, b->binlog_name_len);
+ else
+ goto err;
Binlog_checkpoint_log_event ev(buf, len);
DBUG_EXECUTE_IF("crash_before_write_checkpoint_event",
flush_io_cache(&log_file);
@@ -3711,7 +3706,7 @@ bool MYSQL_BIN_LOG::open(const char *log_name,
{
WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::open(): Removing xid_list_entry for "
"%s (%lu)", b);
- my_free(binlog_xid_count_list.get());
+ delete binlog_xid_count_list.get();
}
mysql_cond_broadcast(&COND_xid_list);
WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::open(): Adding new xid_list_entry for "
@@ -3758,7 +3753,7 @@ Turning logging off for the whole duration of the MySQL server process. \
To turn it on again: fix the cause, \
shutdown the MySQL server and restart it.", name, errno);
if (new_xid_list_entry)
- my_free(new_xid_list_entry);
+ delete new_xid_list_entry;
if (file >= 0)
mysql_file_close(file, MYF(0));
close(LOG_CLOSE_INDEX);
@@ -4252,7 +4247,7 @@ bool MYSQL_BIN_LOG::reset_logs(THD *thd, bool create_new_log,
DBUG_ASSERT(b->xid_count == 0);
WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::reset_logs(): Removing "
"xid_list_entry for %s (%lu)", b);
- my_free(binlog_xid_count_list.get());
+ delete binlog_xid_count_list.get();
}
mysql_cond_broadcast(&COND_xid_list);
reset_master_pending--;
@@ -9736,7 +9731,7 @@ TC_LOG_BINLOG::mark_xid_done(ulong binlog_id, bool write_checkpoint)
break;
WSREP_XID_LIST_ENTRY("TC_LOG_BINLOG::mark_xid_done(): Removing "
"xid_list_entry for %s (%lu)", b);
- my_free(binlog_xid_count_list.get());
+ delete binlog_xid_count_list.get();
}
mysql_mutex_unlock(&LOCK_xid_list);
diff --git a/sql/log.h b/sql/log.h
index b4c9b24a3a9..277e5c6f69c 100644
--- a/sql/log.h
+++ b/sql/log.h
@@ -587,7 +587,18 @@ class MYSQL_BIN_LOG: public TC_LOG, private MYSQL_LOG
long xid_count;
/* For linking in requests to the binlog background thread. */
xid_count_per_binlog *next_in_queue;
- xid_count_per_binlog(); /* Give link error if constructor used. */
+ xid_count_per_binlog(char *log_file_name, uint log_file_name_len)
+ :binlog_id(0), xid_count(0)
+ {
+ binlog_name_len= log_file_name_len;
+ binlog_name= (char *) my_malloc(binlog_name_len, MYF(MY_ZEROFILL));
+ if (binlog_name)
+ memcpy(binlog_name, log_file_name, binlog_name_len);
+ }
+ ~xid_count_per_binlog()
+ {
+ my_free(binlog_name);
+ }
};
I_List<xid_count_per_binlog> binlog_xid_count_list;
mysql_mutex_t LOCK_binlog_background_thread;
1
0

d9716cfadb8: MDEV-20923:UBSAN: member access within address … which does not point to an object of type 'xid_count_per_binlog'
by sujatha 23 Jan '20
by sujatha 23 Jan '20
23 Jan '20
revision-id: d9716cfadb8cd3448f25d8b411451f87064cecaf (mariadb-10.1.43-66-gd9716cfadb8)
parent(s): 982294ac1680938ac9223fb64a64e21f0cbc322a
author: Sujatha
committer: Sujatha
timestamp: 2020-01-23 16:00:41 +0530
message:
MDEV-20923:UBSAN: member access within address … which does not point to an object of type 'xid_count_per_binlog'
Problem:
-------
Accessing a member within 'xid_count_per_binlog' structure results in
following error when 'UBSAN' is enabled.
member access within address 0xXXX which does not point to an object of type
'xid_count_per_binlog'
Analysis:
---------
The problem appears to be that no constructor for 'xid_count_per_binlog' is
being called, and thus the vtable will not be initialized.
Fix:
---
Defined a parameterized constructor for 'xid_count_per_binlog' class.
---
sql/log.cc | 27 +++++++++++----------------
sql/log.h | 14 +++++++++++++-
2 files changed, 24 insertions(+), 17 deletions(-)
diff --git a/sql/log.cc b/sql/log.cc
index acf1f8f8a9c..0efef6d1e29 100644
--- a/sql/log.cc
+++ b/sql/log.cc
@@ -3216,7 +3216,7 @@ void MYSQL_BIN_LOG::cleanup()
DBUG_ASSERT(!binlog_xid_count_list.head());
WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::cleanup(): Removing xid_list_entry "
"for %s (%lu)", b);
- my_free(b);
+ delete b;
}
mysql_mutex_destroy(&LOCK_log);
@@ -3580,17 +3580,9 @@ bool MYSQL_BIN_LOG::open(const char *log_name,
*/
uint off= dirname_length(log_file_name);
uint len= strlen(log_file_name) - off;
- char *entry_mem, *name_mem;
- if (!(new_xid_list_entry = (xid_count_per_binlog *)
- my_multi_malloc(MYF(MY_WME),
- &entry_mem, sizeof(xid_count_per_binlog),
- &name_mem, len,
- NULL)))
+ new_xid_list_entry= new xid_count_per_binlog(log_file_name+off, (int)len);
+ if (!new_xid_list_entry)
goto err;
- memcpy(name_mem, log_file_name+off, len);
- new_xid_list_entry->binlog_name= name_mem;
- new_xid_list_entry->binlog_name_len= len;
- new_xid_list_entry->xid_count= 0;
/*
Find the name for the Initial binlog checkpoint.
@@ -3607,7 +3599,10 @@ bool MYSQL_BIN_LOG::open(const char *log_name,
mysql_mutex_unlock(&LOCK_xid_list);
if (!b)
b= new_xid_list_entry;
- strmake(buf, b->binlog_name, b->binlog_name_len);
+ if (b->binlog_name)
+ strmake(buf, b->binlog_name, b->binlog_name_len);
+ else
+ goto err;
Binlog_checkpoint_log_event ev(buf, len);
DBUG_EXECUTE_IF("crash_before_write_checkpoint_event",
flush_io_cache(&log_file);
@@ -3711,7 +3706,7 @@ bool MYSQL_BIN_LOG::open(const char *log_name,
{
WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::open(): Removing xid_list_entry for "
"%s (%lu)", b);
- my_free(binlog_xid_count_list.get());
+ delete binlog_xid_count_list.get();
}
mysql_cond_broadcast(&COND_xid_list);
WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::open(): Adding new xid_list_entry for "
@@ -3758,7 +3753,7 @@ Turning logging off for the whole duration of the MySQL server process. \
To turn it on again: fix the cause, \
shutdown the MySQL server and restart it.", name, errno);
if (new_xid_list_entry)
- my_free(new_xid_list_entry);
+ delete new_xid_list_entry;
if (file >= 0)
mysql_file_close(file, MYF(0));
close(LOG_CLOSE_INDEX);
@@ -4252,7 +4247,7 @@ bool MYSQL_BIN_LOG::reset_logs(THD *thd, bool create_new_log,
DBUG_ASSERT(b->xid_count == 0);
WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::reset_logs(): Removing "
"xid_list_entry for %s (%lu)", b);
- my_free(binlog_xid_count_list.get());
+ delete binlog_xid_count_list.get();
}
mysql_cond_broadcast(&COND_xid_list);
reset_master_pending--;
@@ -9736,7 +9731,7 @@ TC_LOG_BINLOG::mark_xid_done(ulong binlog_id, bool write_checkpoint)
break;
WSREP_XID_LIST_ENTRY("TC_LOG_BINLOG::mark_xid_done(): Removing "
"xid_list_entry for %s (%lu)", b);
- my_free(binlog_xid_count_list.get());
+ delete binlog_xid_count_list.get();
}
mysql_mutex_unlock(&LOCK_xid_list);
diff --git a/sql/log.h b/sql/log.h
index b4c9b24a3a9..e69c9c39eaf 100644
--- a/sql/log.h
+++ b/sql/log.h
@@ -587,7 +587,19 @@ class MYSQL_BIN_LOG: public TC_LOG, private MYSQL_LOG
long xid_count;
/* For linking in requests to the binlog background thread. */
xid_count_per_binlog *next_in_queue;
- xid_count_per_binlog(); /* Give link error if constructor used. */
+ xid_count_per_binlog(char *log_file_name, uint log_file_name_len)
+ :binlog_name(log_file_name), binlog_name_len(log_file_name_len),
+ binlog_id(0), xid_count(0)
+ {
+ binlog_name_len= log_file_name_len;
+ binlog_name= (char *) my_malloc(binlog_name_len, MYF(MY_ZEROFILL));
+ if (binlog_name)
+ memcpy(binlog_name, log_file_name, binlog_name_len);
+ }
+ ~xid_count_per_binlog()
+ {
+ my_free(binlog_name);
+ }
};
I_List<xid_count_per_binlog> binlog_xid_count_list;
mysql_mutex_t LOCK_binlog_background_thread;
1
0

[Maria-developers] 43f27ae4778: MDEV-21490: binlog tests fail with valgrind: Conditional jump or move depends on uninitialised value in sql_ex_info::init
by sujatha 23 Jan '20
by sujatha 23 Jan '20
23 Jan '20
revision-id: 43f27ae4778b3e7be86897e49c182dc05080fd88 (mariadb-10.1.43-66-g43f27ae4778)
parent(s): 982294ac1680938ac9223fb64a64e21f0cbc322a
author: Sujatha
committer: Sujatha
timestamp: 2020-01-23 11:57:46 +0530
message:
MDEV-21490: binlog tests fail with valgrind: Conditional jump or move depends on uninitialised value in sql_ex_info::init
Problem:
=======
P1) Conditional jump or move depends on uninitialised value(s)
sql_ex_info::init(char const*, char const*, bool) (log_event.cc:3083)
code: All the following variables are not initialized.
----
return ((cached_new_format != -1) ? cached_new_format :
(cached_new_format=(field_term_len > 1 || enclosed_len > 1 ||
line_term_len > 1 || line_start_len > 1 || escaped_len > 1)));
P2) Conditional jump or move depends on uninitialised value(s)
Rows_log_event::Rows_log_event(char const*, unsigned
int, Format_description_log_event const*) (log_event.cc:9571)
Code: Uninitialized values is reported for 'var_header_len' variable.
----
if (var_header_len < 2 || event_len < static_cast<unsigned
int>(var_header_len + (post_start - buf)))
P3) Conditional jump or move depends on uninitialised value(s)
Table_map_log_event::pack_info(Protocol*) (log_event.cc:11553)
code:'m_table_id' is uninitialized.
----
void Table_map_log_event::pack_info(Protocol *protocol)
...
size_t bytes= my_snprintf(buf, sizeof(buf), "table_id: %lu (%s.%s)",
m_table_id, m_dbnam, m_tblnam);
Fix:
===
P1 - Fix)
Initialize cached_new_format,field_term_len, enclosed_len, line_term_len,
line_start_len, escaped_len members in default constructor.
P2 - Fix)
"var_header_len" is initialized by reading the event buffer. In case of an
invalid event the buffer will contain invalid data. Hence added a check to
validate the event data. If event_len is smaller than valid header length
return immediately.
P3 - Fix)
'm_table_id' within Table_map_log_event is initialized by reading data from
the event buffer. Use 'VALIDATE_BYTES_READ' macro to validate the current
state of the buffer. If it is invalid return immediately.
---
sql/log_event.cc | 7 +++++++
sql/log_event.h | 9 ++++++++-
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/sql/log_event.cc b/sql/log_event.cc
index e8881c77f2b..9924b9a0493 100644
--- a/sql/log_event.cc
+++ b/sql/log_event.cc
@@ -9535,6 +9535,12 @@ Rows_log_event::Rows_log_event(const char *buf, uint event_len,
uint8 const post_header_len= description_event->post_header_len[event_type-1];
+ if (event_len < (uint)(common_header_len + post_header_len))
+ {
+ m_cols.bitmap= 0;
+ DBUG_VOID_RETURN;
+ }
+
DBUG_PRINT("enter",("event_len: %u common_header_len: %d "
"post_header_len: %d",
event_len, common_header_len,
@@ -11043,6 +11049,7 @@ Table_map_log_event::Table_map_log_event(const char *buf, uint event_len,
const char *post_start= buf + common_header_len;
post_start+= TM_MAPID_OFFSET;
+ VALIDATE_BYTES_READ(post_start, buf, event_len);
if (post_header_len == 6)
{
/* Master is of an intermediate source tree before 5.1.4. Id is 4 bytes */
diff --git a/sql/log_event.h b/sql/log_event.h
index 2c8dc3d7353..04fce65faeb 100644
--- a/sql/log_event.h
+++ b/sql/log_event.h
@@ -2057,7 +2057,14 @@ class Query_log_event: public Log_event
****************************************************************************/
struct sql_ex_info
{
- sql_ex_info() {} /* Remove gcc warning */
+ sql_ex_info():
+ cached_new_format(-1),
+ field_term_len(0),
+ enclosed_len(0),
+ line_term_len(0),
+ line_start_len(0),
+ escaped_len(0)
+ {} /* Remove gcc warning */
const char* field_term;
const char* enclosed;
const char* line_term;
1
0
Hi Georg,
Can you please review this cleanup patch?
It's a prerequisite for:
MDEV-17832 Protocol: extensions for Pluggable types and JSON, GEOMETRY
The idea is that this patch introduces a new file
libmariadb/mariadb_priv.h and adds prototypes of these functions:
void free_rows(MYSQL_DATA *cur);
int ma_multi_command(MYSQL *mysql, enum enum_multi_status status);
MYSQL_FIELD * unpack_fields(MYSQL_DATA *data,
MA_MEM_ROOT *alloc,uint fields,
my_bool default_value);
and removes these prototypes from mariadb_stmt.c.
I think it should be safer this way.
Later I'll add there more prototypes into libmariadb/mariadb_priv.h,
e.g. functions that will be added under terms of MDEV-17832.
Note, I also removed the "my_bool long_flag_protocol" parameter
from unpack_fields(), it's not used anyway.
Under terms of MDEV-17832 I'll have to add a "const MYSQL *mysql"
parameter. So if we ever need to check CLIENT_LONG_FLAG inside
unpack_fields() again, it can be done through the "mysql" parameter:
(mysql->server_capabilities & CLIENT_LONG_FLAG)
Thanks!
1
0
Hi Georg,
Can you please review this cleanup patch?
It's a prerequisite for:
MDEV-17832 Protocol: extensions for Pluggable types and JSON, GEOMETRY
The idea is that this patch introduces a new file
libmariadb/mariadb_priv.h and adds prototypes of these functions:
void free_rows(MYSQL_DATA *cur);
int ma_multi_command(MYSQL *mysql, enum enum_multi_status status);
MYSQL_FIELD * unpack_fields(MYSQL_DATA *data,
MA_MEM_ROOT *alloc,uint fields,
my_bool default_value);
and removes these prototypes from mariadb_stmt.c.
I think it should be safer this way.
Later I'll add there more prototypes into libmariadb/mariadb_priv.h,
e.g. functions that will be added under terms of MDEV-17832.
Note, I also removed the "my_bool long_flag_protocol" parameter
from unpack_fields(), it's not used anyway.
Under terms of MDEV-17832 I'll have to add a "const MYSQL *mysql"
parameter. So if we ever need to check CLIENT_LONG_FLAG inside
unpack_fields() again, it can be done through the "mysql" parameter:
(mysql->server_capabilities & CLIENT_LONG_FLAG)
Thanks!
1
0

Re: [Maria-developers] 9d2c63d45e4: MDEV-20076: SHOW GRANTS does not quote role names properly
by Sergei Golubchik 19 Jan '20
by Sergei Golubchik 19 Jan '20
19 Jan '20
Hi, Oleksandr!
On Jan 19, Oleksandr Byelkin wrote:
> revision-id: 9d2c63d45e4 (mariadb-10.3.20-6-g9d2c63d45e4)
> parent(s): d4edb0510ec
> author: Oleksandr Byelkin <sanja(a)mariadb.com>
> committer: Oleksandr Byelkin <sanja(a)mariadb.com>
> timestamp: 2019-11-14 09:32:54 +0100
> message:
>
> MDEV-20076: SHOW GRANTS does not quote role names properly
>
> Quotes added to output.
>
> diff --git a/mysql-test/main/grant5.test b/mysql-test/main/grant5.test
> index 649bba7d1ca..045cbf8fc86 100644
> --- a/mysql-test/main/grant5.test
> +++ b/mysql-test/main/grant5.test
> @@ -33,3 +33,26 @@ REVOKE EXECUTE ON PROCEDURE sp FROM u;
> --error ER_TABLE_NOT_LOCKED
> REVOKE PROCESS ON *.* FROM u;
> DROP TABLE t1;
> +
> +--echo #
> +--echo # MDEV-20076: SHOW GRANTS does not quote role names properly
> +--echo #
> +
> +create role 'role-1';
> +create role 'rock\'n\'roll';
> +create user 'user-1'@'localhost';
> +create user 'O\'Brien'@'localhost';
> +grant select on mysql.user to 'role-1';
> +grant select on mysql.user to 'rock\'n\'roll';
> +GRANT 'role-1' TO 'user-1'@'localhost';
> +GRANT 'rock\'n\'roll' TO 'O\'Brien'@'localhost';
> +show grants for 'role-1';
> +show grants for 'rock\'n\'roll';
> +show grants for 'user-1'@'localhost';
> +show grants for 'O\'Brien'@'localhost';
> +drop role 'role-1';
> +drop role 'rock\'n\'roll';
> +drop user 'user-1'@'localhost';
> +drop user 'O\'Brien'@'localhost';
> +
> +--echo # End of 10.3 tests
please add few tests when sql_quote_show_create=0. Like
1. sql_quote_show_create=0, role name is not reserved (e.g. foo)
2. sql_quote_show_create=0, role name is reserved (e.g. fetch)
3. sql_quote_show_create=0, role name is not a valid identifier (e.g. role-1)
Then ok to push.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] edc9059c31b: MDEV-18027: Running out of file descriptors and eventual crash
by Sergei Golubchik 19 Jan '20
by Sergei Golubchik 19 Jan '20
19 Jan '20
Hi, Oleksandr!
Ok to push.
Just one comment, see below:
On Jan 19, Oleksandr Byelkin wrote:
> revision-id: edc9059c31b (mariadb-10.2.28-3-gedc9059c31b)
> parent(s): 0b1bc4bf76f
> author: Oleksandr Byelkin <sanja(a)mariadb.com>
> committer: Oleksandr Byelkin <sanja(a)mariadb.com>
> timestamp: 2019-11-05 16:24:26 +0100
> message:
>
> MDEV-18027: Running out of file descriptors and eventual crash
>
> For automatic number of opened files limit take into account number of
> table instances for table cache
>
> diff --git a/sql/mysqld.cc b/sql/mysqld.cc
> index 34e5704bcfe..9907c17ee07 100644
> --- a/sql/mysqld.cc
> +++ b/sql/mysqld.cc
> @@ -4461,6 +4461,17 @@ static int init_common_variables()
> if (files < wanted_files && global_system_variables.log_warnings)
> sql_print_warning("Could not increase number of max_open_files to more than %u (request: %u)", files, wanted_files);
>
> + /*
> + If we required too much tc_instances than we reduce it but not more
> + then in half
The comment is wrong, you don't do "not more than in half", you can
reduce it all the way to 1. Which, I think, it correct. So the code is
ok, but the comment is misleading, please remove the "but not more than
in half" part.
> + */
> + SYSVAR_AUTOSIZE_IF_CHANGED(tc_instances,
> + (uint32) MY_MIN(MY_MAX((files - extra_files -
> + max_connections)/
> + 2/tc_size,
> + 1),
> + tc_instances),
> + uint32);
> /*
> If we have requested too much file handles than we bring
> max_connections in supported bounds. Still leave at least
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

[Maria-developers] 回复: [Maria-discuss] [MariaDB 10.3.7] Hello, everyone. My process core dumped while calling mysql_ping().
by Ma Allen 15 Jan '20
by Ma Allen 15 Jan '20
15 Jan '20
Hi Markus,
Only MySqlWaitConnected() in the two functions doesn't use lock, reading database uses mutex. What's more, the two functions in initialization relates to ONLY the main thread. No other threads call them.
Previously, the program is written with mysqllib, which is not MariaDB lib currently used. Mysqllib supports multiple threads to concurrently use one connection. Its tutorials were also referred to:
https://dev.mysql.com/doc/refman/5.7/en/c-api-threaded-clients.html
MySQL :: MySQL 5.7 Reference Manual :: 27.7.3.3 Writing C API Threaded Client Programs<https://dev.mysql.com/doc/refman/5.7/en/c-api-threaded-clients.html>
This section provides guidance for writing client programs that use the thread-related functions in the MySQL C API. For further information about these functions, see Section 27.7.11, “C API Threaded Function Descriptions”.For examples of source code that uses them, look in the client directory of a MySQL source distribution:
dev.mysql.com
After replacing mysqllib with MariaDB lib, everything works well on the test environment except one of the clients'.
Allen
________________________________
发件人: Maria-discuss <maria-discuss-bounces+mazhh=outlook.com(a)lists.launchpad.net> 代表 Markus Mäkelä <markus.makela(a)mariadb.com>
发送时间: 2020年1月14日 15:40
收件人: maria-discuss(a)lists.launchpad.net <maria-discuss(a)lists.launchpad.net>
主题: Re: [Maria-discuss] [Maria-developers] [MariaDB 10.3.7] Hello, everyone. My process core dumped while calling mysql_ping().
Hi,
Is the connection protected by a lock of some sorts? To my knowledge the library does not support concurrent use of a single connection. I'd recommend trying to first wrap all access to the connection with a mutex and see if that solves the problem.
Markus
On 1/14/20 08:33, Ma Allen wrote:
Environment:
MariaDB 10.3.7 database server & related headers and libraries
CentOS 7.3 or 7.4
./configure --prefix=/home/spiderflow --localstatedir=/home/spiderflow --enable-unix-socket --with-libnss-libraries=/usr/lib64 --with-libnss-includes=/usr/include/nss3 --with-libnspr-libraries=/usr/lib64 --with-libnspr-includes=/usr/include/nspr4 --enable-netmap --enable-rust --disable-gccmarch-native --enable-mysql --with-libmysql-includes=/usr/local/mariadb-10.3.7-linux-systemd-x86_64/include/mysql --with-libmysql-libraries=/usr/local/mariadb-10.3.7-linux-systemd-x86_64/lib
Phenomenon:
(gdb)
#0 ma_pvio_write (pvio=0x10101010101b2e0, buffer=buffer@entry=0x2510101 "\001", length=length@entry=5)
at /home/buildbot/buildbot/build/libmariadb/libmariadb/ma_pvio.c:350
#1 0x00007f7100249dea in ma_net_real_write (net=net@entry=0xee9de0 <g_mysql_info>, packet=0x2510101 "\001", len=5)
at /home/buildbot/buildbot/build/libmariadb/libmariadb/ma_net.c:335
#2 0x00007f7100249f60 in ma_net_flush (net=net@entry=0xee9de0 <g_mysql_info>) at /home/buildbot/buildbot/build/libmariadb/libmariadb/ma_net.c:166
#3 0x00007f710024a71a in ma_net_write_command (net=net@entry=0xee9de0 <g_mysql_info>, command=command@entry=14 '\016', packet=packet@entry=0x7f71003d6c67 "", len=0,
disable_flush=disable_flush@entry=0 '\000') at /home/buildbot/buildbot/build/libmariadb/libmariadb/ma_net.c:244
#4 0x00007f7100252ca4 in mthd_my_send_cmd (mysql=0xee9de0 <g_mysql_info>, command=<optimized out>, arg=0x7f71003d6c67 "", length=0, skipp_check=<optimized out>,
opt_arg=<optimized out>) at /home/buildbot/buildbot/build/libmariadb/libmariadb/mariadb_lib.c:394
#5 0x00007f71002518d0 in mysql_ping (mysql=0xee9de0 <g_mysql_info>) at /home/buildbot/buildbot/build/libmariadb/libmariadb/mariadb_lib.c:2552
#6 0x00000000006651b2 in MySqlWaitConnected (p_mysql_data=p_mysql_data@entry=0xee9de0 <g_mysql_info>) at ../srcSF/spiderFlow-proto-baseline.c:676
#7 0x0000000000668819 in LoadProtoConfInfos (pInfo=pInfo@entry=0xee9de0 <g_mysql_info>) at ../srcSF/spiderFlow-proto-baseline.c:835
#8 0x000000000066c53a in InitBaseLine () at ../srcSF/spiderFlow-proto-baseline.c:1069
#9 0x00000000005f308b in PostConfLoadedSetup (suri=0x10dbb80 <suricata>) at suricata.c:2900
#10 0x0000000000425ce8 in main (argc=5, argv=0x7fff1cdca258) at suricata.c:3072
(gdb)
Analyzation:
The core dump always exists after running for a while or maybe after the sever restarts. It occurs at the initialization and involves with ONLY ONE thread, which is the main thread of the whole process.
The related code is two functions, which both checks the connection and read data from MariaDB. The first function works fine, however, the second one's checking connection leads to core dump.
The code to check connection is as follows:
void MySqlWaitConnected(MYSQL * p_mysql_data)
{
while (mysql_ping(p_mysql_data) != 0) {
SCLogInfo("Mysql ping error:%s", mysql_error(p_mysql_data));
#ifdef WIN32
Sleep(3000);
#else
sleep(3);
#endif
}
}
Previously, I solved a likely similar problem as follows:
Multiple threads shared ONE MySQL handler, which is the MYSQL data structure. Before inserting data into MariaDB without mutex, MySqlWaitConnected() is called. So while multiple threads do the same operation, inserting data is interrupted by mysql_ping() that leads to core dump. Afterwards, I comment MySqlWaitConnected() and add mutex for mysql_real_query() and mysql_commit().
Does anyone meet the same problem before? Any suggestion is appreciated. Thanks in advance.
Best Regards,
Allen Ma
_______________________________________________
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers(a)lists.launchpad.net<mailto:maria-developers@lists.launchpad.net>
Unsubscribe : https://launchpad.net/~maria-developers
More help : https://help.launchpad.net/ListHelp
--
Markus Mäkelä, Senior Software Engineer
MariaDB Corporation
t: +358 40 7740484 | IRC: markusjm@freenode
1
2

Re: [Maria-developers] 10670133744: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
by Sergei Golubchik 14 Jan '20
by Sergei Golubchik 14 Jan '20
14 Jan '20
Hi, Nikita!
On Dec 16, Nikita Malyavin wrote:
> revision-id: 10670133744 (mariadb-10.4.4-492-g10670133744)
> parent(s): 251c6e17269
> author: Nikita Malyavin <nikitamalyavin(a)gmail.com>
> committer: Nikita Malyavin <nikitamalyavin(a)gmail.com>
> timestamp: 2019-11-26 19:22:04 +1000
> message:
>
> MDEV-16978 Application-time periods: WITHOUT OVERLAPS
>
> * The overlaps check is implemented on a handler level per row command. It creates a separate cursor (actually, another handler instance) and caches it inside the original handler, when ha_update_row or ha_insert_row is issued. Cursor closes on unlocking the handler.
>
> * Containing the same key in index means unique constraint violation even in usual terms. So we fetch left and right neighbours and check that they have same key prefix, excluding from the key only the period part. If it doesnt match, then there's no such neighbour, and the check passes. Otherwise, we check if this neighbour intersects with the considered key.
>
> * the check does introduce new error and fails with ER_DUPP_KEY error. This might break REPLACE workflow and should be fixed separately
wrap long lines, pease.
> diff --git a/sql/handler.cc b/sql/handler.cc
> index 46a0c313c80..ed36f3c5bd3 100644
> --- a/sql/handler.cc
> +++ b/sql/handler.cc
> @@ -6402,6 +6407,14 @@ int handler::ha_external_lock(THD *thd, int lock_type)
> mysql_audit_external_lock(thd, table_share, lock_type);
> }
>
> + if (lock_type == F_UNLCK && check_overlaps_handler)
> + {
> + check_overlaps_handler->ha_external_lock(table->in_use, F_UNLCK);
> + check_overlaps_handler->close();
> + check_overlaps_handler= NULL;
> + overlaps_error_key= -1;
> + }
I'm still thinking about how avoid this overhead or at least to simplify the
code.
One option is to use HA_EXTRA_REMEMBER_POS
It doesn't nest, you're right, but it can be fixed.
Another, simpler, option is to use TABLE::update_handler.
This is a second auxillary handler, a clone, exactly as yours, created for
to check long uniques. So I don't see a need to create a yet another clone
when you can simply use TABLE::update_handler. It's never used for scanning,
only for point lookups, so there is no position that you can disrupt.
upd: you seem to have got the same idea. and you're right, it should be
in the handler class, not in the TABLE as I originally wanted.
> +
> if (MYSQL_HANDLER_RDLOCK_DONE_ENABLED() ||
> MYSQL_HANDLER_WRLOCK_DONE_ENABLED() ||
> MYSQL_HANDLER_UNLOCK_DONE_ENABLED())
> @@ -6935,6 +6956,134 @@ void handler::set_lock_type(enum thr_lock_type lock)
> table->reginfo.lock_type= lock;
> }
>
> +int handler::ha_check_overlaps(const uchar *old_data, const uchar* new_data)
> +{
> + DBUG_ASSERT(new_data);
> + if (!table_share->period.unique_keys)
> + return 0;
> + if (table->versioned() && !table->vers_end_field()->is_max())
> + return 0;
> +
> + bool is_update= old_data != NULL;
> + if (!check_overlaps_buffer)
> + check_overlaps_buffer= (uchar*)alloc_root(&table_share->mem_root,
> + table_share->max_unique_length
> + + table_share->reclength);
> + auto *record_buffer= check_overlaps_buffer + table_share->max_unique_length;
> + auto *handler= this;
Please, add a comment that on INSERT handler->inited can be NONE
> + if (handler->inited != NONE)
> + {
> + if (!check_overlaps_handler)
> + {
> + check_overlaps_handler= clone(table_share->normalized_path.str,
> + &table_share->mem_root);
> + int error= -1;
> + if (check_overlaps_handler != NULL)
> + error= check_overlaps_handler->ha_external_lock(table->in_use, F_RDLCK);
> + if (error)
> + return error;
> + }
> + handler= check_overlaps_handler;
> +
> + // Needs to compare record refs later is old_row_found()
> + if (is_update)
> + position(old_data);
> + }
> +
> + // Save and later restore this handler's keyread
> + int old_this_keyread= this->keyread;
what's that for? you aren't using `this` anywhere below.
Unless handler == this, but then this->keyread cannot be enabled.
> + DBUG_ASSERT(this->ha_end_keyread() == 0);
Eh. Never put any code with side effects into an assert.
Assert are conditionally compiled.
> +
> + int error= 0;
> +
> + for (uint key_nr= 0; key_nr < table_share->keys && !error; key_nr++)
> + {
> + const KEY &key_info= table->key_info[key_nr];
> + const uint key_parts= key_info.user_defined_key_parts;
> + if (!key_info.without_overlaps)
> + continue;
> +
> + if (is_update)
> + {
> + bool key_used= false;
> + for (uint k= 0; k < key_parts && !key_used; k++)
> + key_used= bitmap_is_set(table->write_set,
> + key_info.key_part[k].fieldnr - 1);
> + if (!key_used)
> + continue;
> + }
> +
> + error= handler->ha_index_init(key_nr, 0);
> + if (error)
> + return error;
> +
> + auto old_row_found= [is_update, old_data, record_buffer, this, handler](){
There is no reason to use a lambda here.
could you rewrite that, please? In this particular case old_row_found() is
only used once, so you can inline your lambda there.
> + if (!is_update)
> + return false;
> + /* In case of update it could appear that the nearest neighbour is
> + * a record we are updating. It means, that there are no overlaps
> + * from this side.
> + *
> + * An assumption is made that during update we always have the last
> + * fetched row in old_data. Therefore, comparing ref's is enough
> + * */
> + DBUG_ASSERT(handler != this && inited != NONE);
as a general rule please use
DBUG_ASSERT(x);
DBUG_ASSERT(y);
and not
DBUG_ASSERT(x && y);
> + DBUG_ASSERT(ref_length == handler->ref_length);
> +
> + handler->position(record_buffer);
> + return memcmp(ref, handler->ref, ref_length) == 0;
> + };
> +
> + error= handler->ha_start_keyread(key_nr);
> + DBUG_ASSERT(!error);
> +
> + const uint period_field_length= key_info.key_part[key_parts - 1].length;
> + const uint key_base_length= key_info.key_length - 2 * period_field_length;
> +
> + key_copy(check_overlaps_buffer, new_data, &key_info, 0);
> +
> + /* Copy period_end to period_start.
> + * the value in period_end field is not significant, but anyway let's leave
> + * it defined to avoid uninitialized memory access
> + */
please format your multi-line comments to follow the existing server code
conventions
> + memcpy(check_overlaps_buffer + key_base_length,
> + check_overlaps_buffer + key_base_length + period_field_length,
> + period_field_length);
> +
> + /* Find row with period_start < (period_end of new_data) */
> + error = handler->ha_index_read_map(record_buffer,
> + check_overlaps_buffer,
> + key_part_map((1 << key_parts) - 1),
> + HA_READ_BEFORE_KEY);
> +
> + if (!error && old_row_found())
> + error= handler->ha_index_prev(record_buffer);
> +
> + if (!error && table->check_period_overlaps(key_info, key_info,
> + new_data, record_buffer) == 0)
> + error= HA_ERR_FOUND_DUPP_KEY;
> +
> + if (error == HA_ERR_KEY_NOT_FOUND || error == HA_ERR_END_OF_FILE)
> + error= 0;
> +
> + if (error == HA_ERR_FOUND_DUPP_KEY)
> + overlaps_error_key= key_nr;
> +
> + int end_error= handler->ha_end_keyread();
> + DBUG_ASSERT(!end_error);
> +
> + end_error= handler->ha_index_end();
> + if (!error && end_error)
> + error= end_error;
> + }
> +
> + // Restore keyread of this handler, if it was enabled
> + if (old_this_keyread < MAX_KEY)
> + DBUG_ASSERT(this->ha_start_keyread(old_this_keyread) == 0);
> +
> + return error;
> +}
> +
> #ifdef WITH_WSREP
> /**
> @details
> diff --git a/sql/key.cc b/sql/key.cc
> index bf50094a9e4..d4f33467e2b 100644
> --- a/sql/key.cc
> +++ b/sql/key.cc
> @@ -899,3 +899,45 @@ bool key_buf_cmp(KEY *key_info, uint used_key_parts,
> }
> return FALSE;
> }
> +
> +
a comment please. What does the function return? -1/0/1 ?
> +int key_period_compare_bases(const KEY &lhs_key, const KEY &rhs_key,
> + const uchar *lhs, const uchar *rhs)
> +{
> + uint base_part_nr= lhs_key.user_defined_key_parts - 2;
> + int cmp_res= 0;
> + for (uint part_nr= 0; !cmp_res && part_nr < base_part_nr; part_nr++)
> + {
> + Field *f= lhs_key.key_part[part_nr].field;
> + cmp_res= f->cmp(f->ptr_in_record(lhs),
> + rhs_key.key_part[part_nr].field->ptr_in_record(rhs));
> + }
> +
> + return cmp_res;
> +}
> +
a comment please. What does the function return?
> +int key_period_compare_periods(const KEY &lhs_key, const KEY &rhs_key,
> + const uchar *lhs, const uchar *rhs)
> +{
> + uint base_part_nr= lhs_key.user_defined_key_parts - 2;
> +
> + Field *lhs_fields[]= {lhs_key.key_part[base_part_nr].field,
> + lhs_key.key_part[base_part_nr + 1].field};
> +
> + Field *rhs_fields[]= {rhs_key.key_part[base_part_nr].field,
> + rhs_key.key_part[base_part_nr + 1].field};
> +
> + int cmp[2][2]; /* l1 > l2, l1 > r2, r1 > l2, r1 > r2 */
> + for (int i= 0; i < 2; i++)
> + {
> + for (int j= 0; j < 2; j++)
> + {
> + cmp[i][j]= lhs_fields[0]->cmp(lhs_fields[i]->ptr_in_record(lhs),
> + rhs_fields[j]->ptr_in_record(rhs));
> + }
> + }
> +
> + bool overlaps = (cmp[0][0] <= 0 && cmp[1][0] > 0)
> + || (cmp[0][0] >= 0 && cmp[0][1] < 0);
> + return overlaps ? 0 : cmp[0][0];
Couldn't this be simplifed? Like
if (cmp[1][0] <= 0) // r1 <= l2
return -1;
if (cmp[0][1] >= 0) // l1 >= r2
return 1;
return 0;
and I think it'd be clearer to remove this cmp[][] array and write
the condition directly. May be with shortcuts like
const Field const * f=lhs_fields[0];
const uchar const * l1=lhs_fields[0]->ptr_in_record(lhs), ...
if (f->cmp(r1, l2) <= 0)
return -1;
if (f->cmp(l1, r2) >= 0)
return 1;
return 0;
> +}
> \ No newline at end of file
> diff --git a/sql/share/errmsg-utf8.txt b/sql/share/errmsg-utf8.txt
> index 36188e58624..bb2f0fc5296 100644
> --- a/sql/share/errmsg-utf8.txt
> +++ b/sql/share/errmsg-utf8.txt
> @@ -7943,3 +7943,9 @@ ER_WARN_HISTORY_ROW_START_TIME
> eng "Table `%s.%s` history row start '%s' is later than row end '%s'"
> ER_PART_STARTS_BEYOND_INTERVAL
> eng "%`s: STARTS is later than query time, first history partition may exceed INTERVAL value"
> +ER_KEY_CONTAINS_PERIOD_FIELDS
> + eng "Key %`s should not contain period fields in order to make it WITHOUT OVERLAPS"
I'd say "Key %`s cannot explicitly include column %`s"
or "WITHOUT OVERLAPS key %`s cannot explicitly include column %`s"
but I think the latter looks weird, I like the first variant more
> +ER_PERIOD_WITHOUT_OVERLAPS_PARTITIONED
> + eng "Period WITHOUT OVERLAPS is not implemented for partitioned tables"
I don't think we should create a separate error message for every
feature that doesn't work with partitioning. There's already
ER_FOREIGN_KEY_ON_PARTITIONED
eng "Foreign key clause is not yet supported in conjunction with partitioning"
ER_PARTITION_NO_TEMPORARY
eng "Cannot create temporary table with partitions"
ER_FULLTEXT_NOT_SUPPORTED_WITH_PARTITIONING
eng "FULLTEXT index is not supported for partitioned tables"
which is two error messages too many. So I'd just say
ER_FEATURE_NOT_SUPPORTED_WITH_PARTITIONING
end "Partitioned tables do not support %s"
where %s could be CREATE TEMPORARY TABLE, FOREIGN KEY, FULLTEXT, WITHOUT OVERLAPS,
and whatever else partitioned tables don't or won't support.
Note that you cannot remove old error messages, so just rename the first one
to ER_FEATURE_NOT_SUPPORTED_WITH_PARTITIONING, and keep the rest unused
but still present in the errmsg-utf8.txt file.
> +ER_PERIOD_WITHOUT_OVERLAPS_NON_UNIQUE
> + eng "Period WITHOUT OVERLAPS is only allowed for unique keys"
"Only UNIQUE or PRIMARY keys can be WITHOUT OVERLAPS"
> diff --git a/sql/table.cc b/sql/table.cc
> index 7ed5121a9c6..8b84fb3035d 100644
> --- a/sql/table.cc
> +++ b/sql/table.cc
> @@ -1519,6 +1520,15 @@ bool read_extra2(const uchar *frm_image, size_t len, extra2_fields *fields)
> size_t length= extra2_read_len(&extra2, e2end);
> if (!length)
> DBUG_RETURN(true);
> +
> + auto fill_extra2= [extra2, length](LEX_CUSTRING *section){
> + if (section->str)
> + return true;
> + *section= {extra2, length};
> + return false;
> + };
don't use a lambda here either, make it a proper function
> +
> + bool fail= false;
> switch (type) {
> case EXTRA2_TABLEDEF_VERSION:
> if (fields->version.str) // see init_from_sql_statement_string()
> @@ -1726,11 +1725,26 @@ int TABLE_SHARE::init_from_binary_frm_image(THD *thd, bool write,
> keyinfo= &first_keyinfo;
> thd->mem_root= &share->mem_root;
>
> + auto err= [thd, share, &handler_file, &se_plugin, old_root](){
> + share->db_plugin= NULL;
> + share->error= OPEN_FRM_CORRUPTED;
> + share->open_errno= my_errno;
> + delete handler_file;
> + plugin_unlock(0, se_plugin);
> + my_hash_free(&share->name_hash);
> +
> + if (!thd->is_error())
> + open_table_error(share, OPEN_FRM_CORRUPTED, share->open_errno);
> +
> + thd->mem_root= old_root;
> + return HA_ERR_NOT_A_TABLE;
> + };
Revert that too
> +
> if (write && write_frm_image(frm_image, frm_length))
> - goto err;
> + DBUG_RETURN(err());
>
> if (frm_length < FRM_HEADER_SIZE + FRM_FORMINFO_SIZE)
> - goto err;
> + DBUG_RETURN(err());
>
> share->frm_version= frm_image[2];
> /*
> @@ -2251,14 +2265,30 @@ int TABLE_SHARE::init_from_binary_frm_image(THD *thd, bool write,
> pos+= period.constr_name.length;
>
> if (init_period_from_extra2(&period, pos, end))
> - goto err;
> + DBUG_RETURN(err());
> status_var_increment(thd->status_var.feature_application_time_periods);
> }
>
> + if (extra2.without_overlaps.str)
> + {
> + const uchar *key_pos= extra2.without_overlaps.str;
> + period.unique_keys= read_frm_keyno(key_pos);
> + for (uint k= 0; k < period.unique_keys; k++)
> + {
> + key_pos+= frm_keyno_size;
> + uint key_nr= read_frm_keyno(key_pos);
> + key_info[key_nr].without_overlaps= true;
> + }
> +
> + if ((period.unique_keys + 1) * frm_keyno_size
> + != extra2.without_overlaps.length)
> + DBUG_RETURN(err());
you can also check here that extra2.application_period.str != NULL
otherwise it's OPEN_FRM_CORRUPTED too
> + }
> +
> if (extra2.field_data_type_info.length &&
> field_data_type_info_array.parse(old_root, share->fields,
> extra2.field_data_type_info))
> - goto err;
> + DBUG_RETURN(err());
>
> for (i=0 ; i < share->fields; i++, strpos+=field_pack_length, field_ptr++)
> {
> @@ -8600,6 +8616,15 @@ void TABLE::evaluate_update_default_function()
> DBUG_VOID_RETURN;
> }
>
a comment please. What does the function return?
> +int TABLE::check_period_overlaps(const KEY &lhs_key, const KEY &rhs_key,
> + const uchar *lhs, const uchar *rhs)
> +{
> + int cmp_res= key_period_compare_bases(lhs_key, rhs_key, lhs, rhs);
> + if (cmp_res)
> + return cmp_res;
> +
> + return key_period_compare_periods(lhs_key, rhs_key, lhs, rhs);
> +}
>
> void TABLE::vers_update_fields()
> {
> diff --git a/sql/table.h b/sql/table.h
> index 043341db608..e871471101f 100644
> --- a/sql/table.h
> +++ b/sql/table.h
> @@ -1620,6 +1621,13 @@ struct TABLE
> int period_make_insert(Item *src, Field *dst);
> int insert_portion_of_time(THD *thd, const vers_select_conds_t &period_conds,
> ha_rows *rows_inserted);
> + /*
> + @return -1, lhs precedes rhs
> + 0, lhs overlaps rhs
> + 1, lhs succeeds rhs
> + */
ah, a comment, good :) but move it to the function definition, please
> + static int check_period_overlaps(const KEY &lhs_key, const KEY &rhs_key,
> + const uchar *lhs, const uchar *rhs);
> int delete_row();
> void vers_update_fields();
> void vers_update_end();
> @@ -1759,10 +1767,18 @@ class IS_table_read_plan;
>
> /** number of bytes used by field positional indexes in frm */
> constexpr uint frm_fieldno_size= 2;
> +/** number of bytes used by key position number in frm */
> +constexpr uint frm_keyno_size= 2;
> static inline uint16 read_frm_fieldno(const uchar *data)
> { return uint2korr(data); }
> -static inline void store_frm_fieldno(const uchar *data, uint16 fieldno)
> +static inline void store_frm_fieldno(uchar *data, uint16 fieldno)
> +{ int2store(data, fieldno); }
> +static inline uint16 read_frm_keyno(const uchar *data)
> +{ return uint2korr(data); }
> +static inline void store_frm_keyno(uchar *data, uint16 fieldno)
> { int2store(data, fieldno); }
> +static inline size_t extra2_str_size(size_t len)
> +{ return (len > 255 ? 3 : 1) + len; }
why did you move that? it's still not used anywhere outside of unireg.cc
>
> class select_unit;
> class TMP_TABLE_PARAM;
> diff --git a/sql/unireg.cc b/sql/unireg.cc
> index 7130b3e5d8a..ea5d739a97e 100644
> --- a/sql/unireg.cc
> +++ b/sql/unireg.cc
> @@ -390,7 +385,9 @@ LEX_CUSTRING build_frm_image(THD *thd, const LEX_CSTRING &table,
>
> if (create_info->period_info.name)
> {
> - extra2_size+= 1 + extra2_str_size(period_info_len);
> + // two extra2 sections are taken after 10.5
This is a confusing comment, it suggests that both extra2 sections
were added in 10.5. Remove the comment, please, it's not worth it.
> + extra2_size+= 2 + extra2_str_size(period_info_len)
> + + extra2_str_size(without_overlaps_len);
> }
>
> bool has_extra2_field_flags_= has_extra2_field_flags(create_fields);
> diff --git a/sql/unireg.h b/sql/unireg.h
> index 419fbc4bd80..873d6f681fc 100644
> --- a/sql/unireg.h
> +++ b/sql/unireg.h
> @@ -177,7 +177,8 @@ enum extra2_frm_value_type {
>
> EXTRA2_ENGINE_TABLEOPTS=128,
> EXTRA2_FIELD_FLAGS=129,
> - EXTRA2_FIELD_DATA_TYPE_INFO=130
> + EXTRA2_FIELD_DATA_TYPE_INFO=130,
> + EXTRA2_PERIOD_WITHOUT_OVERLAPS=131,
Please, try to create a table that uses WITHOUT OVERLAPS
and open it in 10.4. Just as a test, to make sure it works as expected.
> };
>
> enum extra2_field_flags {
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
2

[Maria-developers] [MariaDB 10.3.7] Hello, everyone. My process core dumped while calling mysql_ping().
by Ma Allen 14 Jan '20
by Ma Allen 14 Jan '20
14 Jan '20
Environment:
MariaDB 10.3.7 database server & related headers and libraries
CentOS 7.3 or 7.4
./configure --prefix=/home/spiderflow --localstatedir=/home/spiderflow --enable-unix-socket --with-libnss-libraries=/usr/lib64 --with-libnss-includes=/usr/include/nss3 --with-libnspr-libraries=/usr/lib64 --with-libnspr-includes=/usr/include/nspr4 --enable-netmap --enable-rust --disable-gccmarch-native --enable-mysql --with-libmysql-includes=/usr/local/mariadb-10.3.7-linux-systemd-x86_64/include/mysql --with-libmysql-libraries=/usr/local/mariadb-10.3.7-linux-systemd-x86_64/lib
Phenomenon:
(gdb)
#0 ma_pvio_write (pvio=0x10101010101b2e0, buffer=buffer@entry=0x2510101 "\001", length=length@entry=5)
at /home/buildbot/buildbot/build/libmariadb/libmariadb/ma_pvio.c:350
#1 0x00007f7100249dea in ma_net_real_write (net=net@entry=0xee9de0 <g_mysql_info>, packet=0x2510101 "\001", len=5)
at /home/buildbot/buildbot/build/libmariadb/libmariadb/ma_net.c:335
#2 0x00007f7100249f60 in ma_net_flush (net=net@entry=0xee9de0 <g_mysql_info>) at /home/buildbot/buildbot/build/libmariadb/libmariadb/ma_net.c:166
#3 0x00007f710024a71a in ma_net_write_command (net=net@entry=0xee9de0 <g_mysql_info>, command=command@entry=14 '\016', packet=packet@entry=0x7f71003d6c67 "", len=0,
disable_flush=disable_flush@entry=0 '\000') at /home/buildbot/buildbot/build/libmariadb/libmariadb/ma_net.c:244
#4 0x00007f7100252ca4 in mthd_my_send_cmd (mysql=0xee9de0 <g_mysql_info>, command=<optimized out>, arg=0x7f71003d6c67 "", length=0, skipp_check=<optimized out>,
opt_arg=<optimized out>) at /home/buildbot/buildbot/build/libmariadb/libmariadb/mariadb_lib.c:394
#5 0x00007f71002518d0 in mysql_ping (mysql=0xee9de0 <g_mysql_info>) at /home/buildbot/buildbot/build/libmariadb/libmariadb/mariadb_lib.c:2552
#6 0x00000000006651b2 in MySqlWaitConnected (p_mysql_data=p_mysql_data@entry=0xee9de0 <g_mysql_info>) at ../srcSF/spiderFlow-proto-baseline.c:676
#7 0x0000000000668819 in LoadProtoConfInfos (pInfo=pInfo@entry=0xee9de0 <g_mysql_info>) at ../srcSF/spiderFlow-proto-baseline.c:835
#8 0x000000000066c53a in InitBaseLine () at ../srcSF/spiderFlow-proto-baseline.c:1069
#9 0x00000000005f308b in PostConfLoadedSetup (suri=0x10dbb80 <suricata>) at suricata.c:2900
#10 0x0000000000425ce8 in main (argc=5, argv=0x7fff1cdca258) at suricata.c:3072
(gdb)
Analyzation:
The core dump always exists after running for a while or maybe after the sever restarts. It occurs at the initialization and involves with ONLY ONE thread, which is the main thread of the whole process.
The related code is two functions, which both checks the connection and read data from MariaDB. The first function works fine, however, the second one's checking connection leads to core dump.
The code to check connection is as follows:
void MySqlWaitConnected(MYSQL * p_mysql_data)
{
while (mysql_ping(p_mysql_data) != 0) {
SCLogInfo("Mysql ping error:%s", mysql_error(p_mysql_data));
#ifdef WIN32
Sleep(3000);
#else
sleep(3);
#endif
}
}
Previously, I solved a likely similar problem as follows:
Multiple threads shared ONE MySQL handler, which is the MYSQL data structure. Before inserting data into MariaDB without mutex, MySqlWaitConnected() is called. So while multiple threads do the same operation, inserting data is interrupted by mysql_ping() that leads to core dump. Afterwards, I comment MySqlWaitConnected() and add mutex for mysql_real_query() and mysql_commit().
Does anyone meet the same problem before? Any suggestion is appreciated. Thanks in advance.
Best Regards,
Allen Ma
1
0

Re: [Maria-developers] [Commits] afdd6191d5d: Big Test added for sorting
by Sergey Petrunia 13 Jan '20
by Sergey Petrunia 13 Jan '20
13 Jan '20
Hi Varun,
As disucssed:
- Please fix function names
- Please fix the comments to explain what the function is generating
- Please don't use session variables.
Ok to push after this is addressed
On Fri, Jan 03, 2020 at 02:28:25AM +0530, Varun wrote:
> revision-id: afdd6191d5dcb004ec9ac0b908871ad8a370da34 (mariadb-10.4.11-18-gafdd6191d5d)
> parent(s): 59d4f2a373a7960a533e653877ab69a97e91444a
> author: Varun Gupta
> committer: Varun Gupta
> timestamp: 2020-01-03 02:26:58 +0530
> message:
>
> Big Test added for sorting
>
> ---
> mysql-test/main/order_by_pack_big.result | 194 +++++++++++++++++++++++++++++++
> mysql-test/main/order_by_pack_big.test | 107 +++++++++++++++++
> 2 files changed, 301 insertions(+)
>
> diff --git a/mysql-test/main/order_by_pack_big.result b/mysql-test/main/order_by_pack_big.result
> new file mode 100644
> index 00000000000..66aad449c38
> --- /dev/null
> +++ b/mysql-test/main/order_by_pack_big.result
> @@ -0,0 +1,194 @@
> +set @save_rand_seed1= @@RAND_SEED1;
> +set @save_rand_seed2= @@RAND_SEED2;
> +set @@RAND_SEED1=810763568, @@RAND_SEED2=600681772;
> +create table t1(a int);
> +insert into t1 select seq from seq_1_to_10000 order by rand();
> +#
> +# parameters:
> +# mean mean for the column to be considered
> +# max_val max_value for the column to be considered
> +#
> +# This function also calculates the standard deviation
> +# which is required to convert standard normal distribution
> +# to normal distribution
I cannot make any sense of this.
The intent of this function is to generate random numbers with
the mean of `mean` and standard deviation of ...
(max_val - mean) /6 ?
> +#
> +CREATE FUNCTION f1(mean DOUBLE, max_val DOUBLE) RETURNS DOUBLE
> +BEGIN
> +DECLARE std_dev DOUBLE DEFAULT 0;
> +SET @z= (rand() + rand() + rand() + rand() + rand() + rand() +
> +rand() + rand() + rand() + rand() + rand() + rand() - 6);
Here we get mean=1, stddev=1.
> +SET std_dev= (max_val - mean)/6;
> +SET @z= std_dev*@z + mean;
ok so we have generated a random number with the mean 'mean' and std_dev as
shown above.
> +return @z;
Please do not use session variables for function-local computations.
The variables are in user-session scope, that is, this function will
eventually cause a surprise by overwriting user's @z.
> +END|
> +#
> +# parameters:
> +# len length of the random string to be generated
> +#
> +# This function generates a random string for the length passed
> +# as an argument with characters in the range of [A,Z]
> +#
> +CREATE function f2(len INT) RETURNS varchar(128)
> +BEGIN
> +DECLARE str VARCHAR(256) DEFAULT '';
> +DECLARE x INT DEFAULT 0;
> +WHILE (len > 0) DO
> +SET x =round(rand()*25);
> +SET str= CONCAT(str, CHAR(65 + x));
> +SET len= len-1;
> +END WHILE;
> +RETURN str;
> +END|
> +#
> +# parameters:
> +# mean mean for the column to be considered
> +# min_val min_value for the column to be considered
> +# max_val max_value for the column to be considered
> +#
> +CREATE function f3(mean DOUBLE, min_val DOUBLE, max_val DOUBLE) RETURNS INT
> +BEGIN
> +DECLARE r DOUBLE DEFAULT 0;
> +WHILE 1=1 DO
> +set r= f1(mean, max_val);
> +IF (r >= min_val) THEN
> +RETURN round(r);
> +end if;
> +END WHILE;
> +RETURN 0;
> +END|
> +create table t2 (id INT NOT NULL, a INT, b int);
> +insert into t2 select a, f3(12, 0, 64), f3(32, 0, 128) from t1;
> +CREATE TABLE t3(
> +id INT NOT NULL,
> +names VARCHAR(64),
> +address VARCHAR(128),
> +PRIMARY KEY (id)
> +);
> +#
> +# table t3 stores string calculated from the length stored in
> +# table t2
> +#
> +insert into t3 select id, f2(a), f2(b) from t2;
> +set sort_buffer_size=262144*10;
> +flush status;
> +select id,
> +MD5(group_concat(substring(names,1,3), substring(address,1,3)))
> +FROM t3
> +GROUP BY id DIV 100
> +ORDER BY id;
> +id MD5(group_concat(substring(names,1,3), substring(address,1,3)))
> +10 351239227a41de08388ea422f928cc29
> +149 67299eb34e363edabe31576890087e97
> +232 7ac931ef07a24ebe1293093ec6fa8f3d
> +311 8625cade62c8b45c63d8978f8968ebb5
> +430 362761f4180d40372667c8dd7cdcc436
> +502 5380af74db071a35fb1d2491368e641b
> +665 d3e3e2a2cb4e0de17c4f12e5b7745802
> +719 5d93632d4c30ec99802f7be7582f4f2d
> +883 27747ef400898c7eeeba3ebea8c42fb1
> +942 d1e4ae80ca57b99ee49201b658a7b040
> +1007 fceb25160237c8a3c262735b81d027ac
> +1134 cfa9c86c901aaace0e9e94dc6a837468
> +1226 4fb8e9ab9acdd251e7bc51db9e4d2f3b
> +1367 e17fa4948562b3411f0b64084de0c605
> +1486 85dd0f507e660600820f106dc8887edf
> +1502 5bf6015f936908eed31f5769ad4b0d72
> +1674 01f6c54ea21c4acd26f6c1df6abd793c
> +1781 6d38cd061db1f30e2e37cd7d9ac600ad
> +1803 2ac17a3853677ffde105735c92a9f2ea
> +1969 e1e2e39e9d26baebe23232a429783feb
> +2087 af67a443d21665bbb425a783f4e434fa
> +2111 1906e379e9ae0b3b580fa134d2a5a146
> +2268 2afaf9091f92fb8e409142552724a85e
> +2328 5a8fd5d24c9f7c7bcfbcde84a5b0cfe2
> +2416 d9a69c46523f71fce606c6d6c92ca516
> +2599 55a436a6fb744eefd6878473c34fa41e
> +2602 98317430fe15bcc9bb5968b5052c9106
> +2777 8b5c30ae940ff7f31839309b535e3a15
> +2858 0db2f3bcb138c2f91445c4205374a3b4
> +2922 fed051b9185591bc0aaebd1e1471944d
> +3027 f0cff102210e7fa32db222ac3444e4cf
> +3131 c2f3f5a92d4c2b45cadd9c8cbf04d1be
> +3220 8db6dfcca0461654dcb963fe2e1d8f41
> +3331 42031ed42643c755dfd936eb96b28ed5
> +3452 09f418c82012ff6789a6429be0c10f98
> +3519 7d26aac1dbbcff68b528b8c1d80a2c7b
> +3680 0ff5b4295168db71b997f6001bba7015
> +3799 3460724c5fc7271a0a3189bf275b9b89
> +3876 13f21a3dfc2bad54c12fffae7cdf8326
> +3937 a240132ca8905b8165bf6e16fa6e7b3a
> +4029 5fabf8408215c5bf896eda4e173a8a98
> +4158 c7829b1eeda97ff8c9b2a24ead3f6df6
> +4291 0d24e7e9da38dc44ffb43976560c4730
> +4355 bc804d019300149cb891b8fe8afbe445
> +4461 bb5a658677030b64ca3fd095f8a054fd
> +4544 e04f6bfc8dcb8d8014ce39e1b707ed0b
> +4646 06af0dd12faee32a07e785c4d56856b8
> +4714 d0c99cc1aead7d06e5323867867d4b00
> +4848 208d1ca5ade34053d92f96937f76380b
> +4935 3b62eb6129970e714bdc74565183e183
> +5014 9e19c021b79e32ea6fceb7ced26a3a68
> +5184 41fa16423738302b2fdd6cda8e52f2c9
> +5219 3ab8090c30c0206c1e30ce6cd76cb617
> +5349 bd3e73dd60fbd1819aa468d3d0e6999c
> +5400 80dc0e71fcbd2abfec9b585cc04a7545
> +5507 96ed16d40a9e6a1231bc88bd6b3f9c3e
> +5672 764347fc7e265a1478c890fa38d8c892
> +5725 6767ae39fec9b789b8b542080162af46
> +5849 41df99caa43ee3f3b162c66c3eb61a44
> +5941 0725e779ca53da50461ef0d3758d819d
> +6064 06d28bf28138d5726ab61e51a2e87edc
> +6135 b2567b682dd449e358e11c4fb7f7bb72
> +6289 8aa8131d32436add670fed1e7628b297
> +6329 127b1600d2a9f857501f0263536d200b
> +6404 266b87348831b9cc5b570e2b16c3006a
> +6580 f70b98a00f6adb163c0f89bb6bb6d1ad
> +6653 a13a591ba0c88985040c51fda2af7a72
> +6773 ee4306ceb6a3266617707a1ca637c328
> +6822 a8c368cc486b650f6254614535b5b051
> +6938 a7c160cec86018b78942b60b62b5b7fd
> +7085 eb360d65bc8080cd5879fb8ddee830cd
> +7180 c54bebbb560d9e9196a6f986022d4253
> +7290 4d1820f520483d785ba4e1c89b938f20
> +7390 0d3cd69b8e02fde232df802f3e9fc7a2
> +7449 7328ee3fe9383f891b9af5244c63a0e0
> +7589 467169481de385077ebcad083dd36b0b
> +7686 ae22b711e21ba0e0fe20ba713408263a
> +7713 e20cd84a1ee8bd1d743947c9c381731d
> +7844 bc3f0534e283616d6a4dbb0902c03fa6
> +7935 146ea350d8f1cfef44aa7470cf9e02f8
> +8059 3a88201a77ccbd8ce651eeb555c29fe5
> +8153 9db1e67ef602768b7182401905bacc26
> +8245 c5e6c51763b0bbc1a7e72fe1615f9440
> +8310 ee37ab957141c733350e21a6ed2176f5
> +8432 34ae43ecbfa6c96e12a8c315937d511f
> +8596 710f7c0bc4fadbdd859352b584b19d66
> +8647 df6f807e47599027749e1b09b04f6083
> +8742 5efcaddfa993721074a1691947ca611e
> +8856 40ad2459d26129770ac6ac2da757ad7e
> +8967 344f6b2c8242b9b3bbd09898a80ba4ee
> +9057 3084c365110820be5bbfc721f4b2f37d
> +9148 13b2a5aa09a1f107f656e848a963e8ea
> +9275 908187dba9416102a566b955b29f709e
> +9311 d6c8096f5763c6ebdaccb3e2cc3ae686
> +9488 62deb4d1a8900ea7cd7daa1909917490
> +9518 730ecae84924d86922c82152c191d0f6
> +9696 0a15d3446ba3d4b7ca8224633fbab666
> +9752 a74f840a4e599466799d4e0879533da0
> +9887 a7c29b0e5edfcd20572e0fda12a9e9aa
> +9903 e89c3ab708646a5d73683ea68c4e366a
> +10000 9cc0d2b033602eaea73fa9b2201b01b6
> +show status like '%sort%';
> +Variable_name Value
> +Sort_merge_passes 0
> +Sort_priority_queue_sorts 0
> +Sort_range 0
> +Sort_rows 10101
> +Sort_scan 2
> +set sort_buffer_size=default;
> +set @@RAND_SEED1= @save_rand_seed1;
> +set @@RAND_SEED2= @save_rand_seed2;
> +drop function f1;
> +drop function f2;
> +drop function f3;
> +drop table t1, t2, t3;
> diff --git a/mysql-test/main/order_by_pack_big.test b/mysql-test/main/order_by_pack_big.test
> new file mode 100644
> index 00000000000..021edfee13f
> --- /dev/null
> +++ b/mysql-test/main/order_by_pack_big.test
> @@ -0,0 +1,107 @@
> +--source include/big_test.inc
> +--source include/have_sequence.inc
> +--source include/have_64bit.inc
> +
> +set @save_rand_seed1= @@RAND_SEED1;
> +set @save_rand_seed2= @@RAND_SEED2;
> +set @@RAND_SEED1=810763568, @@RAND_SEED2=600681772;
> +
> +create table t1(a int);
> +insert into t1 select seq from seq_1_to_10000 order by rand();
> +delimiter |;
> +
> +--echo #
> +--echo # parameters:
> +--echo # mean mean for the column to be considered
> +--echo # max_val max_value for the column to be considered
> +--echo #
> +--echo # This function also calculates the standard deviation
> +--echo # which is required to convert standard normal distribution
> +--echo # to normal distribution
> +--echo #
> +
> +CREATE FUNCTION f1(mean DOUBLE, max_val DOUBLE) RETURNS DOUBLE
> +BEGIN
> + DECLARE std_dev DOUBLE DEFAULT 0;
> + SET @z= (rand() + rand() + rand() + rand() + rand() + rand() +
> + rand() + rand() + rand() + rand() + rand() + rand() - 6);
> + SET std_dev= (max_val - mean)/6;
> + SET @z= std_dev*@z + mean;
> + return @z;
> +END|
> +
> +--echo #
> +--echo # parameters:
> +--echo # len length of the random string to be generated
> +--echo #
> +--echo # This function generates a random string for the length passed
> +--echo # as an argument with characters in the range of [A,Z]
> +--echo #
> +
> +CREATE function f2(len INT) RETURNS varchar(128)
> +BEGIN
> + DECLARE str VARCHAR(256) DEFAULT '';
> + DECLARE x INT DEFAULT 0;
> + WHILE (len > 0) DO
> + SET x =round(rand()*25);
> + SET str= CONCAT(str, CHAR(65 + x));
> + SET len= len-1;
> + END WHILE;
> +RETURN str;
> +END|
> +
> +--echo #
> +--echo # parameters:
> +--echo # mean mean for the column to be considered
> +--echo # min_val min_value for the column to be considered
> +--echo # max_val max_value for the column to be considered
> +--echo #
> +
> +CREATE function f3(mean DOUBLE, min_val DOUBLE, max_val DOUBLE) RETURNS INT
> +BEGIN
> + DECLARE r DOUBLE DEFAULT 0;
> + WHILE 1=1 DO
> + set r= f1(mean, max_val);
> + IF (r >= min_val) THEN
> + RETURN round(r);
> + end if;
> + END WHILE;
> + RETURN 0;
> +END|
> +
> +delimiter ;|
> +
> +create table t2 (id INT NOT NULL, a INT, b int);
> +insert into t2 select a, f3(12, 0, 64), f3(32, 0, 128) from t1;
> +
> +CREATE TABLE t3(
> + id INT NOT NULL,
> + names VARCHAR(64),
> + address VARCHAR(128),
> + PRIMARY KEY (id)
> +);
> +
> +--echo #
> +--echo # table t3 stores string calculated from the length stored in
> +--echo # table t2
> +--echo #
> +
> +insert into t3 select id, f2(a), f2(b) from t2;
> +
> +set sort_buffer_size=262144*10;
> +flush status;
> +select id,
> + MD5(group_concat(substring(names,1,3), substring(address,1,3)))
> +FROM t3
> +GROUP BY id DIV 100
> +ORDER BY id;
> +show status like '%sort%';
> +set sort_buffer_size=default;
> +
> +set @@RAND_SEED1= @save_rand_seed1;
> +set @@RAND_SEED2= @save_rand_seed2;
> +
> +drop function f1;
> +drop function f2;
> +drop function f3;
> +drop table t1, t2, t3;
> _______________________________________________
> commits mailing list
> commits(a)mariadb.org
> https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
--
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
1
0

[Maria-developers] MariaDB and reproducible builds: Mroonga, RocksDB and TokuDB
by Otto Kekäläinen 09 Jan '20
by Otto Kekäläinen 09 Jan '20
09 Jan '20
Reproducible builds are important for software supply chain security.
See https://reproducible-builds.org/
All packages in Debian are tested for reproducibility. Currently the
latest MariaDB 10.3 build in Debian fails due to the build of the
Mroonga, RocksDB and TokuDB plugins. See
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/mariadb-…
I've filed issues about this at Mroonga and RocksDB now:
https://github.com/mroonga/mroonga/issues/298
https://github.com/facebook/rocksdb/issues/6276
TokuDB does not seem to accept any issues anymore and there are no
commits since 2015. The project is dead, right?
https://tokutek.atlassian.net/projects/DB/issues
https://github.com/percona/tokudb-engine
1
0

Re: [Maria-developers] c91ec05e01b: MDEV-20865 Store foreign key info in TABLE_SHARE
by Sergei Golubchik 07 Jan '20
by Sergei Golubchik 07 Jan '20
07 Jan '20
Hi, Aleksey!
Just a question for now
(and a couple of style comments, I didn't look at the logic yet)
On Dec 04, Aleksey Midenkov wrote:
> revision-id: c91ec05e01b (mariadb-10.4.4-427-gc91ec05e01b)
> parent(s): e6d653b448d
> author: Aleksey Midenkov <midenok(a)gmail.com>
> committer: Aleksey Midenkov <midenok(a)gmail.com>
> timestamp: 2019-11-26 13:04:07 +0300
> message:
>
> MDEV-20865 Store foreign key info in TABLE_SHARE
>
> 1. Access foreign keys via TABLE_SHARE::foreign_keys and
> TABLE_SHARE::referenced_keys;
>
> 2. Remove handler FK interface:
>
> - get_foreign_key_list()
> - get_parent_foreign_key_list()
> - referenced_by_foreign_key()
Good, that was the goal
> 3. Invalidate referenced shares on:
>
> - RENAME TABLE
> - DROP TABLE
> - RENAME COLUMN
> - CREATE TABLE
> - ADD FOREIGN KEY
>
> When foreign table is created or altered by the above operations all
> referenced shares are closed. This blocks the operation while any
> referenced shares are used (when at least one its TABLE instance is
> locked).
And this is the main question of this email:
Why do you close referenced tables?
Minor comments below
> diff --git a/sql/handler.h b/sql/handler.h
> index e913af1d15d..10984505f70 100644
> --- a/sql/handler.h
> +++ b/sql/handler.h
> @@ -1030,6 +1031,15 @@ struct TABLE_SHARE;
> struct HA_CREATE_INFO;
> struct st_foreign_key_info;
> typedef struct st_foreign_key_info FOREIGN_KEY_INFO;
> +class Table_ident;
> +class FK_list : public List<FOREIGN_KEY_INFO>
> +{
> +public:
> + /* Get all referenced tables for foreign key fk_name. */
> + bool get(THD *thd, std::set<Table_ident> &result, LEX_CSTRING &fk_name, bool foreign);
> + /* Get all referenced or foreign tables. */
> + bool get(THD *thd, std::set<Table_ident> &result, bool foreign);
Seems unnecessary. Copying a list into a std::set _could_ be justified, if
you'd later used it for quick checks "if this table belongs to a set" -
something you cannot quickly do with a List.
But as far as I can see you copy the List into std::set and then iterate
this set. This doesn't make much sense, you can iterate the original
list just fine.
> +};
> typedef bool (stat_print_fn)(THD *thd, const char *type, size_t type_len,
> const char *file, size_t file_len,
> const char *status, size_t status_len);
> diff --git a/sql/lex_string.h b/sql/lex_string.h
> index a62609c6b60..769f4dcbf5e 100644
> --- a/sql/lex_string.h
> +++ b/sql/lex_string.h
> @@ -41,11 +41,47 @@ class Lex_cstring : public LEX_CSTRING
> str= start;
> length= end - start;
> }
> + Lex_cstring(const LEX_CSTRING &src)
> + {
> + str= src.str;
> + length= src.length;
> + }
> void set(const char *_str, size_t _len)
> {
> str= _str;
> length= _len;
> }
> + Lex_cstring *strdup_root(MEM_ROOT &mem_root)
The way you use it, it looks like you really need a constructor, not
a strdup.
> + {
> + Lex_cstring *dst=
> + (Lex_cstring *) alloc_root(&mem_root, sizeof(Lex_cstring));
> + if (!dst)
> + return NULL;
> + if (!str)
> + {
> + dst->str= NULL;
> + dst->length= 0;
> + return dst;
> + }
> + dst->str= (const char *) memdup_root(&mem_root, str, length + 1);
> + if (!dst->str)
> + return NULL;
> + dst->length= length;
> + return dst;
> + }
> + bool operator< (const Lex_cstring& rhs) const
> + {
> + return length < rhs.length || (length == rhs.length && memcmp(str, rhs.str, length) < 0);
> + }
> + bool operator== (const Lex_cstring& rhs) const
> + {
> + return length == rhs.length && 0 == memcmp(str, rhs.str, length);
> + }
> + bool operator> (const Lex_cstring& rhs) const
> + {
> + return length > rhs.length || (length == rhs.length && memcmp(str, rhs.str, length) > 0);
> + }
Nope. We've been here before, haven't we? Don't overload operators.
If you want a method to compare, call it cmp(), or greater_than(), or something
btw your comparison has quite weird and unconventional semantics
> +
> };
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
15

Re: [Maria-developers] MDEV-21318: Wrong results with window functions and implicit grouping
by Sergey Petrunia 20 Dec '19
by Sergey Petrunia 20 Dec '19
20 Dec '19
Hi Varun,
> revision-id: fc34657511a9aa08dd92f7363dc53f58934f9673 (mariadb-10.2.29-62-gfc34657511a)
> parent(s): f0aa073f2bf3d8d85b3d028df89cdb4cdfc4002d
> author: Varun Gupta
> committer: Varun Gupta
> timestamp: 2019-12-17 16:40:06 +0530
> message:
>
> MDEV-21318: Wrong results with window functions and implicit grouping
>
> The issue here is for degenerate joins we should execute the window
> function but it is not getting executed in all the cases.
>
> To get the window function values window function needs to be executed
> always. This currently does not happen in few cases
> where the join would return 0 or 1 row like
> 1) IMPOSSIBLE WHERE
> 2) IMPOSSIBLE HAVING
> 3) MIN/MAX optimization
> 4) EMPTY CONST TABLE
> 5) ZERO LIMIT
>
> The fix is to make sure that window functions get executed
> and the temporary table is setup for the execution of window functions
>
> ---
> mysql-test/r/win.result | 51 ++++++++++++++++++++++++++++++++++++++++++++
> mysql-test/t/win.test | 56 +++++++++++++++++++++++++++++++++++++++++++++++++
> sql/sql_select.cc | 33 ++++++++++++++++++++++++++++-
> sql/sql_select.h | 1 +
> 4 files changed, 140 insertions(+), 1 deletion(-)
...
> diff --git a/sql/sql_select.cc b/sql/sql_select.cc
> index c9cb533aa33..f6943b18cee 100644
> --- a/sql/sql_select.cc
> +++ b/sql/sql_select.cc
> @@ -1447,6 +1447,7 @@ JOIN::optimize_inner()
> zero_result_cause= "Zero limit";
> }
> table_count= top_join_tab_count= 0;
> + implicit_grouping_with_window_funcs();
not needed + redundant (I'll elaborate below)
> error= 0;
> goto setup_subq_exit;
> }
> @@ -1502,6 +1503,7 @@ JOIN::optimize_inner()
> zero_result_cause= "No matching min/max row";
> table_count= top_join_tab_count= 0;
> error=0;
> + implicit_grouping_with_window_funcs();
not needed + redundant (I'll elaborate below)
> goto setup_subq_exit;
> }
> if (res > 1)
> @@ -1517,6 +1519,7 @@ JOIN::optimize_inner()
> tables_list= 0; // All tables resolved
> select_lex->min_max_opt_list.empty();
> const_tables= top_join_tab_count= table_count;
> + implicit_grouping_with_window_funcs();
This doesn't seem to be needed?
> /*
> Extract all table-independent conditions and replace the WHERE
> clause with them. All other conditions were computed by opt_sum_query
> @@ -1615,6 +1618,7 @@ JOIN::optimize_inner()
> zero_result_cause= "no matching row in const table";
> DBUG_PRINT("error",("Error: %s", zero_result_cause));
> error= 0;
> + implicit_grouping_with_window_funcs();
OK
> goto setup_subq_exit;
> }
> if (!(thd->variables.option_bits & OPTION_BIG_SELECTS) &&
> @@ -1639,6 +1643,7 @@ JOIN::optimize_inner()
> zero_result_cause=
> "Impossible WHERE noticed after reading const tables";
> select_lex->mark_const_derived(zero_result_cause);
> + implicit_grouping_with_window_funcs();
Ok
> goto setup_subq_exit;
> }
>
> @@ -1781,6 +1786,7 @@ JOIN::optimize_inner()
> zero_result_cause=
> "Impossible WHERE noticed after reading const tables";
> select_lex->mark_const_derived(zero_result_cause);
> + implicit_grouping_with_window_funcs();
> goto setup_subq_exit;
There is no test coverage for this case. Please add.
> }
>
> @@ -18225,7 +18231,8 @@ void set_postjoin_aggr_write_func(JOIN_TAB *tab)
> }
> }
> else if (join->sort_and_group && !tmp_tbl->precomputed_group_by &&
> - !join->sort_and_group_aggr_tab && join->tables_list)
> + !join->sort_and_group_aggr_tab && join->tables_list &&
> + join->top_join_tab_count)
> {
> DBUG_PRINT("info",("Using end_write_group"));
> aggr->set_write_func(end_write_group);
> @@ -26924,6 +26931,30 @@ Item *remove_pushed_top_conjuncts(THD *thd, Item *cond)
> return cond;
> }
>
> +/*
> + There are 5 cases in which we shortcut the join optimization process as we
> + conclude that the join would be a degenerate one
> + 1) IMPOSSIBLE WHERE
> + 2) IMPOSSIBLE HAVING
> + 3) MIN/MAX optimization (@see opt_sum_query)
> + 4) EMPTY CONST TABLE
> + 5) ZERO LIMIT
> + If a window function is present in any of the above cases then to get the
> + result of the window function, we need to execute it. So we need to
Do we really need to do this for cases #2 and #5? In these case, no ouptut is
produced, why compute the window functions?
> + create a temporary table for its execution. Here we need to take in mind
> + that aggregate functions and non-aggregate function need not be executed.
> +
> +*/
> +
> +
> +void JOIN::implicit_grouping_with_window_funcs()
The name of the function doesn't look right. It's a noun ("subject") while
most functions start with verb (e.g. JOIN::optimize_...)
handle_implicit_grouping_with_window_funcs() isn't ideal either but would be
better I think.
> +{
> + if (select_lex->have_window_funcs() && send_row_on_empty_set())
> + {
> + const_tables= top_join_tab_count= table_count= 0;
> + }
> +}
> +
> /**
> @} (end of group Query_Optimizer)
> */
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
1
0

Re: [Maria-developers] [MariaDB/server] Proper locking for mysql.gtid_slave_pos truncation (84b437d)
by Kristian Nielsen 18 Dec '19
by Kristian Nielsen 18 Dec '19
18 Dec '19
Sergey Vojtovich <notifications(a)github.com> writes:
> ATTN @dr-m, @andrelkin, @SachinSetiya, @knielsen
So IIUC, this is about incorrect usage of ha_truncate() in
rpl_slave_state::truncate_state_table().
This is used only for
SET GLOBAL gtid_slave_pos = "..."
when all slave threads are stopped and nothing else is accessing the
gtid_pos table.
So it's fine to use ha_truncate() if that can be done easily (and
correctly). But it would also be fine just to loop and delete all rows one
by one in a normal transaction, if that is simpler. gtid_slave_pos is a
small table, there are normally only a few rows per active replication
domain.
I'm not myself very familiar with details of metadata locking etc. around
ha_truncate().
But looking at the code now, I don't understand why it only truncates one
table? If --gtid-pos-auto-engines is in effect, there could be multiple
tables... shouldn't they all be cleared when setting the gtid_slave_pos
variable? If so, maybe the delete-rows-one-by-one approach is in any case
preferable over ha_truncate, since it can then be done transactionally, to
not leave an inconsistent gtid_slave_pos state if one truncate fails and
another succeeds?
Hope this helps,
- Kristian.
2
2

Re: [Maria-developers] f72427f463d: MDEV-20923:UBSAN: member access within address Б─╕ which does not point to an object of type 'xid_count_per_binlog'
by andrei.elkin@pp.inet.fi 17 Dec '19
by andrei.elkin@pp.inet.fi 17 Dec '19
17 Dec '19
Sujatha, Kristian, howdy.
Sorry, I managed to sent my view without actually cc-ing Kristian, as claimed.
X-From-Line: nobody Wed Nov 27 19:27:51 2019
From: Andrei Elkin <andrei.elkin(a)mariadb.com>
To: sujatha <sujatha.sivakumar(a)mariadb.com>
Cc: <commits(a)mariadb.org>, <maria-developers(a)lists.launchpad.net>
Subject: Re: f72427f463d: MDEV-20923:UBSAN: member access within address
Let me try that again now.
Kristian, the question was about `xid_count_per_binlog' struct
comment -- ..
struct xid_count_per_binlog : public ilink {
char *binlog_name;
uint binlog_name_len;
ulong binlog_id;
/* Total prepared XIDs and pending checkpoint requests in this binlog. */
long xid_count;
/* For linking in requests to the binlog background thread. */
xid_count_per_binlog *next_in_queue;
xid_count_per_binlog(); /* Give link error if constructor used. */
.. -----------------------------^
};
Now I am guessing a constructor was attempted to fail out.. Was that
related to de-POD-ing of the object ('cos of the constructor)? It's
pretty obscure.
We would appreciate your explanation.
The former review mail is quoted (except one more [@]-marked view note, Sujatha) further.
Cheers,
Andrei
-------------------------------------------------------------------------------
Sujatha, hello.
The patch looks quite good. I have two suggestions though.
The first is to match the destructor's free() with actual allocation (of what is to be
freed) in the new parameterized constructor.
And 2nd-ly, I suggest to leave the explicit zero argument constructor
intact. Kristian is cc-d to help us to decide.
My comments are below.
Cheers,
Andrei
sujatha <sujatha.sivakumar(a)mariadb.com> writes:
> revision-id: f72427f463d316a54ebf87c2e84c73947e3c5fe4 (mariadb-10.1.43-5-gf72427f463d)
> parent(s): 13db50fc03e7312e6c01b06c7e4af69f69ba5382
> author: Sujatha
> committer: Sujatha
> timestamp: 2019-11-12 16:11:16 +0530
> message:
>
> MDEV-20923:UBSAN: member access within address … which does not point to an object of type 'xid_count_per_binlog'
>
> Problem:
> -------
> Accessing a member within 'xid_count_per_binlog' structure results in
> following error when 'UBSAN' is enabled.
>
> member access within address 0xXXX which does not point to an object of type
> 'xid_count_per_binlog'
>
> Analysis:
> ---------
> The problem appears to be that no constructor for 'xid_count_per_binlog' is
> being called, and thus the vtable will not be initialized.
>
> Fix:
> ---
> Defined a parameterized constructor for 'xid_count_per_binlog' class.
>
> ---
> sql/log.cc | 28 ++++++++++++++--------------
> sql/log.h | 9 ++++++++-
> 2 files changed, 22 insertions(+), 15 deletions(-)
>
> diff --git a/sql/log.cc b/sql/log.cc
> index acf1f8f8a9c..2b8b67febef 100644
> --- a/sql/log.cc
> +++ b/sql/log.cc
> @@ -3216,7 +3216,7 @@ void MYSQL_BIN_LOG::cleanup()
> DBUG_ASSERT(!binlog_xid_count_list.head());
> WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::cleanup(): Removing xid_list_entry "
> "for %s (%lu)", b);
> - my_free(b);
> + delete b;
> }
>
> mysql_mutex_destroy(&LOCK_log);
> @@ -3580,17 +3580,17 @@ bool MYSQL_BIN_LOG::open(const char *log_name,
> */
> uint off= dirname_length(log_file_name);
> uint len= strlen(log_file_name) - off;
> - char *entry_mem, *name_mem;
> - if (!(new_xid_list_entry = (xid_count_per_binlog *)
> - my_multi_malloc(MYF(MY_WME),
> - &entry_mem, sizeof(xid_count_per_binlog),
> - &name_mem, len,
> - NULL)))
> + char *name_mem;
> + name_mem= (char *) my_malloc(len, MYF(MY_ZEROFILL));
> + if (!name_mem)
> goto err;
> memcpy(name_mem, log_file_name+off, len);
That is both my_malloc and memcpy to go into the new constructor to
match [*] (find the label below).
> - new_xid_list_entry->binlog_name= name_mem;
> - new_xid_list_entry->binlog_name_len= len;
> - new_xid_list_entry->xid_count= 0;
> + new_xid_list_entry= new xid_count_per_binlog(name_mem, (int)len);
> + if (!new_xid_list_entry)
> + {
> + my_free(name_mem);
> + goto err;
> + }
>
> /*
> Find the name for the Initial binlog checkpoint.
> @@ -3711,7 +3711,7 @@ bool MYSQL_BIN_LOG::open(const char *log_name,
> {
> WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::open(): Removing xid_list_entry for "
> "%s (%lu)", b);
> - my_free(binlog_xid_count_list.get());
> + delete binlog_xid_count_list.get();
> }
> mysql_cond_broadcast(&COND_xid_list);
> WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::open(): Adding new xid_list_entry for "
> @@ -3758,7 +3758,7 @@ Turning logging off for the whole duration of the MySQL server process. \
> To turn it on again: fix the cause, \
> shutdown the MySQL server and restart it.", name, errno);
> if (new_xid_list_entry)
> - my_free(new_xid_list_entry);
> + delete new_xid_list_entry;
[@] With `delete` we don't need the non-null check: `if()` to remove.
> if (file >= 0)
> mysql_file_close(file, MYF(0));
> close(LOG_CLOSE_INDEX);
> @@ -4252,7 +4252,7 @@ bool MYSQL_BIN_LOG::reset_logs(THD *thd, bool create_new_log,
> DBUG_ASSERT(b->xid_count == 0);
> WSREP_XID_LIST_ENTRY("MYSQL_BIN_LOG::reset_logs(): Removing "
> "xid_list_entry for %s (%lu)", b);
> - my_free(binlog_xid_count_list.get());
> + delete binlog_xid_count_list.get();
> }
> mysql_cond_broadcast(&COND_xid_list);
> reset_master_pending--;
> @@ -9736,7 +9736,7 @@ TC_LOG_BINLOG::mark_xid_done(ulong binlog_id, bool write_checkpoint)
> break;
> WSREP_XID_LIST_ENTRY("TC_LOG_BINLOG::mark_xid_done(): Removing "
> "xid_list_entry for %s (%lu)", b);
> - my_free(binlog_xid_count_list.get());
> + delete binlog_xid_count_list.get();
> }
>
> mysql_mutex_unlock(&LOCK_xid_list);
> diff --git a/sql/log.h b/sql/log.h
> index b4c9b24a3a9..30a55e577a4 100644
> --- a/sql/log.h
> +++ b/sql/log.h
> @@ -587,7 +587,14 @@ class MYSQL_BIN_LOG: public TC_LOG, private MYSQL_LOG
> long xid_count;
> /* For linking in requests to the binlog background thread. */
> xid_count_per_binlog *next_in_queue;
> - xid_count_per_binlog(); /* Give link error if constructor used. */
Maybe we should leave it as it seems to be for catching inadvertent
object constructions.
I'm cc-ing Kristian to clear out.
> + xid_count_per_binlog(char *log_file_name, uint log_file_name_len)
> + :binlog_name(log_file_name), binlog_name_len(log_file_name_len),
> + binlog_id(0), xid_count(0)
> + {}
> + ~xid_count_per_binlog()
> + {
> + my_free(binlog_name);
[*]
> + }
> };
> I_List<xid_count_per_binlog> binlog_xid_count_list;
> mysql_mutex_t LOCK_binlog_background_thread;
> _______________________________________________
> commits mailing list
> commits(a)mariadb.org
> https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
2
1

Re: [Maria-developers] 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
by Sergei Golubchik 17 Dec '19
by Sergei Golubchik 17 Dec '19
17 Dec '19
Hi, Nikita!
On Nov 15, Nikita Malyavin wrote:
> revision-id: 6ac19d09c66 (mariadb-10.4.4-280-g6ac19d09c66)
> parent(s): 67ddb6507d5
> author: Nikita Malyavin <nikitamalyavin(a)gmail.com>
> committer: Nikita Malyavin <nikitamalyavin(a)gmail.com>
> timestamp: 2019-08-20 17:56:07 +1000
> message:
>
> MDEV-16978 Application-time periods: WITHOUT OVERLAPS
>
> * The overlaps check is implemented on a handler level per row
> command. It creates a separate cursor (actually, another handler
> instance) and caches it inside the original handler, when
> ha_update_row or ha_insert_row is issued. Cursor closes on unlocking
> the handler.
>
> * Containing the same key in index means unique constraint violation
> even in usual terms. So we fetch left and right neighbours and check
> that they have same key prefix, excluding from the key only the period
> part. If it doesnt match, then there's no such neighbour, and the
> check passes. Otherwise, we check if this neighbour intersects with
> the considered key.
>
> * the check does introduce new error and fails with ER_DUPP_KEY error.
> This might break REPLACE workflow and should be fixed separately
>
> diff --git a/sql/handler.cc b/sql/handler.cc
> index 94cffd69b75..560f6316602 100644
> --- a/sql/handler.cc
> +++ b/sql/handler.cc
> @@ -6913,6 +6928,130 @@ void handler::set_lock_type(enum thr_lock_type lock)
> table->reginfo.lock_type= lock;
> }
>
> +int handler::ha_check_overlaps(const uchar *old_data, const uchar* new_data)
> +{
> + DBUG_ASSERT(new_data);
> + if (!table_share->period.unique_keys)
> + return 0;
> + if (table->versioned() && !table->vers_end_field()->is_max())
> + return 0;
> +
> + bool is_update= old_data != NULL;
> + if (!check_overlaps_buffer)
> + check_overlaps_buffer= (uchar*)alloc_root(&table_share->mem_root,
> + table_share->max_unique_length
> + + table_share->reclength);
> + auto *record_buffer= check_overlaps_buffer + table_share->max_unique_length;
> + auto *handler= this;
> + if (handler->inited != NONE)
> + {
> + if (!check_overlaps_handler)
> + {
> + check_overlaps_handler= clone(table_share->normalized_path.str,
> + &table_share->mem_root);
wouldn't it be simpler and cheaper to use HA_EXTRA_REMEMBER_POS and
HA_EXTRA_RESTORE_POS, as we've discussed?
> + int error= -1;
> + if (check_overlaps_handler != NULL)
> + error= check_overlaps_handler->ha_external_lock(table->in_use, F_RDLCK);
> + if (error)
> + return error;
> + }
> + handler= check_overlaps_handler;
> + }
why "if (handler->inited != NONE)" ?
What happens if it is NONE?
> +
> + for (uint key_nr= 0; key_nr < table_share->keys; key_nr++)
> + {
> + const KEY *key_info= table->key_info + key_nr;
> + const uint key_parts= key_info->user_defined_key_parts;
> + if (!key_info->without_overlaps)
> + continue;
> +
> + key_copy(check_overlaps_buffer, new_data, key_info, 0);
> + if (is_update)
> + {
> + bool key_used= false;
> + for (uint k= 0; k < key_parts && !key_used; k++)
> + key_used= bitmap_is_set(table->write_set,
> + key_info->key_part[k].fieldnr - 1);
> + if (!key_used)
> + continue;
> + }
Good.
> +
> + int error= handler->ha_index_init(key_nr, 0);
> + if (error)
> + return error;
> +
> + for (int run= 0; run < 2; run++)
> + {
> + if (run == 0)
> + {
> + error = handler->ha_index_read_map(record_buffer,
> + check_overlaps_buffer,
> + key_part_map((1 << key_parts) - 1),
> + HA_READ_KEY_OR_PREV);
> + if (error == HA_ERR_KEY_NOT_FOUND)
> + continue;
> + }
> + else
> + {
> + error = handler->ha_index_next(record_buffer);
> + if (error == HA_ERR_END_OF_FILE)
> + continue;
> + }
> +
> + if (error)
> + {
> + handler->ha_index_end();
> + return error;
> + }
I found this "for (int run= 0; run < 2; run++)" rather confusing.
Particularly as you have an if() to have different code for the first
and the second runs.
it would be clearer to unroll the loop and move the common code (period
comparison) into a function. Like
handler->ha_index_read_map(...);
compare_periods();
handler->ha_index_next();
compare_periods();
But it can be done better. You search for the key with the same value
and a period start <= than the period start of new row. And then you
have to check two rows for overlaps. If you'll search for a key with the
period start <= than the _period end_ of the new row, then you'll only
need to check one row (may be one more, if updating).
Also, why do you store both period ends in the index? It seems like it'd
be enough to store only one end - only the start or only the end. Both
ends help if you use keyreads, but you don't. On the second thought,
perhaps, you should use keyreads, there's no need to fetch the whole row
for this overlap check. On the yet another thought it might not work
with HA_EXTRA_REMEMBER_POS.
> +
> + /* In case of update it could appear that the nearest neighbour is
> + * a record we are updating. It means, that there are no overlaps
> + * from this side. */
> + if (is_update && memcmp(old_data + table->s->null_bytes,
> + record_buffer + table->s->null_bytes,
> + table->s->reclength - table->s->null_bytes) == 0)
> + {
> + continue;
> + }
I'd rather compare row positions here.
> +
> + uint period_key_part_nr= key_parts - 2;
> + int cmp_res= 0;
> + for (uint part_nr= 0; !cmp_res && part_nr < period_key_part_nr; part_nr++)
> + {
> + Field *f= key_info->key_part[part_nr].field;
> + cmp_res= f->cmp(f->ptr_in_record(new_data),
> + f->ptr_in_record(record_buffer));
> + }
> + if (cmp_res)
> + continue; /* key is different => no overlaps */
> +
> + int period_cmp[2][2]= {/* l1 > l2, l1 > r2, r1 > l2, r1 > r2 */};
> + for (int i= 0; i < 2; i++)
> + {
> + for (int j= 0; j < 2; j++)
> + {
> + Field *lhs= key_info->key_part[period_key_part_nr + i].field;
> + Field *rhs= key_info->key_part[period_key_part_nr + j].field;
> +
> + period_cmp[i][j]= lhs->cmp(lhs->ptr_in_record(new_data),
> + rhs->ptr_in_record(record_buffer));
> + }
> + }
this can be simplified too if you'll change the code as I suggested
above to do only one index read - you'll only need one comparison,
instead of four.
> +
> + if ((period_cmp[0][0] <= 0 && period_cmp[1][0] > 0)
> + || (period_cmp[0][0] >= 0 && period_cmp[0][1] < 0))
> + {
> + handler->ha_index_end();
> + return HA_ERR_FOUND_DUPP_KEY;
> + }
> + }
> + error= handler->ha_index_end();
> + if (error)
> + return error;
> + }
> + return 0;
> +}
> +
> #ifdef WITH_WSREP
> /**
> @details
> diff --git a/sql/handler.h b/sql/handler.h
> index 63d0bf2215c..71debf9bab7 100644
> --- a/sql/handler.h
> +++ b/sql/handler.h
> @@ -3237,6 +3243,8 @@ class handler :public Sql_alloc
> DBUG_ASSERT(cached_table_flags < (HA_LAST_TABLE_FLAG << 1));
> return cached_table_flags;
> }
> + /** PRIMARY KEY WITHOUT OVERLAPS check is done globally */
what do you mean "globally"?
> + int ha_check_overlaps(const uchar *old_data, const uchar* new_data);
> /**
> These functions represent the public interface to *users* of the
> handler class, hence they are *not* virtual. For the inheritance
> diff --git a/sql/sql_table.cc b/sql/sql_table.cc
> index 9bb1d98152b..5aba86003c6 100644
> --- a/sql/sql_table.cc
> +++ b/sql/sql_table.cc
> @@ -4549,25 +4552,57 @@ static bool vers_prepare_keys(THD *thd, HA_CREATE_INFO *create_info,
> if (key->type != Key::PRIMARY && key->type != Key::UNIQUE)
> continue;
>
> - Key_part_spec *key_part= NULL;
> - List_iterator<Key_part_spec> part_it(key->columns);
> - while ((key_part=part_it++))
> + if (create_info->versioned())
> {
> - if (!my_strcasecmp(system_charset_info,
> - row_start_field,
> - key_part->field_name.str) ||
> -
> - !my_strcasecmp(system_charset_info,
> - row_end_field,
> - key_part->field_name.str))
> - break;
> + Key_part_spec *key_part=NULL;
> + List_iterator<Key_part_spec> part_it(key->columns);
> + while ((key_part=part_it++))
> + {
> + if (row_start_field.streq(key_part->field_name) ||
> + row_end_field.streq(key_part->field_name))
> + break;
> + }
> + if (!key_part)
> + key->columns.push_back(new Key_part_spec(&row_end_field, 0));
> }
> - if (key_part)
> - continue; // Key already contains Sys_start or Sys_end
> + }
>
> - Key_part_spec *key_part_sys_end_col=
> - new (thd->mem_root) Key_part_spec(&create_info->vers_info.as_row.end, 0);
> - key->columns.push_back(key_part_sys_end_col);
> + key_it.rewind();
> + while ((key=key_it++))
> + {
> + if (key->without_overlaps)
> + {
> + if (key->type != Key::PRIMARY && key->type != Key::UNIQUE)
> + {
> + my_error(ER_PERIOD_WITHOUT_OVERLAPS_NON_UNIQUE, MYF(0), key->period.str);
> + return true;
> + }
> + if (!create_info->period_info.is_set()
> + || !key->period.streq(create_info->period_info.name))
> + {
> + my_error(ER_PERIOD_NOT_FOUND, MYF(0), key->period.str);
> + return true;
> + }
> + if (thd->work_part_info)
> + {
> + my_error(ER_PERIOD_WITHOUT_OVERLAPS_PARTITIONED, MYF(0));
why?
> + return true;
> + }
> + const auto &period_start= create_info->period_info.period.start;
> + const auto &period_end= create_info->period_info.period.end;
> + List_iterator<Key_part_spec> part_it(key->columns);
> + while (Key_part_spec *key_part= part_it++)
> + {
> + if (period_start.streq(key_part->field_name)
> + || period_end.streq(key_part->field_name))
> + {
> + my_error(ER_KEY_CONTAINS_PERIOD_FIELDS, MYF(0), key->name.str);
> + return true;
> + }
> + }
> + key->columns.push_back(new Key_part_spec(&period_start, 0));
> + key->columns.push_back(new Key_part_spec(&period_end, 0));
> + }
> }
>
> return false;
> diff --git a/sql/table.h b/sql/table.h
> index 2b866159fe0..6511960488e 100644
> --- a/sql/table.h
> +++ b/sql/table.h
> @@ -1730,10 +1731,18 @@ class IS_table_read_plan;
>
> /** number of bytes used by field positional indexes in frm */
> constexpr uint frm_fieldno_size= 2;
> +/** number of bytes used by key position number in frm */
> +constexpr uint frm_keyno_size= 2;
> static inline uint16 read_frm_fieldno(const uchar *data)
> { return uint2korr(data); }
> -static inline void store_frm_fieldno(const uchar *data, uint16 fieldno)
> +static inline void store_frm_fieldno(uchar *data, uint16 fieldno)
> { int2store(data, fieldno); }
> +static inline uint16 read_frm_keyno(const uchar *data)
> +{ return uint2korr(data); }
> +static inline void store_frm_keyno(uchar *data, uint16 fieldno)
> +{ int2store(data, fieldno); }
> +static inline size_t frm_ident_stored_size(size_t len)
> +{ return len + (len > 255 ? 3 : 1); }
this is exactly extra2_str_size() function. Don't duplicate it, reuse.
(and rename, if you'd like a more generic name)
>
> class select_unit;
> class TMP_TABLE_PARAM;
> diff --git a/sql/unireg.cc b/sql/unireg.cc
> index 7130b3e5d8a..1e95242786e 100644
> --- a/sql/unireg.cc
> +++ b/sql/unireg.cc
> @@ -485,6 +486,16 @@ LEX_CUSTRING build_frm_image(THD *thd, const LEX_CSTRING &table,
> store_frm_fieldno(pos, get_fieldno_by_name(create_info, create_fields,
> create_info->period_info.period.end));
> pos+= frm_fieldno_size;
> + store_frm_keyno(pos, create_info->period_info.unique_keys);
> + pos+= frm_keyno_size;
> + for (uint key= 0; key < keys; key++)
> + {
> + if (key_info[key].without_overlaps)
> + {
> + store_frm_keyno(pos, key);
> + pos+= frm_keyno_size;
> + }
> + }
This will make 10.5 frms look corrupted in 10.4. Better use a new EXTRA2
flag for that. With a number > EXTRA2_ENGINE_IMPORTANT.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
20
Hi,
Please release MariaDB 10.4 for CentOS 8.
https://downloads.mariadb.org/mariadb/repositories/
does not show CentOS 8.
thanks
--
Thomas Stephen Lee
3
5

Re: [Maria-developers] 387839e44cc: Plugin for Hashicorp Vault Key Management System
by Sergei Golubchik 04 Dec '19
by Sergei Golubchik 04 Dec '19
04 Dec '19
Hi, Julius!
Just a couple of comments, see below:
On Dec 04, Julius Goryavsky wrote:
> revision-id: 387839e44cc (mariadb-10.4.10-179-g387839e44cc)
> parent(s): 8115a02283d
> author: Julius Goryavsky <julius.goryavsky(a)mariadb.com>
> committer: Julius Goryavsky <julius.goryavsky(a)mariadb.com>
> timestamp: 2019-11-15 13:37:35 +0100
> message:
>
> Plugin for Hashicorp Vault Key Management System
> diff --git a/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc b/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc
> new file mode 100644
> index 00000000000..32c0e6417a3
> --- /dev/null
> +++ b/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc
...
> + if (tolen_len > max_token_size)
> + {
> + my_printf_error(ER_UNKNOWN_ERROR,
> + "Maximum allowed token length exceeded",
> + 0);
> + return true;
> + }
> + size_t buf_len = x_vault_token_len + tolen_len + 1;
typo: "token_len" I think
> +#ifdef _WIN32
> + char *token_header = (char *) _alloca(sizeof(char) * buf_len);
> +#else
> + char *token_header = (char *) alloca(sizeof(char) * buf_len);
> +#endif
Please, remove all these ifdefs and add one at the very top of the file:
#ifdef _WIN32
#define alloca _alloca
#endif
Here I presume that you've tested it in buildbot and alloca is indeed
present on all builders and we don't need a fallback here.
Also remove sizeof(char) everywhere, it's always 1. C99 standard says
6.5.3.4 The sizeof operator
...
When applied to an operand that has type char, unsigned char, or signed
char, (or a qualified version thereof) the result is 1.
> + snprintf(token_header, buf_len, "%s%s", x_vault_token, token);
> + curl_errbuf[0] = '\0';
> + if ((list= curl_slist_append(list, token_header)) == NULL ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, curl_errbuf)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION,
> + write_response_memory)) != CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_WRITEDATA,
> + &read_data_stream)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_HTTPHEADER, list)) !=
> + CURLE_OK ||
...
> + /*
> + If the result of the conversion does not fit into an
> + unsigned integer or if an error (for example, overflow)
> + occurred during the conversion, then we need to return
> + an error:
> + */
> +#if ULONG_MAX > UINT_MAX
> + if (version > UINT_MAX || errno)
> +#else
> + if (errno)
> +#endif
did you verify that otherwise it makes compilation to fail?
better to keep the number of ifdefs in the code to a minimum, so don't
add this one unless absolutely necessary
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + return (unsigned int) version;
> +}
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] [Commits] 3a11b49bb5b: MDEV-20900: IN predicate to IN subquery conversion causes performance regression
by Sergey Petrunia 03 Dec '19
by Sergey Petrunia 03 Dec '19
03 Dec '19
Hi Varun,
As far as I understand, the intent of the patch is that this check:
+ for (uint i=0; i < n; i++)
+ {
+ if (item1->element_index(i)->cmp_type() !=
+ item2->element_index(i)->cmp_type())
matches this check in subquery_types_allow_materialization()
if (!inner->type_handler()->subquery_type_allows_materialization(inner,
outer))
which has implementations like so:
bool Type_handler_real_result::
subquery_type_allows_materialization(const Item *inner,
const Item *outer) const
{
DBUG_ASSERT(inner->cmp_type() == REAL_RESULT);
return outer->cmp_type() == REAL_RESULT;
}
// .. and the same for other datatypes ...
I would like to see comments in both places (cmp_row_types and
subquery_types_allow_materialization()) explaining that.
Note that some Type_handler::subquery_types_allow_materialization() have a more
complex implementation, for example
Type_handler_string_result::subquery_types_allow_materialization() compares
collations.
Is there any reasons why cmp_row_types() doesn't use the same call as
subquery_types_allow_materialization does:
if (!inner->type_handler()->subquery_type_allows_materialization(inner,
outer))
?
On Mon, Nov 04, 2019 at 04:38:02PM +0530, Varun wrote:
> revision-id: 3a11b49bb5b716538f98c6a212bbbfa6fc9b7a88 (mariadb-10.3.17-145-g3a11b49bb5b)
> parent(s): 162f475c4be81dfbceed093ad03d114b4c69a3c0
> author: Varun Gupta
> committer: Varun Gupta
> timestamp: 2019-11-04 16:28:25 +0530
> message:
>
> MDEV-20900: IN predicate to IN subquery conversion causes performance regression
>
> Disable the IN predicate to IN subquery conversion when the types on the left and
> right hand side of the IN predicate are not of comparable type.
>
> ---
> mysql-test/main/opt_tvc.result | 53 ++++++++++++++++++++++++++++++++++++++----
> mysql-test/main/opt_tvc.test | 31 ++++++++++++++++++++++++
> sql/sql_tvc.cc | 27 ++++++++++++++++++++-
> 3 files changed, 106 insertions(+), 5 deletions(-)
>
> diff --git a/mysql-test/main/opt_tvc.result b/mysql-test/main/opt_tvc.result
> index 5329a9f64be..a68e70e8a25 100644
> --- a/mysql-test/main/opt_tvc.result
> +++ b/mysql-test/main/opt_tvc.result
> @@ -629,11 +629,9 @@ SELECT * FROM t1 WHERE i IN (NULL, NULL, NULL, NULL, NULL);
> i
> EXPLAIN EXTENDED SELECT * FROM t1 WHERE i IN (NULL, NULL, NULL, NULL, NULL);
> id select_type table type possible_keys key key_len ref rows filtered Extra
> -1 PRIMARY t1 ALL NULL NULL NULL NULL 3 100.00
> -1 PRIMARY <derived3> ALL NULL NULL NULL NULL 5 100.00 Using where; FirstMatch(t1); Using join buffer (flat, BNL join)
> -3 DERIVED NULL NULL NULL NULL NULL NULL NULL NULL No tables used
> +1 SIMPLE t1 ALL NULL NULL NULL NULL 3 100.00 Using where
> Warnings:
> -Note 1003 /* select#1 */ select `test`.`t1`.`i` AS `i` from `test`.`t1` semi join ((values (NULL),(NULL),(NULL),(NULL),(NULL)) `tvc_0`) where `test`.`t1`.`i` = `tvc_0`.`_col_1`
> +Note 1003 select `test`.`t1`.`i` AS `i` from `test`.`t1` where `test`.`t1`.`i` in (NULL,NULL,NULL,NULL,NULL)
> SET in_predicate_conversion_threshold= default;
> DROP TABLE t1;
> #
> @@ -687,3 +685,50 @@ f1 f2
> 1 1
> DROP TABLE t1,t2,t3;
> SET @@in_predicate_conversion_threshold= default;
> +#
> +# MDEV-20900: IN predicate to IN subquery conversion causes performance regression
> +#
> +create table t1(a int, b int);
> +insert into t1 select seq-1, seq-1 from seq_1_to_10;
> +set in_predicate_conversion_threshold=2;
> +explain select * from t1 where t1.a IN ("1","2","3","4");
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 SIMPLE t1 ALL NULL NULL NULL NULL 10 Using where
> +select * from t1 where t1.a IN ("1","2","3","4");
> +a b
> +1 1
> +2 2
> +3 3
> +4 4
> +set in_predicate_conversion_threshold=0;
> +explain select * from t1 where t1.a IN ("1","2","3","4");
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 SIMPLE t1 ALL NULL NULL NULL NULL 10 Using where
> +select * from t1 where t1.a IN ("1","2","3","4");
> +a b
> +1 1
> +2 2
> +3 3
> +4 4
> +set in_predicate_conversion_threshold=2;
> +explain select * from t1 where (t1.a,t1.b) in (("1","1"),(2,2),(3,3),(4,4));
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 SIMPLE t1 ALL NULL NULL NULL NULL 10 Using where
> +select * from t1 where (t1.a,t1.b) in (("1","1"),(2,2),(3,3),(4,4));
> +a b
> +1 1
> +2 2
> +3 3
> +4 4
> +set in_predicate_conversion_threshold=0;
> +explain select * from t1 where (t1.a,t1.b) in (("1","1"),(2,2),(3,3),(4,4));
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 SIMPLE t1 ALL NULL NULL NULL NULL 10 Using where
> +select * from t1 where (t1.a,t1.b) in (("1","1"),(2,2),(3,3),(4,4));
> +a b
> +1 1
> +2 2
> +3 3
> +4 4
> +drop table t1;
> +SET @@in_predicate_conversion_threshold= default;
> diff --git a/mysql-test/main/opt_tvc.test b/mysql-test/main/opt_tvc.test
> index 7319dbdc9e8..e4e8c6d7919 100644
> --- a/mysql-test/main/opt_tvc.test
> +++ b/mysql-test/main/opt_tvc.test
> @@ -3,6 +3,7 @@
> #
> source include/have_debug.inc;
> source include/default_optimizer_switch.inc;
> +source include/have_sequence.inc;
>
> create table t1 (a int, b int);
>
> @@ -397,3 +398,33 @@ SELECT * FROM t3 WHERE (f1,f2) IN ((2, 2), (1, 2), (3, 5), (1, 1));
> DROP TABLE t1,t2,t3;
>
> SET @@in_predicate_conversion_threshold= default;
> +
> +--echo #
> +--echo # MDEV-20900: IN predicate to IN subquery conversion causes performance regression
> +--echo #
> +
> +create table t1(a int, b int);
> +insert into t1 select seq-1, seq-1 from seq_1_to_10;
> +
> +set in_predicate_conversion_threshold=2;
> +
> +let $query= select * from t1 where t1.a IN ("1","2","3","4");
> +eval explain $query;
> +eval $query;
> +
> +set in_predicate_conversion_threshold=0;
> +eval explain $query;
> +eval $query;
> +
> +set in_predicate_conversion_threshold=2;
> +let $query= select * from t1 where (t1.a,t1.b) in (("1","1"),(2,2),(3,3),(4,4));
> +eval explain $query;
> +eval $query;
> +
> +set in_predicate_conversion_threshold=0;
> +eval explain $query;
> +eval $query;
> +
> +drop table t1;
> +SET @@in_predicate_conversion_threshold= default;
> +
> diff --git a/sql/sql_tvc.cc b/sql/sql_tvc.cc
> index 816c6fe1089..78c7c34a81a 100644
> --- a/sql/sql_tvc.cc
> +++ b/sql/sql_tvc.cc
> @@ -796,6 +796,31 @@ bool Item_subselect::wrap_tvc_into_select(THD *thd, st_select_lex *tvc_sl)
> }
>
>
> +/*
> + @brief
> + Check whether the items are of comparable type or not
> +
> + @retval
> + 0 comparable
> + 1 not comparable
> +*/
> +
> +static bool cmp_row_types(Item* item1, Item* item2)
> +{
> + uint n= item1->cols();
> + if (item2->check_cols(n))
> + return true;
> +
> + for (uint i=0; i < n; i++)
> + {
> + if (item1->element_index(i)->cmp_type() !=
> + item2->element_index(i)->cmp_type())
> + return true;
> + }
> + return false;
> +}
> +
> +
> /**
> @brief
> Transform IN predicate into IN subquery
> @@ -843,7 +868,7 @@ Item *Item_func_in::in_predicate_to_in_subs_transformer(THD *thd,
>
> for (uint i=1; i < arg_count; i++)
> {
> - if (!args[i]->const_item())
> + if (!args[i]->const_item() || cmp_row_types(args[0], args[i]))
> return this;
> }
>
> _______________________________________________
> commits mailing list
> commits(a)mariadb.org
> https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
--
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
1
0

Re: [Maria-developers] 387839e44cc: Plugin for Hashicorp Vault Key Management System
by Sergei Golubchik 02 Dec '19
by Sergei Golubchik 02 Dec '19
02 Dec '19
Hi, Julius!
Looks good! A couple of minor comments, see below
On Dec 02, Julius Goryavsky wrote:
> revision-id: 387839e44cc (mariadb-10.4.10-179-g387839e44cc)
> parent(s): 8115a02283d
> author: Julius Goryavsky <julius.goryavsky(a)mariadb.com>
> committer: Julius Goryavsky <julius.goryavsky(a)mariadb.com>
> timestamp: 2019-11-15 13:37:35 +0100
> message:
>
> Plugin for Hashicorp Vault Key Management System
>
> diff --git a/plugin/hashicorp_key_management/CMakeLists.txt b/plugin/hashicorp_key_management/CMakeLists.txt
> new file mode 100644
> index 00000000000..4b5f79afa66
> --- /dev/null
> +++ b/plugin/hashicorp_key_management/CMakeLists.txt
> @@ -0,0 +1,11 @@
> +INCLUDE(FindCURL)
> +IF(NOT CURL_FOUND)
> + # Can't build plugin
> + RETURN()
> +ENDIF()
> +
> +INCLUDE_DIRECTORIES(${CMAKE_SOURCE_DIR}/sql ${CURL_INCLUDE_DIR})
> +
> +MYSQL_ADD_PLUGIN(HASHICORP_KEY_MANAGEMENT
> + hashicorp_key_management_plugin.cc
> + LINK_LIBRARIES ${CURL_LIBRARIES})
> diff --git a/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc b/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc
> new file mode 100644
> index 00000000000..26ee7b7cd3d
> --- /dev/null
> +++ b/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc
> @@ -0,0 +1,325 @@
> +/* Copyright (C) 2019 MariaDB Corporation
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; version 2 of the License.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program; if not, write to the Free Software
> + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA */
> +
> +#include <mysql/plugin_encryption.h>
> +#include <mysqld_error.h>
> +#include <string.h>
> +#include <stdlib.h>
> +#include <limits.h>
> +#include <errno.h>
> +#include <string>
> +#include <sstream>
> +#include <curl/curl.h>
> +
> +static char* vault_url;
> +static char* token;
> +static char* vault_ca;
> +
> +static MYSQL_SYSVAR_STR(vault_ca, vault_ca,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
> + "Path to Certificate Authority (CA) bundle (is a file "
> + "that contains root and intermediate certificates)",
> + NULL, NULL, "");
> +
> +static MYSQL_SYSVAR_STR(vault_url, vault_url,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
> + "HTTP[s] URL that is used to connect to the Hashicorp Vault server",
> + NULL, NULL, "https://127.0.0.1:8200/v1/secret");
> +
> +static MYSQL_SYSVAR_STR(token, token,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY | PLUGIN_VAR_NOSYSVAR,
> + "Authentication token that passed to the Hashicorp Vault "
> + "in the request header",
> + NULL, NULL, "");
> +
> +static struct st_mysql_sys_var* settings[] = {
> + MYSQL_SYSVAR(vault_url),
> + MYSQL_SYSVAR(token),
> + MYSQL_SYSVAR(vault_ca),
> + NULL
> +};
> +
> +static std::string get_error_from_curl (CURLcode curl_code, char *curl_errbuf)
> +{
> + size_t len = strlen(curl_errbuf);
> + std::ostringstream stream;
> + if (curl_code != CURLE_OK)
> + {
> + stream << "CURL returned this error code: " << curl_code;
> + stream << " with error message : ";
> + if (len)
> + stream << curl_errbuf;
> + else
> + stream << curl_easy_strerror(curl_code);
> + }
> + return stream.str();
> +}
> +
> +#define max_response_size 65536
> +
> +static size_t write_response_memory (void *contents, size_t size, size_t nmemb,
> + void *userp)
> +{
> + size_t realsize = size * nmemb;
> + if (size != 0 && realsize / size != nmemb)
> + return 0; // overflow detected
> + std::ostringstream *read_data = static_cast<std::ostringstream*>(userp);
> + size_t current_length = read_data->tellp();
> + if (current_length + realsize > max_response_size)
> + return 0; // response size limit exceeded
> + read_data->write(static_cast<char*>(contents), realsize);
> + if (!read_data->good())
> + return 0;
> + return realsize;
> +}
> +
> +#define timeout 300
> +
> +static bool curl_run (char *url, std::string *response)
> +{
> + char curl_errbuf [CURL_ERROR_SIZE];
> + struct curl_slist *list = NULL;
> + std::ostringstream read_data_stream;
> + CURLcode curl_res = CURLE_OK;
> + long http_code = 0;
> + CURL *curl = curl_easy_init();
> + if (curl == NULL)
> + {
> + my_printf_error(ER_UNKNOWN_ERROR,
> + "Cannot initialize curl session",
> + ME_ERROR_LOG_ONLY);
> + return true;
> + }
> + std::string token_header = std::string("X-Vault-Token:") +
> + std::string(token);
> + curl_errbuf[0] = '\0';
> + if ((list= curl_slist_append(list, token_header.c_str())) == NULL ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, curl_errbuf)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION,
> + write_response_memory)) != CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_WRITEDATA,
> + &read_data_stream)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_HTTPHEADER, list)) !=
> + CURLE_OK ||
> + /*
> + The options CURLOPT_SSL_VERIFYPEER and CURLOPT_SSL_VERIFYHOST are
> + set explicitly to withstand possible future changes in curl defaults:
> + */
> + (curl_res= curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 1)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 2L)) !=
> + CURLE_OK ||
> + (vault_ca != NULL && strlen(vault_ca) != 0 &&
vault_ca still cannot be NULL :)
> + (curl_res= curl_easy_setopt(curl, CURLOPT_CAINFO, vault_ca)) !=
> + CURLE_OK) ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_USE_SSL, CURLUSESSL_ALL)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, timeout)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_TIMEOUT, timeout)) !=
> + CURLE_OK)
> + {
if you get an error here you won't free list, so it'll be a memory leak.
In fact, I don't understand why you need two if() here, just combine
them in one, the error message is the same anyway, and the second if()
already has curl_slist_free_all().
> + my_printf_error(ER_UNKNOWN_ERROR,
> + get_error_from_curl(curl_res, curl_errbuf).c_str(),
> + ME_ERROR_LOG_ONLY);
> + return true;
> + }
> + if ((curl_res = curl_easy_setopt(curl, CURLOPT_URL, url)) != CURLE_OK ||
> + (curl_res = curl_easy_perform(curl)) != CURLE_OK ||
> + (curl_res = curl_easy_getinfo (curl, CURLINFO_RESPONSE_CODE,
> + &http_code)) != CURLE_OK)
> + {
> + curl_slist_free_all(list);
> + my_printf_error(ER_UNKNOWN_ERROR,
> + get_error_from_curl(curl_res, curl_errbuf).c_str(),
> + ME_ERROR_LOG_ONLY);
> + return true;
> + }
> + curl_slist_free_all(list);
> + if (http_code == 404)
> + {
> + *response = std::string("");
> + return false;
> + }
> + *response = read_data_stream.str();
> + return http_code < 200 || http_code >= 300;
> +}
> +
> +static unsigned int get_latest_version (uint key_id)
> +{
> + char url[2048];
> + std::string response_str;
> + const char *response;
> + const char *js, *ver;
> + int js_len, ver_len;
> + size_t response_len;
> + snprintf(url, sizeof(url), "%s/%u", vault_url, key_id);
> + if (curl_run(url, &response_str)) {
> + my_printf_error(ER_UNKNOWN_ERROR,
> + "Unable to get key data",
> + 0);
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + response = response_str.c_str();
> + response_len = response_str.size();
> + if (json_get_object_key(response, response + response_len,
> + "metadata", &js, &js_len) != JSV_OBJECT)
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + if (json_get_object_key(js, js+js_len,"version",
> + &ver, &ver_len) != JSV_NUMBER)
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + /*
> + Here we may use atoi, but it has undefined overflow behavior
> + and does not work with unsigned integers, so use strtoul:
> + */
> + unsigned long version = strtoul(ver, NULL, 10);
> + /*
> + If the result of the conversion does not fit into an
> + unsigned integer or if an error (for example, overflow)
> + occurred during the conversion, then we need to return
> + an error:
> + */
> + if (version > UINT_MAX || errno)
this might be an error with gcc, "condition is always true"
double check this please, no need to fix anything if there's no error
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + return version;
this might be an error with VS, because of the implicit ulong->uint cast.
double check this please, no need to fix anything if there's no error
> +}
> +
> +static unsigned int get_key_from_vault (unsigned int key_id,
> + unsigned int key_version,
> + unsigned char *dstbuf,
> + unsigned int *buflen)
> +{
> + char url[2048];
> + std::string response_str;
> + const char *response;
> + const char *js, *ver, *key;
> + int js_len, key_len, ver_len;
> + size_t response_len;
> + if (key_version != ENCRYPTION_KEY_VERSION_INVALID)
> + snprintf(url, sizeof(url), "%s/%u?%u", vault_url, key_id, key_version);
> + else
> + snprintf(url, sizeof(url), "%s/%u", vault_url, key_id);
> + if (curl_run(url, &response_str))
> + {
> + my_printf_error(ER_UNKNOWN_ERROR,
> + "Unable to get key data",
> + 0);
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + response = response_str.c_str();
> + response_len = response_str.size();
> +#ifndef NDEBUG
> + /*
> + An internal check that is needed only for debugging the plugin
> + operation - in order to ensure that we get from the Hashicorp Vault
> + server exactly the version of the key that is needed:
> + */
> + if (key_version != ENCRYPTION_KEY_VERSION_INVALID)
> + {
> + if (json_get_object_key(response, response + response_len,
> + "metadata", &js, &js_len) != JSV_OBJECT)
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + if (json_get_object_key(js, js+js_len,"version",
> + &ver, &ver_len) != JSV_NUMBER)
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + /*
> + Here we may use atoi, but it has undefined overflow behavior
> + and does not work with unsigned integers, so use strtoul:
> + */
> + unsigned long version = strtoul(ver, NULL, 10);
> + /*
> + If the result of the conversion does not fit into an
> + unsigned integer or if an error (for example, overflow)
> + occurred during the conversion, then we need to return
> + an error:
> + */
> + if (errno || key_version != version)
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + }
> +#endif
> + if (json_get_object_key(response, response + response_len,
> + "data", &js, &js_len) != JSV_OBJECT)
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + if (json_get_object_key(js, js + js_len, "data",
> + &key, &key_len) != JSV_STRING)
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + if (*buflen < (unsigned) key_len) {
> + *buflen= key_len;
> + return ENCRYPTION_KEY_BUFFER_TOO_SMALL;
> + }
> + *buflen= key_len;
> + memcpy(dstbuf, key, key_len);
> + return 0;
> +}
> +
> +struct st_mariadb_encryption hashicorp_key_management_plugin= {
> + MariaDB_ENCRYPTION_INTERFACE_VERSION,
> + get_latest_version,
> + get_key_from_vault,
> + 0, 0, 0, 0, 0
> +};
> +
> +static int hashicorp_key_management_plugin_init(void *p)
> +{
> + curl_global_init(CURL_GLOBAL_ALL);
> + return 0;
> +}
> +
> +static int hashicorp_key_management_plugin_deinit(void *p)
> +{
> + curl_global_cleanup();
> + return 0;
> +}
> +
> +/*
> + Plugin library descriptor
> +*/
> +maria_declare_plugin(hashicorp_key_management)
> +{
> + MariaDB_ENCRYPTION_PLUGIN,
> + &hashicorp_key_management_plugin,
> + "hashicorp_key_management",
> + "MariaDB Corporation",
> + "HashiCorp Vault key management plugin",
> + PLUGIN_LICENSE_GPL,
> + hashicorp_key_management_plugin_init,
> + hashicorp_key_management_plugin_deinit,
> + 0x0100 /* 1.0 */,
> + NULL, /* status variables */
> + settings,
> + "1.0",
> + MariaDB_PLUGIN_MATURITY_ALPHA
> +}
> +maria_declare_plugin_end;
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] 387839e44cc: Plugin for Hashicorp Vault Key Management System
by Sergei Golubchik 28 Nov '19
by Sergei Golubchik 28 Nov '19
28 Nov '19
Hi, Julius!
On Nov 28, Julius Goryavsky wrote:
> Hi, Sergey,
>
> many thanks for the detailed review, I fixed all the flaws and left only
> what raises questions or where I have explanations:
>
> >> + if ((list= curl_slist_append(list, token_header.c_str())) == NULL ||
> >> + (list= curl_slist_append(list, "Content-Type: application/json"))
> == NULL ||
> >
> > Why do you specify Content-Type header if you aren't sending any content?
> > Seems redundant.
>
> I have already come across servers that respond with the wrong content-type
> if the input request had a different content-type or if the input request
> does not contain a content-type at all. Fortunately, the Hashicorp server
> works without setting the correct content-type in the input, but may be
> more reliable to set it explicitly?
I don't understand. Content-Type in the request header does not mean
that you *accept* application/json, it means you are *sending*
application/json. And you are not, you are sending no body at
all, just the header.
> >> + (vault_ca != NULL && strlen(vault_ca) != 0 && (curl_res=
> curl_easy_setopt(curl, CURLOPT_CAINFO , vault_ca)) != CURLE_OK) ||
>
> > How can vault_ca == NULL here?
> > I think it cannot, the check is redundant too.
>
> I don’t know, we have a guarantee that if this option is not set, then the
> corresponding variable will always point to an empty zero-terminated
> string, but will never just be NULL instead? (This is the path to the file
> with the CA bundle. In principle, a Hashicorp Vault can work without it, it
> is needed only if there is no root or intermediate certificates on the
> other side.)
Yes, I believe vault_ca can never be NULL. The default value is "", so
it'll be either that or a string specified on the command line.
> > > + std::string ver_str = std::string(ver, ver_len);
> > > + return strtoul(ver_str.c_str(), NULL, 10);
>
> > Why do you need to do that? create std::string, copy/reallocate it?
> > What's the point? Just do atol(ver)
>
> Does json_get_object_key return a new zero-terminated string? atol cannot
> be applied to a fragment with a length (without trailing zero), as far as I
> know.
No, it doesn't return a zero-terminated string. It returns a pointer
into a valid json string. So you can be sure that ver[ver_len] is a
non-digit character (it'll be '\n' or, may be, '}'). Which is enough for
atol().
> >> + return ENCRYPTION_KEY_VERSION_INVALID;
> >> + return decode_data(key, key_len, dstbuf, buflen);
>
> > Why? I don't see that in hashicorp docs that the key is base64-encoded
>
> But after all, as far as I understand at this level, the key for us is
> arbitrary binary data? What if it contains all kinds of quotation marks,
> commas, or, UTF-8 escape codes and all kinds of unprintable characters like
> a zero? I thought that it should be base64-endoded if it is binary data, so
> that we can pass it through JSON encoding in the form of text that will not
> cause a failure in the JSON parser.
I believe that with the proper escaping json can handle any binary data
just fine.
With your base64 requirement you basically mean that one cannot use 3rd
party tools to create/manage keys, unless the tool has an option to
store the keys base64-encoded.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] 387839e44cc: Plugin for Hashicorp Vault Key Management System
by Sergei Golubchik 27 Nov '19
by Sergei Golubchik 27 Nov '19
27 Nov '19
Hi, Julius!
On Nov 27, Julius Goryavsky wrote:
> revision-id: 387839e44cc (mariadb-10.4.10-179-g387839e44cc)
> parent(s): 8115a02283d
> author: Julius Goryavsky <julius.goryavsky(a)mariadb.com>
> committer: Julius Goryavsky <julius.goryavsky(a)mariadb.com>
> timestamp: 2019-11-15 13:37:35 +0100
> message:
>
> Plugin for Hashicorp Vault Key Management System
>
> diff --git a/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc b/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc
> new file mode 100644
> index 00000000000..6de1d9d1f3a
> --- /dev/null
> +++ b/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc
> @@ -0,0 +1,330 @@
> +/* Copyright (C) 2019 MariaDB Corporation
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; version 2 of the License.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program; if not, write to the Free Software
> + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA */
> +
> +#include <mysql/plugin.h>
This is redundant, because plugin_encryption.h includes plugin.h
But it does not harm either, so if you want to include it explictly
for clarity - sure, do it
> +#include <mysql/plugin_encryption.h>
> +#include <mysqld_error.h>
> +#include <string.h>
> +#include <stdlib.h>
> +#include <string>
> +#include <sstream>
> +#include <curl/curl.h>
> +
> +static char* vault_url;
> +static char* token;
> +static char* vault_ca;
> +
> +static MYSQL_SYSVAR_STR(vault_ca, vault_ca,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
> + "Path to Certificate Authority (CA) bundle (is a file "
> + "that contains root and intermediate certificates)",
> + NULL, NULL, "");
> +
> +static MYSQL_SYSVAR_STR(vault_url, vault_url,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
> + "HTTP[s] URL that is used to connect to the Hashicorp Vault server",
> + NULL, NULL, "https://127.0.0.1:8200/v1/secret");
> +
> +static MYSQL_SYSVAR_STR(token, token,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY | PLUGIN_VAR_NOSYSVAR,
> + "Authentication token that passed to the Hashicorp Vault "
> + "in the request header",
> + NULL, NULL, "");
> +
> +static struct st_mysql_sys_var* settings[] = {
> + MYSQL_SYSVAR(vault_url),
> + MYSQL_SYSVAR(token),
> + MYSQL_SYSVAR(vault_ca),
> + NULL
> +};
> +
> +static std::string get_error_from_curl (CURLcode curl_code, char *curl_errbuf)
> +{
> + size_t len = strlen(curl_errbuf);
> + std::ostringstream stream;
> + if (curl_code != CURLE_OK)
> + {
> + stream << "CURL returned this error code: " << curl_code;
> + stream << " with error message : ";
> + if (len)
> + stream << curl_errbuf;
> + else
> + stream << curl_easy_strerror(curl_code);
> + }
> + return stream.str();
> +}
> +
> +#define max_response_size 65536
> +
> +static size_t write_response_memory (void *contents, size_t size, size_t nmemb, void *userp)
> +{
> + size_t realsize = size * nmemb;
> + if (size != 0 && realsize / size != nmemb)
> + return 0; // overflow
> + std::ostringstream *read_data = static_cast<std::ostringstream*>(userp);
> + size_t ss_pos = read_data->tellp();
> + read_data->seekp(0, std::ios::end);
> + size_t number_of_read_bytes = read_data->tellp();
> + read_data->seekp(ss_pos);
> + if (number_of_read_bytes + realsize > max_response_size)
> + return 0; // response size limit exceeded
> + read_data->write(static_cast<char*>(contents), realsize);
> + if (!read_data->good())
> + return 0;
> + return realsize;
> +}
> +
> +#define timeout 300
> +
> +static bool setup_curl_session (CURL *curl, char *curl_errbuf,
> + curl_slist **list_ptr,
> + std::ostringstream &read_data_stream)
> +{
> + struct curl_slist *list = NULL;
> + std::string token_header = std::string("X-Vault-Token:") +
> + std::string(token);
> + CURLcode curl_res = CURLE_OK;
> + read_data_stream.str("");
> + read_data_stream.clear();
Seems redundant, the stream was just created
> + curl_errbuf[0] = '\0';
> + if ((list= curl_slist_append(list, token_header.c_str())) == NULL ||
> + (list= curl_slist_append(list, "Content-Type: application/json")) == NULL ||
Why do you specify Content-Type header if you aren't sending any content?
Seems redundant.
> + (curl_res= curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, curl_errbuf)) != CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_response_memory)) != CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_WRITEDATA, static_cast<void *>(&read_data_stream))) != CURLE_OK ||
Do you need to cast here?
Seems redundant.
> + (curl_res= curl_easy_setopt(curl, CURLOPT_HTTPHEADER, list)) != CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 1)) != CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 2L)) != CURLE_OK ||
This is redundant, CURLOPT_SSL_VERIFYPEER is 1 by default,
CURLOPT_SSL_VERIFYHOST is 2 by default.
But again, if you want to be explicit for the sake of clarity - feel free to
> + (vault_ca != NULL && strlen(vault_ca) != 0 && (curl_res= curl_easy_setopt(curl, CURLOPT_CAINFO, vault_ca)) != CURLE_OK) ||
How can vault_ca == NULL here?
I think it cannot, the check is redundant too.
> + (curl_res= curl_easy_setopt(curl, CURLOPT_USE_SSL, CURLUSESSL_ALL)) != CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L)) != CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, timeout)) != CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_TIMEOUT, timeout)) != CURLE_OK ||
> + (curl_res = curl_easy_setopt(curl, CURLOPT_HTTP_VERSION, (long)CURL_HTTP_VERSION_1_1)) != CURLE_OK)
Why?
> + {
> + *list_ptr = list;
> + my_printf_error(ER_UNKNOWN_ERROR,
> + get_error_from_curl(curl_res, curl_errbuf).c_str(),
> + ME_ERROR_LOG_ONLY);
> + return true;
> + }
> + *list_ptr = list;
> + return false;
> +}
> +
> +static bool curl_run (char * url, const char * * response)
> +{
> + char curl_errbuf [CURL_ERROR_SIZE];
> + struct curl_slist *list = NULL;
> + std::ostringstream read_data_stream;
> + CURLcode curl_res = CURLE_OK;
> + long http_code = 0;
> + CURL *curl = curl_easy_init();
> + if (curl == NULL)
> + {
> + my_printf_error(ER_UNKNOWN_ERROR,
> + "Cannot initialize curl session",
> + ME_ERROR_LOG_ONLY);
> + return true;
> + }
> + if (setup_curl_session(curl, curl_errbuf, &list, read_data_stream) ||
your setup_curl_session doesn't make much sense now. It could've been useful if one
could prepare the session once and then reuse it.
but the way you do it now, it's just draws a rather arbitrary line
where some one of the curl setup happens inside setup_curl_session function
and the rest in the caller.
better remove setup_curl_session and put that big if() here.
> + (curl_res = curl_easy_setopt(curl, CURLOPT_URL, url)) != CURLE_OK ||
> + (curl_res = curl_easy_perform(curl)) != CURLE_OK ||
> + (curl_res = curl_easy_getinfo (curl, CURLINFO_RESPONSE_CODE,
> + &http_code)) != CURLE_OK)
> + {
> + if (list != NULL)
> + {
> + curl_slist_free_all(list);
> + }
> + my_printf_error(ER_UNKNOWN_ERROR,
> + get_error_from_curl(curl_res, curl_errbuf).c_str(),
> + ME_ERROR_LOG_ONLY);
> + return true;
> + }
> + if (list != NULL)
> + {
> + curl_slist_free_all(list);
> + }
> + if (http_code == 404)
> + {
> + *response = NULL;
> + return false;
> + }
> + *response = read_data_stream.str().c_str();
Really? Who owns this string? Will read_data_stream free it automatically
when it gets destructed?
> + return http_code < 200 || http_code >= 300;
> +}
> +
> +static unsigned int get_latest_version (uint key_id)
> +{
> + char url[2048];
> + const char *response;
> + const char *js, *ver;
> + int js_len, ver_len;
> + size_t response_len;
> + snprintf(url, sizeof(url), "%s/%u", vault_url, key_id);
> + if (curl_run(url, &response)) {
> + my_printf_error(ER_UNKNOWN_ERROR, "Unable to get key data", 0);
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + response_len = strlen(response);
> + if (json_get_object_key(response, response + response_len,
> + "metadata", &js, &js_len) != JSV_OBJECT)
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + if (json_get_object_key(js, js+js_len,"version",
> + &ver, &ver_len) != JSV_NUMBER)
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + std::string ver_str = std::string(ver, ver_len);
> + return strtoul(ver_str.c_str(), NULL, 10);
Why do you need to do that? create std::string, copy/reallocate it?
What's the point? Just do atol(ver)
> +}
> +
> +static int decode_data (const char *src, int src_len,
> + unsigned char* dstbuf, unsigned *buflen)
> +{
> + int length_of_memory_needed_for_decode =
> + my_base64_needed_decoded_length(src_len);
> + char* data= (char *)malloc(
> + length_of_memory_needed_for_decode * sizeof(char));
> + int decoded_length=
> + my_base64_decode(src, src_len, data, NULL, 0);
> + if (decoded_length <= 0)
> + {
> + memset(data, 0, length_of_memory_needed_for_decode);
> + free(data);
> + return ENCRYPTION_KEY_BUFFER_TOO_SMALL;
> + }
> + if (*buflen < (unsigned) decoded_length) {
> + *buflen= decoded_length;
> + free(data);
> + return ENCRYPTION_KEY_BUFFER_TOO_SMALL;
> + }
> + *buflen= decoded_length;
> + memcpy(dstbuf, data, decoded_length);
> + free(data);
> + return 0;
> +}
> +
> +static unsigned int get_key_from_vault (unsigned int key_id,
> + unsigned int key_version,
> + unsigned char* dstbuf,
> + unsigned int *buflen)
> +{
> + char url[2048];
> + const char *response;
> + const char *js, *ver, *key;
> + int js_len, key_len, ver_len;
> + size_t response_len;
> + if (key_version != ENCRYPTION_KEY_VERSION_INVALID)
> + snprintf(url, sizeof(url), "%s/%u?%u", vault_url, key_id, key_version);
> + else
> + snprintf(url, sizeof(url), "%s/%u", vault_url, key_id);
> + if (curl_run(url, &response))
> + {
> + my_printf_error(ER_UNKNOWN_ERROR, "Unable to get key data", 0);
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + response_len = strlen(response);
> + if (key_version != ENCRYPTION_KEY_VERSION_INVALID)
> + {
> + if (json_get_object_key(response, response + response_len,
> + "metadata", &js, &js_len) != JSV_OBJECT)
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + if (json_get_object_key(js, js+js_len,"version",
> + &ver, &ver_len) != JSV_NUMBER)
> + {
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> +#ifndef NDEBUG
> + /*
> + An internal check that is needed only for debugging the plugin
> + operation - in order to ensure that we get from the Hashicorp Vault
> + server exactly the version of the key that is needed:
> + */
> + std::string ver_str = std::string(ver, ver_len);
same redundant std::string
> + if (strtoul(ver_str.c_str(), NULL, 10) != key_version)
> + return ENCRYPTION_KEY_VERSION_INVALID;
> +#endif
> + }
> + if (json_get_object_key(response, response + response_len,
> + "data", &js, &js_len) != JSV_OBJECT)
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + if (json_get_object_key(js, js + js_len, "data", &key, &key_len) != JSV_STRING)
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + return decode_data(key, key_len, dstbuf, buflen);
Why? I don't see that in hashicorp docs that the key is base64-encoded
> +}
> +
> +struct st_mariadb_encryption hashicorp_key_management_plugin= {
> + MariaDB_ENCRYPTION_INTERFACE_VERSION,
> + get_latest_version,
> + get_key_from_vault,
> + 0, 0, 0, 0, 0
> +};
> +
> +static int hashicorp_key_management_plugin_init(void *p)
> +{
> + curl_global_init(CURL_GLOBAL_ALL);
> + return 0;
> +}
> +
> +static int hashicorp_key_management_plugin_deinit(void *p)
> +{
> + curl_global_cleanup();
> + return 0;
> +}
> +
> +/*
> + Plugin library descriptor
> +*/
> +maria_declare_plugin(hashicorp_key_management)
> +{
> + MariaDB_ENCRYPTION_PLUGIN,
> + &hashicorp_key_management_plugin,
> + "hashicorp_key_management",
> + "MariaDB Corporation",
> + "HashiCorp Vault key management plugin",
> + PLUGIN_LICENSE_GPL,
> + hashicorp_key_management_plugin_init,
> + hashicorp_key_management_plugin_deinit,
> + 0x0100 /* 1.0 */,
> + NULL, /* status variables */
> + settings,
> + "1.0",
> + MariaDB_PLUGIN_MATURITY_ALPHA
> +}
> +maria_declare_plugin_end;
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0
Hi, Julius!
First, please combine your two commits in one.
There's no historical value in keeping the first overcomplicated
implementation in the tree.
See other comments below:
On Nov 23, Julius Goryavsky wrote:
> revision-id: c23c5671a60 (mariadb-10.4.10-180-gc23c5671a60)
> parent(s): 387839e44cc
> author: Julius Goryavsky <julius.goryavsky(a)mariadb.com>
> committer: Julius Goryavsky <julius.goryavsky(a)mariadb.com>
> timestamp: 2019-11-22 18:38:39 +0100
> message:
>
> Simplified version
>
> diff --git a/plugin/hashicorp_key_management/CMakeLists.txt b/plugin/hashicorp_key_management/CMakeLists.txt
> new file mode 100644
> index 00000000000..95ff2434976
> --- /dev/null
> +++ b/plugin/hashicorp_key_management/CMakeLists.txt
> @@ -0,0 +1,20 @@
> +SET(HASHICORP_KEY_MANAGEMENT_PLUGIN_SOURCES
> + hashicorp_key_management_plugin.cc)
> +
> +IF (WITH_CURL)
> + INCLUDE(FindCURL)
> + IF(CURL_FOUND)
> + ELSE()
> + # Can't build plugin
> + RETURN()
> + ENDIF()
> +ELSE()
> + # Can't build plugin
> + RETURN()
> +ENDIF()
Better just
INCLUDE(FindCURL)
IF(NOT CURL_FOUND)
RETURN()
ENDIF()
No need to check for -DWITH_CURL=ON,
it should be enough to specify -DHASHICORP_KEY_MANAGEMENT_PLUGIN=ON
if one wants a plugin, the plugin should implicitly look for curl.
> +
> +INCLUDE_DIRECTORIES(${CMAKE_SOURCE_DIR}/sql ${CURL_INCLUDE_DIR})
> +
> +MYSQL_ADD_PLUGIN(HASHICORP_KEY_MANAGEMENT
> + ${HASHICORP_KEY_MANAGEMENT_PLUGIN_SOURCES}
I wouldn't introduce HASHICORP_KEY_MANAGEMENT_PLUGIN_SOURCES variable
just for one file, but do as you like
> + LINK_LIBRARIES ${CURL_LIBRARIES})
> diff --git a/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc b/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc
> new file mode 100644
> index 00000000000..3dc5214323b
> --- /dev/null
> +++ b/plugin/hashicorp_key_management/hashicorp_key_management_plugin.cc
> @@ -0,0 +1,405 @@
> +/* Copyright (C) 2019 MariaDB Corporation
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; version 2 of the License.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program; if not, write to the Free Software
> + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA */
> +
> +#include <my_global.h>
> +#include <typelib.h>
> +#include "sql_error.h"
these three headers shouldn't be included, only use plugin services, do
not include server headers at random.
> +#include <mysqld_error.h>
> +#include <mysql/plugin_encryption.h>
> +#include <string.h>
> +#include <string>
> +#include <sstream>
> +#include <curl/curl.h>
> +
> +static char* vault_url;
> +static char* secret_mount_point;
> +static char* token;
> +static char* vault_ca;
> +
> +static unsigned long encryption_algorithm;
> +
> +static const char *encryption_algorithm_names[]=
> +{
> + "aes_cbc",
> +#ifdef HAVE_EncryptAes128Ctr
> + "aes_ctr",
> +#endif
> + 0
> +};
This is what file plugin does, I don't think you need to copy it here.
Just use the default algorithm.
> +
> +static TYPELIB encryption_algorithm_typelib=
> +{
> + array_elements(encryption_algorithm_names)-1,"",
> + encryption_algorithm_names, NULL
> +};
> +
> +static MYSQL_SYSVAR_STR(vault_ca, vault_ca,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
> + "Path to Certificate Authority (CA) bundle",
> + NULL, NULL, "");
> +
> +static MYSQL_SYSVAR_STR(vault_url, vault_url,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
> + "Hashicorp Vault URL",
> + NULL, NULL, "");
> +
> +static MYSQL_SYSVAR_STR(secret_mount_point, secret_mount_point,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
> + "Secret mount point",
> + NULL, NULL, "");
I don't see a point in splitting the url in two parts that are _always_
used together. Please, remove the "secret_mount_point" variable, the
user will specify it as a part of the url.
> +
> +static MYSQL_SYSVAR_STR(token, token,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
> + "Authentication token",
> + NULL, NULL, "");
Here, perhaps, I'd specify PLUGIN_VAR_NOSYSVAR flag. To hide the token
from SHOW VARIABLES, because the token is all one needs to retrieve
encryption keys.
> +
> +#ifdef HAVE_EncryptAes128Ctr
> +#define recommendation ", aes_ctr is the recommended one"
> +#else
> +#define recommendation ""
> +#endif
> +static MYSQL_SYSVAR_ENUM(encryption_algorithm, encryption_algorithm,
> + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
> + "Encryption algorithm to use" recommendation ".",
> + NULL, NULL, 0, &encryption_algorithm_typelib);
> +
> +static struct st_mysql_sys_var* settings[] = {
> + MYSQL_SYSVAR(vault_url),
> + MYSQL_SYSVAR(secret_mount_point),
> + MYSQL_SYSVAR(token),
> + MYSQL_SYSVAR(vault_ca),
> + MYSQL_SYSVAR(encryption_algorithm),
> + NULL
> +};
> +
> +static char curl_errbuf [CURL_ERROR_SIZE];
> +
> +static std::string get_error_from_curl (CURLcode curl_code)
> +{
> + size_t len = strlen(curl_errbuf);
> + std::ostringstream stream;
> + if (curl_code != CURLE_OK)
> + {
> + stream << "CURL returned this error code: " << curl_code;
> + stream << " with error message : ";
> + if (len)
> + stream << curl_errbuf;
> + else
> + stream << curl_easy_strerror(curl_code);
> + }
> + return stream.str();
> +}
> +
> +#define max_response_size 65536
> +
> +static size_t write_response_memory(void *contents, size_t size, size_t nmemb, void *userp)
> +{
> + size_t realsize = size * nmemb;
> + if (size != 0 && realsize / size != nmemb)
> + return 0; // overflow
> + std::ostringstream *read_data = static_cast<std::ostringstream*>(userp);
> + size_t ss_pos = read_data->tellp();
> + read_data->seekp(0, std::ios::end);
> + size_t number_of_read_bytes = read_data->tellp();
> + read_data->seekp(ss_pos);
> + if (number_of_read_bytes + realsize > max_response_size)
> + return 0; // response size limit exceeded
> + read_data->write(static_cast<char*>(contents), realsize);
> + if (!read_data->good())
> + return 0;
> + return realsize;
> +}
> +
> +std::ostringstream read_data_stream;
> +
> +#define timeout 300
> +
> +static struct curl_slist *list = NULL;
> +
> +static bool setup_curl_session (CURL * curl)
> +{
> + std::string token_header = std::string("X-Vault-Token:") + std::string(token);
> + CURLcode curl_res = CURLE_OK;
> + read_data_stream.str("");
> + read_data_stream.clear();
> + curl_errbuf[0] = '\0';
> + if (list != NULL)
> + {
> + curl_slist_free_all(list);
> + list = NULL;
> + }
1. why do you recreate the list? isn't it always the same?
2. there's no protection against concurrent access. this probably won't
work at all when many threads will try to request keys in parallel.
note, I don't think you need a mutex protection here, better populate
the list once in the plugin initializer and then use it.
> + if ((list= curl_slist_append(list, token_header.c_str())) == NULL ||
> + (list= curl_slist_append(list, "Content-Type: application/json")) ==
> + NULL ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, curl_errbuf)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION,
> + write_response_memory)) != CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_WRITEDATA,
> + static_cast<void *>(&read_data_stream))) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_HTTPHEADER, list)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 1)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 2L)) !=
> + CURLE_OK ||
> + (vault_ca != NULL && strlen(vault_ca) != 0 &&
> + (curl_res= curl_easy_setopt(curl, CURLOPT_CAINFO, vault_ca)) !=
> + CURLE_OK) ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_USE_SSL, CURLUSESSL_ALL)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, timeout)) !=
> + CURLE_OK ||
> + (curl_res= curl_easy_setopt(curl, CURLOPT_TIMEOUT, timeout)) !=
> + CURLE_OK ||
> + (curl_res = curl_easy_setopt(curl, CURLOPT_HTTP_VERSION,
> + (long)CURL_HTTP_VERSION_1_1)) !=
> + CURLE_OK)
> + {
> + my_printf_error(ER_UNKNOWN_ERROR,
> + get_error_from_curl(curl_res).c_str(),
> + ME_ERROR_LOG_ONLY);
> + return true;
> + }
> + return false;
> +}
> +
> +static bool curl_run (char * url, const char * * response)
> +{
> + CURLcode curl_res = CURLE_OK;
> + long http_code = 0;
> + CURL *curl = curl_easy_init();
the manual for curl_easy_init says
If you did not already call curl_global_init(3), curl_easy_init(3) does
it automatically. This may be lethal in multi-threaded cases, since
curl_global_init(3) is not thread-safe, and it may result in resource
problems because there is no corresponding cleanup.
You are strongly advised to not allow this automatic behaviour, by
calling curl_global_init(3) yourself properly. See the description in
libcurl(3) of global environment requirements for details of how to use
this function.
> + if (curl == NULL)
> + {
> + my_printf_error(ER_UNKNOWN_ERROR,
> + "Cannot initialize curl session",
> + ME_ERROR_LOG_ONLY);
> + return true;
> + }
> + if (setup_curl_session(curl) ||
> + (curl_res = curl_easy_setopt(curl, CURLOPT_URL, url)) != CURLE_OK ||
> + (curl_res = curl_easy_perform(curl)) != CURLE_OK ||
> + (curl_res = curl_easy_getinfo (curl, CURLINFO_RESPONSE_CODE, &http_code)) != CURLE_OK)
> + {
> + my_printf_error(ER_UNKNOWN_ERROR,
> + get_error_from_curl(curl_res).c_str(),
> + ME_ERROR_LOG_ONLY);
> + return true;
> + }
> + if (http_code == 404)
> + {
> + *response = NULL;
> + return false;
> + }
> + *response = read_data_stream.str().c_str();
> + return http_code < 200 || http_code >= 300;
> +}
> +
> +static unsigned int get_latest_version (uint key_id)
> +{
> + char url[2048];
> + const char *response;
> + const char *js, *ver;
> + int js_len, ver_len;
> + size_t response_len;
> + snprintf(url, sizeof(url), "%s/v1/%s/%u",
> + vault_url, secret_mount_point, key_id);
> + if (curl_run(url, &response)) {
> + my_printf_error(ER_UNKNOWN_ERROR,
> + "Unable to get key data",
> + 0);
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + response_len = strlen(response);
> + if (json_get_object_key(response, response + response_len,
> + "metadata", &js, &js_len) != JSV_OBJECT)
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + if (json_get_object_key(js, js+js_len,"version", &ver, &ver_len) != JSV_NUMBER)
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + std::string ver_str = std::string(ver, ver_len);
> + return strtoul(ver_str.c_str(), NULL, 10);
> +}
> +
> +static int decode_data (const char *src, int src_len,
> + unsigned char* dstbuf, unsigned *buflen)
> +{
> + int length_of_memory_needed_for_decode =
> + my_base64_needed_decoded_length(src_len);
> + char* data= (char *)my_malloc(
> + length_of_memory_needed_for_decode * sizeof(char), MYF(0));
> + int decoded_length=
> + my_base64_decode(src, src_len, data, NULL, 0);
> + if (decoded_length <= 0)
> + {
> + memset(data, 0, length_of_memory_needed_for_decode);
> + my_free(data);
> + return ENCRYPTION_KEY_BUFFER_TOO_SMALL;
> + }
> + if (*buflen < (unsigned) decoded_length) {
> + *buflen= decoded_length;
> + my_free(data);
you cannot use my_malloc and my_free here, the plugin won't link.
use malloc/free instead.
> + return ENCRYPTION_KEY_BUFFER_TOO_SMALL;
> + }
> + *buflen= decoded_length;
> + memcpy(dstbuf, data, decoded_length);
> + my_free(data);
> + return 0;
> +}
> +
> +static unsigned int get_key_from_vault (unsigned int key_id,
> + unsigned int key_version,
> + unsigned char* dstbuf,
> + unsigned int *buflen)
> +{
> + char url[2048];
> + const char *response;
> + const char *js, *ver, *key;
> + int js_len, key_len, ver_len;
> + size_t response_len;
> + if (key_version != ENCRYPTION_KEY_VERSION_INVALID)
> + snprintf(url, sizeof(url), "%s/v1/%s/%u?%u",
> + vault_url, secret_mount_point, key_id, key_version);
> + else
> + snprintf(url, sizeof(url), "%s/v1/%s/%u",
> + vault_url, secret_mount_point, key_id);
> + if (curl_run(url, &response))
> + {
> + my_printf_error(ER_UNKNOWN_ERROR,
> + "Unable to get key data",
> + 0);
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + }
> + response_len = strlen(response);
> + if (key_version != ENCRYPTION_KEY_VERSION_INVALID)
> + {
> + if (json_get_object_key(response, response + response_len,
> + "metadata", &js, &js_len) != JSV_OBJECT)
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + if (json_get_object_key(js, js+js_len,"version", &ver, &ver_len) != JSV_NUMBER)
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + std::string ver_str = std::string(ver, ver_len);
> + if (strtoul(ver_str.c_str(), NULL, 10) != key_version)
> + return ENCRYPTION_KEY_VERSION_INVALID;
Can this happen? You've requested a specific version, can Hashicorp
return a _different_ version? That'd be strange.
> + }
> + if (json_get_object_key(response, response + response_len,
> + "data", &js, &js_len) != JSV_OBJECT)
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + if (json_get_object_key(js, js + js_len, "data", &key, &key_len) != JSV_STRING)
> + return ENCRYPTION_KEY_VERSION_INVALID;
> + return decode_data(key, key_len, dstbuf, buflen);
> +}
> +
> +// let's simplify the condition below
> +#ifndef HAVE_EncryptAes128Gcm
> +#define MY_AES_GCM MY_AES_CTR
> +#ifndef HAVE_EncryptAes128Ctr
> +#define MY_AES_CTR MY_AES_CBC
> +#endif
> +#endif
> +
> +static inline enum my_aes_mode mode(int flags)
> +{
> + /*
> + If encryption_algorithm is AES_CTR then
> + if no-padding, use AES_CTR
> + else use AES_GCM (like CTR but appends a "checksum" block)
> + else
> + use AES_CBC
> + */
> + if (encryption_algorithm)
> + if (flags & ENCRYPTION_FLAG_NOPAD)
> + return MY_AES_CTR;
> + else
> + return MY_AES_GCM;
> + else
> + return MY_AES_CBC;
> +}
> +
> +static int ctx_init(void *ctx, const unsigned char* key, unsigned int klen,
> + const unsigned char* iv, unsigned int ivlen, int flags,
> + unsigned int key_id, unsigned int key_version)
> +{
> + return my_aes_crypt_init(ctx, mode(flags), flags, key, klen, iv, ivlen);
> +}
> +
> +static int ctx_update(void *ctx, const unsigned char *src, unsigned int slen,
> + unsigned char *dst, unsigned int *dlen)
> +{
> + return my_aes_crypt_update(ctx, src, slen, dst, dlen);
> +}
> +
> +
> +static int ctx_finish(void *ctx, unsigned char *dst, unsigned int *dlen)
> +{
> + return my_aes_crypt_finish(ctx, dst, dlen);
> +}
> +
> +static unsigned int get_length(unsigned int slen, unsigned int key_id,
> + unsigned int key_version)
> +{
> + return my_aes_get_size(mode(0), slen);
> +}
> +
> +static uint ctx_size(uint, uint)
> +{
> + return my_aes_ctx_size(mode(0));
> +}
> +
> +struct st_mariadb_encryption hashicorp_key_management_plugin= {
> + MariaDB_ENCRYPTION_INTERFACE_VERSION,
> + get_latest_version,
> + get_key_from_vault,
> + ctx_size,
> + ctx_init,
> + ctx_update,
> + ctx_finish,
> + get_length
you've seen my template, I've intentionally did not define any functions
beyond get_latest_version and get_key_from_vault. You don't need to
either, just put 0, don't copy this code from the file plugin.
> +};
> +
> +static int hashicorp_key_management_plugin_init(void *p)
> +{
> + return 0;
> +}
if you don't need to initialize the plugin - don't create an initializer
at all. what's the point in a dummy one?
> +
> +static int hashicorp_key_management_plugin_deinit(void *p)
> +{
> + read_data_stream.str("");
> + read_data_stream.clear();
> + return 0;
> +}
> +
> +/*
> + Plugin library descriptor
> +*/
> +maria_declare_plugin(hashicorp_key_management)
> +{
> + MariaDB_ENCRYPTION_PLUGIN,
> + &hashicorp_key_management_plugin,
> + "hashicorp_key_management",
> + "MariaDB Corporation",
you don't want to have your name here? :)
> + "HashiCorp Vault key management plugin",
> + PLUGIN_LICENSE_GPL,
> + hashicorp_key_management_plugin_init,
> + hashicorp_key_management_plugin_deinit,
> + 0x0100 /* 1.0 */,
> + NULL, /* status variables */
> + settings,
> + "1.0",
> + MariaDB_PLUGIN_MATURITY_STABLE
> +}
> +maria_declare_plugin_end;
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0
Hi Otto!
Otto Kekäläinen <otto(a)debian.org>,
11/11/2019 – 15:06:24 (+0100):
> Hello Faustin!
>
> I have been updating the MariaDB 10.3 packaging in Debian to compat level
> 12, and that includes some changes to how the systemd services are
> installed. Do you have any comments?
>
> https://salsa.debian.org/mariadb-team/mariadb-10.3/commits/bugfix/systemd-r…
See my comment on the d14de1c0 WIP commit.
> Are there some systemd issues regarding 10.3 that you think we should put
> in while at this?
Yes and it is related to all the extra work that is needed to support
both mysql and mariadb systemctl commands (see my comment and Ondrej
commit referenced in it).
IMO we should definitively remove aliases from systemd unit files (also
suggested by M. Biebl #932289) but I did not have time to get things to
move on (https://github.com/MariaDB/server/pull/1172 or MDEV-15526) and
if possible I believe that we should try to upstream it first.
So for the moment I have no other changes to suggest in salsa.
Faustin
1
0
Hi Otto!
I did not had time to look at this for now. I believe I will have a bit
of time before the week-end and surely next week.
Faustin
Otto Kekäläinen <otto(a)debian.org>,
11/11/2019 – 15:06:24 (+0100):
> Hello Faustin!
>
> I have been updating the MariaDB 10.3 packaging in Debian to compat level
> 12, and that includes some changes to how the systemd services are
> installed. Do you have any comments?
>
> https://salsa.debian.org/mariadb-team/mariadb-10.3/commits/bugfix/systemd-r…
>
> Are there some systemd issues regarding 10.3 that you think we should put
> in while at this?
>
> - Otto
1
0

Re: [Maria-developers] c790763e1f0: MDEV-19903 Setup default partitions for system versioning
by Sergei Golubchik 06 Nov '19
by Sergei Golubchik 06 Nov '19
06 Nov '19
Hi, Aleksey!
Looks good!
Ok to push
On Nov 06, Aleksey Midenkov wrote:
> revision-id: c790763e1f0 (mariadb-10.4.7-51-gc790763e1f0)
> parent(s): f10c9ac941e
> author: Aleksey Midenkov <midenok(a)gmail.com>
> committer: Aleksey Midenkov <midenok(a)gmail.com>
> timestamp: 2019-08-19 12:00:33 +0300
> message:
>
> MDEV-19903 Setup default partitions for system versioning
>
> Implement syntax like:
>
> create table t1 (x int) with system versioning partition by system_time;
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] 4de59fa0f92: MENT-360: ASAN heap-use-after-free in strmake_root / Query_arena::strmake
by Sergei Golubchik 06 Nov '19
by Sergei Golubchik 06 Nov '19
06 Nov '19
Hi, Oleksandr!
On Nov 06, Oleksandr Byelkin wrote:
> revision-id: 4de59fa0f92 (mariadb-10.2.27-129-g4de59fa0f92)
> parent(s): 4b21495343d
> author: Oleksandr Byelkin <sanja(a)mariadb.com>
> committer: Oleksandr Byelkin <sanja(a)mariadb.com>
> timestamp: 2019-10-18 15:14:20 +0200
> message:
>
> MENT-360: ASAN heap-use-after-free in strmake_root / Query_arena::strmake
>
> Do not allow thd->query point to freed memory, use safe method to set query.
>
> ---
> mysql-test/r/processlist.result | 32 +++++++++++++++++++++++
> mysql-test/t/processlist.test | 58 +++++++++++++++++++++++++++++++++++++++++
> sql/sql_prepare.cc | 7 ++++-
> 3 files changed, 96 insertions(+), 1 deletion(-)
>
> diff --git a/mysql-test/r/processlist.result b/mysql-test/r/processlist.result
> index ab518d961ef..066154feaf1 100644
> --- a/mysql-test/r/processlist.result
> +++ b/mysql-test/r/processlist.result
> @@ -43,3 +43,35 @@ Message Incorrect string value: '\xF0\x9F\x98\x8Eyy...' for column `information_
> #
> # End of 10.1 tests
> #
> +#
> +# MENT-360: ASAN heap-use-after-free in strmake_root /
> +# Query_arena::strmake
> +#
> +CREATE PROCEDURE pr1()
> +BEGIN
> +DECLARE CONTINUE HANDLER FOR 1146 SET @a = 1;
> +DECLARE CONTINUE HANDLER FOR 1243 SET @a = 1;
> +LOOP
> +PREPARE stmt FROM "CREATE TEMPORARY TABLE ps AS SELECT * FROM non_existing_table";
> +EXECUTE stmt;
> +END LOOP;
> +END $
> +CREATE PROCEDURE pr2()
> +BEGIN
> +DECLARE CONTINUE HANDLER FOR 1094 SET @a = 1;
> +LOOP
> +SHOW EXPLAIN FOR 1;
> +END LOOP;
> +END $
> +connect con1,localhost,root,,;
> +CALL pr1();
> +connect con2,localhost,root,,;
> +CALL pr2();
> +connection default;
> +KILL 6;
> +KILL 7;
^^^ you forgot replace_result
also, mysqltest was designed for deterministic test cases, it'd be
better to rewrite this test to be deterministic, if possible.
> +DROP PROCEDURE pr1;
> +DROP PROCEDURE pr2;
> +#
> +# end of Enterprise tests
> +#
> diff --git a/sql/sql_prepare.cc b/sql/sql_prepare.cc
> index 525f09d611b..d0043979135 100644
> --- a/sql/sql_prepare.cc
> +++ b/sql/sql_prepare.cc
> @@ -2885,6 +2885,11 @@ void mysql_sql_stmt_prepare(THD *thd)
>
> if (stmt->prepare(query.str, (uint) query.length))
> {
> + /*
> + stmt->prepare() sets thd->query_string, so we have to reset it beck,
> + before it will point to uninitialised memory
> + */
> + thd->set_query(orig_query);
> /* Statement map deletes the statement on erase */
> thd->stmt_map.erase(stmt);
> }
> @@ -2902,7 +2907,7 @@ void mysql_sql_stmt_prepare(THD *thd)
> But here we should restore the original query so it's mentioned in
> logs properly.
> */
> - thd->set_query_inner(orig_query);
> + thd->set_query(orig_query);
> DBUG_VOID_RETURN;
> }
may be better to have set_query() only once, like in
bool res=stmt->prepare(query.str, (uint) query.length);
thd->set_query(orig_query);
if (res) ...
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] aeb3432849e: MDEV-20076: SHOW GRANTS does not quote role names properly
by Sergei Golubchik 06 Nov '19
by Sergei Golubchik 06 Nov '19
06 Nov '19
Hi, Oleksandr!
On Nov 06, Oleksandr Byelkin wrote:
> revision-id: aeb3432849e (mariadb-10.3.19-2-gaeb3432849e)
> parent(s): 5d3bd2b75b5
> author: Oleksandr Byelkin <sanja(a)mariadb.com>
> committer: Oleksandr Byelkin <sanja(a)mariadb.com>
> timestamp: 2019-11-06 12:35:19 +0100
> message:
>
> MDEV-20076: SHOW GRANTS does not quote role names properly
>
> Quotes added to output.
>
> +#
> +# MDEV-20076: SHOW GRANTS does not quote role names properly
> +#
> +create role 'role-1';
> +create user 'user-1'@'localhost';
> +grant select on mysql.user to 'role-1';
> +GRANT 'role-1' TO 'user-1'@'localhost';
> +show grants for 'role-1';
> +Grants for role-1
> +GRANT USAGE ON *.* TO 'role-1'
> +GRANT SELECT ON `mysql`.`user` TO 'role-1'
> +show grants for 'user-1'@'localhost';
> +Grants for user-1@localhost
> +GRANT 'role-1' TO 'user-1'@'localhost'
> +GRANT USAGE ON *.* TO 'user-1'@'localhost'
> +drop role 'role-1';
> +drop user 'user-1'@'localhost';
Please, add a test for names that contains a single quote. Like
create role `role'1`;
create user `user'1`@localhost;
and the rest as in your test above.
Or, if you want it to look a bit less artificial
create role `rock'n'roll`;
create user `O'Brien`@localhost;
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] d5352b8154d: MDEV-20015 Assertion `!in_use->is_error()' failed in TABLE::update_virtual_field
by Sergei Golubchik 31 Oct '19
by Sergei Golubchik 31 Oct '19
31 Oct '19
Hi, Aleksey!
On Oct 25, Aleksey Midenkov wrote:
> revision-id: d5352b8154d (mariadb-10.2.25-54-gd5352b8154d)
> parent(s): 1153950ad0a
> author: Aleksey Midenkov <midenok(a)gmail.com>
> committer: Aleksey Midenkov <midenok(a)gmail.com>
> timestamp: 2019-07-22 15:40:06 +0300
> message:
>
> MDEV-20015 Assertion `!in_use->is_error()' failed in TABLE::update_virtual_field
>
> Preserve and restore statement DA.
This is strange. Diagnostics areas aren't supposed to be temporarily
created on the frame, they aren't arenas.
Why TABLE::update_virtual_field() is called at all if there's already
an error?
> diff --git a/sql/table.cc b/sql/table.cc
> index f5b5bad99cc..65611d78bde 100644
> --- a/sql/table.cc
> +++ b/sql/table.cc
> @@ -7682,15 +7682,25 @@ int TABLE::update_virtual_fields(handler *h, enum_vcol_update_mode update_mode)
>
> int TABLE::update_virtual_field(Field *vf)
> {
> - DBUG_ASSERT(!in_use->is_error());
> + Diagnostics_area *stmt_da= NULL;
> + Diagnostics_area tmp_stmt_da(in_use->query_id, false, true);
> + bool error;
> Query_arena backup_arena;
> DBUG_ENTER("TABLE::update_virtual_field");
> + if (unlikely(in_use->is_error()))
> + {
> + stmt_da= in_use->get_stmt_da();
> + in_use->set_stmt_da(&tmp_stmt_da);
> + }
> in_use->set_n_backup_active_arena(expr_arena, &backup_arena);
> bitmap_clear_all(&tmp_set);
> vf->vcol_info->expr->walk(&Item::update_vcol_processor, 0, &tmp_set);
> vf->vcol_info->expr->save_in_field(vf, 0);
> in_use->restore_active_arena(expr_arena, &backup_arena);
> - DBUG_RETURN(in_use->is_error());
> + error= in_use->is_error();
> + if (stmt_da)
> + in_use->set_stmt_da(stmt_da);
> + DBUG_RETURN(error);
> }
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
3

Re: [Maria-developers] f6269a85ad7: MDEV-17553 Enable setting start datetime for interval partitioned history of system versioned tables
by Sergei Golubchik 31 Oct '19
by Sergei Golubchik 31 Oct '19
31 Oct '19
Hi, Aleksey!
On Sep 26, Aleksey Midenkov wrote:
> revision-id: f6269a85ad7 (mariadb-10.4.7-38-gf6269a85ad7)
> parent(s): a9ca752f1a9
> author: Aleksey Midenkov <midenok(a)gmail.com>
> committer: Aleksey Midenkov <midenok(a)gmail.com>
> timestamp: 2019-08-19 11:58:56 +0300
> message:
>
> MDEV-17553 Enable setting start datetime for interval partitioned history of system versioned tables
>
> * Interactive STARTS syntax
I wouldn't use the word "interactive" here, better say "Explicit STARTS
syntax"
> * SHOW CREATE
> * Default STARTS rounding depending on INTERVAL type
This is questionable.
I don't see why it's much better than the old behavior. It kind of makes
sense, but that's all.
If it would've worked that way from the start, it would be fine.
But now, I'm not sure it offers such important benefits to break the
compatibility for it.
> * Warn when STARTS timestamp is further than INTERVAL
>
> [Closes tempesta-tech/mariadb#574]
>
> === Dependency hints (auto-detected by git-deps) ===
> 336c0139a89 MDEV-16370 row-based binlog events (updates by primary key) can not be applied multiple times to system versioned tables
^^^
remove these dependency hints before pushing, please
> diff --git a/mysql-test/suite/versioning/r/partition.result b/mysql-test/suite/versioning/r/partition.result
> index 9e532824414..3606c4e407f 100644
> --- a/mysql-test/suite/versioning/r/partition.result
> +++ b/mysql-test/suite/versioning/r/partition.result
> @@ -1,3 +1,4 @@
> +call mtr.add_suppression("need more HISTORY partitions");
why these warnings suddenly started to show up in the error log?
> set system_versioning_alter_history=keep;
> # Check conventional partitioning on temporal tables
> create or replace table t1 (
> @@ -266,11 +267,11 @@ x
> 6
> insert into t1 values (7), (8);
> Warnings:
> -Warning 4114 Versioned table `test`.`t1`: partition `p1` is full, add more HISTORY partitions
> +Warning 4114 Versioned table `test`.`t1`: last HISTORY partition (`p1`) is out of LIMIT, need more HISTORY partitions
Hmm, "is out of LIMIT" sounds strange, I liked "is full" more.
Is it that important to say LIMIT or INTERVAL here?
> ### warn about full partition
> delete from t1;
> Warnings:
> -Warning 4114 Versioned table `test`.`t1`: partition `p1` is full, add more HISTORY partitions
> +Warning 4114 Versioned table `test`.`t1`: last HISTORY partition (`p1`) is out of LIMIT, need more HISTORY partitions
> select * from t1 partition (p1) order by x;
> x
> 4
> diff --git a/mysql-test/suite/versioning/r/partition_rotation.result b/mysql-test/suite/versioning/r/partition_rotation.result
> index 69b30a56bd6..f6db36b117b 100644
> --- a/mysql-test/suite/versioning/r/partition_rotation.result
> +++ b/mysql-test/suite/versioning/r/partition_rotation.result
> @@ -55,4 +57,265 @@ i
> explain partitions select * from t1 for system_time all where row_end = @ts;
> id select_type table partitions type possible_keys key key_len ref rows Extra
> 1 SIMPLE t1 p1_p1sp0,p1_p1sp1 # NULL NULL NULL NULL # #
> -drop table t1;
> +## INTERVAL ... STARTS
> +create or replace table t1 (i int) with system versioning
> +partition by system_time interval 1 day starts 'a'
> +(partition p0 history, partition pn current);
> +ERROR HY000: Wrong parameters for partitioned `t1`: wrong value for 'STARTS'
> +create or replace table t1 (i int) with system versioning
> +partition by system_time interval 1 day starts '00:00:00'
> +(partition p0 history, partition pn current);
> +ERROR HY000: Wrong parameters for partitioned `t1`: wrong value for 'STARTS'
There are clear and well defined rules how to cast TIME to TIMESTAMP, so
we can allow that. But it might be confusing, so if you'd like, let's
keep it your way and lift this restriction later if needed.
> +create or replace table t1 (i int) with system versioning
> +partition by system_time interval 1 day starts '2000-00-01 00:00:00'
> +(partition p0 history, partition pn current);
> +ERROR HY000: Wrong parameters for partitioned `t1`: wrong value for 'STARTS'
> +create or replace table t1 (i int) with system versioning
> +partition by system_time interval 1 day starts '2000-01-01 00:00:00'
> +(partition p0 history, partition pn current);
> +show create table t1;
> +Table Create Table
> +t1 CREATE TABLE `t1` (
> + `i` int(11) DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING
> + PARTITION BY SYSTEM_TIME INTERVAL 1 DAY STARTS TIMESTAMP'2000-01-01 00:00:00'
> +(PARTITION `p0` HISTORY ENGINE = MyISAM,
> + PARTITION `pn` CURRENT ENGINE = MyISAM)
> +create or replace table t1 (i int) with system versioning
> +partition by system_time interval 1 day starts 946684800
Now, this is very questionable. There is a well defined rule how to
cast numbers to temporal values, and this isn't it. If you want to allow
numbers, it has to be
partition by system_time interval 1 day starts 20000101000000
Whatever the server is doing internally when parsing frms isn't really
relevant for the end user, is it?
> +(partition p0 history, partition pn current);
> +show create table t1;
> +Table Create Table
> +t1 CREATE TABLE `t1` (
> + `i` int(11) DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING
> + PARTITION BY SYSTEM_TIME INTERVAL 1 DAY STARTS TIMESTAMP'2000-01-01 00:00:00'
> +(PARTITION `p0` HISTORY ENGINE = MyISAM,
> + PARTITION `pn` CURRENT ENGINE = MyISAM)
> +# Test STARTS warning
> +set timestamp= unix_timestamp('2000-01-01 00:00:00');
> +create or replace table t1 (i int) with system versioning
> +partition by system_time interval 1 day
> +(partition p0 history, partition pn current);
> +show create table t1;
> +Table Create Table
> +t1 CREATE TABLE `t1` (
> + `i` int(11) DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING
> + PARTITION BY SYSTEM_TIME INTERVAL 1 DAY STARTS TIMESTAMP'2000-01-01 00:00:00'
> +(PARTITION `p0` HISTORY ENGINE = MyISAM,
> + PARTITION `pn` CURRENT ENGINE = MyISAM)
> +# no warning
> +create or replace table t1 (i int) with system versioning
> +partition by system_time interval 1 day starts '2000-01-02 00:00:00'
Why no warning here?
If I insert and delete a row right now, it'll happen at 2000-01-01 00:00:00.
So, it'll be before STARTS, what partition it should go into?
As far as I understand, STARTS normally must be before NOW()
not before NOW()+INTERVAL
> diff --git a/sql/partition_info.cc b/sql/partition_info.cc
> index ba6fc8a49ec..119c00dfce6 100644
> --- a/sql/partition_info.cc
> +++ b/sql/partition_info.cc
> @@ -2381,6 +2383,111 @@ static bool strcmp_null(const char *a, const char *b)
> return true;
> }
>
> +/**
> + Assign INTERVAL and STARTS for SYSTEM_TIME partitions.
> +
> + @return true on error
> +*/
> +
> +bool partition_info::vers_set_interval(THD* thd, Item* interval,
> + interval_type int_type, Item* starts,
> + const char *table_name)
why do you pass the table name as an argument now?
> +{
> + DBUG_ASSERT(part_type == VERSIONING_PARTITION);
> +
> + const bool interactive= !table;
what does that mean?
* first, SQL is not interactive, using this adjective is confusing,
* second you didn't write if(!table) because you wanted to use a
self-documenting variable name, I presume. This name isn't helping, it
confuses even more :(
please, either use a really self-documenting name, or just write
if(!table) with a comment.
... okay, now I see what you mean. Better use a comment:
if (!table)
{
/* called from mysql_unpack_partition() not from mysql_parse() */
> + MYSQL_TIME ltime;
> + uint err;
> + vers_info->interval.type= int_type;
> +
> + /* 1. assign INTERVAL to interval.step */
> + if (interval->fix_fields_if_needed_for_scalar(thd, &interval))
> + return true;
> + bool error= get_interval_value(thd, interval, int_type, &vers_info->interval.step) ||
> + vers_info->interval.step.neg || vers_info->interval.step.second_part ||
> + !(vers_info->interval.step.year || vers_info->interval.step.month ||
> + vers_info->interval.step.day || vers_info->interval.step.hour ||
> + vers_info->interval.step.minute || vers_info->interval.step.second);
> + if (error) {
indentation {
> + my_error(ER_PART_WRONG_VALUE, MYF(0), table_name, "INTERVAL");
> + return true;
> + }
> +
> + /* 2. assign STARTS to interval.start */
> + if (starts)
> + {
I've already commented on the following, but here it is again:
> + if (starts->fix_fields_if_needed_for_scalar(thd, &starts))
> + return true;
> + switch (starts->result_type())
> + {
> + case INT_RESULT:
> + case DECIMAL_RESULT:
> + case REAL_RESULT:
> + if (starts->val_int() > TIMESTAMP_MAX_VALUE)
> + goto interval_starts_error;
> + vers_info->interval.start= (time_t) starts->val_int();
> + break;
wrong conversion to temporal
> + case STRING_RESULT:
> + case TIME_RESULT:
> + {
> + Datetime::Options opt(TIME_NO_ZERO_DATE | TIME_NO_ZERO_IN_DATE, thd);
> + starts->get_date(thd, <ime, opt);
> + vers_info->interval.start= TIME_to_timestamp(thd, <ime, &err);
> + if (err)
> + goto interval_starts_error;
> + break;
> + }
> + case ROW_RESULT:
> + default:
> + goto interval_starts_error;
> + }
> + if (interactive)
> + {
> + my_tz_OFFSET0->gmt_sec_to_TIME(<ime, thd->query_start());
> + if (date_add_interval(thd, <ime, int_type, vers_info->interval.step))
> + return true;
I think this isn't needed
> + my_time_t boundary= my_tz_OFFSET0->TIME_to_gmt_sec(<ime, &err);
> + if (vers_info->interval.start > boundary) {
> + push_warning_printf(thd, Sql_condition::WARN_LEVEL_WARN,
> + ER_PART_STARTS_BEYOND_INTERVAL,
> + ER_THD(thd, ER_PART_STARTS_BEYOND_INTERVAL),
> + table_name);
> + }
> + }
> + }
> + else // calculate default STARTS depending on INTERVAL
> + {
> + thd->variables.time_zone->gmt_sec_to_TIME(<ime, thd->query_start());
> + if (vers_info->interval.step.second)
> + goto interval_set_starts;
> + ltime.second= 0;
I think this isn't needed
> + if (vers_info->interval.step.minute)
> + goto interval_set_starts;
> + ltime.minute= 0;
> + if (vers_info->interval.step.hour)
> + goto interval_set_starts;
> + ltime.hour= 0;
> + if (vers_info->interval.step.day)
> + goto interval_set_starts;
> + ltime.day= 1;
> + if (vers_info->interval.step.month)
> + goto interval_set_starts;
> + ltime.month= 1;
> + DBUG_ASSERT(vers_info->interval.step.year);
> +
> +interval_set_starts:
> + vers_info->interval.start= TIME_to_timestamp(thd, <ime, &err);
> + if (err)
> + goto interval_starts_error;
> + }
> +
> + return false;
> +
> +interval_starts_error:
> + my_error(ER_PART_WRONG_VALUE, MYF(0), table_name, "STARTS");
> + return true;
> +}
> +
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1

Re: [Maria-developers] 02362a77d0d: MDEV-19848 Server crashes in check_vcol_forward_refs upon INSERT DELAYED into table with long blob key
by Sergei Golubchik 29 Oct '19
by Sergei Golubchik 29 Oct '19
29 Oct '19
Hi, Sachin!
Looks good now, thanks.
See comments below
On Oct 29, Sachin Setiya wrote:
> revision-id: 02362a77d0d (mariadb-10.4.7-111-g02362a77d0d)
> parent(s): 52f3829b95c
> author: Sachin <sachin.setiya(a)mariadb.com>
> committer: Sachin <sachin.setiya(a)mariadb.com>
> timestamp: 2019-10-14 16:10:54 +0530
> message:
>
> MDEV-19848 Server crashes in check_vcol_forward_refs upon INSERT DELAYED into table with long blob key
>
> Problem:- Insert delayed is not working with Long Unique Index.
> It is failing with
> 1. INSERT DELAYED INTO t1 VALUES();
> 2. INSERT DELAYED INTO t1 VALUES(1);
> 3. Potential Race condition When Insert DELAYED gets dup key error(After fix),
> And it will change original table key_info by calling
> re_setup_keyinfo_hash, And second thread is in check_duplicate_long_entries
> 4. Insert delayed into INVISIBLE COLUMN will also not work.
>
> There are 4 main issue
>
> 1. while calling make_new_field we forgot to & LONG_UNIQUE_HASH_FIELD
> flag into new field flags.
>
> 2. New field created created into get_local_table by make_new_field does
> not respect old field visibility, Assigning old field visibility will
> solve Problem 4 and part of problem 2.
>
> 3. As we know Problem 3 race condition is caused because table and
> delayed table share same key_info, So we will make a copy of original table
> key_info in get_local_table.
>
> 4. In parse_vcol_defs we have this code block
> keypart->field->vcol_info=
> table->field[keypart->field->field_index]->vcol_info;
> Which is wrong because we should not change original
> table->field->vcol_info with vcol_info which is create on delayed
> thread.
>
> ---
> mysql-test/main/long_unique_bugs.result | 29 +++++++++++++++++++++++++++++
> mysql-test/main/long_unique_bugs.test | 32 ++++++++++++++++++++++++++++++++
> sql/field.cc | 7 ++++++-
> sql/sql_insert.cc | 31 +++++++++++++++++++++++++++++++
> sql/table.cc | 31 ++++++++++++++++++++++++++++++-
> sql/table.h | 2 ++
> 6 files changed, 130 insertions(+), 2 deletions(-)
>
> diff --git a/mysql-test/main/long_unique_bugs.test b/mysql-test/main/long_unique_bugs.test
> index 13a4e1367a0..365393aa2ea 100644
> --- a/mysql-test/main/long_unique_bugs.test
> +++ b/mysql-test/main/long_unique_bugs.test
> @@ -340,3 +340,35 @@ while ($count)
> --eval $insert_stmt
> --enable_query_log
> drop table t1;
> +#
> +# MDEV-19848 Server crashes in check_vcol_forward_refs upon INSERT DELAYED into table with long blob key
> +#
> +CREATE TABLE t1 (a blob, UNIQUE(a)) ENGINE=MyISAM;
> +INSERT DELAYED t1 () VALUES (1);
> +INSERT t1 () VALUES (2);
> +# Cleanup
> +DROP TABLE t1;
> +CREATE TABLE t1 (a char(50), UNIQUE(a(10)) USING HASH);
> +INSERT DELAYED t1 () VALUES (1);
> +INSERT t1 () VALUES (2);
> +# Cleanup
> +DROP TABLE t1;
> +CREATE TABLE t1 (
> + a CHAR(128),
> + b CHAR(128) AS (a),
> + c varchar(5000),
> + UNIQUE(c,b(64))
> +) ENGINE=myisam;
> +INSERT DELAYED t1 (a,c) VALUES (1,1);
> +--sleep 1
> +INSERT t1 (a,c) VALUES (2,2);
> +INSERT t1 (a,c) VALUES (3,3);
> +drop table t1;
> +create table t1(a int , b int invisible);
> +insert into t1 values(1);
> +insert delayed into t1(a,b) values(2,2);
> +--echo #Should not fails
"Should not fail"
> +insert delayed into t1 values(2);
I suspect there's a race condition here.
You cannot know that INSERT DELAYED will insert rows before you do your
SELECT.
Other tests usually do FLUSH TABLE t1 before SELECT.
Alternatively you can use include/wait_condition.inc file
to wait for, say, SELECT COUNT(*) = 3
> +select a,b from t1 order by a;
> +# Cleanup
> +DROP TABLE t1;
> diff --git a/sql/field.cc b/sql/field.cc
> index 0eb53f40a54..d1187237f23 100644
> --- a/sql/field.cc
> +++ b/sql/field.cc
> @@ -2373,8 +2373,13 @@ Field *Field::make_new_field(MEM_ROOT *root, TABLE *new_table,
> tmp->flags&= (NOT_NULL_FLAG | BLOB_FLAG | UNSIGNED_FLAG |
> ZEROFILL_FLAG | BINARY_FLAG | ENUM_FLAG | SET_FLAG |
> VERS_SYS_START_FLAG | VERS_SYS_END_FLAG |
> - VERS_UPDATE_UNVERSIONED_FLAG);
> + VERS_UPDATE_UNVERSIONED_FLAG | LONG_UNIQUE_HASH_FIELD);
> tmp->reset_fields();
> + /*
> + Calling make_new_field will return a VISIBLE field, If caller function
> + wants original visibility he should change it later.
"visibility it should change"
> + This is done because view created on invisible fields are visible.
"because invisible fields explicitly named in a view become visible"
> + */
> tmp->invisible= VISIBLE;
> return tmp;
> }
> diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc
> index fe6c5fa8ec4..d80044ec37d 100644
> --- a/sql/sql_insert.cc
> +++ b/sql/sql_insert.cc
> @@ -2607,6 +2607,11 @@ TABLE *Delayed_insert::get_local_table(THD* client_thd)
> {
> if (!(*field= (*org_field)->make_new_field(client_thd->mem_root, copy, 1)))
> goto error;
> + /*
> + We want same visibility as of original table because we are just creating
> + a clone for delayed insert.
> + */
> + (*field)->invisible= (*org_field)->invisible;
I'm surprised it matters. I'd thought that all sanity checks, and in
particular whether number of columns matches the number of values, are
done in the connection user thread before submitting the job to the
delayed insert thread.
> (*field)->unireg_check= (*org_field)->unireg_check;
> (*field)->orig_table= copy; // Remove connection
> (*field)->move_field_offset(adjust_ptrs); // Point at copy->record[0]
> @@ -2621,7 +2626,33 @@ TABLE *Delayed_insert::get_local_table(THD* client_thd)
> if (share->virtual_fields || share->default_expressions ||
> share->default_fields)
> {
> + /*
> + If we have long unique table then delayed insert can modify key structure
> + (re/setup_keyinfo_hash_all) of original table when it gets insert error,
> + parse_vcol_defs will also modify key_info structure. So it is better to
> + clone the table->key_info for copy table.
> + We will not be cloning key_part_info or even changing any field ptr.
> + Because re/setup_keyinfo_hash_all only modify key_info array. So it will
> + be like having new key_info array for copy table with old key_part_info
> + ptr.
> + */
> + if (share->long_unique_table)
> + {
> + KEY *key_info;
> + if (!(key_info= (KEY*) client_thd->alloc(share->keys*sizeof(KEY))))
> + goto error;
> + copy->key_info= key_info;
> + memcpy(key_info, table->key_info, sizeof(KEY)*share->keys);
> + }
> + /*
> + parse_vcol_defs expects key_infos to be in user defined format.
> + */
> + copy->setup_keyinfo_hash_all();
> bool error_reported= FALSE;
> + /*
> + We won't be calling re_setup_keyinfo_hash because parse_vcol_defs changes
> + key_infos to storage engine format
> + */
> if (unlikely(parse_vcol_defs(client_thd, client_thd->mem_root, copy,
> &error_reported,
> VCOL_INIT_DEPENDENCY_FAILURE_IS_WARNING)))
> diff --git a/sql/table.cc b/sql/table.cc
> index 718d0dce072..d0fcd449098 100644
> --- a/sql/table.cc
> +++ b/sql/table.cc
> @@ -1223,7 +1223,16 @@ bool parse_vcol_defs(THD *thd, MEM_ROOT *mem_root, TABLE *table,
> new (mem_root) Item_field(thd, keypart->field),
> new (mem_root) Item_int(thd, length));
> list_item->fix_fields(thd, NULL);
> - keypart->field->vcol_info=
> + /*
> + Do not change the vcol_info when vcol_info->expr is not NULL
> + This will happen in the case of Delayed_insert::get_local_table()
> + And if we change the vcol_info in Delayed insert , then original
> + table field->vcol_info will be created on delayed insert thread
> + mem_root.
> + */
> + if (!keypart->field->vcol_info ||
> + !keypart->field->vcol_info->expr)
> + keypart->field->vcol_info=
> table->field[keypart->field->field_index]->vcol_info;
> }
> else
> @@ -9084,6 +9093,26 @@ void re_setup_keyinfo_hash(KEY *key_info)
> key_info->ext_key_parts= 1;
> key_info->flags&= ~HA_NOSAME;
> }
> +
> +/*
> + call setup_keyinfo_hash for all keys in table
> + */
> +void TABLE::setup_keyinfo_hash_all()
> +{
> + for (uint i= 0; i < s->keys; i++)
> + if (key_info[i].algorithm == HA_KEY_ALG_LONG_HASH)
> + setup_keyinfo_hash(&key_info[i]);
also... I remember you wanted to rename these methods to something more
descriptive.
> +}
> +
> +/*
> + call re_setup_keyinfo_hash for all keys in table
> + */
> +void TABLE::re_setup_keyinfo_hash_all()
don't add this method, you aren't using it anywhere
> +{
> + for (uint i= 0; i < s->keys; i++)
> + if (key_info[i].algorithm == HA_KEY_ALG_LONG_HASH)
> + re_setup_keyinfo_hash(&key_info[i]);
> +}
> /**
> @brief clone of current handler.
> Creates a clone of handler used in update for
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] 96569b793a3: MDEV-19761 - Before Trigger not processed for Not Null Column
by Sergei Golubchik 29 Oct '19
by Sergei Golubchik 29 Oct '19
29 Oct '19
Hi, Anel!
On Oct 29, Anel Husakovic wrote:
> revision-id: 96569b793a3 (mariadb-10.1.39-121-g96569b793a3)
> parent(s): e32f29b7f31
> author: Anel Husakovic <anel(a)mariadb.org>
> committer: Anel Husakovic <anel(a)mariadb.org>
> timestamp: 2019-07-24 23:50:29 -0700
> message:
>
> MDEV-19761 - Before Trigger not processed for Not Null Column
here you should have a commit comment explaining the bug and your
solution.
> ---
> mysql-test/r/trigger.result | 6 +++-
> mysql-test/r/trigger_no_defaults-11698.result | 29 +++++++++++++++++++
> mysql-test/r/trigger_null-8605.result | 2 +-
> mysql-test/suite/funcs_1/r/innodb_trig_09.result | 4 +--
> mysql-test/suite/funcs_1/r/memory_trig_09.result | 4 +--
> mysql-test/suite/funcs_1/r/myisam_trig_09.result | 4 +--
> mysql-test/t/trigger_no_defaults-11698.test | 37 ++++++++++++++++++++----
> sql/sql_base.cc | 23 +++++++++++++++
> sql/sql_trigger.cc | 2 +-
> sql/sql_trigger.h | 4 +++
> 10 files changed, 97 insertions(+), 18 deletions(-)
>
> diff --git a/mysql-test/r/trigger.result b/mysql-test/r/trigger.result
> index 8852a622251..04bb9d9be68 100644
> --- a/mysql-test/r/trigger.result
> +++ b/mysql-test/r/trigger.result
> @@ -366,6 +366,8 @@ create trigger trg1 before update on t1 for each row set @a:= @a + new.j - old.j
> create trigger trg2 after update on t1 for each row set @b:= "Fired";
> set @a:= 0, @b:= "";
> update t1, t2 set j = j + 10 where t1.i = t2.i;
> +Warnings:
> +Warning 1048 Column 'k' cannot be null
this looks wrong. Why there's suddenly a warning here?
> select @a, @b;
> @a @b
> 10 Fired
> @@ -1235,11 +1237,13 @@ insert into t1 values (1,1), (2,2), (3,3);
> create trigger t1_bu before update on t1 for each row
> set new.j = new.j + 10;
> update t1 set i= i+ 10 where j > 2;
> +Warnings:
> +Warning 1048 Column 'j' cannot be null
and here
> select * from t1;
> diff --git a/sql/sql_base.cc b/sql/sql_base.cc
> index e8bdff8b48f..d9fe37c83da 100644
> --- a/sql/sql_base.cc
> +++ b/sql/sql_base.cc
> @@ -8959,6 +8959,29 @@ void switch_to_nullable_trigger_fields(List<Item> &items, TABLE *table)
>
> while ((item= it++))
> item->walk(&Item::switch_to_nullable_fields_processor, 1, (uchar*)field);
> + uint16 cnt=0;
> + uchar *nptr;
> + nptr= (uchar*)alloc_root(&table->mem_root, (table->s->fields - table->s->null_fields + 7)/8);
> + // First find null_ptr for NULL field in case of mixed NULL and NOT NULL fields
> + for (Field **f= field; *f; f++)
> + {
> + if (table->field[cnt]->null_ptr)
> + {
> + nptr= table->field[cnt]->null_ptr;
> + break;
> + }
> + }
> + for (Field **f= field; *f; f++)
> + {
> + if (!table->field[cnt]->null_ptr)
> + {
> + (*f)->null_bit= 1<<(cnt+1);
> + (*f)->flags&= ~(NOT_NULL_FLAG);
> + (*f)->null_ptr= nptr;
> + }
> + cnt++;
> + }
> + bzero(nptr, (table->s->fields - table->s->null_fields + 7)/8);
what are you doing here?
> table->triggers->reset_extra_null_bitmap();
> }
> }
> diff --git a/sql/sql_trigger.cc b/sql/sql_trigger.cc
> index 4ecd8139921..7de2fde0e5c 100644
> --- a/sql/sql_trigger.cc
> +++ b/sql/sql_trigger.cc
> @@ -1118,7 +1118,7 @@ bool Table_triggers_list::prepare_record_accessors(TABLE *table)
>
> f->flags= (*fld)->flags;
> f->null_ptr= null_ptr;
> - f->null_bit= null_bit;
> + f->null_bit= (*fld)->null_bit;
> if (null_bit == 128)
> null_ptr++, null_bit= 1;
> else
> diff --git a/sql/sql_trigger.h b/sql/sql_trigger.h
> index f451dfda1ee..c1174bfd36e 100644
> --- a/sql/sql_trigger.h
> +++ b/sql/sql_trigger.h
> @@ -226,6 +226,10 @@ class Table_triggers_list: public Sql_alloc
> trigger_table->s->null_fields + 7)/8;
> bzero(extra_null_bitmap, null_bytes);
> }
> + uchar *get_extra_null_bitmap() const
> + {
> + return extra_null_bitmap;
> + }
This is not used anywhere
> private:
> bool prepare_record_accessors(TABLE *table);
>
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] e752eed5354: MDEV-19751 Wrong partitioning by KEY() after key dropped
by Sergei Golubchik 29 Oct '19
by Sergei Golubchik 29 Oct '19
29 Oct '19
Hi, Aleksey!
On Oct 29, Aleksey Midenkov wrote:
> revision-id: e752eed5354 (mariadb-10.4.7-47-ge752eed5354)
> parent(s): 8403148452a
> author: Aleksey Midenkov <midenok(a)gmail.com>
> committer: Aleksey Midenkov <midenok(a)gmail.com>
> timestamp: 2019-08-19 12:00:13 +0300
> message:
>
> MDEV-19751 Wrong partitioning by KEY() after key dropped
>
> Empty partition by key() clause implies that key might be changed by
> ALTER, but inplace algorithm must not be used since repartitioning is
> required for a different key.
good and very clear comment.
> diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc
> index 10b9c74d868..d76ff5e2e76 100644
> --- a/sql/sql_partition.cc
> +++ b/sql/sql_partition.cc
> @@ -5903,6 +5903,45 @@ the generated partition syntax in a correct manner.
> *partition_changed= TRUE;
> }
> }
> + /*
> + Prohibit inplace when key takes part in partitioning expression
> + and is altered (dropped).
> + */
> + if (!*partition_changed && tab_part_info->part_field_array)
> + {
> + KEY *key_info= table->key_info;
> + List_iterator_fast<Alter_drop> drop_it(alter_info->drop_list);
> + for (uint key= 0; key < table->s->keys; key++, key_info++)
> + {
> + if (key_info->flags & HA_INVISIBLE_KEY)
> + continue;
> + const char *key_name= key_info->name.str;
> + const Alter_drop *drop;
> + drop_it.rewind();
> + while ((drop= drop_it++))
> + {
> + if (drop->type == Alter_drop::KEY &&
> + 0 == my_strcasecmp(system_charset_info, key_name, drop->name))
> + break;
> + }
> + if (!drop)
> + continue;
> + for (uint kp= 0; kp < key_info->user_defined_key_parts; ++kp)
> + {
> + const KEY_PART_INFO &key_part= key_info->key_part[kp];
> + for (Field **part_field= tab_part_info->part_field_array;
> + *part_field; ++part_field)
> + {
> + if (*part_field == key_part.field)
> + {
> + *partition_changed= TRUE;
> + goto search_finished;
> + }
> + } // for (part_field)
> + } // for (key_part)
> + } // for (key_info)
Hmm, if I read it correctly, you iterate over all key parts of every
dropped key and see if this field is also part of the partitioning
expression.
This sounds kind of strange. What if there are two indexes that use the
column `a`, say
CREATE TABLE t1 (a int, b int, index(a), index(b,a))
and you partition by key(a). Dropping the second index does not change
the partitioning, does it?
On the other hand, if it's ALTER TABLE ... MODIFY `a` then it can change
partitioning, even when the key hasn't changed, right?
So, I suspect your check is too strong in some cases and too loose is
others.
> + search_finished:;
> + }
> }
> if (thd->work_part_info)
> {
> diff --git a/storage/connect/mysql-test/connect/r/part_file.result b/storage/connect/mysql-test/connect/r/part_file.result
> index 3dabd946b50..480de51cce6 100644
> --- a/storage/connect/mysql-test/connect/r/part_file.result
> +++ b/storage/connect/mysql-test/connect/r/part_file.result
> @@ -308,12 +308,19 @@ EXPLAIN PARTITIONS SELECT * FROM t1 WHERE id = 10;
> id select_type table partitions type possible_keys key key_len ref rows Extra
> 1 SIMPLE t1 1 ref XID XID 4 const 1
> DROP INDEX XID ON t1;
> +Warnings:
> +Warning 1105 Data repartition in 1 is unchecked
> +Warning 1105 Data repartition in 2 is unchecked
> +Warning 1105 Data repartition in 3 is unchecked
why did the result file change?
could you add a short explanation of it to the commit comment, please?
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

[Maria-developers] Please review MDEV-18319 BIGINT UNSIGNED Performance issue
by Alexander Barkov 29 Oct '19
by Alexander Barkov 29 Oct '19
29 Oct '19
Hi Sergei,
Please review a patch for MDEV-18319
https://github.com/MariaDB/server/commit/8a990ad17746927c6d395ec755a262eda5…
Thanks
2
1

Re: [Maria-developers] 7a331ec820b: MDEV-18244 Server crashes in ha_innobase::update_thd / ... / ha_partition::update_next_auto_inc_val.
by Sergei Golubchik 25 Oct '19
by Sergei Golubchik 25 Oct '19
25 Oct '19
Hi, Alexey!
On Oct 25, Alexey Botchkov wrote:
> revision-id: 7a331ec820b (mariadb-10.3.18-65-g7a331ec820b)
> parent(s): 716d396bb3b
> author: Alexey Botchkov <holyfoot(a)mariadb.com>
> committer: Alexey Botchkov <holyfoot(a)mariadb.com>
> timestamp: 2019-10-22 01:40:48 +0400
> message:
>
> MDEV-18244 Server crashes in ha_innobase::update_thd / ... / ha_partition::update_next_auto_inc_val.
>
> Autoincrement calculation should check if tables were opened.
>
> ---
> mysql-test/main/partition_innodb.result | 13 +++++++++++++
> mysql-test/main/partition_innodb.test | 15 +++++++++++++++
> sql/ha_partition.cc | 2 ++
> 3 files changed, 30 insertions(+)
>
> diff --git a/mysql-test/main/partition_innodb.result b/mysql-test/main/partition_innodb.result
> index f3d24347ff9..93a73a785fe 100644
> --- a/mysql-test/main/partition_innodb.result
> +++ b/mysql-test/main/partition_innodb.result
> @@ -1028,5 +1028,18 @@ COUNT(*)
> 2
> DROP TABLE t1;
> #
> +# MDEV-18244 Server crashes in ha_innobase::update_thd / ... / ha_partition::update_next_auto_inc_val
> +#
> +CREATE TABLE t1 (a INT)
> +ENGINE=InnoDB
> +PARTITION BY RANGE (a) (
> +PARTITION p0 VALUES LESS THAN (6),
> +PARTITION pn VALUES LESS THAN MAXVALUE
> +);
> +INSERT INTO t1 VALUES (4),(5),(6);
> +ALTER TABLE t1 MODIFY a INT AUTO_INCREMENT PRIMARY KEY;
> +UPDATE t1 PARTITION (p0) SET a = 3 WHERE a = 5;
> +DROP TABLE t1;
> +#
> # End of 10.3 tests
> #
> diff --git a/mysql-test/main/partition_innodb.test b/mysql-test/main/partition_innodb.test
> index 629bc29e758..1e0af06b4da 100644
> --- a/mysql-test/main/partition_innodb.test
> +++ b/mysql-test/main/partition_innodb.test
> @@ -1105,6 +1105,21 @@ INSERT INTO t1 VALUES (1, 7, 8, 9), (2, NULL, NULL, NULL), (3, NULL, NULL, NULL)
> SELECT COUNT(*) FROM t1 WHERE x IS NULL AND y IS NULL AND z IS NULL;
> DROP TABLE t1;
>
> +--echo #
> +--echo # MDEV-18244 Server crashes in ha_innobase::update_thd / ... / ha_partition::update_next_auto_inc_val
> +--echo #
> +
> +CREATE TABLE t1 (a INT)
> + ENGINE=InnoDB
> + PARTITION BY RANGE (a) (
> + PARTITION p0 VALUES LESS THAN (6),
> + PARTITION pn VALUES LESS THAN MAXVALUE
> + );
> +INSERT INTO t1 VALUES (4),(5),(6);
> +ALTER TABLE t1 MODIFY a INT AUTO_INCREMENT PRIMARY KEY;
> +UPDATE t1 PARTITION (p0) SET a = 3 WHERE a = 5;
> +DROP TABLE t1;
> +
> --echo #
> --echo # End of 10.3 tests
> --echo #
> diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> index 5a78249644d..e461532db1a 100644
> --- a/sql/ha_partition.cc
> +++ b/sql/ha_partition.cc
> @@ -8174,6 +8174,8 @@ int ha_partition::info(uint flag)
> ("checking all partitions for auto_increment_value"));
> do
> {
> + if (!bitmap_is_set(&m_opened_partitions, (uint)(file_array - m_file)))
> + continue;
> file= *file_array;
> file->info(HA_STATUS_AUTO | no_lock_flag);
> set_if_bigger(auto_increment_value,
It seems that the intention was to find the max auto_increment_value
over all partitions. If you skip some, it won't be max anymore.
On the other hand, I don't see why UPDATE should get the max auto-inc of
all partitions. May be the correct fix would be not to do it in UPDATE?
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] 52f3829b95c: MDEV-20001 Potential dangerous regression: INSERT INTO >=100 rows fail for myisam table with HASH indexes
by Sergei Golubchik 25 Oct '19
by Sergei Golubchik 25 Oct '19
25 Oct '19
Hi, Sachin!
This is ok to push.
But, still, about Aria. When you push the fix for MDEV-18791, it will be
very easy to forget this one.
May be you add the same test case for Aria now? It'll fail with
something like "not supported" error, which is fine, just use --error.
But when you fix MDEV-18791 this test will be enabled and it'll ensure
you won't forget to fix Aria for bulk inserts.
On Oct 25, Sachin Setiya wrote:
> revision-id: 52f3829b95c (mariadb-10.4.7-110-g52f3829b95c)
> parent(s): 93d2211e02f
> author: Sachin <sachin.setiya(a)mariadb.com>
> committer: Sachin <sachin.setiya(a)mariadb.com>
> timestamp: 2019-10-09 21:17:32 +0530
> message:
>
> MDEV-20001 Potential dangerous regression: INSERT INTO >=100 rows fail for myisam table with HASH indexes
>
> Problem:-
>
> So the issue is when we do bulk insert with rows
> > MI_MIN_ROWS_TO_DISABLE_INDEXES(100) , We try to disable the indexes to
> speedup insert. But current logic also disables the long unique indexes.
>
> Solution:- In ha_myisam::start_bulk_insert if we find long hash index
> (HA_KEY_ALG_LONG_HASH) we will not disable the index.
>
> This commit also refactors the mi_disable_indexes_for_rebuild function,
> Since this is function is called at only one place, it is inlined into
> start_bulk_insert
>
> mi_clear_key_active is added into myisamdef.h because now it is also used
> in ha_myisam.cc file.
>
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] a0e3bd09251: Part1: MDEV-20837 Add MariaDB_FUNCTION_PLUGIN
by Sergei Golubchik 16 Oct '19
by Sergei Golubchik 16 Oct '19
16 Oct '19
Hi, Alexander!
Just a couple of comments, see below:
On Oct 16, Alexander Barkov wrote:
> revision-id: a0e3bd09251 (mariadb-10.4.4-419-ga0e3bd09251)
> parent(s): 22b645ef529
> author: Alexander Barkov <bar(a)mariadb.com>
> committer: Alexander Barkov <bar(a)mariadb.com>
> timestamp: 2019-10-16 16:26:29 +0400
> message:
>
> Part1: MDEV-20837 Add MariaDB_FUNCTION_PLUGIN
>
> - Defining MariaDB_FUNCTION_PLUGIN
> - Changing the code in /plugins/type_inet/ and /plugins/type_test/
> to use MariaDB_FUNCTION_PLUGIN instead of MariaDB_FUNCTION_COLLECTION_PLUGIN.
> diff --git a/include/mysql/plugin_function.h b/include/mysql/plugin_function.h
> new file mode 100644
> index 00000000000..4ce416612e9
> --- /dev/null
> +++ b/include/mysql/plugin_function.h
> @@ -0,0 +1,58 @@
> +#ifndef MARIADB_PLUGIN_FUNCTION_INCLUDED
> +#define MARIADB_PLUGIN_FUNCTION_INCLUDED
> +/* Copyright (C) 2019, Alexander Barkov and MariaDB
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; version 2 of the License.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program; if not, write to the Free Software
> + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA */
> +
> +/**
> + @file
> +
> + Function Plugin API.
> +
> + This file defines the API for server plugins that manage functions.
> +*/
> +
> +#ifdef __cplusplus
> +
> +#include <mysql/plugin.h>
> +
> +/*
> + API for function plugins. (MariaDB_FUNCTION_PLUGIN)
> +*/
> +#define MariaDB_FUNCTION_INTERFACE_VERSION (MYSQL_VERSION_ID << 8)
> +
> +
> +class Plugin_function
> +{
> + int m_interface_version;
> + Create_func *m_builder;
> +public:
> + Plugin_function(int interface_version, Create_func *builder)
> + :m_interface_version(interface_version),
> + m_builder(builder)
> + { }
Why do you need this ^^^ constructor?
> + Plugin_function(Create_func *builder)
> + :m_interface_version(MariaDB_FUNCTION_INTERFACE_VERSION),
> + m_builder(builder)
> + { }
> + Create_func *create_func()
> + {
> + return m_builder;
> + }
> +};
> +
> +
> +#endif /* __cplusplus */
> +
> +#endif /* MARIADB_PLUGIN_FUNCTION_INCLUDED */
> diff --git a/plugin/type_inet/mysql-test/type_inet/func_inet_plugin.result b/plugin/type_inet/mysql-test/type_inet/func_inet_plugin.result
> index 4663ae485e2..a9422f2e4fd 100644
> --- a/plugin/type_inet/mysql-test/type_inet/func_inet_plugin.result
> +++ b/plugin/type_inet/mysql-test/type_inet/func_inet_plugin.result
> @@ -15,14 +16,94 @@ PLUGIN_LICENSE,
> PLUGIN_MATURITY,
> PLUGIN_AUTH_VERSION
> FROM INFORMATION_SCHEMA.PLUGINS
> -WHERE PLUGIN_TYPE='FUNCTION COLLECTION'
> - AND PLUGIN_NAME='func_inet';
> -PLUGIN_NAME func_inet
> +WHERE PLUGIN_TYPE='FUNCTION'
> + AND PLUGIN_NAME IN
> +('inet_aton',
> +'inet_ntoa',
> +'inet6_aton',
> +'inet6_ntoa',
> +'is_ipv4',
> +'is_ipv6',
> +'is_ipv4_compat',
> +'is_ipv4_mapped')
> +ORDER BY PLUGIN_NAME;
> +---- ----
> +PLUGIN_NAME inet6_aton
> PLUGIN_VERSION 1.0
> PLUGIN_STATUS ACTIVE
> -PLUGIN_TYPE FUNCTION COLLECTION
> +PLUGIN_TYPE FUNCTION
> PLUGIN_AUTHOR MariaDB Corporation
> -PLUGIN_DESCRIPTION Function collection test
> +PLUGIN_DESCRIPTION Function INET6_ATON()
> +PLUGIN_LICENSE GPL
> +PLUGIN_MATURITY Experimental
INET* functions should probably be Alpha, not Experimental.
You expect them to be GA one day, don't you?
> +PLUGIN_AUTH_VERSION 1.0
> +---- ----
> +PLUGIN_NAME inet6_ntoa
> +PLUGIN_VERSION 1.0
> +PLUGIN_STATUS ACTIVE
> +PLUGIN_TYPE FUNCTION
> +PLUGIN_AUTHOR MariaDB Corporation
> +PLUGIN_DESCRIPTION Function INET6_NTOA()
> +PLUGIN_LICENSE GPL
> +PLUGIN_MATURITY Experimental
> +PLUGIN_AUTH_VERSION 1.0
> +---- ----
> +PLUGIN_NAME inet_aton
> +PLUGIN_VERSION 1.0
> +PLUGIN_STATUS ACTIVE
> +PLUGIN_TYPE FUNCTION
> +PLUGIN_AUTHOR MariaDB Corporation
> +PLUGIN_DESCRIPTION Function INET_ATON()
> +PLUGIN_LICENSE GPL
> +PLUGIN_MATURITY Experimental
> +PLUGIN_AUTH_VERSION 1.0
> +---- ----
> +PLUGIN_NAME inet_ntoa
> +PLUGIN_VERSION 1.0
> +PLUGIN_STATUS ACTIVE
> +PLUGIN_TYPE FUNCTION
> +PLUGIN_AUTHOR MariaDB Corporation
> +PLUGIN_DESCRIPTION Function INET_NTOA()
> +PLUGIN_LICENSE GPL
> +PLUGIN_MATURITY Experimental
> +PLUGIN_AUTH_VERSION 1.0
> +---- ----
> +PLUGIN_NAME is_ipv4
> +PLUGIN_VERSION 1.0
> +PLUGIN_STATUS ACTIVE
> +PLUGIN_TYPE FUNCTION
> +PLUGIN_AUTHOR MariaDB Corporation
> +PLUGIN_DESCRIPTION Function IS_IPV4()
> +PLUGIN_LICENSE GPL
> +PLUGIN_MATURITY Experimental
> +PLUGIN_AUTH_VERSION 1.0
> +---- ----
> +PLUGIN_NAME is_ipv4_compat
> +PLUGIN_VERSION 1.0
> +PLUGIN_STATUS ACTIVE
> +PLUGIN_TYPE FUNCTION
> +PLUGIN_AUTHOR MariaDB Corporation
> +PLUGIN_DESCRIPTION Function IS_IPV4_COMPAT()
> +PLUGIN_LICENSE GPL
> +PLUGIN_MATURITY Experimental
> +PLUGIN_AUTH_VERSION 1.0
> +---- ----
> +PLUGIN_NAME is_ipv4_mapped
> +PLUGIN_VERSION 1.0
> +PLUGIN_STATUS ACTIVE
> +PLUGIN_TYPE FUNCTION
> +PLUGIN_AUTHOR MariaDB Corporation
> +PLUGIN_DESCRIPTION Function IS_IPV4_MAPPED()
> +PLUGIN_LICENSE GPL
> +PLUGIN_MATURITY Experimental
> +PLUGIN_AUTH_VERSION 1.0
> +---- ----
> +PLUGIN_NAME is_ipv6
> +PLUGIN_VERSION 1.0
> +PLUGIN_STATUS ACTIVE
> +PLUGIN_TYPE FUNCTION
> +PLUGIN_AUTHOR MariaDB Corporation
> +PLUGIN_DESCRIPTION Function IS_IPV6()
> PLUGIN_LICENSE GPL
> PLUGIN_MATURITY Experimental
> PLUGIN_AUTH_VERSION 1.0
> diff --git a/sql/item_create.cc b/sql/item_create.cc
> index e8eb76dfc12..e316723cedf 100644
> --- a/sql/item_create.cc
> +++ b/sql/item_create.cc
> @@ -5704,12 +5657,26 @@ void item_create_cleanup()
> {
> DBUG_ENTER("item_create_cleanup");
> my_hash_free(& native_functions_hash);
> -#ifdef HAVE_SPATIAL
> - plugin_function_collection_geometry.deinit();
> -#endif
> DBUG_VOID_RETURN;
> }
>
> +
> +static Create_func *
> +function_plugin_find_native_function_builder(THD *thd, const LEX_CSTRING &name)
> +{
> + plugin_ref plugin;
> + if ((plugin= my_plugin_lock_by_name(thd, &name, MariaDB_FUNCTION_PLUGIN)))
> + {
> + Create_func *builder=
> + reinterpret_cast<Plugin_function*>(plugin_decl(plugin)->info)->
> + create_func();
> + plugin_unlock(thd, plugin);
> + return builder;
> + }
> + return NULL;
So, function plugins cannot be unlocked either?
> +}
> +
> +
> Create_func *
> find_native_function_builder(THD *thd, const LEX_CSTRING *name)
> {
> @@ -5724,16 +5691,10 @@ find_native_function_builder(THD *thd, const LEX_CSTRING *name)
> if (func && (builder= func->builder))
> return builder;
>
> - if ((builder= Plugin_find_native_func_builder_param(*name).find(thd)))
> + if ((builder= function_plugin_find_native_function_builder(thd, *name)))
this is what I mean by "special code path for plugins" and want to get
rid of. But not in this commit, I agree.
> return builder;
>
> -#ifdef HAVE_SPATIAL
> - if (!builder)
> - builder= plugin_function_collection_geometry.
> - find_native_function_builder(thd, *name);
> -#endif
> -
> - return builder;
> + return NULL;
> }
>
> Create_qfunc *
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] c776ed4d6ec: MDEV-19848 Server crashes in check_vcol_forward_refs upon INSERT DELAYED into table with long blob key
by Sergei Golubchik 14 Oct '19
by Sergei Golubchik 14 Oct '19
14 Oct '19
Hi, Sachin!
On Aug 20, Sachin Setiya wrote:
> revision-id: c776ed4d6ec (mariadb-10.4.5-153-gc776ed4d6ec)
> parent(s): 4ca016237f1
> author: Sachin <sachin.setiya(a)mariadb.com>
> committer: Sachin <sachin.setiya(a)mariadb.com>
> timestamp: 2019-07-30 03:42:21 +0530
> message:
>
> MDEV-19848 Server crashes in check_vcol_forward_refs upon INSERT DELAYED into table with long blob key
>
> There are 2 issues
>
> 1st:- in make_new_field when we & into new field flag we forget
> LONG_UNIQUE_HASH_FIELD Flag.
>
> 2nd:- We are calling parse_vcol_defs on keyinfo , but they are not in right
> form. We should call setup_keyinfo_hash_all before calling parse_vcol_defs
This looks quite suspicious.
Ideally, Delayed_insert::get_local_table() should just create a copy of
the table, but not modify the original table. In your case it constantly
modifies KEY's of the original table. It's kind of dirty.
As for the real bug, see what parse_vcol_defs() is doing for long unique
fields. In particular, it sets
keypart->field->vcol_info=
table->field[keypart->field->field_index]->vcol_info;
but the keypart here is in the original Delayed_insert::table,
while field and vcol_info is in a copy (created in
Delayed_insert::get_local_table). This seems to be just wrong.
I'd really prefer Delayed_insert::get_local_table to not change anything
in the Delayed_insert::table.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1
Hi!
I've been using MySQL/MariaDB for two decades but have more recently been
working with Elasticsearch. I knew to expect an inverted index to speed up
querying full text fields, but I've been surprised (and a bit annoyed) at
how fast ES can query structured data. (In my case, I'm largely looking
for intersections of a number of varchar fields with lowish cardinality,
e.g. WHERE country = 'US' AND client = 'Microsoft' AND status =
'Completed'.)
Elasticsearch seems to have several things going on, but I think a core
aspect, to use RDMS terminology, is that each column is indexed, and index
unions/intersections are used if the WHERE clause references multiple
columns.
I've heard that MySQL/MariaDB has the ability to merge indexes, but I've
rarely observed it in person. Googling for it yields a bunch of
StackOverflow posts complaining how slow it is, with responses agreeing and
explaining how to disable it.
If I'm reading the MySQL/MariaDB code correctly, it looks like MariaDB will
intersect indexes by looping through each index, reading the rowids of all
matching keys and then, at the end (or once the buffer is full), checking
whether each rowid is present in each index.
I wonder if there's an opportunity to speed this up. If we first read in
the rowids from one index (ideally the one with the fewest matches), we
could tell the storage engine that, when reading the next index, it should
skip over rowids lower than the next candidate intersection. In the best
case scenario, I think this could enable InnoDB to use its page directory
to skip past some of the keys, improving the performance from O(n) to O(log
n).
That said, this is all new to me. Maybe there's an obvious reason this
wouldn't make much of an improvement, or maybe I've overlooked that it's
already been done. However, if it looks promising to you folk, and it's
something you'd consider merging, I'd be willing to attempt writing a PR
for it.
Thank you,
David Sickmiller
3
6

Re: [Maria-developers] Missing memory barrier in parallel replication error handler in wait_for_prior_commit()?
by Kristian Nielsen 11 Oct '19
by Kristian Nielsen 11 Oct '19
11 Oct '19
sujatha <sujatha.sivakumar(a)mariadb.com> writes:
> I have a doubt. A simple fix as per the earlier mail discussion would
> be to swap the
>
> order of assignments as shown below.
>
> wakeup_error= true
> waitee= NULL
> Why cannot we use the simpler approach. Please provide your inputs.
This is because of the need for memory barriers, to prevent compiler and/or
modern high-performance CPUs from re-ordering the memory accesses.
There is nothing in C that requires the compiler to make the two assignments
in the same order as in the source code. And even if it does, the CPU is
free to perform the writes and/or the corresponding loads in the opposite
order (the x86 architecture doesn't do that, but other architectures do).
There are many resources on the net on the need for memory barriers, for
example this from one of your collegues:
https://mariadb.org/wp-content/uploads/2017/11/2017-11-Memory-barriers.pdf
But it will be difficult to make a test case, I think. Because after
correcting the order of the stores there is nowhere to put a debug-sleep,
and there also is no easy way to force the CPU to take the incorrect memory
order.
- Kristian.
1
0

Re: [Maria-developers] 4ca016237f1: MDEV-20001 Potential dangerous regression: INSERT INTO >=100 rows fail for myisam table with HASH indexes
by Sergei Golubchik 09 Oct '19
by Sergei Golubchik 09 Oct '19
09 Oct '19
Hi, Sachin!
On Aug 20, Sachin Setiya wrote:
> revision-id: 4ca016237f1 (mariadb-10.4.5-152-g4ca016237f1)
> parent(s): 4a5cd407289
> author: Sachin <sachin.setiya(a)mariadb.com>
> committer: Sachin <sachin.setiya(a)mariadb.com>
> timestamp: 2019-07-29 19:33:05 +0530
> message:
>
> MDEV-20001 Potential dangerous regression: INSERT INTO >=100 rows fail for myisam table with HASH indexes
>
> Dont deactivate the long unique keys on bulk insert.
>
> diff --git a/storage/myisam/ha_myisam.cc b/storage/myisam/ha_myisam.cc
> index f478e01e441..c1169737911 100644
> --- a/storage/myisam/ha_myisam.cc
> +++ b/storage/myisam/ha_myisam.cc
> @@ -1749,7 +1749,16 @@ void ha_myisam::start_bulk_insert(ha_rows rows, uint flags)
> else
> {
> my_bool all_keys= MY_TEST(flags & HA_CREATE_UNIQUE_INDEX_BY_SORT);
> - mi_disable_indexes_for_rebuild(file, rows, all_keys);
> + if (table->s->long_unique_table)
> + {
> + ulonglong hash_key_map= 0ULL;
> + for(uint i= 0; i < table->s->keys; i++)
> + if (table->key_info[i].algorithm == HA_KEY_ALG_LONG_HASH)
> + mi_set_key_active(hash_key_map, i);
> + mi_disable_indexes_for_rebuild(file, rows, all_keys, hash_key_map);
> + }
> + else
> + mi_disable_indexes_for_rebuild(file, rows, all_keys, 0ULL);
I agree with the fix, but here's a comment about the implementation.
mi_disable_indexes_for_rebuild() is _only_ used here, nowhere else.
So, I think I'd just removed mi_disable_indexes_for_rebuild(), inlined
it here and do everything in one loop.
Note, that you also need to fix Aria engine, it has almost the same
code. And add a test for it, please.
> }
> }
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1

Re: [Maria-developers] [Commits] d48294d15c4: MDEV-13694: Wrong result upon GROUP BY with orderby_uses_equalities=on
by Sergey Petrunia 09 Oct '19
by Sergey Petrunia 09 Oct '19
09 Oct '19
Hi Varun,
This patch is a step in the right direction, but I think we're not quite there
yet.
The patch adds logic about semi-join materialization into the filesort code
This makes things really messy.
I think, some isolation would be beneficial. The first suggestion:
In Sort_param (as this structure is visible in find_all_keys), add members that control
the behavior inside find_all_keys:
- unpack_function (default is NULL)
- bool set_all_read_bits (default is FALSE)
Now, Sort_param is allocated in filesort(), so we will have to add the same
members in "class Filesort" and have filesort() copy them to Sort_param
(necessary because find_all_keys() doesn't see the "Filesort" object).
Then, these members can be set in the SQL layer e.g. in JOIN::add_sorting_to_table
which creates the Filesort structure. (It would be ideal if semi-join code
set them, but this is probably not possible to achieve as Filesort is set up
after the semi-join code is set up).
If after all these changes "JOIN_TAB::unpacking_to_base_table_fields()" is
still there, please rename it to "unpack_..." so that it follows the
established convention.
That's all the input.
On Sat, Sep 28, 2019 at 02:12:01PM +0530, Varun wrote:
> revision-id: d48294d15c4895215d5facef97fc80c03cd6b4b0 (mariadb-10.4.4-341-gd48294d15c4)
> parent(s): a340af922361e3958e5d6653c8b840771db282f2
> author: Varun Gupta
> committer: Varun Gupta
> timestamp: 2019-09-28 13:06:44 +0530
> message:
>
> MDEV-13694: Wrong result upon GROUP BY with orderby_uses_equalities=on
>
> For the case when the SJM scan table is the first table in the join order,
> then if we want to do the sorting on the SJM scan table, then we need to
> make sure that we unpack the values to base table fields in two cases:
> 1) Reading the SJM table and writing the sort-keys inside the sort-buffer
> 2) Reading the sorted data from the sort file
>
> ---
> mysql-test/main/order_by.result | 138 +++++++++++++++++++++++++++++++++++++++-
> mysql-test/main/order_by.test | 34 ++++++++++
> sql/filesort.cc | 17 +++++
> sql/opt_subselect.cc | 10 ++-
> sql/records.cc | 13 ++++
> sql/records.h | 1 +
> sql/sql_select.cc | 89 ++++++++++----------------
> sql/sql_select.h | 4 +-
> sql/table.h | 1 +
> 9 files changed, 246 insertions(+), 61 deletions(-)
>
> diff --git a/mysql-test/main/order_by.result b/mysql-test/main/order_by.result
> index b059cc686cd..e74583670fc 100644
> --- a/mysql-test/main/order_by.result
> +++ b/mysql-test/main/order_by.result
> @@ -3322,7 +3322,7 @@ WHERE books.library_id = 8663 AND
> books.scheduled_for_removal=0 )
> ORDER BY wings.id;
> id select_type table type possible_keys key key_len ref rows filtered Extra
> -1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 2 100.00 Using temporary; Using filesort
> +1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 2 100.00 Using filesort
> 1 PRIMARY wings eq_ref PRIMARY PRIMARY 4 test.books.wings_id 1 100.00
> 2 MATERIALIZED books ref library_idx library_idx 4 const 2 100.00 Using where
> Warnings:
> @@ -3436,3 +3436,139 @@ Note 1003 select `test`.`t4`.`a` AS `a`,`test`.`t4`.`b` AS `b`,`test`.`t4`.`c` A
> set histogram_size=@tmp_h, histogram_type=@tmp_ht, use_stat_tables=@tmp_u,
> optimizer_use_condition_selectivity=@tmp_o;
> drop table t1,t2,t3,t4;
> +#
> +# MDEV-13694: Wrong result upon GROUP BY with orderby_uses_equalities=on
> +#
> +CREATE TABLE t1 (a INT, b int, primary key(a));
> +CREATE TABLE t2 (a INT, b INT);
> +INSERT INTO t1 (a,b) VALUES (58,1),(96,2),(273,3),(23,4),(231,5),(525,6),
> +(2354,7),(321421,3),(535,2),(4535,3);
> +INSERT INTO t2 (a,b) VALUES (58,3),(96,3),(273,3);
> +# Join order should have the SJM scan table as the first table for both
> +# the queries with GROUP BY and ORDER BY clause.
> +EXPLAIN SELECT t1.a
> +FROM t1
> +WHERE t1.a IN (SELECT a FROM t2 WHERE b=3)
> +ORDER BY t1.a DESC;
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 3 Using filesort
> +1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 test.t2.a 1 Using index
> +2 MATERIALIZED t2 ALL NULL NULL NULL NULL 3 Using where
> +EXPLAIN FORMAT=JSON SELECT t1.a
> +FROM t1
> +WHERE t1.a IN (SELECT a FROM t2 WHERE b=3)
> +ORDER BY t1.a DESC;
> +EXPLAIN
> +{
> + "query_block": {
> + "select_id": 1,
> + "read_sorted_file": {
> + "filesort": {
> + "sort_key": "t1.a desc",
> + "table": {
> + "table_name": "<subquery2>",
> + "access_type": "ALL",
> + "possible_keys": ["distinct_key"],
> + "rows": 3,
> + "filtered": 100,
> + "materialized": {
> + "unique": 1,
> + "query_block": {
> + "select_id": 2,
> + "table": {
> + "table_name": "t2",
> + "access_type": "ALL",
> + "rows": 3,
> + "filtered": 100,
> + "attached_condition": "t2.b = 3 and t2.a is not null"
> + }
> + }
> + }
> + }
> + }
> + },
> + "table": {
> + "table_name": "t1",
> + "access_type": "eq_ref",
> + "possible_keys": ["PRIMARY"],
> + "key": "PRIMARY",
> + "key_length": "4",
> + "used_key_parts": ["a"],
> + "ref": ["test.t2.a"],
> + "rows": 1,
> + "filtered": 100,
> + "using_index": true
> + }
> + }
> +}
> +SELECT t1.a
> +FROM t1
> +WHERE t1.a IN (SELECT a FROM t2 WHERE b=3)
> +ORDER BY t1.a DESC;
> +a
> +273
> +96
> +58
> +EXPLAIN SELECT t1.a, group_concat(t1.b)
> +FROM t1
> +WHERE t1.a IN (SELECT a FROM t2 WHERE b=3)
> +GROUP BY t1.a DESC;
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 PRIMARY <subquery2> ALL distinct_key NULL NULL NULL 3 Using filesort
> +1 PRIMARY t1 eq_ref PRIMARY PRIMARY 4 test.t2.a 1
> +2 MATERIALIZED t2 ALL NULL NULL NULL NULL 3 Using where
> +EXPLAIN FORMAT=JSON SELECT t1.a, group_concat(t1.b)
> +FROM t1
> +WHERE t1.a IN (SELECT a FROM t2 WHERE b=3)
> +GROUP BY t1.a DESC;
> +EXPLAIN
> +{
> + "query_block": {
> + "select_id": 1,
> + "read_sorted_file": {
> + "filesort": {
> + "sort_key": "t1.a desc",
> + "table": {
> + "table_name": "<subquery2>",
> + "access_type": "ALL",
> + "possible_keys": ["distinct_key"],
> + "rows": 3,
> + "filtered": 100,
> + "materialized": {
> + "unique": 1,
> + "query_block": {
> + "select_id": 2,
> + "table": {
> + "table_name": "t2",
> + "access_type": "ALL",
> + "rows": 3,
> + "filtered": 100,
> + "attached_condition": "t2.b = 3 and t2.a is not null"
> + }
> + }
> + }
> + }
> + }
> + },
> + "table": {
> + "table_name": "t1",
> + "access_type": "eq_ref",
> + "possible_keys": ["PRIMARY"],
> + "key": "PRIMARY",
> + "key_length": "4",
> + "used_key_parts": ["a"],
> + "ref": ["test.t2.a"],
> + "rows": 1,
> + "filtered": 100
> + }
> + }
> +}
> +SELECT t1.a, group_concat(t1.b)
> +FROM t1
> +WHERE t1.a IN (SELECT a FROM t2 WHERE b=3)
> +GROUP BY t1.a DESC;
> +a group_concat(t1.b)
> +273 3
> +96 2
> +58 1
> +DROP TABLE t1, t2;
> diff --git a/mysql-test/main/order_by.test b/mysql-test/main/order_by.test
> index 934c503302f..b3e43d27e2f 100644
> --- a/mysql-test/main/order_by.test
> +++ b/mysql-test/main/order_by.test
> @@ -2276,3 +2276,37 @@ set histogram_size=@tmp_h, histogram_type=@tmp_ht, use_stat_tables=@tmp_u,
> optimizer_use_condition_selectivity=@tmp_o;
>
> drop table t1,t2,t3,t4;
> +
> +
> +--echo #
> +--echo # MDEV-13694: Wrong result upon GROUP BY with orderby_uses_equalities=on
> +--echo #
> +
> +CREATE TABLE t1 (a INT, b int, primary key(a));
> +CREATE TABLE t2 (a INT, b INT);
> +
> +INSERT INTO t1 (a,b) VALUES (58,1),(96,2),(273,3),(23,4),(231,5),(525,6),
> + (2354,7),(321421,3),(535,2),(4535,3);
> +INSERT INTO t2 (a,b) VALUES (58,3),(96,3),(273,3);
> +
> +--echo # Join order should have the SJM scan table as the first table for both
> +--echo # the queries with GROUP BY and ORDER BY clause.
> +
> +let $query= SELECT t1.a
> + FROM t1
> + WHERE t1.a IN (SELECT a FROM t2 WHERE b=3)
> + ORDER BY t1.a DESC;
> +
> +eval EXPLAIN $query;
> +eval EXPLAIN FORMAT=JSON $query;
> +eval $query;
> +
> +let $query= SELECT t1.a, group_concat(t1.b)
> + FROM t1
> + WHERE t1.a IN (SELECT a FROM t2 WHERE b=3)
> + GROUP BY t1.a DESC;
> +
> +eval EXPLAIN $query;
> +eval EXPLAIN FORMAT=JSON $query;
> +eval $query;
> +DROP TABLE t1, t2;
> diff --git a/sql/filesort.cc b/sql/filesort.cc
> index 3f4291cfb1f..e5c83293e9f 100644
> --- a/sql/filesort.cc
> +++ b/sql/filesort.cc
> @@ -716,11 +716,21 @@ static ha_rows find_all_keys(THD *thd, Sort_param *param, SQL_SELECT *select,
> *found_rows= 0;
> ref_pos= &file->ref[0];
> next_pos=ref_pos;
> + JOIN_TAB *tab= sort_form->reginfo.join_tab;
> + JOIN *join= tab ? tab->join : NULL;
> + bool first_is_in_sjm_nest= FALSE;
>
> DBUG_EXECUTE_IF("show_explain_in_find_all_keys",
> dbug_serve_apcs(thd, 1);
> );
>
> + if (join && join->table_count != join->const_tables &&
> + (join->join_tab + join->const_tables == tab))
> + {
> + TABLE_LIST *tbl_for_first= sort_form->pos_in_table_list;
> + first_is_in_sjm_nest= tbl_for_first && tbl_for_first->is_sjm_scan_table();
> + }
> +
> if (!quick_select)
> {
> next_pos=(uchar*) 0; /* Find records in sequence */
> @@ -756,13 +766,20 @@ static ha_rows find_all_keys(THD *thd, Sort_param *param, SQL_SELECT *select,
> goto err;
> }
>
> + if (first_is_in_sjm_nest)
> + sort_form->column_bitmaps_set(save_read_set, save_write_set);
> +
> DEBUG_SYNC(thd, "after_index_merge_phase1");
> for (;;)
> {
> if (quick_select)
> error= select->quick->get_next();
> else /* Not quick-select */
> + {
> error= file->ha_rnd_next(sort_form->record[0]);
> + if (first_is_in_sjm_nest)
> + tab->unpacking_to_base_table_fields();
> + }
> if (unlikely(error))
> break;
> file->position(sort_form->record[0]);
> diff --git a/sql/opt_subselect.cc b/sql/opt_subselect.cc
> index 87458357865..f837a6394af 100644
> --- a/sql/opt_subselect.cc
> +++ b/sql/opt_subselect.cc
> @@ -4252,11 +4252,11 @@ bool setup_sj_materialization_part2(JOIN_TAB *sjm_tab)
> sjm_tab->type= JT_ALL;
>
> /* Initialize full scan */
> - sjm_tab->read_first_record= join_read_record_no_init;
> + sjm_tab->read_first_record= join_init_read_record;
> sjm_tab->read_record.copy_field= sjm->copy_field;
> sjm_tab->read_record.copy_field_end= sjm->copy_field +
> sjm->sjm_table_cols.elements;
> - sjm_tab->read_record.read_record_func= rr_sequential_and_unpack;
> + sjm_tab->read_record.read_record_func= read_record_func_for_rr_and_unpack;
> }
>
> sjm_tab->bush_children->end[-1].next_select= end_sj_materialize;
> @@ -7105,3 +7105,9 @@ bool Item_in_subselect::pushdown_cond_for_in_subquery(THD *thd, Item *cond)
> thd->lex->current_select= save_curr_select;
> DBUG_RETURN(FALSE);
> }
> +
> +
> +bool TABLE_LIST::is_sjm_scan_table()
> +{
> + return is_active_sjm() && sj_mat_info->is_sj_scan;
> +}
> diff --git a/sql/records.cc b/sql/records.cc
> index 3d709182a4e..f6885f773d5 100644
> --- a/sql/records.cc
> +++ b/sql/records.cc
> @@ -709,3 +709,16 @@ static int rr_cmp(uchar *a,uchar *b)
> return (int) a[7] - (int) b[7];
> #endif
> }
> +
> +
> +int read_record_func_for_rr_and_unpack(READ_RECORD *info)
> +{
> + int error;
> + if ((error= info->read_record_func_and_unpack_calls(info)))
> + return error;
> +
> + for (Copy_field *cp= info->copy_field; cp != info->copy_field_end; cp++)
> + (*cp->do_copy)(cp);
> +
> + return error;
> +}
> diff --git a/sql/records.h b/sql/records.h
> index faf0d13c9a9..037a06b9d34 100644
> --- a/sql/records.h
> +++ b/sql/records.h
> @@ -55,6 +55,7 @@ struct READ_RECORD
> TABLE *table; /* Head-form */
> Unlock_row_func unlock_row;
> Read_func read_record_func;
> + Read_func read_record_func_and_unpack_calls;
> THD *thd;
> SQL_SELECT *select;
> uint ref_length, reclength, rec_cache_size, error_offset;
> diff --git a/sql/sql_select.cc b/sql/sql_select.cc
> index 36d9eda3383..28bc57c692f 100644
> --- a/sql/sql_select.cc
> +++ b/sql/sql_select.cc
> @@ -14015,37 +14015,8 @@ remove_const(JOIN *join,ORDER *first_order, COND *cond,
> can be used without tmp. table.
> */
> bool can_subst_to_first_table= false;
> - bool first_is_in_sjm_nest= false;
> - if (first_is_base_table)
> - {
> - TABLE_LIST *tbl_for_first=
> - join->join_tab[join->const_tables].table->pos_in_table_list;
> - first_is_in_sjm_nest= tbl_for_first->sj_mat_info &&
> - tbl_for_first->sj_mat_info->is_used;
> - }
> - /*
> - Currently we do not employ the optimization that uses multiple
> - equalities for ORDER BY to remove tmp table in the case when
> - the first table happens to be the result of materialization of
> - a semi-join nest ( <=> first_is_in_sjm_nest == true).
> -
> - When a semi-join nest is materialized and scanned to look for
> - possible matches in the remaining tables for every its row
> - the fields from the result of materialization are copied
> - into the record buffers of tables from the semi-join nest.
> - So these copies are used to access the remaining tables rather
> - than the fields from the result of materialization.
> -
> - Unfortunately now this so-called 'copy back' technique is
> - supported only if the rows are scanned with the rr_sequential
> - function, but not with other rr_* functions that are employed
> - when the result of materialization is required to be sorted.
> -
> - TODO: either to support 'copy back' technique for the above case,
> - or to get rid of this technique altogether.
> - */
> if (optimizer_flag(join->thd, OPTIMIZER_SWITCH_ORDERBY_EQ_PROP) &&
> - first_is_base_table && !first_is_in_sjm_nest &&
> + first_is_base_table &&
> order->item[0]->real_item()->type() == Item::FIELD_ITEM &&
> join->cond_equal)
> {
> @@ -19922,19 +19893,6 @@ do_select(JOIN *join, Procedure *procedure)
> }
>
>
> -int rr_sequential_and_unpack(READ_RECORD *info)
> -{
> - int error;
> - if (unlikely((error= rr_sequential(info))))
> - return error;
> -
> - for (Copy_field *cp= info->copy_field; cp != info->copy_field_end; cp++)
> - (*cp->do_copy)(cp);
> -
> - return error;
> -}
> -
> -
> /**
> @brief
> Instantiates temporary table
> @@ -21223,6 +21181,8 @@ bool test_if_use_dynamic_range_scan(JOIN_TAB *join_tab)
>
> int join_init_read_record(JOIN_TAB *tab)
> {
> + bool need_unpacking= FALSE;
> + JOIN *join= tab->join;
> /*
> Note: the query plan tree for the below operations is constructed in
> save_agg_explain_data.
> @@ -21232,6 +21192,12 @@ int join_init_read_record(JOIN_TAB *tab)
> if (tab->filesort && tab->sort_table()) // Sort table.
> return 1;
>
> + if (join->top_join_tab_count != join->const_tables)
> + {
> + TABLE_LIST *tbl= tab->table->pos_in_table_list;
> + need_unpacking= tbl ? tbl->is_sjm_scan_table() : FALSE;
> + }
> +
> tab->build_range_rowid_filter_if_needed();
>
> DBUG_EXECUTE_IF("kill_join_init_read_record",
> @@ -21249,16 +21215,6 @@ int join_init_read_record(JOIN_TAB *tab)
> if (!tab->preread_init_done && tab->preread_init())
> return 1;
>
> -
> - if (init_read_record(&tab->read_record, tab->join->thd, tab->table,
> - tab->select, tab->filesort_result, 1,1, FALSE))
> - return 1;
> - return tab->read_record.read_record();
> -}
> -
> -int
> -join_read_record_no_init(JOIN_TAB *tab)
> -{
> Copy_field *save_copy, *save_copy_end;
>
> /*
> @@ -21268,12 +21224,20 @@ join_read_record_no_init(JOIN_TAB *tab)
> save_copy= tab->read_record.copy_field;
> save_copy_end= tab->read_record.copy_field_end;
>
> - init_read_record(&tab->read_record, tab->join->thd, tab->table,
> - tab->select, tab->filesort_result, 1, 1, FALSE);
> +
> + if (init_read_record(&tab->read_record, tab->join->thd, tab->table,
> + tab->select, tab->filesort_result, 1, 1, FALSE))
> + return 1;
>
> tab->read_record.copy_field= save_copy;
> tab->read_record.copy_field_end= save_copy_end;
> - tab->read_record.read_record_func= rr_sequential_and_unpack;
> +
> + if (need_unpacking)
> + {
> + tab->read_record.read_record_func_and_unpack_calls=
> + tab->read_record.read_record_func;
> + tab->read_record.read_record_func = read_record_func_for_rr_and_unpack;
> + }
>
> return tab->read_record.read_record();
> }
> @@ -28981,6 +28945,19 @@ void build_notnull_conds_for_inner_nest_of_outer_join(JOIN *join,
> }
>
>
> +/*
> + @brief
> + Unpacking temp table fields to base table fields.
> +*/
> +
> +void JOIN_TAB::unpacking_to_base_table_fields()
> +{
> + for (Copy_field *cp= read_record.copy_field;
> + cp != read_record.copy_field_end; cp++)
> + (*cp->do_copy)(cp);
> +}
> +
> +
> /**
> @} (end of group Query_Optimizer)
> */
> diff --git a/sql/sql_select.h b/sql/sql_select.h
> index 4f7bf49f635..545d4a788cc 100644
> --- a/sql/sql_select.h
> +++ b/sql/sql_select.h
> @@ -223,7 +223,7 @@ typedef enum_nested_loop_state
> (*Next_select_func)(JOIN *, struct st_join_table *, bool);
> Next_select_func setup_end_select_func(JOIN *join, JOIN_TAB *tab);
> int rr_sequential(READ_RECORD *info);
> -int rr_sequential_and_unpack(READ_RECORD *info);
> +int read_record_func_for_rr_and_unpack(READ_RECORD *info);
> Item *remove_pushed_top_conjuncts(THD *thd, Item *cond);
> Item *and_new_conditions_to_optimized_cond(THD *thd, Item *cond,
> COND_EQUAL **cond_eq,
> @@ -676,6 +676,7 @@ typedef struct st_join_table {
> table_map remaining_tables);
> bool fix_splitting(SplM_plan_info *spl_plan, table_map remaining_tables,
> bool is_const_table);
> + void unpacking_to_base_table_fields();
> } JOIN_TAB;
>
>
> @@ -2352,7 +2353,6 @@ create_virtual_tmp_table(THD *thd, Field *field)
>
> int test_if_item_cache_changed(List<Cached_item> &list);
> int join_init_read_record(JOIN_TAB *tab);
> -int join_read_record_no_init(JOIN_TAB *tab);
> void set_position(JOIN *join,uint idx,JOIN_TAB *table,KEYUSE *key);
> inline Item * and_items(THD *thd, Item* cond, Item *item)
> {
> diff --git a/sql/table.h b/sql/table.h
> index 1a7e5fbd4dc..35ba9bbb95d 100644
> --- a/sql/table.h
> +++ b/sql/table.h
> @@ -2622,6 +2622,7 @@ struct TABLE_LIST
> */
> const char *get_table_name() const { return view != NULL ? view_name.str : table_name.str; }
> bool is_active_sjm();
> + bool is_sjm_scan_table();
> bool is_jtbm() { return MY_TEST(jtbm_subselect != NULL); }
> st_select_lex_unit *get_unit();
> st_select_lex *get_single_select();
> _______________________________________________
> commits mailing list
> commits(a)mariadb.org
> https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
--
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
1
0

[Maria-developers] GSoC: On completing the MDEV-6017 (Add support for Indexes on Expressions)
by Alexey Mogilyovkin 01 Oct '19
by Alexey Mogilyovkin 01 Oct '19
01 Oct '19
Hello, I would like to finish the task I was working on during GSoC.
Nikita told me that there are some architectural questions to my code.
Can you tell about them in more detail?
2
1

Re: [Maria-developers] 3f38da9145c: MDEV-20016 Add MariaDB_DATA_TYPE_PLUGIN
by Sergei Golubchik 27 Sep '19
by Sergei Golubchik 27 Sep '19
27 Sep '19
Hi, Alexander!
On Sep 16, Alexander Barkov wrote:
> revision-id: 3f38da9145c (mariadb-10.4.4-251-g3f38da9145c)
> parent(s): e6ff3f9d1c8
> author: Alexander Barkov <bar(a)mariadb.com>
> committer: Alexander Barkov <bar(a)mariadb.com>
> timestamp: 2019-07-12 07:53:55 +0400
> message:
>
> MDEV-20016 Add MariaDB_DATA_TYPE_PLUGIN
> diff --git a/include/mysql/plugin.h b/include/mysql/plugin.h
> index 85e52a247af..92703b626ac 100644
> --- a/include/mysql/plugin.h
> +++ b/include/mysql/plugin.h
> @@ -611,6 +612,22 @@ struct handlerton;
> int interface_version;
> };
>
> +/*
> + API for data type plugin. (MariaDB_DATA_TYPE_PLUGIN)
> +*/
> +#define MariaDB_DATA_TYPE_INTERFACE_VERSION (MYSQL_VERSION_ID << 8)
> +
> +/**
> + Data type plugin descriptor
> +*/
> +#ifdef __cplusplus
> +struct st_mariadb_data_type
> +{
> + int interface_version;
> + const class Type_handler *type_handler;
> +};
> +#endif
Plugin-wise it'd be better to have a separate plugin_data_type.h
and put Type_handler definition there.
So that as much of the API as possible gets into the .pp
file and we could detect when it changes.
> +
> /*************************************************************************
> st_mysql_value struct for reading values from mysqld.
> Used by server variables framework to parse user-provided values.
> diff --git a/plugin/type_test/CMakeLists.txt b/plugin/type_test/CMakeLists.txt
> new file mode 100644
> index 00000000000..b85168d1bd2
> --- /dev/null
> +++ b/plugin/type_test/CMakeLists.txt
> @@ -0,0 +1,17 @@
> +# Copyright (c) 2016, MariaDB corporation. All rights reserved.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; version 2 of the License.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write to the Free Software
> +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1335 USA
> +
> +MYSQL_ADD_PLUGIN(type_test plugin.cc RECOMPILE_FOR_EMBEDDED
> + MODULE_ONLY COMPONENT Test)
Add at least two new plugins, please.
May be in separate commits, as you like.
But at least two.
And practical, not just to test the API
> diff --git a/plugin/type_test/mysql-test/type_test/type_test_int8-debug.test b/plugin/type_test/mysql-test/type_test/type_test_int8-debug.test
> new file mode 100644
> index 00000000000..30e313481d6
> --- /dev/null
> +++ b/plugin/type_test/mysql-test/type_test/type_test_int8-debug.test
> @@ -0,0 +1,11 @@
> +--echo #
> +--echo # MDEV-20016 Add MariaDB_DATA_TYPE_PLUGIN
> +--echo #
> +
> +SET SESSION debug_dbug="+d,frm_data_type_info";
> +
> +CREATE TABLE t1 (a TEST_INT8);
> +SHOW CREATE TABLE t1;
> +DROP TABLE t1;
> +
> +SET SESSION debug_dbug="-d,frm_data_type_info";
don't do that, always save old debug_dbug value instead and restore it later.
SET @old_debug_dbug=@@debug_dbug;
SET @@debug_dbug="+d,frm_data_type_info";
...
SET @@debug_dbug=@old_debug_dbug;
because after "+d,xxx" you have one keyword enabled, like in
@@debug_dbug="d,xxx". and after "-d,xxx" you have no keywords enabled,
like in @debug_dbug="d" which means "all keywords enabled" for dbug.
> diff --git a/plugin/type_test/mysql-test/type_test/type_test_int8.result b/plugin/type_test/mysql-test/type_test/type_test_int8.result
> new file mode 100644
> index 00000000000..758f94904c1
> --- /dev/null
> +++ b/plugin/type_test/mysql-test/type_test/type_test_int8.result
> @@ -0,0 +1,144 @@
> +#
> +# MDEV-20016 Add MariaDB_DATA_TYPE_PLUGIN
> +#
> +SELECT
> +PLUGIN_NAME,
> +PLUGIN_VERSION,
> +PLUGIN_STATUS,
> +PLUGIN_TYPE,
> +PLUGIN_AUTHOR,
> +PLUGIN_DESCRIPTION,
> +PLUGIN_LICENSE,
> +PLUGIN_MATURITY,
> +PLUGIN_AUTH_VERSION
> +FROM INFORMATION_SCHEMA.PLUGINS
> +WHERE PLUGIN_TYPE='DATA TYPE'
> + AND PLUGIN_NAME='TEST_INT8';
> +PLUGIN_NAME TEST_INT8
> +PLUGIN_VERSION 1.0
> +PLUGIN_STATUS ACTIVE
> +PLUGIN_TYPE DATA TYPE
> +PLUGIN_AUTHOR MariaDB
> +PLUGIN_DESCRIPTION Data type TEST_INT8
> +PLUGIN_LICENSE GPL
> +PLUGIN_MATURITY Alpha
> +PLUGIN_AUTH_VERSION 1.0
> +CREATE TABLE t1 (a TEST_INT8);
> +SHOW CREATE TABLE t1;
> +Table Create Table
> +t1 CREATE TABLE `t1` (
> + `a` test_int8 DEFAULT NULL
is the type name a reserved word? or an arbitrary identifier?
should we support (and print as)
`a` `test_int8` DEFAULT NULL
?
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1
> +DROP TABLE t1;
> +SELECT CAST('100' AS TEST_INT8) AS cast;
> +cast
> +100
> +BEGIN NOT ATOMIC
> +DECLARE a TEST_INT8 DEFAULT 256;
> +SELECT HEX(a), a;
> +END;
> +$$
> +HEX(a) a
> +100 256
> +CREATE FUNCTION f1(p TEST_INT8) RETURNS TEST_INT8 RETURN 1;
> +SHOW CREATE FUNCTION f1;
> +Function sql_mode Create Function character_set_client collation_connection Database Collation
> +f1 STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION CREATE DEFINER=`root`@`localhost` FUNCTION `f1`(p TEST_INT8) RETURNS test_int8
> +RETURN 1 latin1 latin1_swedish_ci latin1_swedish_ci
use enable_metadata for a part (or for a whole) of this test
and show how mysql --column-type-info handles it.
> +SELECT f1(10);
> +f1(10)
> +1
> +DROP FUNCTION f1;
> +CREATE TABLE t1 (a TEST_INT8);
> +CREATE TABLE t2 AS SELECT a FROM t1;
> +SHOW CREATE TABLE t2;
> +Table Create Table
> +t2 CREATE TABLE `t2` (
> + `a` test_int8 DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1
> +DROP TABLE t2;
> +DROP TABLE t1;
> +CREATE TABLE t1 (a TEST_INT8);
> +CREATE TABLE t2 AS SELECT COALESCE(a,a) FROM t1;
> +SHOW CREATE TABLE t2;
> +Table Create Table
> +t2 CREATE TABLE `t2` (
> + `COALESCE(a,a)` test_int8 DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1
> +DROP TABLE t2;
> +DROP TABLE t1;
> +CREATE TABLE t1 (a TEST_INT8);
> +CREATE TABLE t2 AS SELECT LEAST(a,a) FROM t1;
> +SHOW CREATE TABLE t2;
> +Table Create Table
> +t2 CREATE TABLE `t2` (
> + `LEAST(a,a)` test_int8 DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1
> +DROP TABLE t2;
> +DROP TABLE t1;
> +CREATE TABLE t1 (a TEST_INT8);
> +INSERT INTO t1 VALUES (1),(2);
> +CREATE TABLE t2 AS SELECT MIN(a), MAX(a) FROM t1;
> +SELECT * FROM t2;
> +MIN(a) MAX(a)
> +1 2
> +SHOW CREATE TABLE t2;
> +Table Create Table
> +t2 CREATE TABLE `t2` (
> + `MIN(a)` test_int8 DEFAULT NULL,
> + `MAX(a)` test_int8 DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1
> +DROP TABLE t2;
> +DROP TABLE t1;
> +CREATE TABLE t1 (id INT, a TEST_INT8);
> +INSERT INTO t1 VALUES (1,1),(1,2),(2,1),(2,2);
> +CREATE TABLE t2 AS SELECT id, MIN(a), MAX(a) FROM t1 GROUP BY id;
> +SELECT * FROM t2;
> +id MIN(a) MAX(a)
> +1 1 2
> +2 1 2
> +SHOW CREATE TABLE t2;
> +Table Create Table
> +t2 CREATE TABLE `t2` (
> + `id` int(11) DEFAULT NULL,
> + `MIN(a)` test_int8 DEFAULT NULL,
> + `MAX(a)` test_int8 DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1
> +DROP TABLE t2;
> +DROP TABLE t1;
> +CREATE TABLE t1 (a TEST_INT8);
> +INSERT INTO t1 VALUES (1),(2);
> +CREATE TABLE t2 AS SELECT DISTINCT a FROM t1;
> +SELECT * FROM t2;
> +a
> +1
> +2
> +SHOW CREATE TABLE t2;
> +Table Create Table
> +t2 CREATE TABLE `t2` (
> + `a` test_int8 DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1
> +DROP TABLE t2;
> +DROP TABLE t1;
> +CREATE TABLE t1 (a TEST_INT8);
> +INSERT INTO t1 VALUES (1);
> +CREATE TABLE t2 AS SELECT (SELECT a FROM t1) AS c1;
> +SELECT * FROM t2;
> +c1
> +1
> +SHOW CREATE TABLE t2;
> +Table Create Table
> +t2 CREATE TABLE `t2` (
> + `c1` test_int8 DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1
> +DROP TABLE t2;
> +DROP TABLE t1;
> +CREATE TABLE t1 (a TEST_INT8);
> +CREATE TABLE t2 AS SELECT a FROM t1 UNION SELECT a FROM t1;
> +SHOW CREATE TABLE t2;
> +Table Create Table
> +t2 CREATE TABLE `t2` (
> + `a` test_int8 DEFAULT NULL
> +) ENGINE=MyISAM DEFAULT CHARSET=latin1
> +DROP TABLE t2;
> +DROP TABLE t1;
CREATE LIKE, ALTER TABLE, SHOW COLUMNS, I_S.COLUMNS
Does it support indexing?
> diff --git a/plugin/type_test/plugin.cc b/plugin/type_test/plugin.cc
> new file mode 100644
> index 00000000000..ea70c70f786
> --- /dev/null
> +++ b/plugin/type_test/plugin.cc
> @@ -0,0 +1,120 @@
> +/*
> + Copyright (c) 2000, 2015, Oracle and/or its affiliates.
> + Copyright (c) 2009, 2019, MariaDB
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; version 2 of the License.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program; if not, write to the Free Software
> + Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
> +
> +#include <my_global.h>
> +#include <sql_class.h> // THD
> +#include <mysql/plugin.h>
> +#include "sql_type.h"
> +
> +
> +class Field_test_int8 :public Field_longlong
> +{
> +public:
> + Field_test_int8(const LEX_CSTRING &name, const Record_addr &addr,
> + enum utype unireg_check_arg,
> + uint32 len_arg, bool zero_arg, bool unsigned_arg)
> + :Field_longlong(addr.ptr(), len_arg, addr.null_ptr(), addr.null_bit(),
> + Field::NONE, &name, zero_arg, unsigned_arg)
> + {}
> + void sql_type(String &res) const
> + {
> + CHARSET_INFO *cs= res.charset();
> + res.length(cs->cset->snprintf(cs,(char*) res.ptr(),res.alloced_length(),
> + "test_int8"));
isn't there an easier way of setting a String to a specific value?
Like, res.copy() or something?
By the way, why do you need to do it at all, if the parent's Field::sql_type
method could set the value from type_handler()->name() ?
> + // UNSIGNED and ZEROFILL flags are not supported by the parser yet.
> + // add_zerofill_and_unsigned(res);
> + }
> + const Type_handler *type_handler() const;
> +};
> +
> +
> +class Type_handler_test_int8: public Type_handler_longlong
> +{
> +public:
> + const Name name() const override
> + {
> + static Name name(STRING_WITH_LEN("test_int8"));
I'd prefer this being picked up automatically from the plugin name.
Like it's done for engines, I_S tables, auth plugins, ft parsers, etc.
> + return name;
> + }
> + bool Column_definition_data_type_info_image(Binary_string *to,
> + const Column_definition &def)
> + const override
> + {
> + return to->append(Type_handler_test_int8::name().lex_cstring());
> + }
Not sure, why this has to be overridden by the derived class.
> + Field *make_table_field(MEM_ROOT *root,
> + const LEX_CSTRING *name,
> + const Record_addr &addr,
> + const Type_all_attributes &attr,
> + TABLE *table) const override
> + {
> + return new (root)
> + Field_test_int8(*name, addr, Field::NONE,
> + attr.max_char_length(),
> + 0/*zerofill*/,
> + attr.unsigned_flag);
> + }
> +
> + Field *make_table_field_from_def(TABLE_SHARE *share, MEM_ROOT *root,
> + const LEX_CSTRING *name,
> + const Record_addr &rec, const Bit_addr &bit,
> + const Column_definition_attributes *attr,
> + uint32 flags) const override
> + {
> + return new (root)
> + Field_test_int8(*name, rec, attr->unireg_check,
> + (uint32) attr->length,
> + f_is_zerofill(attr->pack_flag) != 0,
> + f_is_dec(attr->pack_flag) == 0);
> + }
> +};
Why do you need two different methods? I'd expect one
that does new Field_test_int8() to be enough.
The second can be in the parent class, calling make_table_field().
Or both could be in the parent class, calling the only
virtual method that actually does new (root) Field_test_int8(...)
> +
> +static Type_handler_test_int8 type_handler_test_int8;
> +
> +
> +const Type_handler *Field_test_int8::type_handler() const
> +{
> + return &type_handler_test_int8;
> +}
> +
> +
> +/*************************************************************************/
> +
> +static struct st_mariadb_data_type data_type_test_plugin=
> +{
> + MariaDB_DATA_TYPE_INTERFACE_VERSION,
> + &type_handler_test_int8
> +};
It's be more interesting to have a distinct type, not just an alias
for BIGINT. E.g. 7-byte integer.
As an example, it's a pretty empty plugin, it doesn't show
why this API was ever created. Add something non-trivial to it please.
And some real, non-test, plugin in a separate commit.
> +
> +
> +maria_declare_plugin(type_geom)
> +{
> + MariaDB_DATA_TYPE_PLUGIN, // the plugin type (see include/mysql/plugin.h)
> + &data_type_test_plugin, // pointer to type-specific plugin descriptor
> + "TEST_INT8", // plugin name
> + "MariaDB", // plugin author
MariaDB Corporation ?
> + "Data type TEST_INT8", // the plugin description
> + PLUGIN_LICENSE_GPL, // the plugin license (see include/mysql/plugin.h)
> + 0, // Pointer to plugin initialization function
> + 0, // Pointer to plugin deinitialization function
> + 0x0100, // Numeric version 0xAABB means AA.BB veriosn
> + NULL, // Status variables
> + NULL, // System variables
> + "1.0", // String version representation
> + MariaDB_PLUGIN_MATURITY_ALPHA // Maturity (see include/mysql/plugin.h)*/
EXPERIMENTAL (for test plugins. ALPHA kind of implies it'll be BETA, GAMMA
and STABLE eventually)
> +}
> +maria_declare_plugin_end;
> diff --git a/sql/sql_type.cc b/sql/sql_type.cc
> index 5cecd9f50f7..988619cb5f9 100644
> --- a/sql/sql_type.cc
> +++ b/sql/sql_type.cc
> @@ -121,8 +121,23 @@ bool Type_handler_data::init()
>
>
> const Type_handler *
> -Type_handler::handler_by_name(const LEX_CSTRING &name)
> +Type_handler::handler_by_name(THD *thd, const LEX_CSTRING &name)
> {
> + plugin_ref plugin;
> + if ((plugin= my_plugin_lock_by_name(thd, &name, MariaDB_DATA_TYPE_PLUGIN)))
> + {
> + /*
> + INSTALL PLUGIN is not fully supported for data type plugins yet.
Why? What's not supported?
> + Fow now we have only mandatory built-in plugins
> + and dynamic plugins for test purposes,
> + Should be safe to unlock the plugin immediately.
> + */
> + const Type_handler *ph= reinterpret_cast<st_mariadb_data_type*>
> + (plugin_decl(plugin)->info)->type_handler;
> + plugin_unlock(thd, plugin);
> + return ph;
> + }
> +
> #ifdef HAVE_SPATIAL
> const Type_handler *ha= type_collection_geometry.handler_by_name(name);
> if (ha)
> diff --git a/sql/sql_type.h b/sql/sql_type.h
> index 8f2d4d0c49d..b01330f30e4 100644
> --- a/sql/sql_type.h
> +++ b/sql/sql_type.h
> @@ -3221,9 +3221,9 @@ class Information_schema_character_attributes
> class Type_handler
> {
> protected:
> - static const Name m_version_default;
> - static const Name m_version_mysql56;
> - static const Name m_version_mariadb53;
> + static const MYSQL_PLUGIN_IMPORT Name m_version_default;
> + static const MYSQL_PLUGIN_IMPORT Name m_version_mysql56;
> + static const MYSQL_PLUGIN_IMPORT Name m_version_mariadb53;
Why do you need that? Parent's behavior should be always fine for plugins.
I don't think plugins should know about this at all.
> String *print_item_value_csstr(THD *thd, Item *item, String *str) const;
> String *print_item_value_temporal(THD *thd, Item *item, String *str,
> const Name &type_name, String *buf) const;
> @@ -5129,9 +5130,9 @@ class Type_handler_bool: public Type_handler_long
>
> class Type_handler_longlong: public Type_handler_general_purpose_int
> {
> - static const Name m_name_longlong;
> - static const Type_limits_int m_limits_sint64;
> - static const Type_limits_int m_limits_uint64;
> + static const MYSQL_PLUGIN_IMPORT Name m_name_longlong;
> + static const MYSQL_PLUGIN_IMPORT Type_limits_int m_limits_sint64;
> + static const MYSQL_PLUGIN_IMPORT Type_limits_int m_limits_uint64;
same here
> public:
> virtual ~Type_handler_longlong() {}
> const Name name() const { return m_name_longlong; }
> diff --git a/sql/table.cc b/sql/table.cc
> index ea333cb2ecd..70b3814df6c 100644
> --- a/sql/table.cc
> +++ b/sql/table.cc
> @@ -2291,12 +2291,20 @@ int TABLE_SHARE::init_from_binary_frm_image(THD *thd, bool write,
>
> if (field_data_type_info_array.count())
> {
> + const LEX_CSTRING &info= field_data_type_info_array.
> + element(i).type_info();
> DBUG_EXECUTE_IF("frm_data_type_info",
> push_warning_printf(thd, Sql_condition::WARN_LEVEL_NOTE,
> ER_UNKNOWN_ERROR, "DBUG: [%u] name='%s' type_info='%.*s'",
> i, share->fieldnames.type_names[i],
> - (uint) field_data_type_info_array.element(i).type_info().length,
> - field_data_type_info_array.element(i).type_info().str););
> + (uint) info.length, info.str););
> +
> + if (info.length)
> + {
> + const Type_handler *h= Type_handler::handler_by_name(thd, info);
> + if (h)
> + handler= h;
> + }
I don't see where you handle the error that "unknown type"
> }
> }
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
3

Re: [Maria-developers] [Commits] 673e2537249: Fix missing memory barrier in wait_for_commit
by Kristian Nielsen 26 Sep '19
by Kristian Nielsen 26 Sep '19
26 Sep '19
Hi Andrei,
> > I noticed another thing with the wait_for_commit error handling while
> > looking at the MDEV-18648 patch.
> So practically, we seem have to turn the types of the pair into
> std::atomic<> to store into, in the right order, and load from (the way
> it is).
Yes. Something like this? There is a bit of noise in the patch due to
changing the type of waitee to atomic and adjusting all code that accesses
it. But it's basically this in wakeup:
this->wakeup_error= wakeup_error;
waitee.store(NULL, std::memory_order_release);
and this in the waiter fast path (one place in sql_class.h and one in log.cc):
if (waitee.load(std::memory_order_acquire))
return wait_for_prior_commit2(thd);
else
return wakeup_error;
Other access are protected by lock or happen within the waiting thread only,
so they can just use the memory_order_relaxed access semantics without
barriers.
Unfortunately I don't see how to do a useful test case. It is not really
possible to force load or store reordering (on x86 I believe the CPU in fact
enforces orderring between loads and stores). I couldn't think of a test
that would fail without the patch and succeed with it.
If you agree with the patch, do you have an opinion on which version it
should go into? I'd suggest 10.4, since (IIUC) that version started the use
of std::atomic.
(Patch also available here:
https://github.com/knielsen/server/commit/673e253724979fd9fe43a4a22bd7e1b2c…)
- Kristian.
Kristian Nielsen <knielsen(a)knielsen-hq.org> writes:
> revision-id: 673e253724979fd9fe43a4a22bd7e1b2c3a5269e (mariadb-10.4.4-333-g673e2537249)
> parent(s): 8887effe13ad87ba0460d4d3068fb5696f089bb0
> author: Kristian Nielsen
> committer: Kristian Nielsen
> timestamp: 2019-09-26 17:43:26 +0200
> message:
>
> Fix missing memory barrier in wait_for_commit
>
> The function wait_for_commit::wait_for_prior_commit() has a fast path
> where it checks without locks if wakeup_subsequent_commits() has
> already been called. This check was missing a memory barrier. The
> waitee thread does two writes to variables `waitee' and
> `wakeup_error', and if the waiting thread sees the first write it
> _must_ also see the second or incorrect behaviour will occur. This
> requires memory barriers between both the writes (release semantics)
> and the reads (acquire semantics) of those two variables.
>
> Other accesses to these variables are done under lock or where only
> one thread will be accessing them, and can be done without barriers
> (relaxed sematics).
>
> ---
> sql/log.cc | 19 +++++++++++++------
> sql/sql_class.cc | 25 ++++++++++++++-----------
> sql/sql_class.h | 17 ++++++++++++-----
> 3 files changed, 39 insertions(+), 22 deletions(-)
>
> diff --git a/sql/log.cc b/sql/log.cc
> index 4f51a9a9c17..a88d5147898 100644
> --- a/sql/log.cc
> +++ b/sql/log.cc
> @@ -7467,8 +7467,10 @@ MYSQL_BIN_LOG::queue_for_group_commit(group_commit_entry *orig_entry)
> */
> wfc= orig_entry->thd->wait_for_commit_ptr;
> orig_entry->queued_by_other= false;
> - if (wfc && wfc->waitee)
> + if (wfc && wfc->waitee.load(std::memory_order_acquire))
> {
> + wait_for_commit *loc_waitee;
> +
> mysql_mutex_lock(&wfc->LOCK_wait_commit);
> /*
> Do an extra check here, this time safely under lock.
> @@ -7480,10 +7482,10 @@ MYSQL_BIN_LOG::queue_for_group_commit(group_commit_entry *orig_entry)
> before setting the flag, so there is no risk that we can queue ahead of
> it.
> */
> - if (wfc->waitee && !wfc->waitee->commit_started)
> + if ((loc_waitee= wfc->waitee.load(std::memory_order_relaxed)) &&
> + !loc_waitee->commit_started)
> {
> PSI_stage_info old_stage;
> - wait_for_commit *loc_waitee;
>
> /*
> By setting wfc->opaque_pointer to our own entry, we mark that we are
> @@ -7505,7 +7507,8 @@ MYSQL_BIN_LOG::queue_for_group_commit(group_commit_entry *orig_entry)
> &wfc->LOCK_wait_commit,
> &stage_waiting_for_prior_transaction_to_commit,
> &old_stage);
> - while ((loc_waitee= wfc->waitee) && !orig_entry->thd->check_killed(1))
> + while ((loc_waitee= wfc->waitee.load(std::memory_order_relaxed)) &&
> + !orig_entry->thd->check_killed(1))
> mysql_cond_wait(&wfc->COND_wait_commit, &wfc->LOCK_wait_commit);
> wfc->opaque_pointer= NULL;
> DBUG_PRINT("info", ("After waiting for prior commit, queued_by_other=%d",
> @@ -7523,14 +7526,18 @@ MYSQL_BIN_LOG::queue_for_group_commit(group_commit_entry *orig_entry)
> do
> {
> mysql_cond_wait(&wfc->COND_wait_commit, &wfc->LOCK_wait_commit);
> - } while (wfc->waitee);
> + } while (wfc->waitee.load(std::memory_order_relaxed));
> }
> else
> {
> /* We were killed, so remove us from the list of waitee. */
> wfc->remove_from_list(&loc_waitee->subsequent_commits_list);
> mysql_mutex_unlock(&loc_waitee->LOCK_wait_commit);
> - wfc->waitee= NULL;
> + /*
> + This is the thread clearing its own status, it is no longer on
> + the list of waiters. So no memory barriers are needed here.
> + */
> + wfc->waitee.store(NULL, std::memory_order_relaxed);
>
> orig_entry->thd->EXIT_COND(&old_stage);
> /* Interrupted by kill. */
> diff --git a/sql/sql_class.cc b/sql/sql_class.cc
> index 4eab241232b..ca179a39dc1 100644
> --- a/sql/sql_class.cc
> +++ b/sql/sql_class.cc
> @@ -7230,7 +7230,7 @@ wait_for_commit::reinit()
> {
> subsequent_commits_list= NULL;
> next_subsequent_commit= NULL;
> - waitee= NULL;
> + waitee.store(NULL, std::memory_order_relaxed);
> opaque_pointer= NULL;
> wakeup_error= 0;
> wakeup_subsequent_commits_running= false;
> @@ -7308,8 +7308,9 @@ wait_for_commit::wakeup(int wakeup_error)
>
> */
> mysql_mutex_lock(&LOCK_wait_commit);
> - waitee= NULL;
> this->wakeup_error= wakeup_error;
> + /* Memory barrier to make wakeup_error visible to the waiter thread. */
> + waitee.store(NULL, std::memory_order_release);
> /*
> Note that it is critical that the mysql_cond_signal() here is done while
> still holding the mutex. As soon as we release the mutex, the waiter might
> @@ -7340,9 +7341,10 @@ wait_for_commit::wakeup(int wakeup_error)
> void
> wait_for_commit::register_wait_for_prior_commit(wait_for_commit *waitee)
> {
> - DBUG_ASSERT(!this->waitee /* No prior registration allowed */);
> + DBUG_ASSERT(!this->waitee.load(std::memory_order_relaxed)
> + /* No prior registration allowed */);
> wakeup_error= 0;
> - this->waitee= waitee;
> + this->waitee.store(waitee, std::memory_order_relaxed);
>
> mysql_mutex_lock(&waitee->LOCK_wait_commit);
> /*
> @@ -7351,7 +7353,7 @@ wait_for_commit::register_wait_for_prior_commit(wait_for_commit *waitee)
> see comments on wakeup_subsequent_commits2() for details.
> */
> if (waitee->wakeup_subsequent_commits_running)
> - this->waitee= NULL;
> + this->waitee.store(NULL, std::memory_order_relaxed);
> else
> {
> /*
> @@ -7381,7 +7383,8 @@ wait_for_commit::wait_for_prior_commit2(THD *thd)
> thd->ENTER_COND(&COND_wait_commit, &LOCK_wait_commit,
> &stage_waiting_for_prior_transaction_to_commit,
> &old_stage);
> - while ((loc_waitee= this->waitee) && likely(!thd->check_killed(1)))
> + while ((loc_waitee= this->waitee.load(std::memory_order_relaxed)) &&
> + likely(!thd->check_killed(1)))
> mysql_cond_wait(&COND_wait_commit, &LOCK_wait_commit);
> if (!loc_waitee)
> {
> @@ -7404,14 +7407,14 @@ wait_for_commit::wait_for_prior_commit2(THD *thd)
> do
> {
> mysql_cond_wait(&COND_wait_commit, &LOCK_wait_commit);
> - } while (this->waitee);
> + } while (this->waitee.load(std::memory_order_relaxed));
> if (wakeup_error)
> my_error(ER_PRIOR_COMMIT_FAILED, MYF(0));
> goto end;
> }
> remove_from_list(&loc_waitee->subsequent_commits_list);
> mysql_mutex_unlock(&loc_waitee->LOCK_wait_commit);
> - this->waitee= NULL;
> + this->waitee.store(NULL, std::memory_order_relaxed);
>
> wakeup_error= thd->killed_errno();
> if (!wakeup_error)
> @@ -7513,7 +7516,7 @@ wait_for_commit::unregister_wait_for_prior_commit2()
> wait_for_commit *loc_waitee;
>
> mysql_mutex_lock(&LOCK_wait_commit);
> - if ((loc_waitee= this->waitee))
> + if ((loc_waitee= this->waitee.load(std::memory_order_relaxed)))
> {
> mysql_mutex_lock(&loc_waitee->LOCK_wait_commit);
> if (loc_waitee->wakeup_subsequent_commits_running)
> @@ -7526,7 +7529,7 @@ wait_for_commit::unregister_wait_for_prior_commit2()
> See comments on wakeup_subsequent_commits2() for more details.
> */
> mysql_mutex_unlock(&loc_waitee->LOCK_wait_commit);
> - while (this->waitee)
> + while (this->waitee.load(std::memory_order_relaxed))
> mysql_cond_wait(&COND_wait_commit, &LOCK_wait_commit);
> }
> else
> @@ -7534,7 +7537,7 @@ wait_for_commit::unregister_wait_for_prior_commit2()
> /* Remove ourselves from the list in the waitee. */
> remove_from_list(&loc_waitee->subsequent_commits_list);
> mysql_mutex_unlock(&loc_waitee->LOCK_wait_commit);
> - this->waitee= NULL;
> + this->waitee.store(NULL, std::memory_order_relaxed);
> }
> }
> wakeup_error= 0;
> diff --git a/sql/sql_class.h b/sql/sql_class.h
> index 72cb8148895..1c81739865b 100644
> --- a/sql/sql_class.h
> +++ b/sql/sql_class.h
> @@ -20,6 +20,7 @@
>
> /* Classes in mysql */
>
> +#include <atomic>
> #include "dur_prop.h"
> #include <waiting_threads.h>
> #include "sql_const.h"
> @@ -2018,7 +2019,7 @@ struct wait_for_commit
> /*
> The LOCK_wait_commit protects the fields subsequent_commits_list and
> wakeup_subsequent_commits_running (for a waitee), and the pointer
> - waiterr and associated COND_wait_commit (for a waiter).
> + waitee and associated COND_wait_commit (for a waiter).
> */
> mysql_mutex_t LOCK_wait_commit;
> mysql_cond_t COND_wait_commit;
> @@ -2032,8 +2033,14 @@ struct wait_for_commit
>
> When this is cleared for wakeup, the COND_wait_commit condition is
> signalled.
> +
> + This pointer is protected by LOCK_wait_commit. But there is also a "fast
> + path" where the waiter compares this to NULL without holding the lock.
> + Such read must be done with acquire semantics (and all corresponding
> + writes done with release semantics). This ensures that a wakeup with error
> + is reliably detected as (waitee==NULL && wakeup_error != 0).
> */
> - wait_for_commit *waitee;
> + std::atomic<wait_for_commit *> waitee;
> /*
> Generic pointer for use by the transaction coordinator to optimise the
> waiting for improved group commit.
> @@ -2068,7 +2075,7 @@ struct wait_for_commit
> Quick inline check, to avoid function call and locking in the common case
> where no wakeup is registered, or a registered wait was already signalled.
> */
> - if (waitee)
> + if (waitee.load(std::memory_order_acquire))
> return wait_for_prior_commit2(thd);
> else
> {
> @@ -2096,7 +2103,7 @@ struct wait_for_commit
> }
> void unregister_wait_for_prior_commit()
> {
> - if (waitee)
> + if (waitee.load(std::memory_order_relaxed))
> unregister_wait_for_prior_commit2();
> else
> wakeup_error= 0;
> @@ -2118,7 +2125,7 @@ struct wait_for_commit
> }
> next_ptr_ptr= &cur->next_subsequent_commit;
> }
> - waitee= NULL;
> + waitee.store(NULL, std::memory_order_relaxed);
> }
>
> void wakeup(int wakeup_error);
> _______________________________________________
> commits mailing list
> commits(a)mariadb.org
> https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
1
0

Re: [Maria-developers] [Commits] e07caf401c2: MDEV-20645: Replication consistency is broken as workers miss the error notification from an earlier failed group.
by Kristian Nielsen 23 Sep '19
by Kristian Nielsen 23 Sep '19
23 Sep '19
sujatha <sujatha.sivakumar(a)mariadb.com> writes:
> revision-id: e07caf401c26cf8144899336d103e4c7aafd3d7a (mariadb-10.1.41-45-ge07caf401c2)
> MDEV-20645: Replication consistency is broken as workers miss the error notification from an earlier failed group.
Great that you could come up with a testcase like this to trigger the error!
Also, thanks for the detailed description of the issue in the commit mail,
that made it much easier to understand and comment on.
- Kristian.
1
0

[Maria-developers] Missing memory barrier in parallel replication error handler in wait_for_prior_commit()?
by Kristian Nielsen 22 Sep '19
by Kristian Nielsen 22 Sep '19
22 Sep '19
Hi Andrei (Cc: Sujatha),
I noticed another thing with the wait_for_commit error handling while
looking at the MDEV-18648 patch.
We have code like this:
// Wakeup code in wait_for_commit::wakeup():
mysql_mutex_lock(&LOCK_wait_commit);
waitee= NULL;
this->wakeup_error= wakeup_error;
// Wait code in wait_for_prior_commit():
if (waitee)
return wait_for_prior_commit2(thd);
else
{
if (wakeup_error)
my_error(ER_PRIOR_COMMIT_FAILED, MYF(0));
So the waiter code runs a "fast path" without locks. It is ok if we race on
the assignment of NULL to wait_for_commit::waitee variable, because then the
waiter will take the slow path and do proper locking.
But it looks like there is a race as follows:
1. wakeup() sets waitee= NULL
2. wait_for_prior_commit() sees waitee==NULL _and_ wakeup_error==0, and
incorrectly returns without error.
3. wakeup() too late sets wait_for_commit::wakeup_error.
It is not enough of course to swap the assignments in wakeup(). A
write-write memory barrier is needed between them in wakeup(), and a
corresponding read-read barrier is needed in wait_for_prior_commit().
With proper barriers, the waiter cannot see the write of waitee=NULL without
also seeing the write to wakeup_error. So it will either return with
non-zero wakeup_error or take the slow path with proper locking. Both of
which are fine.
Andrei, what do you think? Can you see something in the above analysis that
I missed? It seems like such an obvious miss in the code that I wonder how
it remained undetected for so long (or how I could have written that in the
first place...). But I suppose the race is very unlikely to hit in practice,
especially in a code path that only triggers in the error case...
- Kristian.
1
0

Re: [Maria-developers] [Commits] cde9170709c: MDEV-18648: slave_parallel_mode= optimistic default in 10.5
by Kristian Nielsen 22 Sep '19
by Kristian Nielsen 22 Sep '19
22 Sep '19
sujatha <sujatha.sivakumar(a)mariadb.com> writes:
> Thank you for the review comments. You are right. Setting
> rgi->worker_error=1
> for the 2 is the right way to handle. With this, upon reaching
> 'finish_event_group'
> 2nd will notify 3rd transaction that something went wrong during prior
> commit
> execution.
Ok, great if this fix works.
>> Also, this fix doesn't seem to belong with the other MDEV-18648 changes, it
>> is unrelated to what the default parallel replication mode is. So please do
>> it in a separate commit.
>The missing transaction issue is possible, only in the case of
>slave_parallel_mode='optimistic'.
I *think* it is also possible in conservative mode. Not between different
batches of transactions that group-committed together on the master, as you
described. But between transactions in one gco. Though I did not check the
code deeply for this.
But my point was actually that this problem must exist also in earlier
versions of MariaDB, where optimistic is not the default, but the user
manually enables it. So it should be considered for fixing at least in 10.4
I would assume?
- Kristian.
1
0

Re: [Maria-developers] c3d2998a038: MDEV-16130 wrong error message adding AS ROW START to versioned table
by Sergei Golubchik 21 Sep '19
by Sergei Golubchik 21 Sep '19
21 Sep '19
Hi, Aleksey!
On Aug 30, Aleksey Midenkov wrote:
> revision-id: c3d2998a038 (versioning-1.0.3-71-gc3d2998a038)
> parent(s): 611488e3d90
> author: Aleksey Midenkov <midenok(a)gmail.com>
> committer: Aleksey Midenkov <midenok(a)gmail.com>
> timestamp: 2018-05-23 22:45:08 +0300
> message:
>
> MDEV-16130 wrong error message adding AS ROW START to versioned table
>
> Closes tempesta-tech/mariadb#494
>
> ---
> mysql-test/suite/versioning/r/alter.result | 12 ++++++++----
> mysql-test/suite/versioning/t/alter.test | 7 ++++++-
> sql/handler.cc | 3 ++-
> sql/share/errmsg-utf8.txt | 2 +-
> sql/sql_table.cc | 6 ------
> 5 files changed, 17 insertions(+), 13 deletions(-)
>
> diff --git a/mysql-test/suite/versioning/r/alter.result b/mysql-test/suite/versioning/r/alter.result
> index fafcf3c30b0..666420dc2e5 100644
> --- a/mysql-test/suite/versioning/r/alter.result
> +++ b/mysql-test/suite/versioning/r/alter.result
> @@ -76,13 +76,17 @@ t CREATE TABLE `t` (
> `a` int(11) DEFAULT NULL
> ) ENGINE=MyISAM DEFAULT CHARSET=latin1
> alter table t add column trx_start timestamp(6) as row start;
> -ERROR HY000: Table `t` is not system-versioned
> +ERROR HY000: Can not add system property AS ROW START/END for field `trx_start`
This is very strange wording. What is a "system property"? The standard
has no such concept. Neither does MariaDB, it's not used anywhere in the
manual, as far as I know. Here you can use, for example,
ER_VERS_DUPLICATE_ROW_START_END:
Duplicate ROW START column `trx_start`
> alter table t add system versioning;
> show create table t;
> Table Create Table
> t CREATE TABLE `t` (
> `a` int(11) DEFAULT NULL
> ) ENGINE=MyISAM DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING
> +alter table t add column trx_start timestamp(6) as row start;
> +ERROR HY000: Can not add system property AS ROW START/END for field `trx_start`
> +alter table t modify a int as row start;
> +ERROR HY000: Can not add system property AS ROW START/END for field `a`
> alter table t add column b int;
> show create table t;
> Table Create Table
> @@ -527,6 +531,6 @@ use test;
> # MDEV-15956 Strange ER_UNSUPPORTED_ACTION_ON_GENERATED_COLUMN upon ALTER on versioning column
> create or replace table t1 (i int, j int as (i), s timestamp(6) as row start, e timestamp(6) as row end, period for system_time(s,e)) with system versioning;
> alter table t1 modify s timestamp(6) as row start;
> -ERROR HY000: Can not change system versioning field `s`
> +ERROR HY000: Can not add system property AS ROW START/END for field `s`
This doesn't look right either. The statement does not add anything, the
field `s` is already AS ROW START. Why is it an error at all?
> drop database test;
> create database test;
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1

Re: [Maria-developers] [Commits] cde9170709c: MDEV-18648: slave_parallel_mode= optimistic default in 10.5
by Kristian Nielsen 20 Sep '19
by Kristian Nielsen 20 Sep '19
20 Sep '19
sujatha <sujatha.sivakumar(a)mariadb.com> writes:
Hi Sujutha,
> @sql/sql_class.h
> Moved 'wait_for_prior_commit(THD *thd)' method inside sql_class.cc
>
> @sql/sql_class.cc
> Added code to check for 'stop_on_error_sub_id' for event groups which get skipped
> and don't have any preceding group to wait for.
This looks like the wrong fix. The wait_for_commit mechanism is a low-level
API, it should not deal with things like stop_on_error_sub_id.
> 1,2,3 are scheduled in parallel.
> Since the above is true 'skip_event_group=true' is set. Simply call
> 'wait_for_prior_commit' to wakeup all waiters. Group '2' didn't had any
> waitee and its execution is skipped. Hence its wakeup_error=0.It sends a
> positive wakeup signal to '3'. Which commits. This results in a missed
> transaction. i.e 33 is missed.
I think this the the real problem. 2 is stopping because of error, it must
not call wakeup_subsequent_commits() without propagating the error, that
breaks the whole semantics of wait_for_commit.
Maybe it's enough just to also set rgi->worker_error here, when
skip_event_group is set due to error in an earlier group?
if (unlikely(entry->stop_on_error_sub_id <= rgi->wait_commit_sub_id))
skip_event_group= true;
Then wakeup_subsequent_commits() will correctly inform the following
transaction about the error so it doesn't try to commit on its own.
There is already similar code in the retry_event_group() function:
if (entry->stop_on_error_sub_id == (uint64) ULONGLONG_MAX ||
rgi->gtid_sub_id < entry->stop_on_error_sub_id)
...
else {
err= rgi->worker_error= 1;
Then all of the changes to sql_class.h and sql_class.cc can be omittet.
Also, this fix doesn't seem to belong with the other MDEV-18648 changes, it
is unrelated to what the default parallel replication mode is. So please do
it in a separate commit.
Hope this helps,
- Kristian.
> diff --git a/sql/sql_class.cc b/sql/sql_class.cc
> index 4eab241232b..5ba9c5fe456 100644
> --- a/sql/sql_class.cc
> +++ b/sql/sql_class.cc
> @@ -7365,6 +7365,33 @@ wait_for_commit::register_wait_for_prior_commit(wait_for_commit *waitee)
> }
>
>
> +int wait_for_commit::wait_for_prior_commit(THD *thd)
> +{
> + /*
> + Quick inline check, to avoid function call and locking in the common case
> + where no wakeup is registered, or a registered wait was already signalled.
> + */
> + if (waitee)
> + return wait_for_prior_commit2(thd);
> + else
> + {
> + if (wakeup_error)
> + my_error(ER_PRIOR_COMMIT_FAILED, MYF(0));
> + else
> + {
> + rpl_group_info* rgi= thd->rgi_slave;
> + if (rgi && rgi->is_parallel_exec &&
> + rgi->parallel_entry->stop_on_error_sub_id < (uint64)ULONGLONG_MAX &&
> + rgi->gtid_sub_id >= rgi->parallel_entry->stop_on_error_sub_id)
> + {
> + my_error(ER_PRIOR_COMMIT_FAILED, MYF(0));
> + wakeup_error= ER_PRIOR_COMMIT_FAILED;
> + }
> + }
> + return wakeup_error;
> + }
> +}
> +
> /*
> Wait for commit of another transaction to complete, as already registered
> with register_wait_for_prior_commit(). If the commit already completed,
> @@ -7387,6 +7414,17 @@ wait_for_commit::wait_for_prior_commit2(THD *thd)
> {
> if (wakeup_error)
> my_error(ER_PRIOR_COMMIT_FAILED, MYF(0));
> + else
> + {
> + rpl_group_info* rgi= thd->rgi_slave;
> + if (rgi && rgi->is_parallel_exec &&
> + rgi->parallel_entry->stop_on_error_sub_id < (uint64)ULONGLONG_MAX &&
> + rgi->gtid_sub_id >= rgi->parallel_entry->stop_on_error_sub_id)
> + {
> + my_error(ER_PRIOR_COMMIT_FAILED, MYF(0));
> + wakeup_error= ER_PRIOR_COMMIT_FAILED;
> + }
> + }
> goto end;
> }
> /*
> diff --git a/sql/sql_class.h b/sql/sql_class.h
> index 72cb8148895..0a0a1aa9fa1 100644
> --- a/sql/sql_class.h
> +++ b/sql/sql_class.h
> @@ -2062,21 +2062,6 @@ struct wait_for_commit
> bool commit_started;
>
> void register_wait_for_prior_commit(wait_for_commit *waitee);
> - int wait_for_prior_commit(THD *thd)
> - {
> - /*
> - Quick inline check, to avoid function call and locking in the common case
> - where no wakeup is registered, or a registered wait was already signalled.
> - */
> - if (waitee)
> - return wait_for_prior_commit2(thd);
> - else
> - {
> - if (wakeup_error)
> - my_error(ER_PRIOR_COMMIT_FAILED, MYF(0));
> - return wakeup_error;
> - }
> - }
> void wakeup_subsequent_commits(int wakeup_error_arg)
> {
> /*
> @@ -2123,6 +2108,7 @@ struct wait_for_commit
>
> void wakeup(int wakeup_error);
>
> + int wait_for_prior_commit(THD *thd);
> int wait_for_prior_commit2(THD *thd);
> void wakeup_subsequent_commits2(int wakeup_error);
> void unregister_wait_for_prior_commit2();
1
0
Hi Igor,
We discussed the problem with error messages with Sergei and both
agreed that a new error message like:
Subquery is not allowed in '%s'
will be more informative, and it's easy to do.
Why not replace:
bool expr_allows_subselect;
to
const char *clause_that_disallows_subselect;
?
So we can change the grammar to something like this:
| opt_generated_always AS
{ Lex->clause_that_disallows_subselect= "GENERATED ALWAYS AS"; }
virtual_column_func
{ Lex->clause_that_disallows_subselect= NULL; }
The same for other places where we disallow subselects.
This can be done in 10.4 before your patch, as a separate commit.
1
0
Hi Igor,
I have the following suggestions so far.
Can you please fix them and replace the commit in bb-10.4-igor?
1. In statements where subquery is not allowed, the error message become
confusing:
EXECUTE stmt USING (SELECT 1);
< ERROR 1064 (42000): ... syntax to use near 'SELECT 1)' at line 1
> ERROR 1064 (42000): ... syntax to use near ')' at line 1
Please change SELECT and WITH to have the '<kwd>' type, like this:
%token <kwd> SELECT_SYM /* SQL-2003-R */
%token <kwd> WITH /* SQL-2003-R */
Then in the grammar like this:
query_specification_start:
SELECT_SYM
{
SELECT_LEX *sel;
LEX *lex= Lex;
if (!(sel= lex->alloc_select(TRUE)) ||
lex->push_select(sel))
MYSQL_YYABORT;
sel->init_select();
sel->braces= FALSE;
you will be able to use $1.pos() to get the position of the keyword,
and later use this position to generate the error properly.
2. The change in sql_cte.cc is not relevant to this task.
Please make a separate MDEV for this and push it to 10.4, then rebase
MDEV-19956.
3. There are a few mistakes in comments:
+ <query expression body> ::
+ <query term>
+ | <query expression body> UNION [ ALL | DISTINCT ] <query term>
+ | <query expression body> EXCEPT [ DISTINCT ]
+ Consider the production rule of the SQL Standard
+ subquery:
+ '(' query_expression_no_with_clause ')'
+ The latter can be re-written into
+ subquery:
+ '(' query_expression_body_ext_parens ')'
+ | '(' with_clause query_expression_no_with_clause ')'
Please fix them as discussed on slack.
Thanks.
1
0
Hi,
I am implementing a threadpool system to AIX. The AIX equivalent of epoll / kqueue on AIX is pollset (and IOCP, but partial implementation only). However, pollset has only a level-trigger mode and MariaDB needs edge-trigger (see comments of sql/threadpoll_generic.h file). Adding a pollset support in MariaDB would be difficult, and probably not so efficient, as we need to simulate the edge-trigger behavior.
Obviously, AIX has poll and select support. MariaDB has not. Is there a reason to don't implement threadpoll through poll or select? No interest? Performance issues?
MariaDB currently works on AIX without threadpool; in term of efficiency, do you know what can be obtained using threadpool with poll/select or a more modern solution?
As far I know, SunOS/Solaris/Illumos threadpoll system (called "port") is also level-trigger only, but I do not find specific functions to manage this.
Thanks!
Etienne Guesnet.
4
6
Hi!
By now, all of you have probably received the results of this year's Google
Summer of Code. I want to congratulate you all for the work that you have
done this year, students and mentors alike.
We have had 4 projects this year:
1. EXCEPT ALL and INTERSECT ALL operations - Ruihang Xia
https://github.com/MariaDB/server/pull/1378
2. INSERT ... RETURNING - Rucha Deodar
https://medium.com/@ruchad1998/google-summer-of-code-2019-add-returning-to-…
3. UPDATE ... RETURNING - Miroslav Koberskii
https://gist.github.com/Mup0c/43c781e2135e55bf5126e0f8c60e2a24
4. Support for indexes on expressions - Alexey Mogilyovkin
https://gist.github.com/ZeroICQ/f0c35a20cae7065368dae31d7ad9b001
Implementing these projects certainly took a lot of effort. The first one
is already merged within MariaDB and the rest will probably follow, after a
bit more polishing. The MariaDB community would be grateful if you stuck
around to watch over your projects, extend them and improve them over time.
I am sure there was a lot to learn from this experience. Now that you have
gotten a feel to what it means to contribute significant features to
MariaDB you are in a great position to help others! In this sense, I
encourage you to keep being active in the community and share what you have
learned. :)
Thank you for a great summer!
Vicențiu Ciorbaru
GSoC MariaDB Foundation Admin
1
0

[Maria-developers] Review for MDEV-19708 RBR replication loses data silently ignoring important column attributes
by Alexander Barkov 02 Sep '19
by Alexander Barkov 02 Sep '19
02 Sep '19
Hi Sachin,
I've reviewed the code in bb-10.5-19708.
It looks good.
I suggest some cleanups.
- Remove Field::binlog_type_info_slave, it's not used.
- Remove Field::binlog_type_info_master.
Please don't cache Binlog_type_info inside the Field itself.
Let the caller code cache it in an array of Binlog_type_info pointers.
- Change this method in Field:
virtual Binlog_type_info *binlog_type_info();
to
virtual Binlog_type_info *binlog_type_info(MEM_ROOT *) const;
Let's pass mem_root as a parameter instead of using table->mem_root.
Please add the 'const' qualifier!
- Restore the 'const' qualifier to
Field::enum_conv_type rpl_conv_type_from
- Instead of doing this:
Binlog_type_info * Field_new_decimal::binlog_type_info()
{
Field_num::binlog_type_info();
this->binlog_type_info_master->m_metadata= precision + (decimals() << 8);
this->binlog_type_info_master->m_metadata_size= 2;
return this->binlog_type_info_master;
}
Please add a number of Binlog_type_info constructor helpers,
in addition to the current one.
For example, for numeric types this would be good:
Binlog_type_info(uchar type_code,
uint16 metadata,
uint8 metadata_size,
binlog_signess_t signess)
:m_type_code(type_code),
m_metadata(metadata),
m_metadata_size(metadata_size),
m_signess(signess),
m_cs(NULL),
m_enum_typelib(NULL),
m_set_typelib(NULL),
m_geom_type(GEOM_GEOMETRY)
{ };
so you can do something simple like this:
Binlog_type_info Field_new_decimal::binlog_type_info(MEM_ROOT *root) const
{
return new (root)
Binlog_type_info(type(),
precision + (decimals() << 8), 2,
signess_arg);
}
- It seems you always set signess to SIGNESS_NOT_RELEVANT.
Where is signess set to SIGNED or UNSIGNED?
- This is something we've never used before:
+#include <vector>
+#include <string>
+#include <functional>
+#include <memory>
We need to discuss this with Serg.
Is it possible to avoid this?
- There is a new test file
mysql-test/suite/rpl/t/rpl_mdev_19708.test.diff
but I could not find a corresponding .result.diff.
Where is it?
- Please use error names instead of numbers in tests:
Instead of:
--error 1231
SET GLOBAL binlog_row_metadata = -1;
it should be:
--error ER_WRONG_VALUE_FOR_VAR
SET GLOBAL binlog_row_metadata = -1;
You can find error names in include/mysqld_ername.h
Thanks.
2
1

Re: [Maria-developers] 899c0b3ec6e: Part2: MDEV-18156 Assertion `0' failed or `btr_validate_index(index, 0, false)' in row_upd_sec_index_entry or error code 126: Index is corrupted upon DELETE with PAD_CHAR_TO_FULL_LENGTH
by Sergei Golubchik 30 Aug '19
by Sergei Golubchik 30 Aug '19
30 Aug '19
Hi, Alexander!
On Aug 30, Alexander Barkov wrote:
> revision-id: 899c0b3ec6e (mariadb-10.2.26-51-g899c0b3ec6e)
> parent(s): e4415549e53
> author: Alexander Barkov <bar(a)mariadb.com>
> committer: Alexander Barkov <bar(a)mariadb.com>
> timestamp: 2019-08-29 12:35:19 +0400
> message:
>
> Part2: MDEV-18156 Assertion `0' failed or `btr_validate_index(index, 0, false)' in row_upd_sec_index_entry or error code 126: Index is corrupted upon DELETE with PAD_CHAR_TO_FULL_LENGTH
>
> This patch allows the server to open old tables that have
> "bad" generated columns (i.e. indexed virtual generated columns,
> persistent generated columns) that depend on sql_mode,
> for general things like SELECT, INSERT, DROP, etc.
> Warning are issued in such cases.
>
> Only these commands are now disallowed and return an error:
> - CREATE TABLE introducing a "bad" generated column
> - ALTER TABLE introducing a "bad" generated column
> - CREATE INDEX introdicing a "bad" generated column
> (i.e. adding an index on a virtual generated column
> that depends on sql_mode).
>
> Note, these commands are allowed:
> - ALTER TABLE removing a "bad" generate column
> - ALTER TABLE removing an index from a "bad" virtual generated column
> - DROP INDEX removing an index from a "bad" virtual generated column
> but only if the table does not have any "bad" columns as a result.
>
> diff --git a/sql/field.cc b/sql/field.cc
> index e2b745743d2..b720db19ebf 100644
> --- a/sql/field.cc
> +++ b/sql/field.cc
> @@ -1428,16 +1428,24 @@ void Field::load_data_set_value(const char *pos, uint length,
> }
>
>
> -void Field::error_generated_column_function_is_not_allowed(THD *thd) const
> +void Field::error_generated_column_function_is_not_allowed(THD *thd,
> + bool error) const
> {
> StringBuffer<64> tmp;
> vcol_info->expr->print(&tmp, (enum_query_type)
> (QT_TO_SYSTEM_CHARSET |
> QT_ITEM_IDENT_SKIP_DB_NAMES |
> QT_ITEM_IDENT_SKIP_TABLE_NAMES));
> - my_error(ER_GENERATED_COLUMN_FUNCTION_IS_NOT_ALLOWED, MYF(0),
> - tmp.c_ptr(), vcol_info->get_vcol_type_name(),
> - const_cast<const char*>(field_name));
> + if (error)
> + my_error(ER_GENERATED_COLUMN_FUNCTION_IS_NOT_ALLOWED, MYF(0),
> + tmp.c_ptr(), vcol_info->get_vcol_type_name(),
> + const_cast<const char*>(field_name));
> + else
> + push_warning_printf(thd, Sql_condition::WARN_LEVEL_WARN,
> + ER_GENERATED_COLUMN_FUNCTION_IS_NOT_ALLOWED,
> + ER_THD(thd, ER_GENERATED_COLUMN_FUNCTION_IS_NOT_ALLOWED),
> + tmp.c_ptr(), vcol_info->get_vcol_type_name(),
> + const_cast<const char*>(field_name));
an easier way of doing it would be
my_error(ER_GENERATED_COLUMN_FUNCTION_IS_NOT_ALLOWED,
MYF(error ? 0 : ME_WARNING),
tmp.c_ptr(), vcol_info->get_vcol_type_name(),
const_cast<const char*>(field_name));
> }
>
>
> diff --git a/sql/table.h b/sql/table.h
> index 7786679982f..23a61345242 100644
> --- a/sql/table.h
> +++ b/sql/table.h
> @@ -325,6 +325,20 @@ enum tmp_table_type
> };
> enum release_type { RELEASE_NORMAL, RELEASE_WAIT_FOR_DROP };
>
> +
> +enum vcol_init_mode
> +{
> + VCOL_INIT_DEPENDENCY_FAILURE_IS_WARNING= 1,
> + VCOL_INIT_DEPENDENCY_FAILURE_IS_ERROR= 2
> + /*
> + There will be a new flags soon,
better say "There may be new flags here"
because may be there won't be any new flags soon after all,
if this your fix will be sufficient :)
> + e.g. to automatically remove sql_mode dependency:
> + GENERATED ALWAYS AS (char_col) ->
> + GENERATED ALWAYS AS (RTRIM(char_col))
> + */
> +};
> +
> +
> enum enum_vcol_update_mode
> {
> VCOL_UPDATE_FOR_READ= 0,
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0

Re: [Maria-developers] 5ba0b11c5d1: MDEV-20257 Fix crash in Grant_table_base::init_read_record
by Sergei Golubchik 29 Aug '19
by Sergei Golubchik 29 Aug '19
29 Aug '19
Hi, Robert!
On Aug 28, Robert Bindar wrote:
> revision-id: 5ba0b11c5d1 (mariadb-10.4.7-35-g5ba0b11c5d1)
> parent(s): 4a490d1a993
> author: Robert Bindar <robert(a)mariadb.org>
> committer: Robert Bindar <robert(a)mariadb.org>
> timestamp: 2019-08-27 11:38:52 +0300
> message:
>
> MDEV-20257 Fix crash in Grant_table_base::init_read_record
>
> The bug shows up when a 10.4+ server starts on a <10.4 datadir
> with a crashed mysql.user table.
> Grant_tables::open_and_lock tries to open the list of requested
> tables (host, db, global_priv,..) and because global_priv doesn't
> exist (pre-10.4 datadir), it falls back to opening mysql.user.
> The first open_tables call for [host, db, global_priv,..] works
> just fine. The second open_tables call (trying to open mysql.user
> alone) sees the crashed user table and performs 3 steps:
> - closes all the open tables
> - attempts a table fix for mysql.user (succeeds)
> - opens the tables initially passed as arguments
> In an ideal world, the first step should only close the tables
> passed as arguments (i.e. mysql.user), but for some reasons
> close_tables_for_reopen makes a close_thread_tables call which
> closes everything in THD::open_tables (nevertheless, this is
> probably the intended behavior).
> This side effect causes all the tables opened in the first
> open_tables call to now be closed and Grant_table_base::init_read_record
> tries to read from a released table.
>
> diff --git a/mysql-test/main/mdev20257.test b/mysql-test/main/mdev20257.test
> new file mode 100644
> index 00000000000..8506292d57c
> --- /dev/null
> +++ b/mysql-test/main/mdev20257.test
> @@ -0,0 +1,51 @@
> +--source include/not_embedded.inc
> +--echo #
> +--echo # MDEV 20257 Server crashes in Grant_table_base::init_read_record
> +--echo # upon crash-upgrade
> +--echo #
> +
> +let $MYSQLD_DATADIR= `select @@datadir`;
> +
> +rename table mysql.global_priv to mysql.global_priv_bak;
> +rename table mysql.user to mysql.user_bak;
> +rename table mysql.db to mysql.db_bak;
> +rename table mysql.proxies_priv to mysql.proxies_priv_bak;
> +rename table mysql.roles_mapping to mysql.roles_mapping_bak;
> +
> +--source include/shutdown_mysqld.inc
> +
> +# Bring in a crashed user table
> +# Ideally we should've copied only the crashed mysql.user, but this
> +# would make mysqld crash in some other place before hitting the
> +# crash spot described in MDEV-20257 which is what we're trying to
> +# test against.
I don't understand that. Where does it crash then?
What difference does it make that you replace all tables - are they all
corrupted?
> +--copy_file std_data/mdev20257/user.frm $MYSQLD_DATADIR/mysql/user.frm
> +--copy_file std_data/mdev20257/user.MYD $MYSQLD_DATADIR/mysql/user.MYD
> +--copy_file std_data/mdev20257/user.MYI $MYSQLD_DATADIR/mysql/user.MYI
> +
> +--copy_file std_data/mdev20257/host.frm $MYSQLD_DATADIR/mysql/host.frm
> +--copy_file std_data/mdev20257/host.MYD $MYSQLD_DATADIR/mysql/host.MYD
> +--copy_file std_data/mdev20257/host.MYI $MYSQLD_DATADIR/mysql/host.MYI
> +
> +--copy_file std_data/mdev20257/db.frm $MYSQLD_DATADIR/mysql/db.frm
> +--copy_file std_data/mdev20257/db.MYD $MYSQLD_DATADIR/mysql/db.MYD
> +--copy_file std_data/mdev20257/db.MYI $MYSQLD_DATADIR/mysql/db.MYI
> +--copy_file std_data/mdev20257/db.opt $MYSQLD_DATADIR/mysql/db.opt
> +
> +--source include/start_mysqld.inc
> +
> +call mtr.add_suppression("Table \'\..mysql\.user\' is marked as crashed and should be repaired");
> +call mtr.add_suppression("Checking table: \'\..mysql\.user\'");
> +call mtr.add_suppression("mysql.user: 1 client is using or hasn't closed the table properly");
> +call mtr.add_suppression("Missing system table mysql..*; please run mysql_upgrade to create it");
> +
> +# Cleanup
> +drop table mysql.user;
> +drop table mysql.db;
> +drop table mysql.host;
> +rename table mysql.global_priv_bak to mysql.global_priv;
> +rename table mysql.user_bak to mysql.user;
> +rename table mysql.db_bak to mysql.db;
> +rename table mysql.proxies_priv_bak to mysql.proxies_priv;
> +rename table mysql.roles_mapping_bak to mysql.roles_mapping;
> +
> diff --git a/mysql-test/std_data/mdev20257/db.opt b/mysql-test/std_data/mdev20257/db.opt
> --- /dev/null
> +++ b/mysql-test/std_data/mdev20257/db.opt
> @@ -0,0 +1,2 @@
> +default-character-set=latin1
> +default-collation=latin1_swedish_ci
why is that?
> diff --git a/sql/sql_acl.cc b/sql/sql_acl.cc
> index 847d2bd777b..e028d084703 100644
> --- a/sql/sql_acl.cc
> +++ b/sql/sql_acl.cc
> - DBUG_RETURN(res);
> + first = build_open_list(tables, which_tables, lock_type, true);
> +
> + uint counter;
> + if (int rv= really_open(thd, first, &counter))
> + DBUG_RETURN(rv);
> +
> + TABLE *user_table= tables[USER_TABLE].table;
> + if ((which_tables & Table_user) && !user_table)
> + {
> + close_thread_tables(thd);
> + first = build_open_list(tables, which_tables, lock_type, false);
Hmm... Here you always close/reopen all tables, just to account for the
case when the user table is found corrupted.
I'd argue that a corrupted user table happens rarely (compared to the
non-corrupted), so it makes sense to optimize for the normal use case.
That is, only open the user table as before. After it's opened you
check if other tables are still opened - if they aren't you can close
and reopen everything. This way you only close/reopen all tables if the
user table was corrupted.
Makes sense?
> + if (int rv= really_open(thd, first, &counter))
> + DBUG_RETURN(rv);
> +
> + p_user_table= &m_user_table_tabular;
> + user_table= tables[USER_TABLE + 1].table;
> + }
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1

Re: [Maria-developers] 5f1cd8fe7a5: MDEV-18156 Assertion `0' failed or `btr_validate_index(index, 0, false)' in row_upd_sec_index_entry or error code 126: Index is corrupted upon DELETE with PAD_CHAR_TO_FULL_LENGTH
by Sergei Golubchik 28 Aug '19
by Sergei Golubchik 28 Aug '19
28 Aug '19
Hi, Alexander!
On Aug 28, Alexander Barkov wrote:
> revision-id: 5f1cd8fe7a5 (mariadb-10.2.26-50-g5f1cd8fe7a5)
> parent(s): 9de2e60d749
> author: Alexander Barkov <bar(a)mariadb.com>
> committer: Alexander Barkov <bar(a)mariadb.com>
> timestamp: 2019-08-26 15:28:32 +0400
> message:
>
> MDEV-18156 Assertion `0' failed or `btr_validate_index(index, 0, false)' in row_upd_sec_index_entry or error code 126: Index is corrupted upon DELETE with PAD_CHAR_TO_FULL_LENGTH
>
This is not a trivial one-liner fix. Before pushing,
please add a comment explaning the problem and your fix.
>
> diff --git a/mysql-test/suite/vcol/r/vcol_sql_mode.result b/mysql-test/suite/vcol/r/vcol_sql_mode.result
> new file mode 100644
> index 00000000000..6af46dffb80
> --- /dev/null
> +++ b/mysql-test/suite/vcol/r/vcol_sql_mode.result
> @@ -0,0 +1,234 @@
> +#
> +# Start of 10.2 tests
> +#
> +#
> +# MDEV-18156 Assertion `0' failed or `btr_validate_index(index, 0, false)' in row_upd_sec_index_entry or error code 126: Index is corrupted upon DELETE with PAD_CHAR_TO_FULL_LENGTH
> +#
> +#
> +# PAD_CHAR_TO_FULL_LENGTH + various virtual column data types
> +#
> +CREATE TABLE t1 (a CHAR(5), v CHAR(5) AS (a) VIRTUAL, KEY(v));
> +DROP TABLE t1;
please add SHOW CREATE after every CREATE. To show what automatic
expression rewriting does (or does not) happen here.
> +CREATE TABLE t1 (a CHAR(5), v INT AS (a) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v TIME AS (a) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (c CHAR(8), v BINARY(8) AS (c), KEY(v));
> +ERROR HY000: Function or expression '`c`' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
Better do --enable_warnings at the beginning of the test.
It'll serve two purposes - you won't need to do SHOW WARNINGS manually
and it'll make clear when there are no warnings after a CREATE.
> +Level Code Message
> +Error 1901 Function or expression '`c`' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
Let's make it a proper sentence: "Expression `c` implicitly depends on the @@sql_mode value PAD_CHAR_TO_FULL_LENGTH"
> +CREATE TABLE t1 (a CHAR(5), v BIT(64) AS (a) VIRTUAL, KEY(v));
> +ERROR HY000: Function or expression '`a`' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression '`a`' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (a) VIRTUAL, KEY(v));
> +ERROR HY000: Function or expression '`a`' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression '`a`' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +CREATE TABLE t1 (a CHAR(5), v TEXT AS (a) VIRTUAL, KEY(v(100)));
> +ERROR HY000: Function or expression '`a`' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression '`a`' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +# PAD_CHAR_TO_FULL_LENGTH + TRIM resolving dependency
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (RTRIM(a)) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v TEXT AS (RTRIM(a)) VIRTUAL, KEY(v(100)));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (TRIM(TRAILING ' ' FROM a)) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v TEXT AS (TRIM(TRAILING ' ' FROM a)) VIRTUAL, KEY(v(100)));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (TRIM(BOTH ' ' FROM a)) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v TEXT AS (TRIM(BOTH ' ' FROM a)) VIRTUAL, KEY(v(100)));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (TRIM(TRAILING NULL FROM a)) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (TRIM(BOTH NULL FROM a)) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +# PAD_CHAR_TO_FULL_LENGTH + TRIM not resolving dependency
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (TRIM(LEADING ' ' FROM a)) VIRTUAL, KEY(v));
> +ERROR HY000: Function or expression 'trim(leading ' ' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
nice :)
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'trim(leading ' ' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +CREATE TABLE t1 (a CHAR(5), v TEXT AS (TRIM(LEADING ' ' FROM a)) VIRTUAL, KEY(v(100)));
> +ERROR HY000: Function or expression 'trim(leading ' ' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'trim(leading ' ' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (TRIM(TRAILING '' FROM a)) VIRTUAL, KEY(v));
> +ERROR HY000: Function or expression 'trim(trailing '' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'trim(trailing '' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (TRIM(BOTH '' FROM a)) VIRTUAL, KEY(v));
> +ERROR HY000: Function or expression 'trim(both '' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'trim(both '' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (TRIM(TRAILING 'x' FROM a)) VIRTUAL, KEY(v));
> +ERROR HY000: Function or expression 'trim(trailing 'x' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'trim(trailing 'x' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (TRIM(BOTH 'x' FROM a)) VIRTUAL, KEY(v));
> +ERROR HY000: Function or expression 'trim(both 'x' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'trim(both 'x' from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +# PAD_CHAR_TO_FULL_LENGTH + TRIM(... non_constant FROM a)
> +CREATE TABLE t1 (
> +a CHAR(5),
> +b CHAR(5),
> +v TEXT AS (TRIM(TRAILING b FROM a)) VIRTUAL, KEY(v(100)));
> +ERROR HY000: Function or expression 'trim(trailing `b` from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'trim(trailing `b` from `a`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +# PAD_CHAR_TO_FULL_LENGTH + RPAD resolving dependency
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (RPAD(a,5,' ')) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (RPAD(a,6,' ')) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (RPAD(a,6,NULL)) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (RPAD(a,NULL,' ')) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +# PAD_CHAR_TO_FULL_LENGTH + RPAD not resolving dependency
> +CREATE TABLE t1 (a CHAR(5), v VARCHAR(5) AS (RPAD(a,4,' ')) VIRTUAL, KEY(v));
> +ERROR HY000: Function or expression 'rpad(`a`,4,' ')' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'rpad(`a`,4,' ')' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +CREATE TABLE t1 (
> +a CHAR(5),
> +b CHAR(5),
> +v VARCHAR(5) AS (RPAD(a,NULL,b)) VIRTUAL,
> +KEY(v)
> +);
> +ERROR HY000: Function or expression 'rpad(`a`,NULL,`b`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'rpad(`a`,NULL,`b`)' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +# PAD_CHAR_TO_FULL_LENGTH + comparison
> +CREATE TABLE t1 (a CHAR(5), v INT AS (a='a') VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (
> +a CHAR(5) CHARACTER SET latin1 COLLATE latin1_nopad_bin,
> +v INT AS (a='a') VIRTUAL, KEY(v)
> +);
> +ERROR HY000: Function or expression '`a` = 'a'' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression '`a` = 'a'' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +# PAD_CHAR_TO_FULL_LENGTH + LIKE
> +CREATE TABLE t1 (a CHAR(5), v INT AS (a LIKE 'a%') VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v INT AS (a LIKE NULL) VIRTUAL, KEY(v));
> +DROP TABLE t1;
> +CREATE TABLE t1 (a CHAR(5), v INT AS (a LIKE 'a') VIRTUAL, KEY(v));
> +ERROR HY000: Function or expression '`a` like 'a'' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression '`a` like 'a'' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +#
> +# Testing NO_UNSIGNED_SUBTRACTION
> +#
> +CREATE TABLE t1 (
> +a INT UNSIGNED,
> +b INT UNSIGNED,
> +c INT GENERATED ALWAYS AS (a-b) VIRTUAL,
> +KEY (c)
> +);
> +ERROR HY000: Function or expression '`a` - `b`' cannot be used in the GENERATED ALWAYS AS clause of `c`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression '`a` - `b`' cannot be used in the GENERATED ALWAYS AS clause of `c`
> +Warning 1105 depends on system variable @@sql_mode value NO_UNSIGNED_SUBTRACTION
> +CREATE TABLE t1 (
> +a INT UNSIGNED,
> +b INT UNSIGNED,
> +c INT GENERATED ALWAYS AS (CAST(a AS SIGNED)-b) VIRTUAL,
> +KEY (c)
> +);
> +ERROR HY000: Function or expression 'cast(`a` as signed) - `b`' cannot be used in the GENERATED ALWAYS AS clause of `c`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'cast(`a` as signed) - `b`' cannot be used in the GENERATED ALWAYS AS clause of `c`
> +Warning 1105 depends on system variable @@sql_mode value NO_UNSIGNED_SUBTRACTION
> +CREATE TABLE t1 (
> +a INT UNSIGNED,
> +b INT UNSIGNED,
> +c INT GENERATED ALWAYS AS (a-CAST(b AS SIGNED)) VIRTUAL,
> +KEY (c)
> +);
> +ERROR HY000: Function or expression '`a` - cast(`b` as signed)' cannot be used in the GENERATED ALWAYS AS clause of `c`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression '`a` - cast(`b` as signed)' cannot be used in the GENERATED ALWAYS AS clause of `c`
> +Warning 1105 depends on system variable @@sql_mode value NO_UNSIGNED_SUBTRACTION
> +CREATE TABLE t1 (
> +a INT UNSIGNED,
> +b INT UNSIGNED,
> +c INT GENERATED ALWAYS AS (CAST(a AS SIGNED)-CAST(b AS SIGNED)) VIRTUAL,
> +KEY (c)
> +);
> +DROP TABLE t1;
> +CREATE TABLE t1 (
> +a INT UNSIGNED,
> +b INT UNSIGNED,
> +c INT GENERATED ALWAYS AS (CAST(a AS DECIMAL(20,0))-CAST(b AS DECIMAL(20,0))) VIRTUAL,
> +KEY (c)
> +);
> +DROP TABLE t1;
> +#
> +# Comnination: PAD_CHAR_TO_FULL_LENGTH + NO_UNSIGNED_SUBTRACTION
> +#
> +CREATE TABLE t1 (
> +a INT UNSIGNED,
> +b INT UNSIGNED,
> +c CHAR(5),
> +v VARCHAR(5) GENERATED ALWAYS AS (RPAD(c,a-b,' ')) VIRTUAL,
> +KEY (v)
> +);
> +ERROR HY000: Function or expression 'rpad(`c`,`a` - `b`,' ')' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'rpad(`c`,`a` - `b`,' ')' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value NO_UNSIGNED_SUBTRACTION
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +CREATE TABLE t1 (
> +a INT UNSIGNED,
> +b INT UNSIGNED,
> +c CHAR(5),
> +v VARCHAR(5) GENERATED ALWAYS AS (RPAD(c,CAST(a AS DECIMAL(20,1))-b,' ')) VIRTUAL,
> +KEY (v)
> +);
> +ERROR HY000: Function or expression 'rpad(`c`,cast(`a` as decimal(20,1)) - `b`,' ')' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +SHOW WARNINGS;
> +Level Code Message
> +Error 1901 Function or expression 'rpad(`c`,cast(`a` as decimal(20,1)) - `b`,' ')' cannot be used in the GENERATED ALWAYS AS clause of `v`
> +Warning 1105 depends on system variable @@sql_mode value PAD_CHAR_TO_FULL_LENGTH
> +#
> +# End of 10.2 tests
> +#
> diff --git a/sql/field.cc b/sql/field.cc
> index e9eaa440952..28caea78c7d 100644
> --- a/sql/field.cc
> +++ b/sql/field.cc
> @@ -1428,6 +1428,35 @@ void Field::load_data_set_value(const char *pos, uint length,
> }
>
>
> +void Field::error_generated_column_function_is_not_allowed(THD *thd) const
> +{
> + StringBuffer<64> tmp;
> + vcol_info->expr->print(&tmp, QT_TO_SYSTEM_CHARSET);
> + my_error(ER_GENERATED_COLUMN_FUNCTION_IS_NOT_ALLOWED, MYF(0),
> + tmp.c_ptr(), vcol_info->get_vcol_type_name(),
> + (const char *) field_name /*add .str on merge to 10.3 */);
better const_cast<const char*>(field_name)
and it won't need a comment - the compiler will catch it
> +}
> +
> +
this needs a comment. May be as simple as "see sql_mode.h"
btw, you might need this one-line comment in few other places too
> +bool Field::check_vcol_sql_mode_dependency(THD *thd) const
> +{
> + DBUG_ASSERT(vcol_info);
> + if (!(part_of_key.is_clear_all() && key_start.is_clear_all()))
1. why not for persistent?
2. why both part_of_key and key_start? I though part_of_key means "anywhere
in the key", that is it's a superset of key_start.
> + {
> + Sql_mode_dependency dep=
> + vcol_info->expr->value_depends_on_sql_mode() &
> + Sql_mode_dependency(~0, ~can_handle_sql_mode_dependency_on_store());
> + if (dep)
> + {
> + error_generated_column_function_is_not_allowed(thd);
> + dep.push_dependency_warnings(thd);
> + return true;
> + }
> + }
> + return false;
> +}
> +
> +
> /**
> Numeric fields base class constructor.
> */
> diff --git a/sql/item_func.h b/sql/item_func.h
> index 81a7e948085..5be62427852 100644
> --- a/sql/item_func.h
> +++ b/sql/item_func.h
> @@ -773,11 +773,16 @@ class Item_func_plus :public Item_func_additive_op
>
> class Item_func_minus :public Item_func_additive_op
> {
> + Sql_mode_dependency m_sql_mode_dependency;
> public:
> Item_func_minus(THD *thd, Item *a, Item *b):
> Item_func_additive_op(thd, a, b) {}
> const char *func_name() const { return "-"; }
> enum precedence precedence() const { return ADD_PRECEDENCE; }
> + Sql_mode_dependency value_depends_on_sql_mode() const
> + {
> + return m_sql_mode_dependency;
1. no soft_to_hard here?
2. If you do need soft_to_hard here, then, may be, better to do
soft_to_hard not in value_depends_on_sql_mode() at all,
but just in one place - in Field::check_vcol_sql_mode_dependency() ?
> + }
> longlong int_op();
> double real_op();
> my_decimal *decimal_op(my_decimal *);
> diff --git a/sql/item_strfunc.cc b/sql/item_strfunc.cc
> index 753b6134419..846920bc8c2 100644
> --- a/sql/item_strfunc.cc
> +++ b/sql/item_strfunc.cc
> @@ -2103,6 +2103,38 @@ void Item_func_trim::print(String *str, enum_query_type query_type)
> }
>
>
> +/*
> + RTRIM(expr)
> + TRIM(TRAILING ' ' FROM expr)
> + remove argument's soft dependency on PAD_CHAR_TO_FULL_LENGTH:
> +*/
> +Sql_mode_dependency Item_func_trim::value_depends_on_sql_mode() const
> +{
> + DBUG_ASSERT(fixed);
> + if (arg_count == 1) // RTRIM(expr)
> + return (args[0]->value_depends_on_sql_mode() &
> + Sql_mode_dependency(~0, ~MODE_PAD_CHAR_TO_FULL_LENGTH)).
> + soft_to_hard();
> + // TRIM(... FROM expr)
> + DBUG_ASSERT(arg_count == 2);
> + if (!args[1]->value_depends_on_sql_mode_const_item())
> + return Item_func::value_depends_on_sql_mode();
> + StringBuffer<64> trimstrbuf;
> + String *trimstr= args[1]->val_str(&trimstrbuf);
> + if (!trimstr)
> + return Sql_mode_dependency(); // will return NULL
> + if (trimstr->length() == 0)
> + return Item_func::value_depends_on_sql_mode(); // will trim nothing
> + if (trimstr->lengthsp() != 0)
> + return Item_func::value_depends_on_sql_mode(); // will trim not only spaces
what about TRIM(TRAILING ' ' FROM expr) ?
it'll trim an even number of spaces, might leave one space untrimmed.
> + // TRIM(TRAILING ' ' FROM expr)
> + return ((args[0]->value_depends_on_sql_mode() |
> + args[1]->value_depends_on_sql_mode()) &
> + Sql_mode_dependency(~0, ~MODE_PAD_CHAR_TO_FULL_LENGTH)).
> + soft_to_hard();
> +}
> +
> +
> /* Item_func_password */
>
> bool Item_func_password::fix_fields(THD *thd, Item **ref)
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1

Re: [Maria-developers] 25d6f575a50: MDEV-19705: Assertion `tmp >= 0' failed in best_access_path
by Sergei Golubchik 26 Aug '19
by Sergei Golubchik 26 Aug '19
26 Aug '19
Hi, Varun!
On Aug 26, Varun Gupta wrote:
> revision-id: 25d6f575a50 (mariadb-10.4.7-31-g25d6f575a50)
> parent(s): 7b4de10477a
> author: Varun Gupta <varun.gupta(a)mariadb.com>
> committer: Varun Gupta <varun.gupta(a)mariadb.com>
> timestamp: 2019-08-22 02:38:38 +0530
> message:
>
> MDEV-19705: Assertion `tmp >= 0' failed in best_access_path
>
> The reason for hitting the assert is that rec_per_key estimates have some garbage value.
> So the solution to fix this would be for long unique keys to use use rec_per_key for only 1 keypart
>
> ---
> mysql-test/main/long_unique.result | 15 +++++++++++++++
> mysql-test/main/long_unique.test | 14 ++++++++++++++
> sql/table.cc | 13 ++++++++++++-
> 3 files changed, 41 insertions(+), 1 deletion(-)
>
> diff --git a/sql/table.cc b/sql/table.cc
> index 1ab4df0f7cf..fa57693e7eb 100644
> --- a/sql/table.cc
> +++ b/sql/table.cc
> @@ -828,6 +829,16 @@ static bool create_key_infos(const uchar *strpos, const uchar *frm_image_end,
> {
> keyinfo->key_length= HA_HASH_KEY_LENGTH_WITHOUT_NULL;
> key_part++; // reserved for the hash value
> + /*
> + but hash keys have a flag "only whole key", so for hash keys one should never
> + look at keyinfo->rec_per_key for partial keys, and they cannot possibly be calculated
> + so here we make sure that the keyinfo->rec_per_key[keyinfo->user_defined_key_parts-1]
> + would point to the slot we reserved for the keyinfo which holds the unique key with
> + hash
> + */
> + ulong *dest= (keyinfo->rec_per_key + keyinfo->user_defined_key_parts - 1);
> + memcpy(&dest, &rec_per_key, sizeof(ulong*));
Sorry, I don't understand what you're doing here.
I only meant something like
keyinfo->rec_per_key-= keyinfo->user_defined_key_parts - 1;
> + *rec_per_key++= 0;
> }
>
> /*
>
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0
Report for week 12:
Hi!
This week I merged some test cases in respective files into one so that the
tests can be under 80 lines and also improved the documentation.
Documentation for INSERT...RETURNING:
https://docs.google.com/document/d/1EknNlh-J72cUlCg9rvcujUeZdzu5uzm0hmHG3qg…
REPLACE...RETURNING:
https://docs.google.com/document/d/1UMy5fi-j6yIObYIRwTteaJKHU5F-wPoEAVgVbeu…
I also fixed feature_insert_returning system variable by adding some code
and improving the test file. The test is passing.
insert_returning_datatypes.result and replace_returning_datatypes.result
files were showing binary data. To fix it, it was suggested to create a
.reject and .result files, hexdump of the files, diff and giving a
printable ASCII character as the input in BIT field because BIT is printed
as VARCHAR.
One thing I noticed was, after fixing the files they were still showing as
binary files, which was probably because of the diffs. So I removed the
existing files from my repo and pushed new files then cleaned up the
commits again and fixed the 80 character per line rule for new tests. My
github repo is up to date with latest changes.
Regards,
Rucha Deodhar
1
0

Re: [Maria-developers] e2a82094332: MENT-253 Server Audit v2 does not work with PS protocol: server crashes in filter_query_type.
by Sergei Golubchik 14 Aug '19
by Sergei Golubchik 14 Aug '19
14 Aug '19
Hi, Alexey!
On Aug 14, Alexey Botchkov wrote:
> revision-id: e2a82094332 (mariadb-10.4.7-123-ge2a82094332)
> parent(s): 691654721b3
> author: Alexey Botchkov <holyfoot(a)mariadb.com>
> committer: Alexey Botchkov <holyfoot(a)mariadb.com>
> timestamp: 2019-08-07 11:49:54 +0400
> message:
>
> MENT-253 Server Audit v2 does not work with PS protocol: server crashes in filter_query_type.
>
> Set the thd->query_string where possible.
> server_audit2 should not crash when gets empty event->query.
>
> ---
> plugin/server_audit2/server_audit.c | 15 ++++++++++++---
> sql/sql_prepare.cc | 22 ++++++++++++++++++++++
> 2 files changed, 34 insertions(+), 3 deletions(-)
Test case? With normal and error code paths?
> diff --git a/sql/sql_prepare.cc b/sql/sql_prepare.cc
> index 74723d5bd91..d2cb4e9b372 100644
> --- a/sql/sql_prepare.cc
> +++ b/sql/sql_prepare.cc
> @@ -2600,6 +2600,14 @@ void mysqld_stmt_prepare(THD *thd, const char *packet, uint packet_length)
>
> if (stmt->prepare(packet, packet_length))
> {
> + /*
> + Set the thd->query_string with the current query so the
> + audit plugin gets the meaningful notification.
say that it's an error case. Like
Prepare failed.
Set the thd->query_string with the current query so the
audit plugin gets the meaningful notification when it gets
the error notification.
> + */
> + if (alloc_query(thd, stmt->query_string.str(), stmt->query_string.length()))
> + {
> + thd->set_query(0, 0);
> + }
> /* Statement map deletes statement on erase */
> thd->stmt_map.erase(stmt);
> thd->clear_last_stmt();
> @@ -3184,6 +3192,13 @@ static void mysql_stmt_execute_common(THD *thd,
> char llbuf[22];
> my_error(ER_UNKNOWN_STMT_HANDLER, MYF(0), static_cast<int>(sizeof(llbuf)),
> llstr(stmt_id, llbuf), "mysqld_stmt_execute");
> + /*
> + Set thd->query_string with the stmt_id so the
> + audit plugin gets the meaningful notification.
Same here.
> + */
> + if (alloc_query(thd, llbuf, strlen(llbuf)))
> + thd->set_query(0, 0);
> +
> DBUG_VOID_RETURN;
> }
> stmt->read_types= read_types;
> @@ -3939,6 +3954,13 @@ bool Prepared_statement::prepare(const char *packet, uint packet_len)
> DBUG_RETURN(TRUE);
> }
>
> + /*
> + We'd like to have thd->query to be set to the actual query
> + after the function ends.
> + This value will be sent to audit pulgins later.
1. typo: "pulgins"
2. Mention explicitly that "query string is allocated in the stmt arena,
not in the thd arena. But it's safe here, because stmt always has a longer
lifetime than thd arena." - or something along these lines.
> + */
> + stmt_backup.query_string= thd->query_string;
> +
> old_stmt_arena= thd->stmt_arena;
> thd->stmt_arena= this;
>
>
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0
Report for week 11:
Hello!
This week I worked on cleaning up the code, commits and test cases.
I was a little less familiar with git rebase -i so tried a couple of things
on a dummy repo first and then cleaned up the commits on my repo. I also
worked on the coding style review and removed all the trailing white spaces
and tabs mentioned in the review. To remove the last argument of
mysql_prepare_insert() I tried a couple of things like not calling
setup_fields() and setup_wild() separately for INSERT...SELECT...RETURNING
and calling std::swap if select_insert is true. Like so
if(!lex->returning_list.is_empty())
{
if (select_insert)
//std::swap()
setup_fields() and setup_wild()
if (selec_insert)
//std::swap()
}
and also changing where the above functions are called. But that didn't
work. So now calling it separately for INSERT..SELECT..RETURNING and other
variants. I added separate function to sql_base.cc and sql_base.h which
calls setup_fields() and setup_wild() so that not only the last argument
can be removed but also lines of code can be reduced. All the occurrences
of
(wild_num && setup_wild(thd, table_list, field_list, NULL, wild_num,
&select_lex->hidden_bit_fields)) ||
setup_fields(thd, Ref_ptr_array(),
field_list, MARK_COLUMNS_READ, NULL, NULL, 0)
can be replaced with setup_returning_fields(). (Since this is used in
insert..returning, insert..select..returning, delete...returning and
update...returning too).
On some lines of the test cases, the characters were coming to more than 80
characters. So fixed that and added some more test cases
(INSERT...IGNORE..RETURNING and fields with auto_increment). Finally I
merged all the commits made this week into one commit. My github repo is up
to date with the latest changes.
Here is the link for the same:
https://github.com/rucha174/server/commit/0d7eb4198cc0aadba42acb4960c175702…
Regards,
Rucha Deodhar
1
0
Hi Rucha!
This is going to be a very thorough review looking at coding style. I'll
nitpick on every line of code where I find something, just be patient and
let's clean up everything.
You might consider me picking on the coding style a bit pedantic, but it
really is important. It is important because it helps in searching, it helps
in understanding the code faster, it helps to look-up history faster.
The good part is that once you get used to writing things properly, you won't
even have to think about it actively. It will just come naturally.
I suspect you do not have your editor configured to show whitespaces and
show you tabs versus spaces. Please look into your editor's settings to
highlight both tabs and spaces.
There are scripts that remove trailing whitespaces for you, across whole files.
I don't recommend them for MariaDB code as we have remnants from MySQL times of
these and because we care more about preserving the immediate history for
running `git blame`, we did not fully fix all files. For any other project, I
highly recommend you use them.
Some of these comments might slightly contradict what I said before. That is
because I wanted us to get things working first and I didn't want to burden
you with too many details initially. I'll look to explain myself now clearly
so you understand why some things are bad ideas. I want you to take this
review also as a general guideline for any other code you write, anywhere.
With these hints, tricks, ideas, you'll produce much higher quality code in
the long run. It will take you longer to write sometimes, but it really makes
a difference. :)
Also, you may have noticed some parts of our code do not agree with my general
guidelines. You have to take into account that this code was written over the
course of many years, by different people, with different levels of experience
or available time to implement something. Try not to make use of it as an
excuse for new code that you write. It's up to you to always write the best
possible version *with the time allowed*. We have adequate time now so we
can afford to look for better solutions.
With all that said, this took me a while to write and I may still
have missed a few things. Feel free to point out anything that looks strange.
Now, let's get to the review.
>
> diff --git a/sql/sql_class.h b/sql/sql_class.h
> index 18ccf992d1e..9e4bb1d4579 100644
> --- a/sql/sql_class.h
> +++ b/sql/sql_class.h
> @@ -5475,7 +5476,13 @@ class select_dump :public select_to_file {
>
> class select_insert :public select_result_interceptor {
> public:
> + select_result* sel_result;
The * should be next to the variable name. Not everybody does this, but the
code around it does, so let's be consistent.
> TABLE_LIST *table_list;
> + /*
> + Points to the table in which we're inserting records. We need to save the
> + insert table before it is masked in the parsing phase.
> + */
> + TABLE_LIST *insert_table;
This sort of code and comment is usually a bad idea:
1. We are adding an extra member, because of a hack in a different part of
the code.
2. We are mentioning a specific flow for which this member is good for. This
already begins to hint at a bad design decision. Sometimes there's no way
around it, but this guarantees lack of re-usability. If, in the future
we fix the "masking in the parsing phase" to no longer mask the first table,
which we probably will do, because we have to to allow INSERT within other
queries, the comment will become outdated. One needs to remember this too.
Forgetting to update comments happens easily. Here is one example:
One checks the class initially, finds the appropriate member to use,
uses it in a different context than the original author thought of. The
class then is no longer changed so it does not show up during review time,
the reviewer doesn't think to check the class member's definition either.
You now have outdated comments.
To prevent this from happening, avoid as much as possible to explain where
a member is used, rather explain what type of data it is supposed to store.
The first sentence accomplishes that goal.
> TABLE *table;
> List<Item> *fields;
> ulonglong autoinc_value_of_last_inserted_row; // autogenerated or not
> @@ -5484,7 +5491,7 @@ class select_insert :public select_result_interceptor {
> select_insert(THD *thd_arg, TABLE_LIST *table_list_par,
> TABLE *table_par, List<Item> *fields_par,
> List<Item> *update_fields, List<Item> *update_values,
> - enum_duplicates duplic, bool ignore);
> + enum_duplicates duplic, bool ignore,select_result* sel_ret_list,TABLE_LIST *save_first);
This is an artifact of our long-standing code base. TABS mixed with spaces. :(
Let's clean up all of this select_insert's constructor. Do not keep the tabs,
we are modifying it anyway. Turn all these tabs to proper spaces. Make sure
the function arguments match the beginning parenthesis like so:
<function_name>(TYPE1 arg1, TYPE2 arg2,
TYPE3 arg3, TYPE4 arg4,
TYPE5 arg5)
> ~select_insert();
> int prepare(List<Item> &list, SELECT_LEX_UNIT *u);
> virtual int prepare2(JOIN *join);
> @@ -5520,7 +5527,7 @@ class select_create: public select_insert {
> List<Item> &select_fields,enum_duplicates duplic, bool ignore,
> TABLE_LIST *select_tables_arg):
> select_insert(thd_arg, table_arg, NULL, &select_fields, 0, 0, duplic,
> - ignore),
> + ignore, NULL, NULL),
Sigh... again, we have a historical background of mixing NULLs with 0s.
Ok here I guess.
> create_table(table_arg),
> create_info(create_info_par),
> select_tables(select_tables_arg),
> diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc
> index f3a548d7265..4bb19cef726 100644
> --- a/sql/sql_insert.cc
> +++ b/sql/sql_insert.cc
> @@ -119,6 +119,18 @@ static bool check_view_insertability(THD *thd, TABLE_LIST *view);
> @returns false if success.
> */
>
> +/*
> + Swaps the context before and after calling setup_fields() and setup_wild() in
> + INSERT...SELECT when LEX::returning_list is not empty.
One more space to indent the comment text, for both lines.
> +*/
> +template <typename T>
> +void swap_context(T& cxt1, T& cxt2)
> +{
> + T temp = cxt1;
> + cxt1 = cxt2;
> + cxt2 = temp;
> +
> +}
Unneded empty line before the the closing }.
Also, you should add 2 empty lines after the function. Delete the tabs in the
function, replace with 2 spaces.
Also, this function is not even used just to swap_contexts any more. Rename
it to swap.
Can we not use std::swap instead?
> static bool check_view_single_update(List<Item> &fields, List<Item> *values,
> TABLE_LIST *view, table_map *map,
> bool insert)
> @@ -688,6 +700,10 @@ Field **TABLE::field_to_fill()
>
> /**
> INSERT statement implementation
> +
Trailing whitespace.
> + SYNOPSIS
> + mysql_insert()
> + result NULL if returning_list is empty
I'd change this to NULL if the insert is not outputing results via 'RETURNING'
clause.
>
> @note Like implementations of other DDL/DML in MySQL, this function
> relies on the caller to close the thread tables. This is done in the
> @@ -700,7 +716,7 @@ bool mysql_insert(THD *thd,TABLE_LIST *table_list,
> List<Item> &update_fields,
> List<Item> &update_values,
> enum_duplicates duplic,
> - bool ignore)
> + bool ignore, select_result* result)
Remove tabs, use spaces, align properly, like explained above.
> {
> bool retval= true;
> int error, res;
> @@ -719,6 +735,8 @@ bool mysql_insert(THD *thd,TABLE_LIST *table_list,
> List_item *values;
> Name_resolution_context *context;
> Name_resolution_context_state ctx_state;
> + SELECT_LEX* select_lex = thd->lex->first_select_lex();
> + List<Item>& returning_list = thd->lex->returning_list;
Generally this kind of use of references is a bad idea. Highly suggest you
only use const references, otherwise things can turn very confusing,
very quickly. If you read the Google C++ coding style guide, you'll see they
only allow const references in any part of their code.
> #ifndef EMBEDDED_LIBRARY
> char *query= thd->query();
> /*
> @@ -775,9 +793,13 @@ bool mysql_insert(THD *thd,TABLE_LIST *table_list,
>
> if (mysql_prepare_insert(thd, table_list, table, fields, values,
> update_fields, update_values, duplic, &unused_conds,
> - FALSE))
> - goto abort;
> -
> + FALSE,result?true:false))
Missing a few spaces. One after , One before and one after ?
One before and one after :.
> + goto abort;
Tabs instead of spaces.
> +
Trailing whitespaces.
> + /* Prepares LEX::returing_list if it is not empty */
> + if (result)
> + (void)result->prepare(returning_list, NULL);
Tabs instead of spaces.
> +
Tabs, spaces, trailing whitespaces. Make sure each new line that contains
no text be just that, a newline.
> /* mysql_prepare_insert sets table_list->table if it was not set */
> table= table_list->table;
>
> @@ -947,6 +969,18 @@ bool mysql_insert(THD *thd,TABLE_LIST *table_list,
> goto values_loop_end;
> }
> }
> + /*
> + If statement returns result set, we need to send the result set metadata
> + to the client so that it knows that it has to expect an EOF or ERROR.
> + At this point we have all the required information to send the result set metadata.
> + */
When adding comments like these, you need to indent the text 2 more spaces.
You crossed the 80 character limit with the "metadata" word. That needs to be
put on a new line.
> + if (result)
> + {
> + if (unlikely(result->send_result_set_metadata(returning_list,
Tabs again.
> + Protocol::SEND_NUM_ROWS |
> + Protocol::SEND_EOF)))
> + goto values_loop_end;
The goto indentation is ok, but Protocol should be aligned to the last open
parenthesis.
> + }
>
> THD_STAGE_INFO(thd, stage_update);
> do
> @@ -1060,7 +1094,7 @@ bool mysql_insert(THD *thd,TABLE_LIST *table_list,
> error= 1;
> break;
> }
> -
> +
You added an extra line here, but introduced a tab and trailing whitespaces,
not good.
> #ifndef EMBEDDED_LIBRARY
> if (lock_type == TL_WRITE_DELAYED)
> {
> @@ -1072,9 +1106,25 @@ bool mysql_insert(THD *thd,TABLE_LIST *table_list,
> }
> else
> #endif
> - error=write_record(thd, table ,&info);
> + error=write_record(thd, table ,&info);
You didn't change this line logic at all. Just introduced an extra tab plus
whitespaces. Please clean it up.
> if (unlikely(error))
> break;
> + /*
> + We send the rows after writing them to table so that the correct information
> + is sent to the client. Example INSERT...ON DUPLICAT KEY UPDATE and values
> + set to auto_increment. Write record handles auto_increment updating values
> + if there is a duplicate key. We want to send the rows to the client only
> + after these operations are carried out. Otherwise it shows 0 to the client
> + if the fields that are incremented automatically are not given explicitly
> + and the irreplaced values in case of ON DUPLICATE KEY UPDATE (even if
> + the values are replaced or incremented while writing record.
> + Hence it shows different result set to the client)
> + */
Good comment! Trailing whitespaces however are still present and you are
breaking the 80 character rule.
> + if (result && result->send_data(returning_list) < 0)
> + {
> + error = 1;
Remove space before =.
> + break;
> + }
> thd->get_stmt_da()->inc_current_row_for_warning();
> }
> its.rewind();
> @@ -1233,19 +1283,30 @@ bool mysql_insert(THD *thd,TABLE_LIST *table_list,
> retval= thd->lex->explain->send_explain(thd);
> goto abort;
> }
> +
Trailing whitespaces. Please fix.
> if ((iteration * values_list.elements) == 1 && (!(thd->variables.option_bits & OPTION_WARNINGS) ||
> !thd->cuted_fields))
> - {
> - my_ok(thd, info.copied + info.deleted +
> + {
> + /*
> + Client expects an EOF/Ok packet if result set metadata was sent. If
> + LEX::returning_list is not empty and the statement returns result set
> + we send EOF which is the indicator of the end of the row stream.
> + Else we send Ok packet i.e when the statement returns only the status
> + information
> + */
> + if (result)
> + result->send_eof();
> + else
> + my_ok(thd, info.copied + info.deleted +
> ((thd->client_capabilities & CLIENT_FOUND_ROWS) ?
> - info.touched : info.updated),
> - id);
> + info.touched : info.updated),id);
> }
> else
> - {
> + {
You didn't change the logic here, just added an extra trailing whitespace.
> char buff[160];
> ha_rows updated=((thd->client_capabilities & CLIENT_FOUND_ROWS) ?
> info.touched : info.updated);
> +
I guess I am ok with an extra line here, but it should not have an extra tab.
Please, only newlines, no extra characters.
> if (ignore)
> sprintf(buff, ER_THD(thd, ER_INSERT_INFO), (ulong) info.records,
> (lock_type == TL_WRITE_DELAYED) ? (ulong) 0 :
> @@ -1255,8 +1316,12 @@ bool mysql_insert(THD *thd,TABLE_LIST *table_list,
> sprintf(buff, ER_THD(thd, ER_INSERT_INFO), (ulong) info.records,
> (ulong) (info.deleted + updated),
> (long) thd->get_stmt_da()->current_statement_warn_count());
> - ::my_ok(thd, info.copied + info.deleted + updated, id, buff);
> + if (result)
> + result->send_eof();
> + else
Tabs, please use spaces.
> + ::my_ok(thd, info.copied + info.deleted + updated, id, buff);
> }
> +
Trailing whitespaces. :(
> thd->abort_on_warning= 0;
> if (thd->lex->current_select->first_cond_optimization)
> {
> @@ -1470,6 +1535,7 @@ static void prepare_for_positional_update(TABLE *table, TABLE_LIST *tables)
> be taken from table_list->table)
> where Where clause (for insert ... select)
> select_insert TRUE if INSERT ... SELECT statement
> + with_returning_list TRUE if returning_list is not empty
Trailing whitespace. (I know the rest of the function comments also have
trailing whitespaces. Perfect time to fix all that and remove all their
tabs.
>
> TODO (in far future)
> In cases of:
> @@ -1490,7 +1556,7 @@ bool mysql_prepare_insert(THD *thd, TABLE_LIST *table_list,
> TABLE *table, List<Item> &fields, List_item *values,
> List<Item> &update_fields, List<Item> &update_values,
> enum_duplicates duplic, COND **where,
> - bool select_insert)
> + bool select_insert, bool with_returning_list)
One too many spaces after the last comma.
> {
> SELECT_LEX *select_lex= thd->lex->first_select_lex();
> Name_resolution_context *context= &select_lex->context;
> @@ -1556,8 +1622,21 @@ bool mysql_prepare_insert(THD *thd, TABLE_LIST *table_list,
> */
> table_list->next_local= 0;
> context->resolve_in_table_list_only(table_list);
> +
Trailing whitespaces.
> + /*
> + Perform checks like all given fields exists, if exists fill struct with
Trailing whitespace.
> + current data and expand all '*' in given fields for LEX::returning_list.
> + */
Trailing whitespaces.
> + if(with_returning_list)
Trailing whitespaces. Put a space after if.
> + {
> + res= ((select_lex->with_wild && setup_wild(thd, table_list,
Trailing whitespace.
> + thd->lex->returning_list,NULL, select_lex->with_wild,
Trailing whitespace, tabs instead of spaces.
> + &select_lex->hidden_bit_fields)) ||
> + setup_fields(thd, Ref_ptr_array(),
> + thd->lex->returning_list, MARK_COLUMNS_READ, 0, NULL, 0));
Tabs instead of spaces, swap them to spaces. Make sure to follow the 80
character max line length.
> + }
>
> - res= (setup_fields(thd, Ref_ptr_array(),
> + res= (res||setup_fields(thd, Ref_ptr_array(),
Please Add a space before and after ||.
> *values, MARK_COLUMNS_READ, 0, NULL, 0) ||
> check_insert_fields(thd, context->table_list, fields, *values,
> !insert_into_view, 0, &map));
> @@ -1724,7 +1803,7 @@ int write_record(THD *thd, TABLE *table,COPY_INFO *info)
> table->file->insert_id_for_cur_row= insert_id_for_cur_row;
> bool is_duplicate_key_error;
> if (table->file->is_fatal_error(error, HA_CHECK_ALL))
> - goto err;
> + goto err;
A good fix, not strictly necessary, but a good fix.
> is_duplicate_key_error=
> table->file->is_fatal_error(error, HA_CHECK_ALL & ~HA_CHECK_DUP);
> if (!is_duplicate_key_error)
> @@ -3557,29 +3636,39 @@ bool Delayed_insert::handle_inserts(void)
> TRUE Error
> */
>
> -bool mysql_insert_select_prepare(THD *thd)
> +bool mysql_insert_select_prepare(THD *thd,select_result *sel_res)
Put a space after last comma (,).
This function could use a bit of tweaking, but let's fix all the code style
issues first.
> {
> LEX *lex= thd->lex;
> SELECT_LEX *select_lex= lex->first_select_lex();
> DBUG_ENTER("mysql_insert_select_prepare");
> -
> + List<Item>& returning_list = thd->lex->returning_list;
For assignment operators no space before it. Like so:
List<Item>& returning_list= thd->lex->returning_list;
>
>
> /*
> SELECT_LEX do not belong to INSERT statement, so we can't add WHERE
> clause if table is VIEW
> */
> -
> + /*
> + Passing with_returning_list (last argument) as false otherwise
> + setup_field() and setup_wild() will be called twice. 1) in mysql_prepare_insert()
> + and 2) time in select_insert::prepare(). We want to call it only once:
> + in select_insert::prepare()
> + */
Ok, this is an interesting comment. It's much better if we can write code that
prevents this sort of reasoning. Pragmatically, why do we end up calling it
twice? Can we somehow refactor it such that this is not a problem in the
first place?
For example, here it kind of makes sense to just compute with_returning_list
within the function, by looking at thd->lex->returning_list.
> if (mysql_prepare_insert(thd, lex->query_tables,
> lex->query_tables->table, lex->field_list, 0,
> lex->update_list, lex->value_list, lex->duplicates,
> - &select_lex->where, TRUE))
> + &select_lex->where, TRUE,false))
Put a space after last comma. And let's use FALSE.
> DBUG_RETURN(TRUE);
>
> + /* If LEX::returning_list is not empty, we prepare the list */
Uhm, the comment doesn't seem to match the code. You are checking sel_res,
yet the comment says that if LEX:returning_list is not empty, prepare the
list.
> + if (sel_res)
> + (void)sel_res->prepare(returning_list, NULL);
Here we have a tab instead of spaces again.
> +
> DBUG_ASSERT(select_lex->leaf_tables.elements != 0);
> List_iterator<TABLE_LIST> ti(select_lex->leaf_tables);
> TABLE_LIST *table;
> uint insert_tables;
>
> +
This line is not necessary.
> if (select_lex->first_cond_optimization)
> {
> /* Back up leaf_tables list. */
> @@ -3607,6 +3696,7 @@ bool mysql_insert_select_prepare(THD *thd)
> while ((table= ti++) && insert_tables--)
> ti.remove();
>
> +
This line is not necessary.
> DBUG_RETURN(FALSE);
> }
>
> @@ -3617,7 +3707,7 @@ select_insert::select_insert(THD *thd_arg, TABLE_LIST *table_list_par,
> List<Item> *update_fields,
> List<Item> *update_values,
> enum_duplicates duplic,
> - bool ignore_check_option_errors):
> + bool ignore_check_option_errors,select_result *result, TABLE_LIST *save_first):
This line is still way too long. I remember mentioning this in a previous
review, please look more closely and wrap this properly.
> select_result_interceptor(thd_arg),
> table_list(table_list_par), table(table_par), fields(fields_par),
> autoinc_value_of_last_inserted_row(0),
> @@ -3630,6 +3720,8 @@ select_insert::select_insert(THD *thd_arg, TABLE_LIST *table_list_par,
> info.update_values= update_values;
> info.view= (table_list_par->view ? table_list_par : 0);
> info.table_list= table_list_par;
> + sel_result= result;
> + insert_table= save_first;
Trailing whitespace, you added a tab after =. Please delete the tab.
> }
>
>
> @@ -3637,10 +3729,11 @@ int
> select_insert::prepare(List<Item> &values, SELECT_LEX_UNIT *u)
> {
> LEX *lex= thd->lex;
> - int res;
> + int res=0;
Space after =.
> table_map map= 0;
> SELECT_LEX *lex_current_select_save= lex->current_select;
> DBUG_ENTER("select_insert::prepare");
> + SELECT_LEX* select_lex = thd->lex->first_select_lex();
Delete space before =.
>
> unit= u;
>
> @@ -3651,7 +3744,33 @@ select_insert::prepare(List<Item> &values, SELECT_LEX_UNIT *u)
> */
> lex->current_select= lex->first_select_lex();
>
> - res= (setup_fields(thd, Ref_ptr_array(),
> + /*
> + We want the returning_list to point to insert table. But the context is masked.
> + So we swap it with the context saved during parsing stage.
Tab mixed with spaces. Only use spaces. Traling whitespace, here and the line
before.
> + */
> + if(sel_result)
Trailing whitespace.
> + {
Trailing whitespace.
> + swap_context(insert_table,select_lex->table_list.first);
> + swap_context(select_lex->context.saved_table_list,select_lex->context.table_list);
> + swap_context(select_lex->context.saved_name_resolution_table,select_lex->context.first_name_resolution_table);
Line length exceeds 80 characters.
> +
> + /*
> + Perform checks for LEX::returning_list like we do for other variant of INSERT.
> + */
> + res=((select_lex->with_wild && setup_wild(thd, table_list,
> + thd->lex->returning_list, NULL, select_lex->with_wild,
> + &select_lex->hidden_bit_fields)) ||
> + setup_fields(thd, Ref_ptr_array(),
> + thd->lex->returning_list, MARK_COLUMNS_READ, 0, NULL, 0));
Tabs instead of spaces and mixed spaces. Not good. Also trailing whitespaces.
Also, the parameters of setup_wild should be aligned to the first parenthesis
of setup_wild. setup_fields should be aligned to the first parenthesis here.
I also think there's one too many parantheses here, the outer most ones are
not necessary..
> +
> + /* Swap it back to retore the previous state for the rest of the function */
Tabs instead of spaces.
> +
> + swap_context(insert_table,select_lex->table_list.first);
> + swap_context(select_lex->context.saved_table_list,select_lex->context.table_list);
> + swap_context(select_lex->context.saved_name_resolution_table, select_lex->context.first_name_resolution_table);
Tabs and spaces mixed here. Watch line length.
> + }
> +
> + res= res || (setup_fields(thd, Ref_ptr_array(),
> values, MARK_COLUMNS_READ, 0, NULL, 0) ||
> check_insert_fields(thd, table_list, *fields, values,
> !insert_into_view, 1, &map));
The line that starts with "values" should be aligned to the first open
parenthesis.
> @@ -3822,13 +3941,26 @@ select_insert::prepare(List<Item> &values, SELECT_LEX_UNIT *u)
>
> int select_insert::prepare2(JOIN *)
> {
> +
Trailing whitespace.
> DBUG_ENTER("select_insert::prepare2");
> + List<Item>& returning_list = thd->lex->returning_list;
Delete space before =
> + LEX* lex = thd->lex;
This function does not use the lex local variable at all, only via thd->lex.
This line should be removed.
> if (thd->lex->current_select->options & OPTION_BUFFER_RESULT &&
> thd->locked_tables_mode <= LTM_LOCK_TABLES &&
> !thd->lex->describe)
> table->file->ha_start_bulk_insert((ha_rows) 0);
> if (table->validate_default_values_of_unset_fields(thd))
> DBUG_RETURN(1);
> +
> + /* Same as the other variants of INSERT */
> + if (sel_result)
> + {
> + if(unlikely(sel_result->send_result_set_metadata(returning_list,
> + Protocol::SEND_NUM_ROWS |
> + Protocol::SEND_EOF)))
The Protocol should be aligned to the last open parenthesis.
You are mixing tabs with spaces here. Convert to spaces.
> +
> + DBUG_RETURN(1);
Mixing tabs with spaces again, convert to spaces.
> + }
> DBUG_RETURN(0);
> }
>
> @@ -3842,6 +3974,8 @@ void select_insert::cleanup()
> select_insert::~select_insert()
> {
> DBUG_ENTER("~select_insert");
> + sel_result=NULL;
Add a space after =.
> + insert_table=NULL;
Add a space after =.
> if (table && table->is_created())
> {
> table->next_number_field=0;
> @@ -3857,6 +3991,8 @@ select_insert::~select_insert()
> int select_insert::send_data(List<Item> &values)
> {
> DBUG_ENTER("select_insert::send_data");
> + LEX* lex = thd->lex;
Delete space before =.
>
> + List<Item>& returning_list = thd->lex->returning_list;
Delete space before =.
> bool error=0;
>
> if (unit->offset_limit_cnt)
> @@ -3889,8 +4025,18 @@ int select_insert::send_data(List<Item> &values)
> DBUG_RETURN(1);
> }
> }
> -
> +
Trailing whitespaces added for no good reason.
> error= write_record(thd, table, &info);
> +
Trailing whitespaces.
> + /*
> + Sending the result set to the cliet after writing record. The reason is same
> + as other variants of insert.
> + */
> + if (sel_result && sel_result->send_data(returning_list) < 0)
> + {
> + error = 1;
Delete space before =.
> + DBUG_RETURN(1);
> + }
> table->vers_write= table->versioned();
> table->auto_increment_field_not_null= FALSE;
>
> @@ -4024,7 +4170,7 @@ bool select_insert::send_ok_packet() {
> char message[160]; /* status message */
> ulonglong row_count; /* rows affected */
> ulonglong id; /* last insert-id */
> -
> + LEX *lex=thd->lex;
This function does not use this variable at all. Please delete it if it's not
used. All it does is produce a compiler warning.
> DBUG_ENTER("select_insert::send_ok_packet");
>
> if (info.ignore)
> @@ -4045,8 +4191,15 @@ bool select_insert::send_ok_packet() {
> (thd->arg_of_last_insert_id_function ?
> thd->first_successful_insert_id_in_prev_stmt :
> (info.copied ? autoinc_value_of_last_inserted_row : 0));
> -
> - ::my_ok(thd, row_count, id, message);
> +
Trailing whitespace
> + /*
> + Client expects an EOF/Ok packet If LEX::returning_list is not empty and
Trailing whitespace
> + if result set meta was sent. See explanation for other variants of INSERT.
> + */
> + if (sel_result)
> + sel_result->send_eof();
> + else
> + ::my_ok(thd, row_count, id, message);
>
> DBUG_RETURN(false);
> }
> diff --git a/sql/sql_insert.h b/sql/sql_insert.h
> index a37ed1f31e5..314817b53d3 100644
> --- a/sql/sql_insert.h
> +++ b/sql/sql_insert.h
> @@ -27,11 +27,11 @@ bool mysql_prepare_insert(THD *thd, TABLE_LIST *table_list, TABLE *table,
> List<Item> &fields, List_item *values,
> List<Item> &update_fields,
> List<Item> &update_values, enum_duplicates duplic,
> - COND **where, bool select_insert);
> + COND **where, bool select_insert, bool with_returning_list);
This line exceeds 80 characters. Please wrap it to 80 characters by moving
the final function parameter to a new line.
> bool mysql_insert(THD *thd,TABLE_LIST *table,List<Item> &fields,
> List<List_item> &values, List<Item> &update_fields,
> List<Item> &update_values, enum_duplicates flag,
> - bool ignore);
> + bool ignore, select_result* result);
> void upgrade_lock_type_for_insert(THD *thd, thr_lock_type *lock_type,
> enum_duplicates duplic,
> bool is_multi_insert);
> diff --git a/sql/sql_lex.h b/sql/sql_lex.h
> index 0e1d17d13f0..00cf3025efe 100644
> --- a/sql/sql_lex.h
> +++ b/sql/sql_lex.h
> @@ -3091,6 +3092,8 @@ struct LEX: public Query_tables_list
> SELECT_LEX *current_select;
> /* list of all SELECT_LEX */
> SELECT_LEX *all_selects_list;
> + /* List of fields and expression for returning part of insert*/
Add a space before closing the comment.
> + List<Item> returning_list;
> /* current with clause in parsing if any, otherwise 0*/
> With_clause *curr_with_clause;
> /* pointer to the first with clause in the current statement */
> diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc
> index 01d0ed1c383..852078b254b 100644
> --- a/sql/sql_parse.cc
> +++ b/sql/sql_parse.cc
> @@ -4489,6 +4489,7 @@ mysql_execute_command(THD *thd)
> case SQLCOM_INSERT:
> {
> WSREP_SYNC_WAIT(thd, WSREP_SYNC_WAIT_BEFORE_INSERT_REPLACE);
> + select_result * sel_result = NULL;
Tabs instead of spaces. Delete the space after *. Delete the space before =.
> DBUG_ASSERT(first_table == all_tables && first_table != 0);
>
> WSREP_SYNC_WAIT(thd, WSREP_SYNC_WAIT_BEFORE_INSERT_REPLACE);
> @@ -4509,9 +4510,43 @@ mysql_execute_command(THD *thd)
> break;
>
> MYSQL_INSERT_START(thd->query());
> +
> + Protocol* UNINIT_VAR(save_protocol);
> + bool replaced_protocol = false;
Tabs instead of spaces. Delete space before =
> +
> + if (!thd->lex->returning_list.is_empty())
> + {
> + /*increment the status variable. This is useful for feedback plugin*/
I would remove this comment, the function name is suggestive enough.
> + status_var_increment(thd->status_var.feature_insert_returning);
> +
> + /*This is INSERT ... RETURNING. It will return output to the client */
Space after start of comment.
> + if (thd->lex->analyze_stmt)
> + {
> + /*
> + Actually, it is ANALYZE .. INSERT .. RETURNING. We need to produce
> + output and then discard it.
> + */
Trailign whitespace tab.
> + sel_result = new (thd->mem_root) select_send_analyze(thd);
Delete space before =
> + replaced_protocol = true;
Delete space before =
> + save_protocol = thd->protocol;
Delete space before =
> + thd->protocol = new Protocol_discard(thd);
Delete space before =
> + }
> + else
> + {
> + if (!lex->result && !(sel_result = new (thd->mem_root) select_send(thd)))
Delete space before =
> + goto error;
> + }
> + }
Tabs instead of spaces for this whole part.
> +
Trailing whitespaces.
> res= mysql_insert(thd, all_tables, lex->field_list, lex->many_values,
> lex->update_list, lex->value_list,
> - lex->duplicates, lex->ignore);
> + lex->duplicates, lex->ignore, lex->result ? lex->result : sel_result);
This line overflows the 80 character line rule.
Wrap the last parameter to another line. IT also makes things a lot easier
to read.
> + if (replaced_protocol)
> + {
Tabs instead of spaces for these 2 lines.
> + delete thd->protocol;
> + thd->protocol= save_protocol;
> + }
> + delete sel_result;
Tab instead of spaces.
> MYSQL_INSERT_DONE(res, (ulong) thd->get_row_count_func());
> /*
> If we have inserted into a VIEW, and the base table has
> @@ -4547,6 +4582,8 @@ mysql_execute_command(THD *thd)
> {
> WSREP_SYNC_WAIT(thd, WSREP_SYNC_WAIT_BEFORE_INSERT_REPLACE);
> select_insert *sel_result;
> + TABLE_LIST *save_first= NULL;
Tab instead of space after =.
> + select_result* result = NULL;
Delete space after =. Move * close to result not to select_result.
> bool explain= MY_TEST(lex->describe);
> DBUG_ASSERT(first_table == all_tables && first_table != 0);
> WSREP_SYNC_WAIT(thd, WSREP_SYNC_WAIT_BEFORE_UPDATE_DELETE);
> @@ -4595,6 +4632,34 @@ mysql_execute_command(THD *thd)
> Only the INSERT table should be merged. Other will be handled by
> select.
> */
> +
> + Protocol* UNINIT_VAR(save_protocol);
> + bool replaced_protocol = false;
Tabs instead of spaces. Delete space before =.
> +
> + if (!thd->lex->returning_list.is_empty())
> + {
Tabs instead of spaces.
> + /*increment the status variable. This is useful for feedback plugin*/
I would remove this comment, the function name is suggestive enough.
> + status_var_increment(thd->status_var.feature_insert_returning);
> +
> + /* This is INSERT ... RETURNING. It will return output to the client */
> + if (thd->lex->analyze_stmt)
> + {
> + /*
> + Actually, it is ANALYZE .. INSERT .. RETURNING. We need to produce
> + output and then discard it.
> + */
> + result = new (thd->mem_root) select_send_analyze(thd);
> + replaced_protocol = true;
> + save_protocol = thd->protocol;
> + thd->protocol = new Protocol_discard(thd);
> + }
> + else
> + {
> + if (!lex->result && !(result = new (thd->mem_root) select_send(thd)))
> + goto error;
> + }
> + }
> +
> /* Skip first table, which is the table we are inserting in */
> TABLE_LIST *second_table= first_table->next_local;
> /*
> @@ -4604,18 +4669,33 @@ mysql_execute_command(THD *thd)
> TODO: fix it by removing the front element (restoring of it should
> be done properly as well)
> */
> +
Trailing whitespaces for this line.
> + /*
> + Also, if items are present in returning_list, then we need those items
> + to point to INSERT table during setup_fields() and setup_wild(). But
> + it gets masked before that. So we save the values in saved_first,
> + saved_table_list and saved_first_name_resolution_context before they are masked.
> + We will later swap the saved values with the masked values if returning_list
> + is not empty in INSERT...SELECT...RETURNING.
This comment text needs to indented 2 more spaces. Remove trailing
whitespaces. Also, make sure to keep within 80 characters per line.
> + */
> +
> + TABLE_LIST *save_first=select_lex->table_list.first;
Tabs instead of spaces. Add space after =.
> select_lex->table_list.first= second_table;
> + select_lex->context.saved_table_list=select_lex->context.table_list;
Tabs instead of spaces. Add space after =.
> + select_lex->context.saved_name_resolution_table=
Tabs instead of spaces.
> + select_lex->context.first_name_resolution_table;
> select_lex->context.table_list=
> select_lex->context.first_name_resolution_table= second_table;
> - res= mysql_insert_select_prepare(thd);
> - if (!res && (sel_result= new (thd->mem_root) select_insert(thd,
> - first_table,
> - first_table->table,
> - &lex->field_list,
> - &lex->update_list,
> - &lex->value_list,
> - lex->duplicates,
> - lex->ignore)))
> + res= mysql_insert_select_prepare(thd,lex->result ? lex->result : result);
> + if (!res && (sel_result= new (thd->mem_root) select_insert(thd, first_table,
> + first_table->table,
> + &lex->field_list,
> + &lex->update_list,
> + &lex->value_list,
> + lex->duplicates,
> + lex->ignore,
> + lex->result ? lex->result : result,
> + save_first)))
This is ugly. I suggest you keep just new (thd->mem_root) on one line and wrap
everything to a new line. Something like:
+ if (!res &&
+ (sel_result= new (thd->mem_root)
+ select_insert(thd, first_table,
+ first_table->table,
+ &lex->field_list,
+ &lex->update_list,
+ &lex->value_list,
+ lex->duplicates,
+ lex->ignore,
+ lex->result ? lex->result : result,
+ save_first)))
>
> {
> if (lex->analyze_stmt)
> ((select_result_interceptor*)sel_result)->disable_my_ok_calls();
> @@ -4630,6 +4710,7 @@ mysql_execute_command(THD *thd)
> TODO: this is workaround. right way will be move invalidating in
> the unlock procedure.
> */
> +
Extra line added for no good reason.
> if (!res && first_table->lock_type == TL_WRITE_CONCURRENT_INSERT &&
> thd->lock)
> {
> @@ -4650,8 +4731,13 @@ mysql_execute_command(THD *thd)
> sel_result->abort_result_set();
> }
> delete sel_result;
> + delete result;
> + }
> + if (replaced_protocol)
> + {
> + delete thd->protocol;
> + thd->protocol= save_protocol;
> }
> -
No need to delete this empty line.
> if (!res && (explain || lex->analyze_stmt))
> res= thd->lex->explain->send_explain(thd);
>
> diff --git a/sql/sql_prepare.cc b/sql/sql_prepare.cc
> index 8088b6923f2..2f5d0f17451 100644
> --- a/sql/sql_prepare.cc
> +++ b/sql/sql_prepare.cc
> @@ -1296,7 +1296,7 @@ static bool mysql_test_insert(Prepared_statement *stmt,
>
> if (mysql_prepare_insert(thd, table_list, table_list->table,
> fields, values, update_fields, update_values,
> - duplic, &unused_conds, FALSE))
> + duplic, &unused_conds, FALSE,false))
FALSE mixed with false. As this function call uses FALSE, let's use FALSE too.
I know, the file is inconsistent, it uses FALSE and false at times. :(
Let's do our best here and not make things stranger than they have to be.
Space after last comma too.
Like so: duplic, &unused_conds, FALSE, FALSE))
> goto error;
>
> value_count= values->elements;
> @@ -2154,7 +2154,7 @@ static int mysql_insert_select_prepare_tester(THD *thd)
> thd->lex->first_select_lex()->context.first_name_resolution_table=
> second_table;
>
> - return mysql_insert_select_prepare(thd);
> + return mysql_insert_select_prepare(thd,NULL);
space after , please.
> }
>
>
> diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy
> index 183a2504b70..c7706ca3c1f 100644
> --- a/sql/sql_yacc.yy
> +++ b/sql/sql_yacc.yy
> @@ -1911,11 +1912,14 @@ bool my_yyoverflow(short **a, YYSTYPE **b, size_t *yystacksize);
> %type <item_basic_constant> text_literal
>
> %type <item_list>
> - expr_list opt_udf_expr_list udf_expr_list when_list when_list_opt_else
> + opt_insert_update
Trailing whitespace.
> + opt_select_expressions expr_list
Trailing whitespace.
> + opt_udf_expr_list udf_expr_list when_list when_list_opt_else
> ident_list ident_list_arg opt_expr_list
> decode_when_list_oracle
> execute_using
> execute_params
> + select_item_list
>
> %type <sp_cursor_stmt>
> sp_cursor_stmt_lex
> @@ -9223,6 +9226,7 @@ query_specification_start:
> select_item_list
> {
> Select->parsing_place= NO_MATTER;
> + Select->item_list=*($5);
Space after =.
> }
> ;
>
> @@ -9544,9 +9548,24 @@ opt_lock_wait_timeout_new:
> }
> ;
>
> + /*
> + Here, we make select_item_list return List<Item> to prevent it from adding
> + everything to SELECT_LEX::item_list. If items are already there in the item_list
> + then using RETURNING with INSERT...SELECT is not possible because rules occuring
> + after insert_values add everything to SELECT_LEX::item_list.
> + */
This comment must be wrapped at 80 lines.
> +
Delete this final line.
> select_item_list:
> select_item_list ',' select_item
> + {
> + $1->push_back($3,thd->mem_root);
Space after ,
> + $$=$1;
Space after =
> + }
> | select_item
> + {
> + if (unlikely(!($$= List<Item>::make(thd->mem_root, $1))))
> + MYSQL_YYABORT;
> + }
> | '*'
> {
> Item *item= new (thd->mem_root)
> @@ -9554,24 +9573,23 @@ select_item_list:
> NULL, NULL, &star_clex_str);
> if (unlikely(item == NULL))
> MYSQL_YYABORT;
> - if (unlikely(add_item_to_list(thd, item)))
> + if (unlikely(!($$= List<Item>::make(thd->mem_root, item))))
> MYSQL_YYABORT;
> (thd->lex->current_select->with_wild)++;
> +
> + (thd->lex->current_select->with_wild)++;
This double's with_wild. This is a bug.
> }
> ;
>
> select_item:
> remember_name select_sublist_qualified_asterisk remember_end
> {
> - if (unlikely(add_item_to_list(thd, $2)))
> - MYSQL_YYABORT;
> + $$=$2;
> }
> | remember_name expr remember_end select_alias
> {
> DBUG_ASSERT($1 < $3);
> -
> - if (unlikely(add_item_to_list(thd, $2)))
> - MYSQL_YYABORT;
> + $$=$2;
> if ($4.str)
> {
> if (unlikely(Lex->sql_command == SQLCOM_CREATE_VIEW &&
> @@ -13307,13 +13325,15 @@ insert:
> Select->set_lock_for_tables($3, true);
> Lex->current_select= Lex->first_select_lex();
> }
> - insert_field_spec opt_insert_update
> - {
> + insert_field_spec opt_insert_update opt_select_expressions
> + {
Trailing whitespace.
> + if ($9)
> + Lex->returning_list=*($9);
> Lex->pop_select(); //main select
> if (Lex->check_main_unit_semantics())
> MYSQL_YYABORT;
> }
> - ;
> + ;
Trailing whitespace tab. Additionally the semicolon is not
Why change this? The semicolon is now indented wrong. Also trailing tab
introduced after.
>
> replace:
> REPLACE
> @@ -13331,13 +13351,15 @@ replace:
> Select->set_lock_for_tables($3, true);
> Lex->current_select= Lex->first_select_lex();
> }
> - insert_field_spec
> + insert_field_spec opt_select_expressions
> {
> + if ($7)
> + Lex->returning_list=*($7);
Space after =.
> Lex->pop_select(); //main select
> if (Lex->check_main_unit_semantics())
> MYSQL_YYABORT;
> }
> - ;
> + ;
Why change this? The semicolon is now indented wrong.
>
> insert_lock_option:
> /* empty */
> @@ -13389,8 +13411,8 @@ insert_table:
> ;
>
> insert_field_spec:
> - insert_values {}
> - | insert_field_list insert_values {}
> + insert_values
Trailing whitespace.
> + | insert_field_list insert_values
> | SET
> {
> LEX *lex=Lex;
> @@ -13732,6 +13754,8 @@ single_multi:
> {
> if ($3)
> Select->order_list= *($3);
> + if($5)
> + Select->item_list=*($5);
Space after if. Space after =.
> Lex->pop_select(); //main select
> }
> | table_wild_list
> @@ -13764,9 +13788,16 @@ single_multi:
> }
> ;
>
> + /*
Trailing whitespace.
> + Return NULL if the rule is empty else return the list of items
Trailing whitespace.
> + in the select expression
Trailing whitespce. Put a fullstop at the end of the sentence.
> + */
Comment should start at the beginning of line, not indented in this case.
> opt_select_expressions:
> - /* empty */
> - | RETURNING_SYM select_item_list
> + /* empty */ {$$=NULL;}
Space after =. Space after {. Space after ;
> + | RETURNING_SYM select_item_list
> + {
> + $$=$2;
Space after =.
> + }
> ;
>
> table_wild_list:
2
1
Report for week 10:
Hello!
Report for Week 10:
Hello!
This week I worked on adding system variable feature_insert_returning which
is useful for feedback plugin. This variable increments each time
INSERT...RETURNING or REPLACE...RETURNING is used. Before beginning, I did
some reading about feedback plugin and referred feature_subquery for adding
feature_insert_returning. Initially I was facing difficulty in finding the
test file that was to be updated. I was searching for test file in sys_vars
test suite. Zulip conversation helped me find the right test file and also
a missed step for publishing variable for SHOW STATUS. I have made
necessary changes in the code and have updated the test and result file.
I also found out that the current implementation was not showing the
expected output when AUTO_INCREMENT is used and the fields are not given
explicitly. On using RETURNING it was showing the value of all the rows in
auto_increment field as 0. Fixed that by calling send_data() after
write_record(). This also fixed another thing. The previous implementation,
for INSERT ... ON DUPLICATE KEY UPDATE...RETURNING was not showing the
updated value in the result set.
Example: If the table t1 has 1 row: id1=1 and val1='A'.
The below statement
INSERT INTO t1(id1,val1) VALUES(1,'B') ON DUPLICATE KEY UPDATE val1='C'
returning *;
was showing id1=1 and val='B' for the result even if it inserts val1='C'.
Now it is fixed and is returning val1='C'.
I have updated the result file for insert_returning.
I also fixed line endings for tests and added more comments to make our
implementation more understandable. I github repo is up to date with the
latest changes.
Regards,
Rucha Deodhar.
1
0

Re: [Maria-developers] [Commits] 1e58c579f5b: MDEV-18930: Failed CREATE OR REPLACE TEMPORARY not written into binary log makes data on master and slave diverge
by Sachin Setiya 02 Aug '19
by Sachin Setiya 02 Aug '19
02 Aug '19
Hi Sujatha!
On Thu, Aug 1, 2019 at 4:16 PM sujatha <sujatha.sivakumar(a)mariadb.com> wrote:
>
> revision-id: 1e58c579f5b61644a74045a0470d2c62b343eb75 (mariadb-10.1.41-3-g1e58c579f5b)
> parent(s): 0bb8f33b55cc016d0ede86b97990a3ee30dcb069
> author: Sujatha
> committer: Sujatha
> timestamp: 2019-08-01 16:16:05 +0530
> message:
>
> MDEV-18930: Failed CREATE OR REPLACE TEMPORARY not written into binary log makes data on master and slave diverge
>
> Problem:
> =======
> Failed CREATE OR REPLACE TEMPORARY TABLE statement which dropped the table but
> failed at a later stage of creation of temporary table is not written to
> binarylog in row based replication. This causes the slave to diverge.
>
> Analysis:
> ========
> CREATE OR REPLACE statements work as shown below.
>
> CREATE OR REPLACE TABLE table_name (a int);
> is basically the same as:
>
> DROP TABLE IF EXISTS table_name;
> CREATE TABLE table_name (a int);
>
> Hence every CREATE OR REPLACE TABLE command which dropped the table should be
> written to binary log, even when following CREATE TABLE part fails. In order
> to achieve this, during the execution of CREATE OR REPLACE command, when a
> table is dropped 'thd->log_current_statement' flag is set. When table creation
> results in an error within 'mysql_create_table' code, the error handling part
> looks for this flag. If it is set the failed CREATE OR REPLACE statement is
> written into the binary log inspite of error. This ensure that slave doesn't
> diverge from the master. In case of row based replication the error handling
> code returns very early, if the table is of type temporary. This is done based
> on the assumption that temporary tables are not replicated in row based
> replication.
>
> It fails to handle the cases where a temporary table was created as part of
> statement based replication at an earlier stage and the binary log format was
> changed to row because of an unsafe statement. In this case when a CREATE OR
> REPLACE statement is executed on this temporary table it will dropped but the
> query will not be written to binary log. Hence slave diverges.
>
> Fix:
> ===
> In error handling code check the return status of create table operation. If
> it is successful and replication mode is row based and table is of type
> temporary then return. Other wise proceed further to the code which checks for
> thd->log_current_statement flag and does appropriate logging.
>
> ---
> .../suite/rpl/r/rpl_create_or_replace_fail.result | 18 +++++++
> .../suite/rpl/t/rpl_create_or_replace_fail.test | 56 ++++++++++++++++++++++
> sql/sql_table.cc | 2 +-
> 3 files changed, 75 insertions(+), 1 deletion(-)
>
> diff --git a/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result b/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result
> new file mode 100644
> index 00000000000..57178f0efbe
> --- /dev/null
> +++ b/mysql-test/suite/rpl/r/rpl_create_or_replace_fail.result
> @@ -0,0 +1,18 @@
> +include/master-slave.inc
> +[connection master]
> +CREATE TEMPORARY TABLE t1 (a INT NOT NULL);
> +LOAD DATA INFILE 'x' INTO TABLE x;
> +ERROR 42S02: Table 'test.x' doesn't exist
> +CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x);
> +ERROR HY000: Cannot create temporary table with partitions
> +"************** DROP TEMPORARY TABLE Should be present in Binary log **************"
> +include/show_binlog_events.inc
> +Log_name Pos Event_type Server_id End_log_pos Info
> +master-bin.000001 # Gtid # # GTID #-#-#
> +master-bin.000001 # Query # # use `test`; CREATE TEMPORARY TABLE t1 (a INT NOT NULL)
> +master-bin.000001 # Gtid # # GTID #-#-#
> +master-bin.000001 # Query # # use `test`; CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x)
> +CREATE TABLE t1 (b INT);
> +INSERT INTO t1 VALUES (NULL);
> +DROP TABLE t1;
> +include/rpl_end.inc
> diff --git a/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test b/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test
> new file mode 100644
> index 00000000000..e75f34b0b56
> --- /dev/null
> +++ b/mysql-test/suite/rpl/t/rpl_create_or_replace_fail.test
> @@ -0,0 +1,56 @@
> +# ==== Purpose ====
> +#
> +# Test verifies that failed CREATE OR REPLACE TEMPORARY TABLE statement which
> +# dropped the table but failed at a later stage of creation of temporary table
> +# is written to binarylog in row based replication.
> +#
> +# ==== Implementation ====
> +#
> +# Steps:
> +# 0 - Have mixed based replication mode.
> +# 1 - Create a temporary table. It will be replicated as mixed replication
> +# mode is in use.
> +# 2 - Execute an unsafe statement which will switch current statement
> +# binlog format to 'ROW'. i.e If binlog_format=MIXED, there are open
> +# temporary tables, and an unsafe statement is executed, then subsequent
> +# statements are logged in row format.
Why not just set binlog format to row ?
> +# 3 - Execute a CREATE OR REPLACE TEMPORARY TABLE statement which tries to
> +# create partitions on temporary table. Since it is not supported it will
> +# fail.
> +# 4 - Check the binary log output to ensure that the failed statement is
> +# written to the binary log.
> +# 5 - Slave should be up and running and in sync with master.
> +#
> +# ==== References ====
> +#
> +# MDEV-18930: Failed CREATE OR REPLACE TEMPORARY not written into binary log
> +# makes data on master and slave diverge
> +#
> +
> +--source include/have_partition.inc
> +--source include/have_binlog_format_mixed.inc
> +--source include/master-slave.inc
> +
> +CREATE TEMPORARY TABLE t1 (a INT NOT NULL);
> +
> +# Execute an unsafe statement which switches replication mode internally from
> +# "STATEMENT" to "ROW".
> +--error ER_NO_SUCH_TABLE
> +LOAD DATA INFILE 'x' INTO TABLE x;
> +
> +--error ER_PARTITION_NO_TEMPORARY
> +CREATE OR REPLACE TEMPORARY TABLE t1 (x INT) PARTITION BY HASH(x);
> +
> +--echo "************** DROP TEMPORARY TABLE Should be present in Binary log **************"
> +--source include/show_binlog_events.inc
> +
> +CREATE TABLE t1 (b INT);
> +INSERT INTO t1 VALUES (NULL);
> +--sync_slave_with_master
> +
> +# Cleanup
> +--connection master
> +DROP TABLE t1;
> +
> +--source include/rpl_end.inc
> +
> diff --git a/sql/sql_table.cc b/sql/sql_table.cc
> index 656834c7852..fe923d73200 100644
> --- a/sql/sql_table.cc
> +++ b/sql/sql_table.cc
> @@ -5092,7 +5092,7 @@ bool mysql_create_table(THD *thd, TABLE_LIST *create_table,
>
> err:
> /* In RBR we don't need to log CREATE TEMPORARY TABLE */
> - if (thd->is_current_stmt_binlog_format_row() && create_info->tmp_table())
> + if (!result && thd->is_current_stmt_binlog_format_row() && create_info->tmp_table())
> DBUG_RETURN(result);
Can we only binary log only the drop part ? not the create part.
>
> if (create_info->tmp_table())
> _______________________________________________
> commits mailing list
> commits(a)mariadb.org
> https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
--
Regards
Sachin Setiya
Software Engineer at MariaDB
1
0
Report for week 9:
Hello!
This week I worked on documentation, added some comments to the code and
removed some lines of code from parser as it was still making use of
SELECT_LEX::item_list after opt_select_expressions.
I was referring DELETE...RETURNING test case and noticed case for using
stored functions was missing. I added that to INSERT...RETURNING and
REPLACE...RETURNING.
Also checked if the implementation supports other functions like string
functions, date-time functions, control flow functions, numeric function
and secondary functions(Since the aggregate functions cannot be used, I
wanted to check if these functions can be used)
I have made related changes in the documentation and my github repo is up
to date with latest changes.
Regards,
Rucha Deodhar
1
0

23 Jul '19
Hello!
While trying to release 10.0.26 for Debian and Ubuntu I noticed git
chokes again on Windows line endings. As our .gitattributes states
that by default git should clean up the line endings, why do we keep
getting new Windows line endings in new code?
mariadb-10.0$ cat .gitattributes
# Set the default behavior, in case people don't have core.autocrlf set.
* text=auto
...
storage/connect/mysql-test/connect/std_data/*.txt -text
mariadb-10.0.26$ find . | xargs file | grep CRLF
./storage/connect/mysql-test/connect/std_data/boyswin.txt:
ASCII text, with CRLF line terminators
./storage/connect/mysql-test/connect/std_data/expenses.txt:
ASCII text, with CRLF line terminators
./storage/connect/mysql-test/connect/std_data/emp.txt:
ASCII text, with CRLF line terminators
./mysql-test/std_data/loaddata7.dat:
ASCII text, with
CRLF line terminators
./mysql-test/r/loadxml.result:
ASCII text, with
CRLF, LF line terminators
./mysql-test/r/perror-win.result:
ASCII text, with
CRLF, LF line terminators
./mysql-test/r/mysql_binary_mode.result:
ASCII English
text, with CRLF, LF line terminators
./mysql-test/r/func_regexp_pcre.result:
UTF-8 Unicode C++
program text, with CRLF, CR, LF line terminators
./pcre/testdata/grepoutputN:
ASCII text, with CRLF, CR, LF line terminators
./pcre/testdata/greppatN4:
ASCII text, with CRLF line terminators
This leads for me to:
dpkg-source: info: building mariadb-10.0 using existing
./mariadb-10.0_10.0.26.orig.tar.gz
dpkg-source: info: local changes detected, the modified files are:
mariadb-10.0/storage/connect/mysql-test/connect/std_data/boyswin.txt
mariadb-10.0/storage/connect/mysql-test/connect/std_data/emp.txt
mariadb-10.0/storage/connect/mysql-test/connect/std_data/expenses.txt
dpkg-source: error: aborting due to unexpected upstream changes, see
/tmp/mariadb-10.0_10.0.26-1.diff.Yzwwhw
Which means I have to do extra work with importing the upstream sources..
- Otto
2
4
Report for week 8:
Hello!
This week I worked on the parser. I removed the hack "swapping lists" we
had used earlier. Now the grammar uses values from the bison stack.
However, I was making the list rule return the SELECT_LEX :: item_list.
This was fixed after the intermediate review.
I changed the grammar for INSERT...RETURNING and REPLACE...RETURNING, added
test and result files for REPLACE...RETURNING and modified the comments of
one test file.
I also worked on the intermediate review. Now the parser code doesn't rely
on SELECT_LEX :: item_list. In REPLACE...RETURNING a list was declared
inside the scope of "if" and using it outside the scope was illegal. So I
moved it to the scope outside of if and made the grammar return:
&list if select_lex::item_list is not empty else NULL.
My github repo is up to date with latest changes.
Regards,
Rucha Deodhar
1
0

21 Jul '19
Hi Rucha!
I've spent quite a bit of time looking at the implementation. It is working and
passes almost all test cases, although you forgot to also commit the
insert_parser updated result file. Good job so far.
I want to give you something to work on and think about for a bit before we do
another overall overview of the whole project. Hopefully this will help you
across all your coding endeavours, not just now. What I want you to learn is
that it is important to think about how your code can and will evolve in the
future. Just getting things to work is ok up to a certain point, but not
implementing things in a decoupled fashion is what we call "hacks". These hacks
are to be avoided as much as possible as it leads to much more difficult work
down the line. Let's elaborate on the topic.
There are a number of problems with our current code (hacks). I am not referring
to yours in particular. All these hacks are a bit difficult to overcome in one
go. I do not expect you to fix them yourself, not during this GSoC at least, but
it is something to consider and if you are willing to work towards fixing some
of them, excellent!
Historically our codebase evolved from a simple parser to the "monster" that you
see now. Sometimes due to time constraints, things were not implemented in the
most extendable way, rather they made use of certain particular constructs, only
valid in the contexts they were written in. Keyword here is context, as you'll
soon see. Some of this contain things you have interacted with so far in your
project. Here is an example that you've had to deal with:
A SELECT statement has what we call a SELECT LIST, this list is traditionally
stored in SELECT_LEX::item_list. This item_list however is overloaded at times.
For example DELETE ... RETURNING uses it to store the expressions part of the
RETURNING clause. This overloading generally works ok, unless you run into
situations like you did with your INSERT ... RETURNING project.
INSERT ... RETURNING clauses already made use of the item_list to store
something else. Now you were forced to introduce another list in SELECT_LEX,
which we called returning_list. Everything ok so far. But then, you ran into
another problem, the bison grammar rules do not put Items in the list you want.
They always used item_list, so you could not use your returning_list. So what
did we do then? Well, we came up with another hack! We suggested this hack to
you because it is something that will work quickly and get you unstuck and that
is to use the same grammar rules like before, but swap the item_list with the
returning list temporarily, such that the grammar rules will put the right items
in the right list. This works, but it masks the underlying problem. The problem
is that we have a very rigid implementation for select_item_list grammar rule.
What we should do is to make select_item_list grammar return a list using the
bison stack. That means no relying on SELECT_LEX::item_list to store our result
temporarily. You have done steps towards fixing this, which brings us to where
your code's current state. The problem with your implementation is you have not
lifted the restriction of the grammar rules making use of
SELECT_LIST::item_list. Instead, you have masked it by returning the address of
that member on the bison stack. The funny part is that this works, but it still
a hack. It is still a hack, because these grammar rules have a side effect of
modifying this member behind the scenes. A very, very, very (!) good rule is to
consider all side effects as bad, especially now that you are starting out. With
experience you might get away with them every now-and-then, but for now avoid
them like the plague and try to remove them whenever possible.
The current code makes use of a function I really don't like. The inline
function add_item_list(). It is one of those hacky functions that strictly
relies on context. The context is: take the thd->lex->current_select and put the
item passed as a parameter in the item_list. The diff I've attached shows an
implementation of how to use the bison stack properly. I took inspiration from
the "expr" rule. Unfortunately the hack chain goes deeper, but we need to do
baby steps and some things are best left outside your GSoC project, or we risk
going down a rabbit whole we might not get out of.
One other issue I've observed is with the REPLACE code here:
insert_field_spec { if
($6) { List<Item>
list; list=*($6); Lex->current_select-
>item_list.empty(); $6=&list; } } opt_
select_expressions { if ($8) Lex-
>returning_list=*($8); if ($6) Lex->current_select-
>item_list=*($6); Lex->pop_select(); //main select if
(Lex->check_main_unit_semantics()) MYSQL_YYABORT; }
This is really problematic and your probably wrote it because of
opt_select_expressions putting stuff into item_list. In the insert_field_spec
rule, you receive the address of current_select->item_list as parameter $6. You
store the value of this list (basically the pointer to the head element) in a
temporary variable that is only in scope during the if
statement (!). Attempting to use it after the if statement is over (via the $6
parameter) is illegal! It only works because the compiler did not chose to use
that area of memory for something else. Please find a way to store things here
properly.
After this is fixed, we'll spend the next month cleaning everything up,
documenting our code, the added test cases etc. We have an overall working
solution, but we need to make it as hack-free as possible, so we can later work
on supporting INSERT ... RETURNING in subqueries.
Let me know if anything in the email is unclear. It took me a while to write it,
trying to explain the reasoning, but I may have missed a few things. :)
Vicențiu
2
2

Re: [Maria-developers] MDEV-19429: Wrong query result with EXISTS and LIMIT 0
by Sergey Petrunia 19 Jul '19
by Sergey Petrunia 19 Jul '19
19 Jul '19
Hi Sanja,
Ok to push after the below input is addressed.
> commit ab5fa406b4b314705cb87ffd74111a518b549ff4
> Author: Oleksandr Byelkin <sanja(a)mariadb.com>
> Date: Wed Jul 17 12:31:45 2019 +0200
>
> MDEV-19429: Wrong query result with EXISTS and LIMIT 0
>
> Check EXISTS LIMIT before rewriting.
>
> diff --git a/sql/item_subselect.cc b/sql/item_subselect.cc
> index afc42dc08d5..85d91181337 100644
> --- a/sql/item_subselect.cc
> +++ b/sql/item_subselect.cc
> @@ -1432,12 +1432,18 @@ void Item_exists_subselect::fix_length_and_dec()
> {
> DBUG_ENTER("Item_exists_subselect::fix_length_and_dec");
> init_length_and_dec();
> - /*
> - We need only 1 row to determine existence (i.e. any EXISTS that is not
> - an IN always requires LIMIT 1)
> - */
> - thd->change_item_tree(&unit->global_parameters->select_limit,
> - new Item_int((int32) 1));
> + // If limit is not set or it is constant more than 1
> + if (!unit->global_parameters->select_limit ||
> + (unit->global_parameters->select_limit->basic_const_item() &&
> + unit->global_parameters->select_limit->val_int() > 1))
> + {
> + /*
> + We need only 1 row to determine existence (i.e. any EXISTS that is not
> + an IN always requires LIMIT 1)
> + */
> + thd->change_item_tree(&unit->global_parameters->select_limit,
> + new Item_int((int32) 1));
Please fix identation ^
> + }
> DBUG_PRINT("info", ("Set limit to 1"));a
Please move the DBUG_PRINT into the if () {...} . Because right now it will
print "set limit to 1" even when it didn't set it.
> DBUG_VOID_RETURN;
> }
I also observe that LIMIT clause is not printed into EXPLAIN EXTENDED output:
mysql> explain extended select * from t10 where exists (select * from one_k where a >55 order by a limit 100 offset 50);
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-----------------------------+
| 1 | PRIMARY | t10 | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | |
| 2 | SUBQUERY | one_k | ALL | NULL | NULL | NULL | NULL | 1342 | 100.00 | Using where; Using filesort |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-----------------------------+
2 rows in set, 1 warning (6.34 sec)
mysql> show warnings\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: select `test`.`t10`.`a` AS `a` from `test`.`t10` where exists(select 1 from `test`.`one_k` where (`test`.`one_k`.`a` > 55) order by `test`.`one_k`.`a`)
1 row in set (0.00 sec)
^^^ Note the lack of LIMIT above. It's missing only for EXISTS subqueries, for
other kinds of subqueries it is there. I guess this is outside of scope of this
MDEV and should be filed separately.
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
1
0

Re: [Maria-developers] Interaction between rpl_slave_state and rpl_binlog_state
by Kristian Nielsen 18 Jul '19
by Kristian Nielsen 18 Jul '19
18 Jul '19
Andrei Elkin <andrei.elkin(a)mariadb.com> writes:
> I would also raise another but relevant topic of maintaining
>
> gtid_"executed"_pos
>
> which is an union of all GTID executed regardless of their arrival
> method. E.g some of foreign (to the recipient server) domains gtid:s may
> The master potentially could be (made) interested in such table
> should create a replication mode without necessary binlogging on the
> server.
Sorry, I don't follow.
Do you mean to create a new table of GTID positions, which will be updated
by all transactions, whether originating locally or replicated by a slave
thread?
Or do you mean to have the mysql.gtid_slave_pos table updated also by
locally originated transactions?
Or do you mean to have a new system variable @@gtid_executed_pos which is
constructed from the existing binlog and mysql.gtid_slave_pos_table, similar
to @@gtid_current_pos, but maybe including more GTIDs somehow?
> This is not really something new as it's exactly how mysql implemented
> gtid bookkeeping.
I think MySQL originally kept track of GTIDs only in the binlog, right? And
binlog was required to be enabled for GTID to work? I do remember seeing
something that relaxed this requirement, but I haven't followed the details.
- Kristian.
2
2