developers
Threads by month
- ----- 2025 -----
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- 5 participants
- 6818 discussions
Re: [Maria-developers] 00f9df29224: MDEV-20787: Script dgcov.pl does not work
by Sergei Golubchik 23 Apr '20
by Sergei Golubchik 23 Apr '20
23 Apr '20
Hi, Anel!
On Apr 23, Anel Husakovic wrote:
> revision-id: 00f9df29224 (mariadb-10.2.29-23-g00f9df29224)
> parent(s): 6718d3bc324
> author: Anel Husakovic <anel(a)mariadb.org>
> committer: Anel Husakovic <anel(a)mariadb.org>
> timestamp: 2019-11-19 15:00:48 +0100
> message:
>
> MDEV-20787: Script dgcov.pl does not work
>
> Let's change CMakeList with `--coverage` flag as an alias for
> `-fprofile-arcs -ftest-coverage -lgcov` in addition.
> When the server is compiled with `-DENABLE_GCOV=ON`, from object files are generated
> `.gcno` and `.gcda` files.
> `./mtr --gcov is_check_constraint` is invoking the script calls
> `./dgcov.pl --purge`, `./mtr is_check_constraint`,
> `./dgcov.pl --generate>/var/last_changes.dgcov`.
> The `purge` flag is clearing `.gcda` files (and others extensions),
> while running the test new `.gcda` files are obtained.
> With `generate` flag, `gcov -i` (`intermediate format`) is called
> on obtained `<object-files-name>.gcda` files (`dbug.c.gcda` e.g.).
> The patch is tested on `gcov 6.3` and `gcov 7.4` versions
> and can be seen that resulting `.gcov` file for `6.3` creates
> `<full path>.gcov` (`dbug.c.gcda.gcov` e.g.) file,
> where `gcov 7.4` is still creating `object-file-names.gcov`(`dbug.c.gcov`) files
> as `gcov` in general is doing.
>
> diff --git a/CMakeLists.txt b/CMakeLists.txt
> index 0dcc2a75587..af025a0312f 100644
> --- a/CMakeLists.txt
> +++ b/CMakeLists.txt
> @@ -249,7 +249,7 @@ SET(CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} -DENABLED_DEBUG_SYNC")
>
> OPTION(ENABLE_GCOV "Enable gcov (debug, Linux builds only)" OFF)
> IF (ENABLE_GCOV)
> - MY_CHECK_AND_SET_COMPILER_FLAG("-fprofile-arcs -ftest-coverage -lgcov" DEBUG)
> + MY_CHECK_AND_SET_COMPILER_FLAG("--coverage" DEBUG)
Why?
Since what version does gcc support it?
> ENDIF()
>
> MY_CHECK_AND_SET_COMPILER_FLAG(-ggdb3 DEBUG)
> diff --git a/mysql-test/dgcov.pl b/mysql-test/dgcov.pl
> index fbc5540e697..47ffaca04ef 100755
> --- a/mysql-test/dgcov.pl
> +++ b/mysql-test/dgcov.pl
> @@ -68,8 +68,11 @@ if ($opt_purge)
> system($cmd)==0 or die "system($cmd): $? $!";
> exit 0;
> }
> -
> +my $gcov_vers= `gcov -v`;
> +$gcov_vers=~ s/\D//g;
> +$gcov_vers= substr($gcov_vers, 0, 1);
my gcov is 9.3.0. gcc version 10 is already out.
You get just the first digit, 10 will become 1.
Better to something like
$gcov_vers ~= s/^.*(\d+\.\d+\.\d+).*$/\1/s;
> find(\&gcov_one_file, $root);
> +undef $gcov_vers;
why?
> find(\&write_coverage, $root) if $opt_generate;
> exit 0 if $opt_only_gcov;
>
> @@ -162,7 +165,16 @@ sub gcov_one_file {
> }
>
> # now, read the generated file
> - open FH, '<', "$_.gcov" or die "open(<$_.gcov): $!";
> + if($gcov_vers<7)
> + {
> + open FH, '<', "$_.gcov" or die "open(<$_.gcov): $!";
> + }
> + else
> + {
> + my $f=substr $_, 0, -5;
> + open FH, '<', "$f.gcov" or die "open(<$f.gcov): $!";
> + undef $f;
> + }
I'd rather write it more compact, in two lines like
my $f = $gcov_vers < 7 ? $_ : substr $_, 0, -5;
open FH, '<', "$f.gcov" or die "open(<$f.gcov): $!";
> my $fname;
> while (<FH>) {
> chomp;
>
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0
[Maria-developers] fb963a7fd3b: MDEV-21117: --tc-heuristic-recover=rollback is not replication safe
by sujatha 21 Apr '20
by sujatha 21 Apr '20
21 Apr '20
revision-id: fb963a7fd3b3e0733fb7ad9ece6256aebd8c176c (mariadb-10.1.43-112-gfb963a7fd3b)
parent(s): ad4b70562bb94dd063eebde5189c6e730d3120a2
author: Sujatha
committer: Sujatha
timestamp: 2020-04-21 18:19:34 +0530
message:
MDEV-21117: --tc-heuristic-recover=rollback is not replication safe
Problem:
=======
When run after master server crash --tc-heuristic-recover=rollback produces
inconsistent server state with binlog still containing transactions that were
rolled back by the option. Such way recovered server may not be used for
replication.
Fix:
===
During "--tc-heuristic-recover=ROLLBACK", query the storage engine to get
binlog file name and position corresponding to the last committed transaction.
This marks the consistent binlog state. If last_commit_file is not set then
checkpoint binary log file is considered as starting point.
Look for first transactional event beyond the consistent binlog state. This
will be the starting point for heuristic rollback. Consider this event
specific starting point as binlog truncation position. Now traverse the rest
of binlogs beyond this point. During this traversal check for the presence of
DDL or non transactional operations, as they cannot be safely rolled back. If
such events are found the truncation will not happen it will return an error.
If only transactional events are found beyond the binlog truncation position
it is safe to truncate binlog. The log gets truncated to the identified
position and the GTID state is adjusted accordingly. If there are more binary
logs beyond the being truncated file they are all removed.
If recovery is initiated based on last committed transaction specific binary
log name and position, and there is failure during recovery, then recovery may
be retried by remove the source of the failure. i.e
--tc-heuristic-recover=rollback may be retried.
Retry will not be successful if recovery is based on checkpoint file. As at
the end of tc-heuristic-recover a new binary log file is created which will
change the checkpoint file. Appropriate error message is written to error log
for each case.
---
.../r/binlog_heuristic_rollback_active_log.result | 18 +
.../binlog/r/binlog_truncate_multi_log.result | 33 ++
.../r/binlog_truncate_multi_log_unsafe.result | 30 ++
.../suite/binlog/r/binlog_truncate_none.result | 42 ++
.../binlog/r/binlog_truncate_retry_success.result | 28 +
.../t/binlog_heuristic_rollback_active_log.test | 75 +++
.../suite/binlog/t/binlog_truncate_multi_log.test | 87 +++
.../binlog/t/binlog_truncate_multi_log_unsafe.test | 106 ++++
.../suite/binlog/t/binlog_truncate_none.test | 130 +++++
.../binlog/t/binlog_truncate_retry_success.test | 84 +++
.../suite/rpl/r/rpl_heuristic_fail_over.result | 53 ++
.../suite/rpl/t/rpl_heuristic_fail_over.test | 160 ++++++
sql/log.cc | 589 ++++++++++++++++++++-
sql/log.h | 7 +-
sql/mysqld.h | 3 +-
storage/innobase/handler/ha_innodb.cc | 7 +
storage/innobase/log/log0recv.cc | 5 +-
storage/xtradb/handler/ha_innodb.cc | 7 +
storage/xtradb/log/log0recv.cc | 5 +-
19 files changed, 1457 insertions(+), 12 deletions(-)
diff --git a/mysql-test/suite/binlog/r/binlog_heuristic_rollback_active_log.result b/mysql-test/suite/binlog/r/binlog_heuristic_rollback_active_log.result
new file mode 100644
index 00000000000..eff95d05aac
--- /dev/null
+++ b/mysql-test/suite/binlog/r/binlog_heuristic_rollback_active_log.result
@@ -0,0 +1,18 @@
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+RESET MASTER;
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+INSERT INTO t VALUES (10);
+SET DEBUG_SYNC= "commit_before_update_end_pos SIGNAL con1_ready WAIT_FOR con1_go";
+INSERT INTO t VALUES (20);
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+# Kill the server
+"One row should be present in table 't'"
+SELECT COUNT(*) FROM t;
+COUNT(*)
+1
+"gtid_binlog_state should be 0-1-2
+SELECT @@GLOBAL.gtid_binlog_state;
+@@GLOBAL.gtid_binlog_state
+0-1-2
+DROP TABLE t;
diff --git a/mysql-test/suite/binlog/r/binlog_truncate_multi_log.result b/mysql-test/suite/binlog/r/binlog_truncate_multi_log.result
new file mode 100644
index 00000000000..1a3d49b5463
--- /dev/null
+++ b/mysql-test/suite/binlog/r/binlog_truncate_multi_log.result
@@ -0,0 +1,33 @@
+SET @old_max_binlog_size= @@GLOBAL.max_binlog_size;
+SET GLOBAL max_binlog_size= 4096;
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+RESET MASTER;
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+FLUSH LOGS;
+"List of binary logs before rotation"
+show binary logs;
+Log_name File_size
+master-bin.000001 #
+master-bin.000002 #
+SET DEBUG_SYNC= "commit_after_release_LOCK_log SIGNAL con1_ready WAIT_FOR con1_go";
+INSERT INTO t1 VALUES (2, REPEAT("x", 4100));
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+"List of binary logs after rotation"
+show binary logs;
+Log_name File_size
+master-bin.000001 #
+master-bin.000002 #
+master-bin.000003 #
+# Kill the server
+"Zero rows shoule be present in table"
+SELECT COUNT(*) FROM t1;
+COUNT(*)
+0
+SELECT @@GLOBAL.gtid_current_pos;
+@@GLOBAL.gtid_current_pos
+0-1-1
+DROP TABLE t1;
+SELECT @@GLOBAL.gtid_binlog_state;
+@@GLOBAL.gtid_binlog_state
+0-1-2
diff --git a/mysql-test/suite/binlog/r/binlog_truncate_multi_log_unsafe.result b/mysql-test/suite/binlog/r/binlog_truncate_multi_log_unsafe.result
new file mode 100644
index 00000000000..5b7d45d81de
--- /dev/null
+++ b/mysql-test/suite/binlog/r/binlog_truncate_multi_log_unsafe.result
@@ -0,0 +1,30 @@
+SET @old_max_binlog_size= @@global.max_binlog_size;
+SET GLOBAL max_binlog_size= 4096;
+call mtr.add_suppression("Table '.*tm' is marked as crashed and should be repaired");
+call mtr.add_suppression("Got an error from unknown thread");
+call mtr.add_suppression("Checking table: '.*tm'");
+call mtr.add_suppression("Recovering table: '.*tm'");
+call mtr.add_suppression("tc-heuristic-recover cannot trim the binary log to");
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+RESET MASTER;
+CREATE TABLE ti (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+CREATE TABLE tm (f INT) ENGINE=MYISAM;
+SET DEBUG_SYNC= "commit_before_get_LOCK_commit_ordered SIGNAL con1_ready WAIT_FOR con1_go";
+INSERT INTO ti VALUES (2, REPEAT("x", 4100));
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+INSERT INTO tm VALUES (30);;
+# Kill the server
+FOUND /tc-heuristic-recover cannot trim the binary log to File/ in mysqld.1.err
+"Zero rows shoule be present in 'ti' table."
+SELECT COUNT(*) FROM ti;
+COUNT(*)
+0
+"One row must be present in 'tm' table."
+SELECT COUNT(*) FROM tm;
+COUNT(*)
+1
+SELECT @@GLOBAL.gtid_current_pos;
+@@GLOBAL.gtid_current_pos
+0-1-4
+DROP TABLE ti, tm;
diff --git a/mysql-test/suite/binlog/r/binlog_truncate_none.result b/mysql-test/suite/binlog/r/binlog_truncate_none.result
new file mode 100644
index 00000000000..bae87985208
--- /dev/null
+++ b/mysql-test/suite/binlog/r/binlog_truncate_none.result
@@ -0,0 +1,42 @@
+SET @old_max_binlog_size= @@global.max_binlog_size;
+SET GLOBAL max_binlog_size= 4096;
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+RESET MASTER;
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+INSERT INTO t1 VALUES (1, REPEAT("x", 4100));
+CREATE TABLE t2 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+show binary logs;
+Log_name File_size
+master-bin.000001 #
+master-bin.000002 #
+# Kill the server
+"Zero records should be there."
+SELECT COUNT(*) FROM t1;
+COUNT(*)
+1
+show binary logs;
+Log_name File_size
+master-bin.000001 #
+master-bin.000002 #
+master-bin.000003 #
+master-bin.000004 #
+DROP TABLE t1,t2;
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+RESET MASTER;
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+CREATE TABLE t2 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+show binary logs;
+Log_name File_size
+master-bin.000001 #
+# Kill the server
+"Zero records should be there."
+SELECT COUNT(*) FROM t1;
+COUNT(*)
+0
+SHOW TABLES;
+Tables_in_test
+t1
+t2
+DROP TABLE t1,t2;
diff --git a/mysql-test/suite/binlog/r/binlog_truncate_retry_success.result b/mysql-test/suite/binlog/r/binlog_truncate_retry_success.result
new file mode 100644
index 00000000000..cd0c4c23d05
--- /dev/null
+++ b/mysql-test/suite/binlog/r/binlog_truncate_retry_success.result
@@ -0,0 +1,28 @@
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+RESET MASTER;
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+INSERT INTO t1 VALUES (1,'dummy');
+SET DEBUG_SYNC= "commit_after_release_LOCK_log SIGNAL con1_ready WAIT_FOR con1_go";
+INSERT INTO t1 VALUES (2,'dummy');
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+"List of binary logs"
+show binary logs;
+Log_name File_size
+master-bin.000001 #
+# Kill the server
+TEST 1: Recovery with error
+TEST 2: Recovery without error
+"One row shoule be present in table"
+SELECT COUNT(*) FROM t1;
+COUNT(*)
+1
+"Two gtids should be present 0-1-2"
+SELECT @@GLOBAL.gtid_current_pos;
+@@GLOBAL.gtid_current_pos
+0-1-2
+"Two gtids should be present 0-1-3"
+DROP TABLE t1;
+SELECT @@GLOBAL.gtid_binlog_state;
+@@GLOBAL.gtid_binlog_state
+0-1-3
diff --git a/mysql-test/suite/binlog/t/binlog_heuristic_rollback_active_log.test b/mysql-test/suite/binlog/t/binlog_heuristic_rollback_active_log.test
new file mode 100644
index 00000000000..87af83a7a92
--- /dev/null
+++ b/mysql-test/suite/binlog/t/binlog_heuristic_rollback_active_log.test
@@ -0,0 +1,75 @@
+# ==== Purpose ====
+#
+# Test verifies the truncation of single binary log file.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Create table t1 and insert a row.
+# 1 - Insert an another row such that it gets written to binlog but commit
+# in engine fails as server crashed at this point.
+# 2 - Restart server with --tc-heuristic-recover=ROLLBACK
+# 3 - Upon server start 'master-bin.000001' will be truncated to contain
+# only the first insert
+#
+# ==== References ====
+#
+# MDEV-21117: --tc-heuristic-recover=rollback is not replication safe
+
+
+--source include/have_innodb.inc
+--source include/have_log_bin.inc
+--source include/have_debug.inc
+--source include/have_binlog_format_statement.inc
+
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+
+connect(master,localhost,root,,);
+connect(master1,localhost,root,,);
+
+--connection master
+RESET MASTER;
+CREATE TABLE t ( f INT ) ENGINE=INNODB;
+INSERT INTO t VALUES (10);
+
+--connection master1
+# Hold insert after write to binlog and before "run_commit_ordered" in engine
+SET DEBUG_SYNC= "commit_before_update_end_pos SIGNAL con1_ready WAIT_FOR con1_go";
+send INSERT INTO t VALUES (20);
+
+--connection master
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+--source include/kill_mysqld.inc
+--source include/wait_until_disconnected.inc
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart: --tc-heuristic-recover=ROLLBACK
+EOF
+--source include/wait_until_disconnected.inc
+
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart:
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection master
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--echo "One row should be present in table 't'"
+SELECT COUNT(*) FROM t;
+
+--echo "gtid_binlog_state should be 0-1-2
+SELECT @@GLOBAL.gtid_binlog_state;
+DROP TABLE t;
diff --git a/mysql-test/suite/binlog/t/binlog_truncate_multi_log.test b/mysql-test/suite/binlog/t/binlog_truncate_multi_log.test
new file mode 100644
index 00000000000..0a6e011627e
--- /dev/null
+++ b/mysql-test/suite/binlog/t/binlog_truncate_multi_log.test
@@ -0,0 +1,87 @@
+# ==== Purpose ====
+#
+# Test verifies truncation of multiple binary logs.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Create a table in innodb engine and execute FLUSH LOGS command to
+# generate a new binary log.
+# 1 - Set max_binlog_size= 4096. Insert a row such that the max_binlog_size
+# is reached and binary log gets rotated.
+# 2 - Using debug simulation make the server crash at a point where the DML
+# transaction is written to binary log but not committed in engine.
+# 3 - At the time of crash three binary logs will be there
+# master-bin.0000001, master-bin.000002 and master-bin.000003.
+# 4 - Restart server with --tc-heuristic-recover=ROLLBACK
+# 5 - Since the prepared DML in master-bin.000002 the binary log will be
+# truncated and master-bin.000003 will be removed.
+#
+# ==== References ====
+#
+# MDEV-21117: --tc-heuristic-recover=rollback is not replication safe
+
+
+--source include/have_innodb.inc
+--source include/have_log_bin.inc
+--source include/have_debug.inc
+--source include/have_debug_sync.inc
+--source include/have_binlog_format_row.inc
+
+SET @old_max_binlog_size= @@GLOBAL.max_binlog_size;
+SET GLOBAL max_binlog_size= 4096;
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+
+RESET MASTER;
+
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+FLUSH LOGS;
+connect(master1,localhost,root,,);
+connect(master2,localhost,root,,);
+
+--connection master1
+# Hold insert after write to binlog and before "run_commit_ordered" in engine.
+# Use "commit_after_release_LOCK_log" sync point. This point is reached after
+# the binary log end position is updated which actually triggers binlog to be
+# rotated.
+--echo "List of binary logs before rotation"
+--source include/show_binary_logs.inc
+SET DEBUG_SYNC= "commit_after_release_LOCK_log SIGNAL con1_ready WAIT_FOR con1_go";
+send INSERT INTO t1 VALUES (2, REPEAT("x", 4100));
+
+--connection master2
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+--echo "List of binary logs after rotation"
+--source include/show_binary_logs.inc
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+--source include/kill_mysqld.inc
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart: --tc-heuristic-recover=ROLLBACK --debug-dbug=d,simulate_innodb_forget_commit_pos
+EOF
+--source include/wait_until_disconnected.inc
+
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart:
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--echo "Zero rows shoule be present in table"
+SELECT COUNT(*) FROM t1;
+
+SELECT @@GLOBAL.gtid_current_pos;
+
+DROP TABLE t1;
+SELECT @@GLOBAL.gtid_binlog_state;
+
diff --git a/mysql-test/suite/binlog/t/binlog_truncate_multi_log_unsafe.test b/mysql-test/suite/binlog/t/binlog_truncate_multi_log_unsafe.test
new file mode 100644
index 00000000000..cf4f3d34a24
--- /dev/null
+++ b/mysql-test/suite/binlog/t/binlog_truncate_multi_log_unsafe.test
@@ -0,0 +1,106 @@
+# ==== Purpose ====
+#
+# Test verifies truncation of multiple binary logs will report an error on
+# unsafe scenario.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Set max_binlog_size= 4096. Create a table 'ti' using transactional
+# storage engine. Do an insert such that the max_binlog_size is reached
+# and binary log gets rotated. Hold this thread at a position where
+# transaction is written to binary log but not committed in engine.
+# 1 - Create a table named 'tm' using non transactional storage engine.
+# 2 - Insert a row in 'tm' table. The DML will reach the engine and it is
+# also written to binary log.
+# 3 - Kill and restart the server with --tc-heuristic-recover=ROLLBACK
+# 4 - Check for binary log truncation unsafe message in error log.
+# 5 - No rows should be present in 'ti' table. One row should be present in
+# 'tm' table.
+#
+# ==== References ====
+#
+# MDEV-21117: --tc-heuristic-recover=rollback is not replication safe
+
+
+--source include/have_innodb.inc
+--source include/have_log_bin.inc
+--source include/have_debug.inc
+--source include/have_debug_sync.inc
+--source include/have_binlog_format_row.inc
+
+SET @old_max_binlog_size= @@global.max_binlog_size;
+SET GLOBAL max_binlog_size= 4096;
+
+call mtr.add_suppression("Table '.*tm' is marked as crashed and should be repaired");
+call mtr.add_suppression("Got an error from unknown thread");
+call mtr.add_suppression("Checking table: '.*tm'");
+call mtr.add_suppression("Recovering table: '.*tm'");
+call mtr.add_suppression("tc-heuristic-recover cannot trim the binary log to");
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+
+RESET MASTER;
+
+CREATE TABLE ti (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+CREATE TABLE tm (f INT) ENGINE=MYISAM;
+
+connect(master1,localhost,root,,);
+connect(master2,localhost,root,,);
+
+--connection master1
+# Hold insert after write to binlog and before "run_commit_ordered" in engine
+SET DEBUG_SYNC= "commit_before_get_LOCK_commit_ordered SIGNAL con1_ready WAIT_FOR con1_go";
+--send INSERT INTO ti VALUES (2, REPEAT("x", 4100))
+
+--connection master2
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+--send INSERT INTO tm VALUES (30);
+
+--connection default
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+--source include/kill_mysqld.inc
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart: --tc-heuristic-recover=ROLLBACK --debug-dbug=d,simulate_innodb_forget_commit_pos
+EOF
+--source include/wait_until_disconnected.inc
+
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart:
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+# Check error log for correct messages.
+let $log_error_= `SELECT @@GLOBAL.log_error`;
+if(!$log_error_)
+{
+ # MySQL Server on windows is started with --console and thus
+ # does not know the location of its .err log, use default location
+ let $log_error_ = $MYSQLTEST_VARDIR/log/mysqld.1.err;
+}
+--let SEARCH_FILE=$log_error_
+--let SEARCH_RANGE=-50000
+--let SEARCH_PATTERN=tc-heuristic-recover cannot trim the binary log to File
+--source include/search_pattern_in_file.inc
+
+--echo "Zero rows shoule be present in 'ti' table."
+SELECT COUNT(*) FROM ti;
+--echo "One row must be present in 'tm' table."
+--replace_regex /Table '.*tm/Table 'tm/
+--disable_warnings
+SELECT COUNT(*) FROM tm;
+--enable_warnings
+SELECT @@GLOBAL.gtid_current_pos;
+
+DROP TABLE ti, tm;
diff --git a/mysql-test/suite/binlog/t/binlog_truncate_none.test b/mysql-test/suite/binlog/t/binlog_truncate_none.test
new file mode 100644
index 00000000000..cc3fd4f0b37
--- /dev/null
+++ b/mysql-test/suite/binlog/t/binlog_truncate_none.test
@@ -0,0 +1,130 @@
+# ==== Purpose ====
+#
+# Test case verifies no binlog truncation happens when non transactional
+# events are found in binlog after the last committed transaction.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Set max_binlog_size= 4096. Create a table and do an insert such that
+# the max_binlog_size is reached and binary log gets rotated.
+# 1 - Create a table in newly created binary log and crash the server
+# 2 - Restart server with --tc-heuristic-recover=ROLLBACK
+# 3 - Recovery code will get the last committed DML specific postion and
+# will try to check if binlog can be truncated upto this position.
+# Since a DDL is present beyond this no truncation will happen.
+# ==== References ====
+#
+# MDEV-21117: --tc-heuristic-recover=rollback is not replication safe
+
+
+--source include/have_innodb.inc
+--source include/have_log_bin.inc
+--source include/have_debug.inc
+--source include/have_binlog_format_row.inc
+
+SET @old_max_binlog_size= @@global.max_binlog_size;
+SET GLOBAL max_binlog_size= 4096;
+
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+
+RESET MASTER;
+
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+INSERT INTO t1 VALUES (1, REPEAT("x", 4100));
+CREATE TABLE t2 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+--source include/show_binary_logs.inc
+
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+--source include/kill_mysqld.inc
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart: --tc-heuristic-recover=ROLLBACK
+EOF
+--source include/wait_until_disconnected.inc
+
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart:
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--echo "Zero records should be there."
+SELECT COUNT(*) FROM t1;
+--source include/show_binary_logs.inc
+DROP TABLE t1,t2;
+
+# ==== Purpose ====
+#
+# Test case verifies no binlog truncation happens when only DDLs are present in
+# the binary log. Since none of the DMLs are performed in storage engine,
+# Engine will not have last committed transaction file name or position.
+# Truncation code should return success.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Create two table t1, t2
+# 1 - Kill and restart server with --tc-heuristic-recover=ROLLBACK
+# 2 - Only DDL statements are present in the binary log. Since
+# no DML was performed engine will not have last commited transaction
+# specific binary log name and position. Since no transactional events
+# are found, truncation code should simply return.
+#
+# ==== References ====
+#
+# MDEV-21117: --tc-heuristic-recover=rollback is not replication safe
+
+
+--source include/have_innodb.inc
+--source include/have_log_bin.inc
+--source include/have_debug.inc
+--source include/have_binlog_format_row.inc
+
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+
+RESET MASTER;
+
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+CREATE TABLE t2 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+--source include/show_binary_logs.inc
+
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+--source include/kill_mysqld.inc
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart: --tc-heuristic-recover=ROLLBACK --debug-dbug=d,simulate_innodb_forget_commit_pos
+EOF
+--source include/wait_until_disconnected.inc
+
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart:
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--echo "Zero records should be there."
+SELECT COUNT(*) FROM t1;
+SHOW TABLES;
+DROP TABLE t1,t2;
+
diff --git a/mysql-test/suite/binlog/t/binlog_truncate_retry_success.test b/mysql-test/suite/binlog/t/binlog_truncate_retry_success.test
new file mode 100644
index 00000000000..6e92667c9b4
--- /dev/null
+++ b/mysql-test/suite/binlog/t/binlog_truncate_retry_success.test
@@ -0,0 +1,84 @@
+# ==== Purpose ====
+#
+# Test verifies that tc-heuristic-recover=ROLLBACK can be retried.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Create a table in Innodb storage engine. Insert a row into the table.
+# During the commit of this DML engine will persist last committed
+# transaction specific binary log file name and position.
+# 1 - Do an another DML into the table, and simulate a crash in the middle
+# of DML commit, so that DML is present in binary log but not committed in
+# the storage engine.
+# 2 - Restart the server using --tc-heuristic-recover=ROLLBACK. Simulate an
+# error during the binary log rollback operation. Verify appropriate
+# error is reported in the error log.
+# 3 - Retry the --tc-heuristic-recver=ROLLBACK operation. Verify that
+# binary log gets truncated as expected.
+# 4 - Verify that global gtid state is according to the rolled back
+# transaction.
+# ==== References ====
+#
+# MDEV-21117: --tc-heuristic-recover=rollback is not replication safe
+
+
+--source include/have_innodb.inc
+--source include/not_embedded.inc
+--source include/have_log_bin.inc
+--source include/have_debug.inc
+--source include/have_debug_sync.inc
+--source include/have_binlog_format_row.inc
+
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+
+RESET MASTER;
+
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+INSERT INTO t1 VALUES (1,'dummy');
+connect(master1,localhost,root,,);
+connect(master2,localhost,root,,);
+
+--connection master1
+SET DEBUG_SYNC= "commit_after_release_LOCK_log SIGNAL con1_ready WAIT_FOR con1_go";
+send INSERT INTO t1 VALUES (2,'dummy');
+
+--connection master2
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+--echo "List of binary logs"
+--source include/show_binary_logs.inc
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+--source include/kill_mysqld.inc
+--source include/wait_until_disconnected.inc
+
+#
+# Server restart
+#
+--echo TEST 1: Recovery with error
+--error 1
+--exec $MYSQLD_LAST_CMD --tc-heuristic-recover=ROLLBACK --debug-dbug=d,fault_injection_opening_binlog > $MYSQLTEST_VARDIR/log/mysqld.1.err 2>&1
+--source include/wait_until_disconnected.inc
+
+--echo TEST 2: Recovery without error
+--error 1
+--exec $MYSQLD_LAST_CMD --tc-heuristic-recover=ROLLBACK > $MYSQLTEST_VARDIR/log/mysqld.1.err 2>&1
+--source include/wait_until_disconnected.inc
+
+--source include/start_mysqld.inc
+
+--echo "One row shoule be present in table"
+SELECT COUNT(*) FROM t1;
+
+--echo "Two gtids should be present 0-1-2"
+SELECT @@GLOBAL.gtid_current_pos;
+
+--echo "Two gtids should be present 0-1-3"
+DROP TABLE t1;
+SELECT @@GLOBAL.gtid_binlog_state;
+
+--disconnect master1
+--disconnect master2
diff --git a/mysql-test/suite/rpl/r/rpl_heuristic_fail_over.result b/mysql-test/suite/rpl/r/rpl_heuristic_fail_over.result
new file mode 100644
index 00000000000..02fc47def41
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_heuristic_fail_over.result
@@ -0,0 +1,53 @@
+include/rpl_init.inc [topology=1->2]
+include/stop_slave.inc
+set global rpl_semi_sync_slave_enabled = 1;
+CHANGE MASTER TO master_use_gtid= current_pos;
+include/start_slave.inc
+ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
+set global rpl_semi_sync_master_enabled = 1;
+set global rpl_semi_sync_master_wait_point=AFTER_SYNC;
+SET @old_max_binlog_size= @@global.max_binlog_size;
+SET GLOBAL max_binlog_size= 4096;
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+INSERT INTO t1 VALUES (1, 'dummy1');
+SET DEBUG_SYNC= "commit_before_update_end_pos SIGNAL con1_ready WAIT_FOR con1_go";
+INSERT INTO t1 VALUES (2, REPEAT("x", 4100));
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+# Kill the server
+include/stop_slave.inc
+SELECT @@GLOBAL.gtid_current_pos;
+@@GLOBAL.gtid_current_pos
+0-1-5
+FOUND /tc-heuristic-recover: Truncated binlog File: \.\/master\-bin\.000001 of Size:[0-9]*, to Position */ in mysqld.1.err
+set global rpl_semi_sync_master_enabled = 1;
+set global rpl_semi_sync_master_wait_point=AFTER_SYNC;
+CHANGE MASTER TO master_host='127.0.0.1', master_port=SERVER_MYPORT_2, master_user='root', master_use_gtid=CURRENT_POS;
+set global rpl_semi_sync_slave_enabled = 1;
+include/start_slave.inc
+INSERT INTO t1 VALUES (3, 'dummy3');
+SELECT COUNT(*) FROM t1;
+COUNT(*)
+2
+SHOW VARIABLES LIKE 'gtid_current_pos';
+Variable_name Value
+gtid_current_pos 0-2-6
+SHOW VARIABLES LIKE 'gtid_current_pos';
+Variable_name Value
+gtid_current_pos 0-2-6
+DROP TABLE t1;
+set global rpl_semi_sync_master_enabled = 0;
+set global rpl_semi_sync_slave_enabled = 0;
+set global rpl_semi_sync_master_wait_point=default;
+set global rpl_semi_sync_master_enabled = 0;
+set global rpl_semi_sync_slave_enabled = 0;
+set global rpl_semi_sync_master_wait_point=default;
+include/stop_slave.inc
+RESET MASTER;
+RESET SLAVE;
+RESET MASTER;
+RESET SLAVE;
+CHANGE MASTER TO master_host='127.0.0.1', master_port=SERVER_MYPORT_1, master_user='root', master_use_gtid=no;
+include/start_slave.inc
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_heuristic_fail_over.test b/mysql-test/suite/rpl/t/rpl_heuristic_fail_over.test
new file mode 100644
index 00000000000..39b1f7545d0
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_heuristic_fail_over.test
@@ -0,0 +1,160 @@
+# ==== Purpose ====
+#
+# Test verifies replication failover scenario.
+#
+# ==== Implementation ====
+#
+# Steps:
+# 0 - Have two servers with id's 1 and 2. Enable semi-sync based
+# replication. Have semi sync master wait point as 'after_sync'.
+# 1 - Create a table and insert a row. While inserting second row simulate
+# a server crash at once the transaction is written to binlog, flushed
+# and synced but the binlog position is not updated.
+# 2 - Restart the server using tc-heuristic-recover=ROLLBACK
+# 3 - Post restart switch the crashed master to slave. Execute CHANGE MASTER
+# TO command to connect to server id 2.
+# 4 - Slave should be able to connect to master.
+# 5 - Execute some DML on new master with server id 2. Ensure that it gets
+# replicated to server id 1.
+# 6 - Verify the "gtid_current_pos" for correctness.
+# 7 - Clean up
+#
+# ==== References ====
+#
+# MDEV-21117: --tc-heuristic-recover=rollback is not replication safe
+
+
+--source include/have_semisync.inc
+--source include/have_innodb.inc
+--source include/have_log_bin.inc
+--source include/have_debug.inc
+--source include/have_debug_sync.inc
+--source include/have_binlog_format_row.inc
+--let $rpl_topology=1->2
+--source include/rpl_init.inc
+
+--connection server_2
+--source include/stop_slave.inc
+set global rpl_semi_sync_slave_enabled = 1;
+CHANGE MASTER TO master_use_gtid= current_pos;
+--source include/start_slave.inc
+
+
+--connection server_1
+ALTER TABLE mysql.gtid_slave_pos ENGINE=InnoDB;
+set global rpl_semi_sync_master_enabled = 1;
+set global rpl_semi_sync_master_wait_point=AFTER_SYNC;
+SET @old_max_binlog_size= @@global.max_binlog_size;
+SET GLOBAL max_binlog_size= 4096;
+
+call mtr.add_suppression("Can't init tc log");
+call mtr.add_suppression("Aborting");
+
+CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
+INSERT INTO t1 VALUES (1, 'dummy1');
+--save_master_pos
+
+--connection server_2
+--sync_with_master
+
+--connection server_1
+connect(master1,localhost,root,,);
+connect(master2,localhost,root,,);
+
+--connection master1
+# Hold insert after write to binlog and before "run_commit_ordered" in engine
+SET DEBUG_SYNC= "commit_before_update_end_pos SIGNAL con1_ready WAIT_FOR con1_go";
+send INSERT INTO t1 VALUES (2, REPEAT("x", 4100));
+
+--connection master2
+SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
+--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+wait
+EOF
+
+--source include/kill_mysqld.inc
+--source include/wait_until_disconnected.inc
+
+--connection server_2
+--error 2003
+--source include/stop_slave.inc
+SELECT @@GLOBAL.gtid_current_pos;
+
+--connection server_1
+#
+# Server restart
+#
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart: --tc-heuristic-recover=ROLLBACK
+EOF
+--source include/wait_until_disconnected.inc
+
+--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+restart:
+EOF
+
+connection default;
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+--connection server_1
+--enable_reconnect
+--source include/wait_until_connected_again.inc
+
+# Check error log for correct messages.
+let $log_error_= `SELECT @@GLOBAL.log_error`;
+if(!$log_error_)
+{
+ # MySQL Server on windows is started with --console and thus
+ # does not know the location of its .err log, use default location
+ let $log_error_ = $MYSQLTEST_VARDIR/log/mysqld.1.err;
+}
+--let SEARCH_FILE=$log_error_
+--let SEARCH_RANGE=-50000
+--let SEARCH_PATTERN=tc-heuristic-recover: Truncated binlog File: \.\/master\-bin\.000001 of Size:[0-9]*, to Position *
+--source include/search_pattern_in_file.inc
+
+--connection server_2
+set global rpl_semi_sync_master_enabled = 1;
+set global rpl_semi_sync_master_wait_point=AFTER_SYNC;
+
+--connection server_1
+--replace_result $SERVER_MYPORT_2 SERVER_MYPORT_2
+eval CHANGE MASTER TO master_host='127.0.0.1', master_port=$SERVER_MYPORT_2, master_user='root', master_use_gtid=CURRENT_POS;
+set global rpl_semi_sync_slave_enabled = 1;
+--source include/start_slave.inc
+
+--connection server_2
+INSERT INTO t1 VALUES (3, 'dummy3');
+--save_master_pos
+
+--connection server_1
+--sync_with_master
+SELECT COUNT(*) FROM t1;
+SHOW VARIABLES LIKE 'gtid_current_pos';
+
+--connection server_2
+SHOW VARIABLES LIKE 'gtid_current_pos';
+DROP TABLE t1;
+set global rpl_semi_sync_master_enabled = 0;
+set global rpl_semi_sync_slave_enabled = 0;
+set global rpl_semi_sync_master_wait_point=default;
+--save_master_pos
+
+--connection server_1
+--sync_with_master
+set global rpl_semi_sync_master_enabled = 0;
+set global rpl_semi_sync_slave_enabled = 0;
+set global rpl_semi_sync_master_wait_point=default;
+--source include/stop_slave.inc
+RESET MASTER;
+RESET SLAVE;
+
+--connection server_2
+RESET MASTER;
+RESET SLAVE;
+--replace_result $SERVER_MYPORT_1 SERVER_MYPORT_1
+eval CHANGE MASTER TO master_host='127.0.0.1', master_port=$SERVER_MYPORT_1, master_user='root', master_use_gtid=no;
+--source include/start_slave.inc
+
+--source include/rpl_end.inc
diff --git a/sql/log.cc b/sql/log.cc
index 0efef6d1e29..66445944b40 100644
--- a/sql/log.cc
+++ b/sql/log.cc
@@ -3164,6 +3164,7 @@ MYSQL_BIN_LOG::MYSQL_BIN_LOG(uint *sync_period)
checksum_alg_reset(BINLOG_CHECKSUM_ALG_UNDEF),
relay_log_checksum_alg(BINLOG_CHECKSUM_ALG_UNDEF),
description_event_for_exec(0), description_event_for_queue(0),
+ last_commit_pos_offset(0),
current_binlog_id(0)
{
/*
@@ -3173,6 +3174,7 @@ MYSQL_BIN_LOG::MYSQL_BIN_LOG(uint *sync_period)
before main().
*/
index_file_name[0] = 0;
+ last_commit_pos_file[0]= 0;
bzero((char*) &index_file, sizeof(index_file));
bzero((char*) &purge_index_file, sizeof(purge_index_file));
}
@@ -4576,8 +4578,8 @@ int MYSQL_BIN_LOG::open_purge_index_file(bool destroy)
0, 0, MYF(MY_WME | MY_NABP | MY_WAIT_IF_FULL)))
{
error= 1;
- sql_print_error("MYSQL_BIN_LOG::open_purge_index_file failed to open register "
- " file.");
+ sql_print_error("MYSQL_BIN_LOG::open_purge_index_file failed to open "
+ "register file.");
}
}
DBUG_RETURN(error);
@@ -4742,7 +4744,7 @@ int MYSQL_BIN_LOG::purge_index_entry(THD *thd, ulonglong *reclaimed_space,
}
goto err;
}
-
+
error= 0;
DBUG_PRINT("info",("purging %s",log_info.log_file_name));
@@ -7877,6 +7879,7 @@ MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader)
first= false;
}
+ DEBUG_SYNC(leader->thd, "commit_before_update_end_pos");
/* update binlog_end_pos so it can be read by dump thread
*
* note: must be _after_ the RUN_HOOK(after_flush) or else
@@ -8964,7 +8967,7 @@ int TC_LOG_MMAP::open(const char *opt_name)
{
if (my_errno != ENOENT)
goto err;
- if (using_heuristic_recover())
+ if (using_heuristic_recover(opt_name))
return 1;
if ((fd= mysql_file_create(key_file_tclog, logname, CREATE_MODE,
O_RDWR | O_CLOEXEC, MYF(MY_WME))) < 0)
@@ -9497,14 +9500,40 @@ TC_LOG_MMAP tc_log_mmap;
1 heuristic recovery was performed
*/
-int TC_LOG::using_heuristic_recover()
+int TC_LOG::using_heuristic_recover(const char* opt_name)
{
+ LOG_INFO log_info;
+ int error;
+
if (!tc_heuristic_recover)
return 0;
sql_print_information("Heuristic crash recovery mode");
+
if (ha_recover(0))
+ {
sql_print_error("Heuristic crash recovery failed");
+ }
+
+ if (!strcmp(opt_name, opt_bin_logname) &&
+ tc_heuristic_recover == TC_HEURISTIC_RECOVER_ROLLBACK)
+ {
+ if ((error= mysql_bin_log.find_log_pos(&log_info, NullS, 1)))
+ {
+ if (error != LOG_INFO_EOF)
+ sql_print_error("tc-heuristic-recover failed in find_log_pos(). "
+ "Error: %d", error);
+ }
+ else
+ {
+ if ((error= heuristic_binlog_rollback()))
+ {
+ sql_print_error("Heuristic crash recovery of binary log failed.");
+ return 1;
+ }
+ }
+ }
+
sql_print_information("Please restart mysqld without --tc-heuristic-recover");
return 1;
}
@@ -9512,6 +9541,554 @@ int TC_LOG::using_heuristic_recover()
/****** transaction coordinator log for 2pc - binlog() based solution ******/
#define TC_LOG_BINLOG MYSQL_BIN_LOG
+/**
+ Truncates the current binlog to specified position. Removes the rest of binlogs
+ which are present after this binlog file.
+
+ @param truncate_file Holds the binlog name to be truncated
+ @param truncate_pos Position within binlog from where it needs to
+ truncated.
+
+ @retval true ok
+ @retval false error
+
+*/
+bool MYSQL_BIN_LOG::truncate_and_remove_binlogs(const char *truncate_file,
+ my_off_t truncate_pos)
+{
+ int error= 0;
+#ifdef HAVE_REPLICATION
+ LOG_INFO log_info;
+ THD *thd= current_thd;
+ my_off_t index_file_offset= 0;
+ File file= -1;
+ IO_CACHE cache;
+ MY_STAT s;
+ my_off_t binlog_size;
+
+ if ((error= find_log_pos(&log_info, truncate_file, 1)))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to locate binary log file:%s."
+ "Error:%d", truncate_file, error);
+ goto end;
+ }
+
+ while (!(error= find_next_log(&log_info, 1)))
+ {
+ if (!index_file_offset)
+ {
+ index_file_offset= log_info.index_file_start_offset;
+ if ((error= open_purge_index_file(TRUE)))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to open purge index "
+ "file:%s. Error:%d", purge_index_file_name, error);
+ goto end;
+ }
+ }
+ if ((error= register_purge_index_entry(log_info.log_file_name)))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to copy %s to purge index"
+ " file. Error:%d", log_info.log_file_name, error);
+ goto end;
+ }
+ }
+
+ if (error != LOG_INFO_EOF)
+ {
+ sql_print_error("tc-heuristic-recover: Failed to find the next binlog to "
+ "add to purge index register. Error:%d", error);
+ goto end;
+ }
+
+ if (is_inited_purge_index_file())
+ {
+ if (!index_file_offset)
+ index_file_offset= log_info.index_file_start_offset;
+
+ if ((error= sync_purge_index_file()))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to flush purge index "
+ "file. Error:%d", error);
+ goto end;
+ }
+
+ // Trim index file
+ if ((error=
+ mysql_file_chsize(index_file.file, index_file_offset, '\n',
+ MYF(MY_WME))) ||
+ (error=
+ mysql_file_sync(index_file.file, MYF(MY_WME|MY_SYNC_FILESIZE))))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to trim binlog index "
+ "file:%s to offset:%llu. Error:%d", index_file_name,
+ index_file_offset);
+ mysql_file_close(index_file.file, MYF(MY_WME));
+ goto end;
+ }
+
+ /* Reset data in old index cache */
+ if ((error= reinit_io_cache(&index_file, READ_CACHE, (my_off_t) 0, 0, 1)))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to reinit binlog index "
+ "file. Error:%d", error);
+ mysql_file_close(index_file.file, MYF(MY_WME));
+ goto end;
+ }
+
+ /* Read each entry from purge_index_file and delete the file. */
+ if ((error= purge_index_entry(thd, NULL, TRUE)))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to process registered "
+ "files that would be purged.");
+ goto end;
+ }
+ }
+
+ DBUG_ASSERT(truncate_pos);
+
+ if ((file= mysql_file_open(key_file_binlog, truncate_file,
+ O_RDWR | O_BINARY, MYF(MY_WME))) < 0)
+ {
+ error= 1;
+ sql_print_error("tc-heuristic-recover: Failed to open binlog file:%s for "
+ "truncation.", truncate_file);
+ goto end;
+ }
+ my_stat(truncate_file, &s, MYF(0));
+ binlog_size= s.st_size;
+
+ /* Change binlog file size to truncate_pos */
+ if ((error=
+ mysql_file_chsize(file, truncate_pos, 0, MYF(MY_WME))) ||
+ (error= mysql_file_sync(file, MYF(MY_WME|MY_SYNC_FILESIZE))))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to trim the "
+ "binlog file:%s to size:%llu. Error:%d",
+ truncate_file, truncate_pos, error);
+ goto end;
+ }
+ else
+ {
+ sql_print_information("tc-heuristic-recover: Truncated binlog "
+ "File: %s of Size:%llu, to Position:%llu.",
+ truncate_file, binlog_size, truncate_pos);
+ }
+ if (!(error= init_io_cache(&cache, file, IO_SIZE, WRITE_CACHE,
+ (my_off_t) truncate_pos, 0, MYF(MY_WME|MY_NABP))))
+ {
+ /*
+ Write Stop_log_event to ensure clean end point for the binary log being
+ truncated.
+ */
+ Stop_log_event se;
+ se.checksum_alg= (enum_binlog_checksum_alg) binlog_checksum_options;
+ if ((error= write_event(&se, &cache)))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to write stop event to "
+ "binary log. Errno:%d",
+ (cache.error == -1) ? my_errno : error);
+ goto end;
+ }
+ if ((error= flush_io_cache(&cache)) ||
+ (error= mysql_file_sync(file, MYF(MY_WME|MY_SYNC_FILESIZE))))
+ {
+ sql_print_error("tc-heuristic-recover: Faild to write stop event to "
+ "binary log. Errno:%d",
+ (cache.error == -1) ? my_errno : error);
+ }
+ }
+ else
+ sql_print_error("tc-heuristic-recover: Failed to initialize binary log "
+ "cache for writing stop event. Errno:%d",
+ (cache.error == -1) ? my_errno : error);
+
+end:
+ if (file >= 0)
+ {
+ end_io_cache(&cache);
+ mysql_file_close(file, MYF(MY_WME));
+ }
+ error= error || close_purge_index_file();
+#endif
+ return error > 0;
+}
+
+/**
+ Returns the checkpoint binlog file name found in the lastest binlog file.
+
+ @param checkpoint_file Holds the binlog checkpoint file name.
+
+ @retval 0 ok
+ @retval 1 error
+
+*/
+int TC_LOG_BINLOG::get_binlog_checkpoint_file(char* checkpoint_file)
+{
+ Log_event *ev= NULL;
+ bool binlog_checkpoint_found= false;
+ LOG_INFO log_info;
+ const char *errmsg;
+ IO_CACHE log;
+ File file;
+ Format_description_log_event fdle(BINLOG_VERSION);
+ char log_name[FN_REFLEN];
+ int error=1;
+
+ if (!fdle.is_valid())
+ return 1;
+
+ if ((error= find_log_pos(&log_info, NullS, 1)))
+ {
+ sql_print_error("tc-heuristic-recover: find_log_pos() failed to read first "
+ "binary log entry from index file.(error: %d)", error);
+ return error;
+ }
+
+ // Move to the last binary log.
+ do
+ {
+ strmake_buf(log_name, log_info.log_file_name);
+ } while (!(error= find_next_log(&log_info, 1)));
+
+ if (error != LOG_INFO_EOF)
+ {
+ sql_print_error("tc-heuristic-recover: find_next_log() failed "
+ "(error: %d)", error);
+ return error;
+ }
+ if ((file= open_binlog(&log, log_name, &errmsg)) < 0)
+ {
+ sql_print_error("tc-heuristic-recover failed to open the binlog:%s for "
+ "reading checkpoint file name. Error: %s",
+ log_info.log_file_name, errmsg);
+ return error;
+ }
+ while (!binlog_checkpoint_found &&
+ (ev=
+ Log_event::read_log_event(&log, 0, &fdle, opt_master_verify_checksum))
+ && ev->is_valid())
+ {
+ enum Log_event_type typ= ev->get_type_code();
+ if (typ == BINLOG_CHECKPOINT_EVENT)
+ {
+ size_t dir_len;
+ Binlog_checkpoint_log_event *cev= (Binlog_checkpoint_log_event *)ev;
+ if (cev->binlog_file_len >= FN_REFLEN)
+ {
+ sql_print_error("tc-heuristic-recover: Incorrect binlog checkpoint "
+ "event with too long file name found.");
+ delete ev;
+ ev= NULL;
+ end_io_cache(&log);
+ mysql_file_close(file, MYF(MY_WME));
+ return 1;
+ }
+ else
+ {
+ dir_len= dirname_length(log_name);
+ strmake(strnmov(checkpoint_file, log_name, dir_len),
+ cev->binlog_file_name, FN_REFLEN - 1 - dir_len);
+ binlog_checkpoint_found= true;
+ }
+ }
+ delete ev;
+ ev= NULL;
+ } // End of while
+ end_io_cache(&log);
+ mysql_file_close(file, MYF(MY_WME));
+ file= -1;
+ /*
+ Old binary log without checkpoint found, binlog truncation is not
+ possible. Hence return error.
+ */
+ if (!binlog_checkpoint_found)
+ return 1;
+
+ return 0;
+}
+
+
+/**
+ Truncates the binary log, according to the transactions that got rolled
+ back from engine, during heuristic-recover=ROLLBACK. Global GTID state is
+ adjusted as per the truncated binlog.
+
+ Called from @c TC_LOG::using_heuristic_recover(const char* opt_name)
+
+ @param opt_name The base name of binary log.
+
+ @return indicates success or failure of binlog rollback
+ @retval 0 success
+ @retval 1 failure
+
+*/
+int TC_LOG_BINLOG::heuristic_binlog_rollback()
+{
+ int error=0;
+#ifdef HAVE_REPLICATION
+ Log_event *ev= NULL;
+ char engine_commit_file[FN_REFLEN];
+ char binlog_truncate_file_name[FN_REFLEN];
+ char checkpoint_file[FN_REFLEN];
+ my_off_t binlog_truncate_pos= 0;
+ my_off_t engine_commit_pos= 0;
+ LOG_INFO log_info;
+ const char *errmsg;
+ IO_CACHE log;
+ File file=-1;
+ Format_description_log_event fdle(BINLOG_VERSION);
+ bool found_engine_commit_pos= false;
+ bool found_truncate_pos= false;
+ bool is_safe= true;
+ my_off_t tmp_truncate_pos=0;
+ rpl_gtid last_gtid;
+ bool last_gtid_standalone= false;
+ bool last_gtid_valid= false;
+ bool is_checkpoint_based_recovery= false;
+
+
+ DBUG_EXECUTE_IF("simulate_innodb_forget_commit_pos",
+ {
+ last_commit_pos_file[0]= 0;
+ };);
+
+ // Initialize engine_commit_file/pos
+ if (last_commit_pos_file[0] != 0)
+ {
+ strmake_buf(engine_commit_file, last_commit_pos_file);
+ engine_commit_pos= last_commit_pos_offset;
+ sql_print_information("tc-heuristic-recover: Initialising heuristic "
+ "rollback of binary log using last committed "
+ "transaction specific binary log name:%s and "
+ "position:%llu", last_commit_pos_file,
+ last_commit_pos_offset);
+ }
+ else
+ {
+ if ((error= get_binlog_checkpoint_file(checkpoint_file)))
+ {
+ is_checkpoint_based_recovery= true;
+ sql_print_error("tc-heuristic-recover: Failed to read latest checkpoint "
+ "binary log name.");
+ goto end;
+ }
+ strmake_buf(engine_commit_file, checkpoint_file);
+ sql_print_information("tc-heuristic-recover: Initialising heuristic "
+ "rollback of binary log using last checkpoint "
+ "file:%s.", engine_commit_file);
+ /*
+ If there is no engine specific commit file we are doing checkpoint file
+ based recovery. Hence we mark "found_engine_commit_pos" true, and start
+ looking for the first transactional event group with is the candidate for
+ rollback.
+ */
+ found_engine_commit_pos= true;
+ }
+
+ if ((error= find_log_pos(&log_info, engine_commit_file, 1)))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to locate binary log file:%s "
+ "in index file. Error:%d", engine_commit_file, error);
+ goto end;
+ }
+ if ((file= open_binlog(&log, log_info.log_file_name, &errmsg)) < 0 ||
+ DBUG_EVALUATE_IF("fault_injection_opening_binlog", (errmsg="Unknown"),
+ FALSE))
+ {
+ error= 1;
+ sql_print_error("tc-heuristic-recover: Failed to open the binlog:%s for "
+ "recovery. Error:%s", log_info.log_file_name, errmsg);
+ goto end;
+ }
+
+
+ error= read_state_from_file();
+ if (error && error != 2)
+ {
+ sql_print_error("tc-heuristic-recover: Failed to load global gtid binlog "
+ "state from file");
+ goto end;
+ }
+ if (!fdle.is_valid())
+ {
+ error= 1;
+ sql_print_error("tc-heuristic-recover: Failed due to invalid format "
+ "description log event.");
+ goto end;
+ }
+
+ for(;;)
+ {
+ while (is_safe &&
+ (ev= Log_event::read_log_event(&log, 0, &fdle,
+ opt_master_verify_checksum)) && ev->is_valid())
+ {
+ enum Log_event_type typ= ev->get_type_code();
+ switch (typ)
+ {
+ case XID_EVENT:
+ if (ev->log_pos == engine_commit_pos)
+ {
+ found_engine_commit_pos= true;
+ }
+ break;
+ case GTID_LIST_EVENT:
+ {
+ Gtid_list_log_event *glev= (Gtid_list_log_event *)ev;
+ /* Initialise the binlog state from the Gtid_list event. */
+ if (!found_truncate_pos && glev->count > 0 &&
+ rpl_global_gtid_binlog_state.load(glev->list, glev->count))
+ {
+ error= 1;
+ sql_print_error("tc-heuristic-recover: Failed to read GTID List "
+ "event.");
+ goto end;
+ }
+ }
+ break;
+ case GTID_EVENT:
+ {
+ Gtid_log_event *gev= (Gtid_log_event *)ev;
+
+ /* Update the binlog state with any GTID logged after Gtid_list. */
+ last_gtid.domain_id= gev->domain_id;
+ last_gtid.server_id= gev->server_id;
+ last_gtid.seq_no= gev->seq_no;
+ last_gtid_standalone=
+ ((gev->flags2 & Gtid_log_event::FL_STANDALONE) ? true : false);
+ last_gtid_valid= true;
+ if (gev->flags2 & Gtid_log_event::FL_TRANSACTIONAL &&
+ !found_truncate_pos)
+ {
+ if ((engine_commit_pos == 0 || found_engine_commit_pos))
+ {
+ found_truncate_pos= true;
+ strmake_buf(binlog_truncate_file_name, log_info.log_file_name);
+ binlog_truncate_pos= tmp_truncate_pos;
+ }
+ }
+ else
+ {
+ if (found_truncate_pos)
+ is_safe= false;
+ }
+ }
+ break;
+ default:
+ /* Nothing. */
+ break;
+ }// End switch
+ if (!found_truncate_pos && last_gtid_valid &&
+ ((last_gtid_standalone && !ev->is_part_of_group(typ)) ||
+ (!last_gtid_standalone &&
+ (typ == XID_EVENT ||
+ (typ == QUERY_EVENT &&
+ (((Query_log_event *)ev)->is_commit() ||
+ ((Query_log_event *)ev)->is_rollback()))))))
+ {
+ if ((error= rpl_global_gtid_binlog_state.update_nolock(&last_gtid,
+ false)))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to update GTID within "
+ "global gtid state.");
+ goto end;
+ }
+ last_gtid_valid= false;
+ }
+ // Used to identify the last group specific end position.
+ tmp_truncate_pos= ev->log_pos;
+ delete ev;
+ ev= NULL;
+ }// End While
+ if (file >= 0)
+ {
+ end_io_cache(&log);
+ mysql_file_close(file, MYF(MY_WME));
+ file= -1;
+ }
+ if (is_safe)
+ {
+ if ((error= find_next_log(&log_info, 1)))
+ {
+ if (error != LOG_INFO_EOF)
+ {
+ sql_print_error("tc-heuristic-recover: Failed to read next binary "
+ "log during recovery.");
+ goto end;
+ }
+ else
+ {
+ error= 0; // LOG_INFO_EOF= -1 is not an error.
+ break;
+ }
+ }
+ if ((file= open_binlog(&log, log_info.log_file_name, &errmsg)) < 0)
+ {
+ error= 1;
+ sql_print_error("tc-heuristic-recover: Failed to open the binlog:%s for "
+ "recovery. Error:%s", log_info.log_file_name, errmsg);
+ goto end;
+ }
+ }
+ else
+ break;
+ } //end of for(;;)
+ sql_print_information("tc-heuristic-recover: Binary log to be truncated "
+ "File:%s Pos:%llu.", binlog_truncate_file_name,
+ binlog_truncate_pos);
+ if (!found_truncate_pos)
+ goto end; // Nothing to truncate
+ else
+ DBUG_ASSERT(binlog_truncate_pos > 0);
+
+ if (is_safe)
+ {
+ if ((error= truncate_and_remove_binlogs(binlog_truncate_file_name,
+ binlog_truncate_pos)))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to trim the binary log to "
+ "File:%s Pos:%llu.", binlog_truncate_file_name,
+ binlog_truncate_pos);
+ goto end;
+ }
+ }
+ else
+ {
+ sql_print_error("tc-heuristic-recover cannot trim the binary log to "
+ "File:%s Pos:%llu as unsafe statements (non-trans/DDL) "
+ "statements are found beyond the truncation position.",
+ binlog_truncate_file_name, binlog_truncate_pos);
+ }
+ if ((error= write_state_to_file()))
+ {
+ sql_print_error("tc-heuristic-recover: Failed to write global gtid state "
+ "to file");
+ goto end;
+ }
+
+end:
+ if (file >= 0)
+ {
+ end_io_cache(&log);
+ mysql_file_close(file, MYF(MY_WME));
+ }
+ if (error)
+ {
+ if (is_checkpoint_based_recovery)
+ {
+ sql_print_error("tc-heuristic-recover failed during binary log rollback."
+ "Further retry attemps won't succeed.");
+ }
+ else
+ sql_print_error("tc-heuristic-recover failed during binary log rollback. "
+ "After removing the source of the failure "
+ "--tc-heuristic-recover=rollback may be retried.");
+ }
+
+#endif
+ return error;
+}
+
int TC_LOG_BINLOG::open(const char *opt_name)
{
int error= 1;
@@ -9526,7 +10103,7 @@ int TC_LOG_BINLOG::open(const char *opt_name)
return 1;
}
- if (using_heuristic_recover())
+ if (using_heuristic_recover(opt_name))
{
mysql_mutex_lock(&LOCK_log);
/* generate a new binlog to mask a corrupted one */
diff --git a/sql/log.h b/sql/log.h
index 277e5c6f69c..84c0a457b52 100644
--- a/sql/log.h
+++ b/sql/log.h
@@ -41,7 +41,8 @@ bool stmt_has_updated_non_trans_table(const THD* thd);
class TC_LOG
{
public:
- int using_heuristic_recover();
+ int using_heuristic_recover(const char* opt_name);
+ virtual int heuristic_binlog_rollback() { return 0; };
TC_LOG() {}
virtual ~TC_LOG() {}
@@ -694,6 +695,7 @@ class MYSQL_BIN_LOG: public TC_LOG, private MYSQL_LOG
void commit_checkpoint_notify(void *cookie);
int recover(LOG_INFO *linfo, const char *last_log_name, IO_CACHE *first_log,
Format_description_log_event *fdle, bool do_xa);
+ int heuristic_binlog_rollback();
int do_binlog_recovery(const char *opt_name, bool do_xa_recovery);
#if !defined(MYSQL_CLIENT)
@@ -794,6 +796,9 @@ class MYSQL_BIN_LOG: public TC_LOG, private MYSQL_LOG
int purge_first_log(Relay_log_info* rli, bool included);
int set_purge_index_file_name(const char *base_file_name);
int open_purge_index_file(bool destroy);
+ bool truncate_and_remove_binlogs(const char *truncate_file,
+ my_off_t truncate_pos);
+ int get_binlog_checkpoint_file(char* checkpoint_file);
bool is_inited_purge_index_file();
int close_purge_index_file();
int clean_purge_index_file();
diff --git a/sql/mysqld.h b/sql/mysqld.h
index e939524dbff..515d5f39c3d 100644
--- a/sql/mysqld.h
+++ b/sql/mysqld.h
@@ -94,6 +94,7 @@ extern "C" MYSQL_PLUGIN_IMPORT CHARSET_INFO *system_charset_info;
extern MYSQL_PLUGIN_IMPORT CHARSET_INFO *files_charset_info ;
extern MYSQL_PLUGIN_IMPORT CHARSET_INFO *national_charset_info;
extern MYSQL_PLUGIN_IMPORT CHARSET_INFO *table_alias_charset;
+extern MYSQL_PLUGIN_IMPORT bool opt_bin_log;
/**
Character set of the buildin error messages loaded from errmsg.sys.
@@ -104,7 +105,7 @@ extern CHARSET_INFO *character_set_filesystem;
extern MY_BITMAP temp_pool;
extern bool opt_large_files, server_id_supplied;
-extern bool opt_update_log, opt_bin_log, opt_error_log;
+extern bool opt_update_log, opt_error_log;
extern my_bool opt_log, opt_bootstrap;
extern my_bool opt_backup_history_log;
extern my_bool opt_backup_progress_log;
diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index 4de2cdbeaec..552348ee2c5 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -16493,6 +16493,13 @@ innobase_xa_recover(
{
DBUG_ASSERT(hton == innodb_hton_ptr);
+ if (opt_bin_log)
+ {
+ mysql_bin_log.last_commit_pos_offset= trx_sys_mysql_bin_log_pos;
+ strmake_buf(mysql_bin_log.last_commit_pos_file,
+ trx_sys_mysql_bin_log_name);
+ }
+
if (len == 0 || xid_list == NULL) {
return(0);
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index 3b3c7c23224..c8a742d95e9 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -2,7 +2,7 @@
Copyright (c) 1997, 2017, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2012, Facebook Inc.
-Copyright (c) 2013, 2019, MariaDB Corporation.
+Copyright (c) 2013, 2020, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
@@ -3301,7 +3301,8 @@ void
recv_recovery_from_checkpoint_finish(void)
/*======================================*/
{
- if (recv_needed_recovery) {
+ extern MYSQL_PLUGIN_IMPORT bool opt_bin_log;
+ if (opt_bin_log) {
trx_sys_print_mysql_master_log_pos();
trx_sys_print_mysql_binlog_offset();
}
diff --git a/storage/xtradb/handler/ha_innodb.cc b/storage/xtradb/handler/ha_innodb.cc
index 2aafb1a44ee..a966f726a03 100644
--- a/storage/xtradb/handler/ha_innodb.cc
+++ b/storage/xtradb/handler/ha_innodb.cc
@@ -17165,6 +17165,13 @@ innobase_xa_recover(
{
DBUG_ASSERT(hton == innodb_hton_ptr);
+ if (opt_bin_log)
+ {
+ mysql_bin_log.last_commit_pos_offset= trx_sys_mysql_bin_log_pos;
+ strmake_buf(mysql_bin_log.last_commit_pos_file,
+ trx_sys_mysql_bin_log_name);
+ }
+
if (len == 0 || xid_list == NULL) {
return(0);
diff --git a/storage/xtradb/log/log0recv.cc b/storage/xtradb/log/log0recv.cc
index dd55d31218a..a4844507328 100644
--- a/storage/xtradb/log/log0recv.cc
+++ b/storage/xtradb/log/log0recv.cc
@@ -2,7 +2,7 @@
Copyright (c) 1997, 2017, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2012, Facebook Inc.
-Copyright (c) 2013, 2019, MariaDB Corporation.
+Copyright (c) 2013, 2020, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
@@ -3397,7 +3397,8 @@ void
recv_recovery_from_checkpoint_finish(void)
/*======================================*/
{
- if (recv_needed_recovery) {
+ extern MYSQL_PLUGIN_IMPORT bool opt_bin_log;
+ if (opt_bin_log) {
trx_sys_print_mysql_master_log_pos();
trx_sys_print_mysql_binlog_offset();
}
1
0
Re: [Maria-developers] 784cc5970dd: MDEV-19650: Privilege bug on MariaDB 10.4
by Sergei Golubchik 21 Apr '20
by Sergei Golubchik 21 Apr '20
21 Apr '20
Hi, Oleksandr!
On Apr 21, Oleksandr Byelkin wrote:
> >
> OK, INSERT probably is not needed, but we have tons (more than 40 I think)
> of DELETE in our tests (and so UPDATE should work).
Yes, the view is delete-able and some columns are update-able.
So, let's keep these privileges.
INSERT and LOAD never worked.
> Also we have not LOAD but this:
>
> rpl.rpl_current_user 'stmt' w7 [ fail ]
> Test ended at 2020-04-21 09:34:44
>
> CURRENT_TEST: rpl.rpl_current_user
> mysqltest: In included file "./include/diff_tables.inc":
> included from
> /home/sanja/maria/git/10.4/mysql-test/suite/rpl/t/rpl_current_user.test at
> line 65:
> At line 170: query 'SELECT * INTO OUTFILE '$_dt_outfile' FROM
> $_dt_database.$_dt_table ORDER BY `$_dt_column_list`' failed: 1356: View
> 'test.v_user' references invalid table(s) or column(s) or function(s) or
> definer/invoker of view lack rights to use them
This must be a bug. If one does
SELECT * INTO OUTFILE FROM some.view
than the current user should have FILE, not the view definer.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0
Re: [Maria-developers] 784cc5970dd: MDEV-19650: Privilege bug on MariaDB 10.4
by Sergei Golubchik 20 Apr '20
by Sergei Golubchik 20 Apr '20
20 Apr '20
Hi, Oleksandr!
On Apr 20, Oleksandr Byelkin wrote:
> > > > > +
> > > > > +CREATE TEMPORARY TABLE tmp_user_sys LIKE global_priv;
> > > > > +INSERT INTO tmp_user_sys (Host,User,Priv) VALUES ('localhost','mariadb.sys','{"access":512,"plugin":"mysql_native_password","authentication_string":"","account_locked":true,"password_last_changed":0}');
> > > > > +INSERT INTO global_priv SELECT * FROM tmp_user_sys WHERE NOT @had_sys_user;
> > > > > +DROP TABLE tmp_user_sys;
> > > >
> > > > 1. This could've been simply INSERT IGNORE, I suspect
> > >
> > > Nope, the idea is do not insert more than needed.
> >
> > Why would INSERT IGNORE insert more than needed?
>
> I thought you propose insert ignore instead of where clause, in any case I
> do not see why it should help.
I mean, instead of four lines above, I think, one can simply do
INSERT IGNORE global_priv (Host,User,Priv) VALUES ('localhost','mariadb.sys','{"access":512,"plugin":"mysql_native_password","authentication_string":"","account_locked":true,"password_last_changed":0}');
> > > 2. why access:512 ? It's FILE_ACL, iirc.
> > >
> > > Because LOAD used in tests and so in reality probably, so we need
> > FILE_ACL
> >
> > We need FILE_ACL, but why mariadb.sys account needs it?
> > mysql.user is a read-only view of the mysql.global_priv table,
> > its owner doesn't need FILE or INSERT/UPDATE/DELETE.
>
> As I told we somewhere in test suite (not main, but I don't remember
> exactly) test which require the permission otherwise I would not return and
> add more rights.
Where? The view is clearly not insertable-into, so INSERT privilege is
meaningless. Some fields are updateable, and one can delele from it.
This was more accidental than intentional, but ok, let's not change it
in 10.4.
And why does it need FILE privilege? The view owner cannot possibly do
any LOAD DATA.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0
Re: [Maria-developers] 784cc5970dd: MDEV-19650: Privilege bug on MariaDB 10.4
by Sergei Golubchik 20 Apr '20
by Sergei Golubchik 20 Apr '20
20 Apr '20
Hi, Oleksandr!
On Apr 20, Oleksandr Byelkin wrote:
> > Hi, Oleksandr!
> >
> > On Apr 20, Oleksandr Byelkin wrote:
> > > revision-id: 784cc5970dd (mariadb-10.4.11-68-g784cc5970dd)
> > > parent(s): c5e00fea102
> > > author: Oleksandr Byelkin <sanja(a)mariadb.com>
> > > committer: Oleksandr Byelkin <sanja(a)mariadb.com>
> > > timestamp: 2020-02-20 14:06:09 +0100
> > > message:
> > >
> > > MDEV-19650: Privilege bug on MariaDB 10.4
> > >
> > > diff --git a/scripts/mysql_system_tables.sql
> > > index 29f2a4c1ef6..af852444d0c 100644
> > > --- a/scripts/mysql_system_tables.sql
> > > +++ b/scripts/mysql_system_tables.sql
> > > @@ -33,9 +33,17 @@ CREATE TABLE IF NOT EXISTS db ( Host char(60)
> > binary DEFAULT '' NOT NULL, Db c
> > > -- Remember for later if db table already existed
> > > set @had_db_table= @@warning_count != 0;
> > >
> > > -CREATE TABLE IF NOT EXISTS global_priv (Host char(60) binary DEFAULT
> > '', User char(80) binary DEFAULT '', Priv JSON NOT NULL DEFAULT '{}'
> > CHECK(JSON_VALID(Priv)), PRIMARY KEY Host (Host,User)) engine=Aria
> > transactional=1 CHARACTER SET utf8 COLLATE utf8_bin comment='Users and
> > global privileges';
> > > +CREATE TABLE IF NOT EXISTS global_priv (Host char(60) binary DEFAULT
> > '', User char(80) binary DEFAULT '', Priv JSON NOT NULL DEFAULT '{}'
> > CHECK(JSON_VALID(Priv)), PRIMARY KEY (Host,User)) engine=Aria
> > transactional=1 CHARACTER SET utf8 COLLATE utf8_bin comment='Users and
> > global privileges';
> > >
> > > -CREATE DEFINER=root@localhost SQL SECURITY DEFINER VIEW IF NOT EXISTS
> > user AS SELECT
> > > +set @had_sys_user= @@warning_count != 0 OR 0 <> (select count(*) from
> > mysql.global_priv where Host="localhost" and User="mariadb.sys");
> > > +
> > > +CREATE TEMPORARY TABLE tmp_user_sys LIKE global_priv;
> > > +INSERT INTO tmp_user_sys (Host,User,Priv) VALUES
> > ('localhost','mariadb.sys','{"access":512,"plugin":"mysql_native_password","authentication_string":"","account_locked":true,"password_last_changed":0}');
> > > +INSERT INTO global_priv SELECT * FROM tmp_user_sys WHERE NOT
> > @had_sys_user;
> > > +DROP TABLE tmp_user_sys;
> >
> > 1. This could've been simply INSERT IGNORE, I suspect
>
> Nope, the idea is do not insert more than needed.
Why would INSERT IGNORE insert more than needed?
> 2. why access:512 ? It's FILE_ACL, iirc.
>
> Because LOAD used in tests and so in reality probably, so we need FILE_ACL
We need FILE_ACL, but why mariadb.sys account needs it?
mysql.user is a read-only view of the mysql.global_priv table,
its owner doesn't need FILE or INSERT/UPDATE/DELETE.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0
Re: [Maria-developers] 784cc5970dd: MDEV-19650: Privilege bug on MariaDB 10.4
by Sergei Golubchik 20 Apr '20
by Sergei Golubchik 20 Apr '20
20 Apr '20
Hi, Oleksandr!
On Apr 20, Oleksandr Byelkin wrote:
> revision-id: 784cc5970dd (mariadb-10.4.11-68-g784cc5970dd)
> parent(s): c5e00fea102
> author: Oleksandr Byelkin <sanja(a)mariadb.com>
> committer: Oleksandr Byelkin <sanja(a)mariadb.com>
> timestamp: 2020-02-20 14:06:09 +0100
> message:
>
> MDEV-19650: Privilege bug on MariaDB 10.4
>
> Also fixes:
> MDEV-21487: Implement option for mysql_upgrade that allows root@localhost to be replaced
> MDEV-21486: Implement option for mysql_install_db that allows root@localhost to be replaced
>
> Add user mariadb.sys to be definer of user view
> (and has right on underlying table global_priv for
> required operation over global_priv
> (SELECT,UPDATE,DELETE,INSERT,FILE))
>
> Also changed definer of gis functions in case of creation,
> but they work with any definer so upgrade script do not try
> to push this change.
>
> diff --git a/scripts/mysql_system_tables.sql b/scripts/mysql_system_tables.sql
> index 29f2a4c1ef6..af852444d0c 100644
> --- a/scripts/mysql_system_tables.sql
> +++ b/scripts/mysql_system_tables.sql
> @@ -33,9 +33,17 @@ CREATE TABLE IF NOT EXISTS db ( Host char(60) binary DEFAULT '' NOT NULL, Db c
> -- Remember for later if db table already existed
> set @had_db_table= @@warning_count != 0;
>
> -CREATE TABLE IF NOT EXISTS global_priv (Host char(60) binary DEFAULT '', User char(80) binary DEFAULT '', Priv JSON NOT NULL DEFAULT '{}' CHECK(JSON_VALID(Priv)), PRIMARY KEY Host (Host,User)) engine=Aria transactional=1 CHARACTER SET utf8 COLLATE utf8_bin comment='Users and global privileges';
> +CREATE TABLE IF NOT EXISTS global_priv (Host char(60) binary DEFAULT '', User char(80) binary DEFAULT '', Priv JSON NOT NULL DEFAULT '{}' CHECK(JSON_VALID(Priv)), PRIMARY KEY (Host,User)) engine=Aria transactional=1 CHARACTER SET utf8 COLLATE utf8_bin comment='Users and global privileges';
>
> -CREATE DEFINER=root@localhost SQL SECURITY DEFINER VIEW IF NOT EXISTS user AS SELECT
> +set @had_sys_user= @@warning_count != 0 OR 0 <> (select count(*) from mysql.global_priv where Host="localhost" and User="mariadb.sys");
> +
> +CREATE TEMPORARY TABLE tmp_user_sys LIKE global_priv;
> +INSERT INTO tmp_user_sys (Host,User,Priv) VALUES ('localhost','mariadb.sys','{"access":512,"plugin":"mysql_native_password","authentication_string":"","account_locked":true,"password_last_changed":0}');
> +INSERT INTO global_priv SELECT * FROM tmp_user_sys WHERE NOT @had_sys_user;
> +DROP TABLE tmp_user_sys;
1. This could've been simply INSERT IGNORE, I suspect
2. why access:512 ? It's FILE_ACL, iirc.
> +
> +CREATE DEFINER='mariadb.sys'@'localhost' SQL SECURITY DEFINER VIEW IF NOT EXISTS user AS SELECT
> Host,
> User,
> IF(JSON_VALUE(Priv, '$.plugin') IN ('mysql_native_password', 'mysql_old_password'), IFNULL(JSON_VALUE(Priv, '$.authentication_string'), ''), '') AS Password,
> @@ -101,6 +109,11 @@ CREATE TABLE IF NOT EXISTS servers ( Server_name char(64) NOT NULL DEFAULT '', H
>
> CREATE TABLE IF NOT EXISTS tables_priv ( Host char(60) binary DEFAULT '' NOT NULL, Db char(64) binary DEFAULT '' NOT NULL, User char(80) binary DEFAULT '' NOT NULL, Table_name char(64) binary DEFAULT '' NOT NULL, Grantor char(141) DEFAULT '' NOT NULL, Timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, Table_priv set('Select','Insert','Update','Delete','Create','Drop','Grant','References','Index','Alter','Create View','Show view','Trigger','Delete versioning rows') COLLATE utf8_general_ci DEFAULT '' NOT NULL, Column_priv set('Select','Insert','Update','References') COLLATE utf8_general_ci DEFAULT '' NOT NULL, PRIMARY KEY (Host,Db,User,Table_name), KEY Grantor (Grantor) ) engine=Aria transactional=1 CHARACTER SET utf8 COLLATE utf8_bin comment='Table privileges';
>
> +CREATE TEMPORARY TABLE tmp_user_sys LIKE tables_priv;
> +INSERT INTO tmp_user_sys (Host,Db,User,Table_name,Grantor,Timestamp,Table_priv) VALUES ('localhost','mysql','mariadb.sys','global_priv','root@localhost','0','Select,Insert,Update,Delete');
why Insert,Update,Delete ?
> +INSERT INTO tables_priv SELECT * FROM tmp_user_sys WHERE NOT @had_sys_user;
> +DROP TABLE tmp_user_sys;
> +
> CREATE TABLE IF NOT EXISTS columns_priv ( Host char(60) binary DEFAULT '' NOT NULL, Db char(64) binary DEFAULT '' NOT NULL, User char(80) binary DEFAULT '' NOT NULL, Table_name char(64) binary DEFAULT '' NOT NULL, Column_name char(64) binary DEFAULT '' NOT NULL, Timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, Column_priv set('Select','Insert','Update','References') COLLATE utf8_general_ci DEFAULT '' NOT NULL, PRIMARY KEY (Host,Db,User,Table_name,Column_name) ) engine=Aria transactional=1 CHARACTER SET utf8 COLLATE utf8_bin comment='Column privileges';
>
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
1
0
Re: [Maria-developers] MDEV-17399 Add support for JSON_TABLE: for ORDINALITY
by Sergey Petrunia 19 Apr '20
by Sergey Petrunia 19 Apr '20
19 Apr '20
Hi Alexey,
The standard says
"An ordinality column provides a sequential numbering of rows. Row numbering is
1-based."
Consider this example from the standard:
https://gist.github.com/spetrunia/d4d8564a3ed26148ae92035b24e1f294
SELECT
jt.rowseq, jt.name, jt.zip
FROM
bookclub,
JSON_TABLE(bookclub.jcol, "lax $"
COLUMNS ( rowSeq FOR ORDINALITY,
name VARCHAR(30) PATH 'lax $.Name',
zip CHAR(5) PATH 'lax $.address.postalCode')
) AS jt
+--------+---------+------+
| rowseq | name | zip |
+--------+---------+------+
| 0 | John Sm | 1 |
| 1 | Peter W | 9 |
| 2 | James L | NULL |
+--------+---------+------+
The numbering is 0-based, instead of 1-based.
What is not entirely clear for me is when the numbering should be reset.
For the above example, the SQL standard shows the resultset with rowseq 1-2-3
which hints that the numbering is "global".
MySQL-8 produces "1-1-1", which looks like they reset the numbering for every
scan of the JSON_TABLE output.
What are you thoughts on this?
If the numbering should be "global", should it be query-level global, or
subselect-level global?
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
1
1
Re: [Maria-developers] 8742d176bc2: Added support for more functions when using partitioned S3 tables
by Sergei Golubchik 18 Apr '20
by Sergei Golubchik 18 Apr '20
18 Apr '20
Hi, Michael!
On Apr 13, Michael Widenius wrote:
> revision-id: 8742d176bc2 (mariadb-10.5.2-127-g8742d176bc2)
> parent(s): bf32018be96
> author: Michael Widenius <monty(a)mariadb.com>
> committer: Michael Widenius <monty(a)mariadb.com>
> timestamp: 2020-04-09 16:41:42 +0300
> message:
>
> Added support for more functions when using partitioned S3 tables
>
> MDEV-22088 S3 partitioning support
>
> All ALTER PARTITION commands should now work on S3 tables except
>
> REBUILD PARTITION
> TRUNCATE PARTITION
> REORGANIZE PARTITION
>
> In addition, PARTIONED S3 TABLES can also be replicated.
> This is achived by storing the partition tables .frm and .par file on S3
> for partitioned shared (S3) tables.
>
> The discovery methods are enchanced by allowing engines that supports
> discovery to also support of the partitioned tables .frm and .par file
>
> Things in more detail
>
> - The .frm and .par files of partitioned tables are stored in S3 and kept
> in sync.
> - Added hton callback create_partitioning_metadata to inform handler
> that metadata for a partitoned file has changed
> - Added back handler::discover_check_version() to be able to check if
> a table's or a part table's definition has changed.
> - Added handler::check_if_updates_are_ignored(). Needed for partitioning.
> - Renamed rebind() -> rebind_psi(), as it was before.
> - Changed CHF_xxx hadnler flags to an enum
> - Changed some checks from using table->file->ht to use
> table->file->partition_ht() to get discovery to work with partitioning.
> - If TABLE_SHARE::init_from_binary_frm_image() fails, ensure that we
> don't leave any .frm or .par files around.
> - Fixed that writefrm() doesn't leave unusable .frm files around
> - Appended extension to path for writefrm() to be able to reuse to function
> for creating .par files.
> - Added DBUG_PUSH("") to a a few functions that caused a lot of not
> critical tracing.
>
> diff --git a/mysql-test/suite/s3/replication_partition.test b/mysql-test/suite/s3/replication_partition.test
> new file mode 100644
> index 00000000000..8a177f8f075
> --- /dev/null
> +++ b/mysql-test/suite/s3/replication_partition.test
> @@ -0,0 +1,170 @@
> +--source include/have_s3.inc
> +--source include/have_partition.inc
> +--source include/master-slave.inc
master-slave should always be included last, after all other have_xxx includes.
> +--source include/have_binlog_format_mixed.inc
> +--source include/have_innodb.inc
> +--source include/have_sequence.inc
> +--source create_database.inc
> +
> +connection slave;
> +let $MYSQLD_DATADIR= `select @@datadir`;
> +--replace_result $database database
> +--eval use $database
> +connection master;
> +
> +--echo #
> +--echo # Check replication of parititioned S3 tables
> +--echo #
> +
> +CREATE TABLE t1 (
> + c1 INT DEFAULT NULL
> +) ENGINE=Aria
> + PARTITION BY HASH (c1)
> + PARTITIONS 3;
> +INSERT INTO t1 VALUE (1), (2), (101), (102), (201), (202);
> +ALTER TABLE t1 ENGINE=S3;
> +ALTER TABLE t1 ADD PARTITION PARTITIONS 6;
> +select sum(c1) from t1;
> +ALTER TABLE t1 ADD COLUMN c INT;
> +select sum(c1) from t1;
> +sync_slave_with_master;
> +show create table t1;
> +select sum(c1) from t1;
> +connection master;
> +drop table t1;
> +
> +--echo #
> +--echo # Checking that the slave is keeping in sync with changed partitions
> +--echo #
> +
> +CREATE TABLE t1 (
> + c1 int primary key,
> + c2 int DEFAULT NULL
> +) ENGINE=InnoDB
> + PARTITION BY RANGE (c1)
> + (PARTITION p1 VALUES LESS THAN (200),
> + PARTITION p2 VALUES LESS THAN (300),
> + PARTITION p3 VALUES LESS THAN (400));
> +insert into t1 select seq*100,seq*100 from seq_1_to_3;
> +alter table t1 engine=S3;
> +show create table t1;
> +
> +sync_slave_with_master;
> +select sum(c1) from t1;
> +--file_exists $MYSQLD_DATADIR/$database/t1.frm
> +--file_exists $MYSQLD_DATADIR/$database/t1.par
> +stop slave;
> +connection master;
> +ALTER TABLE t1 ADD PARTITION (PARTITION p4 VALUES LESS THAN (500));
> +connection slave;
> +show create table t1;
> +select sum(c1) from t1;
> +start slave;
> +connection master;
> +sync_slave_with_master;
> +select sum(c1)+0 from t1;
> +stop slave;
> +
> +# .frm amd .par files should not exists on the salve as it has just seen the
> +# ALTER TABLE which cased the removal of the .frm and .par files. The table
> +# from the above "select sum()" came from table cache and was used as it's
> +# id matches the one in S3
> +--error 1
> +--file_exists $MYSQLD_DATADIR/$database/t1.frm
> +--error 1
> +--file_exists $MYSQLD_DATADIR/$database/t1.par
> +# Flushing the table cache will force the .frm and .par files to be
> +# re-generated
> +flush tables;
> +select sum(c1)+0 from t1;
> +--file_exists $MYSQLD_DATADIR/$database/t1.frm
> +--file_exists $MYSQLD_DATADIR/$database/t1.par
> +
> +connection master;
> +drop table t1;
> +connection slave;
> +--file_exists $MYSQLD_DATADIR/$database/t1.par
> +--replace_result $database database
> +--error ER_NO_SUCH_TABLE
> +select sum(c1) from t1;
> +--error 1
> +--file_exists $MYSQLD_DATADIR/$database/t1.par
> +start slave;
> +connection master;
> +
> +--echo #
> +--echo # Check altering partitioned table to S3 and back
> +--echo # Checks also rename partitoned table and drop partition
> +--echo #
> +
> +CREATE TABLE t2 (
> + c1 int primary key,
> + c2 int DEFAULT NULL
> +) ENGINE=InnoDB
> + PARTITION BY RANGE (c1)
> + (PARTITION p1 VALUES LESS THAN (200),
> + PARTITION p2 VALUES LESS THAN (300),
> + PARTITION p3 VALUES LESS THAN (400));
> +insert into t2 select seq*100,seq*100 from seq_1_to_3;
> +alter table t2 engine=S3;
> +rename table t2 to t1;
> +alter table t1 drop partition p1;
> +sync_slave_with_master;
> +select sum(c1) from t1;
> +connection master;
> +alter table t1 engine=innodb;
> +sync_slave_with_master;
> +select sum(c1) from t1;
> +connection master;
> +drop table t1;
> +
> +--echo #
> +--echo # Check that slaves ignores changes to S3 tables.
> +--echo #
> +
> +connection master;
> +CREATE TABLE t1 (
> + c1 int primary key,
> + c2 int DEFAULT NULL
> +) ENGINE=InnoDB
> + PARTITION BY RANGE (c1)
> + (PARTITION p1 VALUES LESS THAN (200),
> + PARTITION p2 VALUES LESS THAN (300),
> + PARTITION p3 VALUES LESS THAN (400));
> +insert into t1 select seq*100,seq*100 from seq_1_to_3;
> +create table t2 like t1;
> +alter table t2 remove partitioning;
> +insert into t2 values (450,450);
> +sync_slave_with_master;
> +stop slave;
> +connection master;
> +alter table t1 engine=s3;
> +alter table t2 engine=s3;
> +ALTER TABLE t1 ADD PARTITION (PARTITION p4 VALUES LESS THAN (500));
> +alter table t1 exchange partition p4 with table t2;
> +select count(*) from t1;
> +drop table t1,t2;
> +connection slave;
> +start slave;
> +connection master;
> +sync_slave_with_master;
> +--replace_result $database database
> +--error ER_NO_SUCH_TABLE
> +select sum(c1) from t1;
> +connection master;
> +
> +--echo #
> +--echo # Check slave binary log
> +--echo #
> +
> +sync_slave_with_master;
> +--let $binlog_database=$database
> +--source include/show_binlog_events.inc
> +connection master;
> +
> +--echo #
> +--echo # clean up
> +--echo #
> +--source drop_database.inc
> +sync_slave_with_master;
> +--source include/rpl_end.inc
> diff --git a/mysys/my_symlink.c b/mysys/my_symlink.c
> index cbee78a7f5c..323ae69a39c 100644
> --- a/mysys/my_symlink.c
> +++ b/mysys/my_symlink.c
> @@ -154,7 +154,8 @@ int my_realpath(char *to, const char *filename, myf MyFlags)
> original name but will at least be able to resolve paths that starts
> with '.'.
> */
> - DBUG_PRINT("error",("realpath failed with errno: %d", errno));
> + if (MyFlags)
> + DBUG_PRINT("error",("realpath failed with errno: %d", errno));
This is kind of against the concept of dbug. dbug's idea is that one should
not edit the code every time, but put DBUG points one and enable/disable them
at run-time.
In cases like that you can specify precisely what you want to see in the trace
with complicated command-line --debug string. But I usually just let it
log everything and then use dbug/remove_function_from_trace.pl script.
> my_errno=errno;
> if (MyFlags & MY_WME)
> my_error(EE_REALPATH, MYF(0), filename, my_errno);
> diff --git a/sql/discover.cc b/sql/discover.cc
> index e49a2a3b0c0..7f3fe73c155 100644
> --- a/sql/discover.cc
> +++ b/sql/discover.cc
> @@ -99,23 +99,23 @@ int readfrm(const char *name, const uchar **frmdata, size_t *len)
>
>
> /*
> - Write the content of a frm data pointer
> - to a frm file.
> + Write the content of a frm data pointer to a frm or par file.
really? writefrm() writes to a .par file?
may be rename it to writefile() then?
>
> - @param path path to table-file "db/name"
> - @param frmdata frm data
> - @param len length of the frmdata
> + @param path full path to table-file "db/name.frm" or .par
> + @param db Database name. Only used for my_error()
> + @param table Table name. Only used for my_error()
> + @param data data to write to file
> + @param len length of the data
>
> @retval
> 0 ok
> @retval
> - 2 Could not write file
> + <> 0 Could not write file. In this case the file is not created
> */
>
> int writefrm(const char *path, const char *db, const char *table,
> - bool tmp_table, const uchar *frmdata, size_t len)
> + bool tmp_table, const uchar *data, size_t len)
> {
> - char file_name[FN_REFLEN+1];
> int error;
> int create_flags= O_RDWR | O_TRUNC;
> DBUG_ENTER("writefrm");
> diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> index 0ec1f2138ab..b8b5085f389 100644
> --- a/sql/ha_partition.cc
> +++ b/sql/ha_partition.cc
> @@ -629,7 +629,8 @@ int ha_partition::rename_table(const char *from, const char *to)
>
> SYNOPSIS
> create_partitioning_metadata()
> - name Full path of table name
> + path Full path of new table name
> + old_p Full path of old table name
This is a rather meaningless phrase, there cannot be a "path" of a "name"
May be, write "A path to the new/old frm file but without the extension"
instead?
> create_info Create info generated for CREATE TABLE
>
> RETURN VALUE
> @@ -678,6 +681,19 @@ int ha_partition::create_partitioning_metadata(const char *path,
> DBUG_RETURN(1);
> }
> }
> +
> + /* m_part_info is only NULL when we failed to create a partition table */
How can m_part_info be NULL here?
create_handler_file() dereferences m_part_info unconditionally, if
m_part_info would've been NULL it'll crash.
> + if (m_part_info)
> + {
> + part= m_part_info->partitions.head();
> + if ((part->engine_type)->create_partitioning_metadata &&
> + ((part->engine_type)->create_partitioning_metadata)(path, old_path,
> + action_flag))
> + {
> + my_error(ER_CANT_CREATE_HANDLER_FILE, MYF(0));
> + DBUG_RETURN(1);
> + }
> + }
> DBUG_RETURN(0);
> }
>
> @@ -1604,6 +1620,7 @@ int ha_partition::prepare_new_partition(TABLE *tbl,
>
> if (!(file->ht->flags & HTON_CAN_READ_CONNECT_STRING_IN_PARTITION))
> tbl->s->connect_string= p_elem->connect_string;
> + create_info->options|= HA_CREATE_TMP_ALTER;
This looks wrong. The partition isn't created as a temporary file.
HA_CREATE_TMP_ALTER is used for autiding and perfschema to distinguish
between normal tables, temporary tables and these alter-tmp-tables.
I see that you've hijacked this flag and check it inside s3, but it wasn't
designed for that, so change s3 instead, not what the flag means.
I don't think you need that if inside s3 at all, if hton flag says
HTON_TEMPORARY_NOT_SUPPORTED, then s3 won't be asked to create a temporary
table and you don't need a check for that in create.
> if ((error= file->ha_create(part_name, tbl, create_info)))
> {
> /*
> @@ -2361,14 +2379,20 @@ uint ha_partition::del_ren_table(const char *from, const char *to)
> const char *to_path= NULL;
> uint i;
> handler **file, **abort_file;
> + THD *thd= ha_thd();
> DBUG_ENTER("ha_partition::del_ren_table");
>
> - if (get_from_handler_file(from, ha_thd()->mem_root, false))
> - DBUG_RETURN(TRUE);
> + if (get_from_handler_file(from, thd->mem_root, false))
> + DBUG_RETURN(my_errno ? my_errno : ENOENT);
> DBUG_ASSERT(m_file_buffer);
> DBUG_PRINT("enter", ("from: (%s) to: (%s)", from, to ? to : "(nil)"));
> name_buffer_ptr= m_name_buffer_ptr;
> +
> file= m_file;
> + /* The command should be logged with IF EXISTS if using a shared table */
> + if (m_file[0]->ht->flags & HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE)
> + thd->replication_flags|= OPTION_IF_EXISTS;
I don't think this and the whole thd->replication_flags is needed,
because in the sql_table.cc you can check directly
table->file->partition_ht()->flags && HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE
> +
> if (to == NULL)
> {
> /*
> @@ -2424,7 +2453,35 @@ uint ha_partition::del_ren_table(const char *from, const char *to)
> goto rename_error;
> }
> }
> +
> + /* Update .par file in the handlers that supports it */
> + if ((*m_file)->ht->create_partitioning_metadata)
> + {
> + if (to == NULL)
> + error= (*m_file)->ht->create_partitioning_metadata(NULL, from,
> + CHF_DELETE_FLAG);
> + else
> + error= (*m_file)->ht->create_partitioning_metadata(to, from,
> + CHF_RENAME_FLAG);
why not just
error= (*m_file)->ht->create_partitioning_metadata(to, from);
the engine can distingush CREATE from RENAME from DELETE by looking
at whether 'to' or `from` is NULL.
> + DBUG_EXECUTE_IF("failed_create_partitioning_metadata",
> + { my_message_sql(ER_OUT_OF_RESOURCES,"Simulated crash",MYF(0));
> + error= 1;
> + });
> + if (error)
> + {
> + if (to)
> + {
> + (void) handler::rename_table(to, from);
> + (void) (*m_file)->ht->create_partitioning_metadata(from, to,
> + CHF_RENAME_FLAG);
> + goto rename_error;
> + }
> + else
> + save_error=error;
> + }
> + }
> DBUG_RETURN(save_error);
> +
> rename_error:
> name_buffer_ptr= m_name_buffer_ptr;
> for (abort_file= file, file= m_file; file < abort_file; file++)
> @@ -3729,6 +3787,16 @@ int ha_partition::rebind()
> #endif /* HAVE_M_PSI_PER_PARTITION */
>
>
> +/*
> + Check if the table definition has changed for the part tables
> + We use the first partition for the check.
> +*/
> +
> +int ha_partition::discover_check_version()
> +{
> + return m_file[0]->discover_check_version();
> +}
1. generally, you have to do it for every open partition, not just for the
first one.
2. m_file[0] partition is not necessarily open, you need to use the first
open partition here (and everywhere)
> +
> /**
> Clone the open and locked partitioning handler.
>
> @@ -11382,6 +11450,12 @@ int ha_partition::end_bulk_delete()
> }
>
>
> +bool ha_partition::check_if_updates_are_ignored(const char *op) const
> +{
> + return (handler::check_if_updates_are_ignored(op) ||
> + ha_check_if_updates_are_ignored(table->in_use, partition_ht(), op));
no you don't need this virtual method at all, simply do
- if (ha_check_if_updates_are_ignored(ha_thd(), ht, "DROP"))
+ if (ha_check_if_updates_are_ignored(ha_thd(), partition_ht(), "DROP"))
that's why partition_ht() was created in the first place.
> +}
> +
> /**
> Perform initialization for a direct update request.
>
> diff --git a/sql/handler.h b/sql/handler.h
> index 8c45c64bec8..eadbf28229c 100644
> --- a/sql/handler.h
> +++ b/sql/handler.h
> @@ -3183,7 +3200,10 @@ class handler :public Sql_alloc
>
> public:
> virtual void unbind_psi();
> - virtual int rebind();
> + virtual void rebind_psi();
this doesn't have to be virtual anymore, because you
moved the check into the virtual discover_check_version()
> + /* Return error if definition doesn't match for already opened table */
> + virtual int discover_check_version() { return 0; }
> +
> /**
> Put the handler in 'batch' mode when collecting
> table io instrumented events.
> diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc
> index 53ccdb5d4c3..b1756b83056 100644
> --- a/sql/sql_parse.cc
> +++ b/sql/sql_parse.cc
> @@ -5906,6 +5906,8 @@ mysql_execute_command(THD *thd)
> case SQLCOM_CALL:
> case SQLCOM_REVOKE:
> case SQLCOM_GRANT:
> + if (thd->variables.option_bits & OPTION_IF_EXISTS)
> + lex->create_info.set(DDL_options_st::OPT_IF_EXISTS);
1. please remove the same (now redundant) code from
Sql_cmd_alter_table::execute()
2. better put it only for the relevant case. Like this
case SQLCOM_ALTER_TABLE:
if (thd->variables.option_bits & OPTION_IF_EXISTS)
lex->create_info.set(DDL_options_st::OPT_IF_EXISTS);
/* fall through */
case SQLCOM_ANALYZE_TABLE:
...
> DBUG_ASSERT(lex->m_sql_cmd != NULL);
> res= lex->m_sql_cmd->execute(thd);
> DBUG_PRINT("result", ("res: %d killed: %d is_error: %d",
> diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc
> index 7385f9059e1..351464939e2 100644
> --- a/sql/sql_partition.cc
> +++ b/sql/sql_partition.cc
> @@ -7057,6 +7059,10 @@ uint fast_alter_partition_table(THD *thd, TABLE *table,
> lpt->pack_frm_data= NULL;
> lpt->pack_frm_len= 0;
>
> + /* Add IF EXISTS to binlog if shared table */
> + if (table->file->partition_ht()->flags & HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE)
> + thd->variables.option_bits|= OPTION_IF_EXISTS;
> +
No need to, you already have that in mysql_alter_table(). Just fix the check
there, like this:
if (!if_exists &&
- (table->s->db_type()->flags & HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE))
+ (table->file->partition_ht()->flags & HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE))
> if (table->file->alter_table_flags(alter_info->flags) &
> HA_PARTITION_ONE_PHASE)
> {
> diff --git a/sql/sql_partition_admin.cc b/sql/sql_partition_admin.cc
> index ed77c0938f3..d29c014bdc2 100644
> --- a/sql/sql_partition_admin.cc
> +++ b/sql/sql_partition_admin.cc
> @@ -529,14 +530,46 @@ bool Sql_cmd_alter_table_exchange_partition::
> table_list->mdl_request.set_type(MDL_SHARED_NO_WRITE);
> if (unlikely(open_tables(thd, &table_list, &table_counter, 0,
> &alter_prelocking_strategy)))
> + {
> + if (thd->lex->if_exists() &&
> + thd->get_stmt_da()->sql_errno() == ER_NO_SUCH_TABLE)
> + {
> + /*
> + ALTER TABLE IF EXISTS was used on not existing table
> + We have to log the query on a slave as the table may be a shared one
> + from the master and we need to ensure that the next slave can see
> + the statement as this slave may not have the table shared
> + */
> + thd->clear_error();
> + if (thd->slave_thread &&
> + write_bin_log(thd, true, thd->query(), thd->query_length()))
> + DBUG_RETURN(true);
> + my_ok(thd);
> + DBUG_RETURN(false);
> + }
I don't like that. There are many Sql_cmd_alter_table* classes and lots of
duplicated code. Okay, you've put your OPTION_IF_EXISTS check into the big
switch in the mysql_execute_command() and thus avoided duplicating that code.
But this if() is also in mysql_alter_table() (and may be more?)
I'm thinking that doing many Sql_cmd_alter_table* classes was a bad idea
which leads to code duplication. How can we fix that?
> DBUG_RETURN(true);
> + }
>
> part_table= table_list->table;
> swap_table= swap_table_list->table;
>
> + if (part_table->file->check_if_updates_are_ignored("ALTER"))
> + {
> + if (thd->slave_thread &&
> + write_bin_log_with_if_exists(thd, true, false, true))
> + DBUG_RETURN(true);
> + my_ok(thd);
> + DBUG_RETURN(false);
> + }
another duplicated one
> +
> if (unlikely(check_exchange_partition(swap_table, part_table)))
> DBUG_RETURN(TRUE);
>
> + /* Add IF EXISTS to binlog if shared table */
> + if (part_table->file->partition_ht()->flags &
> + HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE)
> + force_if_exists= 1;
and again
> +
> /* set lock pruning on first table */
> partition_name= alter_info->partition_names.head();
> if (unlikely(table_list->table->part_info->
> @@ -784,16 +821,51 @@ bool Sql_cmd_alter_table_truncate_partition::execute(THD *thd)
> #endif /* WITH_WSREP */
>
> if (open_tables(thd, &first_table, &table_counter, 0))
> - DBUG_RETURN(true);
> + {
> + if (thd->lex->if_exists() &&
> + thd->get_stmt_da()->sql_errno() == ER_NO_SUCH_TABLE)
> + {
> + /*
> + ALTER TABLE IF EXISTS was used on not existing table
> + We have to log the query on a slave as the table may be a shared one
> + from the master and we need to ensure that the next slave can see
> + the statement as this slave may not have the table shared
> + */
> + thd->clear_error();
> + if (thd->slave_thread &&
> + write_bin_log(thd, true, thd->query(), thd->query_length()))
> + DBUG_RETURN(TRUE);
> + my_ok(thd);
> + DBUG_RETURN(FALSE);
> + }
> + DBUG_RETURN(TRUE);
> + }
see?
>
> - if (!first_table->table || first_table->view ||
> - first_table->table->s->db_type() != partition_hton)
> + if (!first_table->table || first_table->view)
> {
> my_error(ER_PARTITION_MGMT_ON_NONPARTITIONED, MYF(0));
> DBUG_RETURN(TRUE);
> }
>
> -
> + if (first_table->table->file->check_if_updates_are_ignored("ALTER"))
> + {
> + if (thd->slave_thread &&
> + write_bin_log_with_if_exists(thd, true, false, 1))
> + DBUG_RETURN(true);
> + my_ok(thd);
> + DBUG_RETURN(false);
> + }
> +
> + if (first_table->table->s->db_type() != partition_hton)
> + {
> + my_error(ER_PARTITION_MGMT_ON_NONPARTITIONED, MYF(0));
> + DBUG_RETURN(TRUE);
> + }
why did you split the check for ER_PARTITION_MGMT_ON_NONPARTITIONED?
> +
> + if (first_table->table->file->partition_ht()->flags &
> + HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE)
> + force_if_exists= 1;
> +
> /*
> Prune all, but named partitions,
> to avoid excessive calls to external_lock().
> diff --git a/sql/table.cc b/sql/table.cc
> index fe096835144..c6b42b7ba36 100644
> --- a/sql/table.cc
> +++ b/sql/table.cc
> @@ -25,7 +25,8 @@
> // primary_key_name
> #include "sql_parse.h" // free_items
> #include "strfunc.h" // unhex_type2
> -#include "sql_partition.h" // mysql_unpack_partition,
> +#include "ha_partition.h" // PART_EXT
> + // mysql_unpack_partition,
also included below
> // fix_partition_func, partition_info
> #include "sql_base.h"
> #include "create_options.h"
> @@ -42,6 +43,7 @@
> #include "rpl_filter.h"
> #include "sql_cte.h"
> #include "ha_sequence.h"
> +#include "ha_partition.h"
also included above
> #include "sql_show.h"
> #include "opt_trace.h"
>
> @@ -620,6 +622,9 @@ enum open_frm_error open_table_def(THD *thd, TABLE_SHARE *share, uint flags)
> {
> DBUG_ASSERT(flags & GTS_TABLE);
> DBUG_ASSERT(flags & GTS_USE_DISCOVERY);
> + /* Delete .frm and .par files */
> + mysql_file_delete_with_symlink(key_file_frm, path, "", MYF(0));
> + strxmov(path, share->normalized_path.str, PAR_EXT, NullS);
> mysql_file_delete_with_symlink(key_file_frm, path, "", MYF(0));
no need to strxmov, this third "" argument is the file extension, you can directly do
mysql_file_delete_with_symlink(key_file_frm, share->normalized_path.str, PAR_EXT, MYF(0));
also it should be key_file_partition_ddl_log not key_file_frm
(it's not a ddl log, but sql_table.cc uses key_file_partition_ddl_log for par files)
> file= -1;
> }
> @@ -1679,12 +1687,13 @@ class Field_data_type_info_array
>
> 42..46 are unused since 5.0 (were for RAID support)
> Also, there're few unused bytes in forminfo.
> -
> */
>
> int TABLE_SHARE::init_from_binary_frm_image(THD *thd, bool write,
> const uchar *frm_image,
> - size_t frm_length)
> + size_t frm_length,
> + const uchar *par_image,
> + size_t par_length)
I don't think writing .par file belongs here, why not to do it in the caller?
Or, better, don't send .par file to S3 and don't discover it, it'll make
everything much simpler. .par can be created from the .frm, see sql_table.cc:
file= mysql_create_frm_image(thd, orig_db, orig_table_name, create_info,
alter_info, create_table_mode, key_info,
key_count, frm);
...
if (file->ha_create_partitioning_metadata(path, NULL, CHF_CREATE_FLAG))
goto err;
> {
> TABLE_SHARE *share= this;
> uint new_frm_ver, field_pack_length, new_field_pack_flag;
> @@ -1715,24 +1724,31 @@ int TABLE_SHARE::init_from_binary_frm_image(THD *thd, bool write,
> uint len;
> uint ext_key_parts= 0;
> plugin_ref se_plugin= 0;
> - bool vers_can_native= false;
> + bool vers_can_native= false, frm_created= 0;
> Field_data_type_info_array field_data_type_info_array;
> -
> MEM_ROOT *old_root= thd->mem_root;
> Virtual_column_info **table_check_constraints;
> extra2_fields extra2;
> -
> DBUG_ENTER("TABLE_SHARE::init_from_binary_frm_image");
>
> keyinfo= &first_keyinfo;
> thd->mem_root= &share->mem_root;
>
> - if (write && write_frm_image(frm_image, frm_length))
> - goto err;
> -
> if (frm_length < FRM_HEADER_SIZE + FRM_FORMINFO_SIZE)
> goto err;
>
> + if (write)
> + {
> +#ifdef WITH_PARTITION_STORAGE_ENGINE
> + if (par_image)
> + if (write_par_image(par_image, par_length))
> + goto err;
> +#endif
> + frm_created= 1;
> + if (write_frm_image(frm_image, frm_length))
> + goto err;
> + }
why not to create it at the end? then you won't need to delete it on an error.
> +
> share->frm_version= frm_image[2];
> /*
> Check if .frm file created by MySQL 5.0. In this case we want to
> diff --git a/storage/maria/ha_maria.cc b/storage/maria/ha_maria.cc
> index ed9c5404a7f..1f99fd9d895 100644
> --- a/storage/maria/ha_maria.cc
> +++ b/storage/maria/ha_maria.cc
> @@ -2724,7 +2724,7 @@ void ha_maria::drop_table(const char *name)
> {
> DBUG_ASSERT(file->s->temporary);
> (void) ha_close();
> - (void) maria_delete_table_files(name, 1, 0);
> + (void) maria_delete_table_files(name, 1, MY_WME);
MYF(MY_WME) as always?
> }
>
>
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
1
Re: [Maria-developers] 9243921c84b: Make all #sql temporary table names uniform
by Sergei Golubchik 18 Apr '20
by Sergei Golubchik 18 Apr '20
18 Apr '20
Hi, Michael!
On Apr 13, Michael Widenius wrote:
> revision-id: 9243921c84b (mariadb-10.5.2-128-g9243921c84b)
> parent(s): 8742d176bc2
> author: Michael Widenius <monty(a)mariadb.com>
> committer: Michael Widenius <monty(a)mariadb.com>
> timestamp: 2020-04-09 22:02:06 +0300
> message:
>
> Make all #sql temporary table names uniform
>
> The reason for this is to make all temporary file names similar and
> also to be able to figure out from where a #sql-xxx name orginates.
>
> New format is for most cases:
> #sql-name-current_pid-thread_id[-increment]
> Where name is one of subselect, alter, exchange, temptable or backup
>
> The execptions are:
typo, "exceptions"
> ALTER PARTITION shadow files:
> #sql-shadow-'original_table_name'
Please, add a thread_id here at the end. normally MDL should ensure that
no two threads can have a shadow for the same table at the same time,
but we have enough bugs as it is to introduce another vector when two
threads can overwrite each other temp files.
> Names from the temp pool:
> #sql-name-current_pid-pool_number
Would you mind if I drop temp pool completely in 10.5?
It was added by Jeremy in January 2001 with the comment
Added --temp-pool option to mysqld. This will cause temporary files
created to use a small set of filenames, to try and avoid problems
in the Linux kernel.
And I doubt it's still an issue in 2020.
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
3
4
Re: [Maria-developers] faf8de3aa98: Fixed some assert crashes related to keyread.
by Sergei Golubchik 18 Apr '20
by Sergei Golubchik 18 Apr '20
18 Apr '20
Hi, Michael!
On Apr 13, Michael Widenius wrote:
> revision-id: faf8de3aa98 (mariadb-10.5.2-121-gfaf8de3aa98)
> parent(s): 3cbe15bd78c
> author: Michael Widenius <monty(a)mariadb.com>
> committer: Michael Widenius <monty(a)mariadb.com>
> timestamp: 2020-04-09 01:37:02 +0300
> message:
>
> Fixed some assert crashes related to keyread.
>
> - MDEV-22062 Assertion `!table->file->keyread_enabled()' failed in
> close_thread_table()
> - MDEV-22077 table->no_keyread .. failed in join_read_first()
>
> diff --git a/mysql-test/main/keyread.test b/mysql-test/main/keyread.test
> index d9d3002d392..76d0f5674dd 100644
> --- a/mysql-test/main/keyread.test
> +++ b/mysql-test/main/keyread.test
> @@ -8,3 +8,14 @@ create view v1 as select * from t1 where f2 = 1;
> select distinct f1 from v1;
> drop view v1;
> drop table t1;
> +
> +#
> +# MDEV-22062 Assertion `!table->file->keyread_enabled()' failed in
> +# close_thread_table
> +#
> +
> +CREATE TABLE t1 (a INT NOT NULL, UNIQUE(a)) ENGINE=InnoDB;
> +INSERT INTO t1 VALUES (1),(2);
> +DELETE FROM t1 ORDER BY a LIMIT 1;
> +SELECT * FROM t1;
> +DROP TABLE t1;
where's a test case for MDEV-22077?
> diff --git a/sql/sql_delete.cc b/sql/sql_delete.cc
> index 19eabbb053c..2fc0de4345f 100644
> --- a/sql/sql_delete.cc
> +++ b/sql/sql_delete.cc
> @@ -547,10 +547,12 @@ bool mysql_delete(THD *thd, TABLE_LIST *table_list, COND *conds,
> else
> {
> ha_rows scanned_limit= query_plan.scanned_rows;
> + table->no_keyread= 1;
> query_plan.index= get_index_for_order(order, table, select, limit,
> &scanned_limit,
> &query_plan.using_filesort,
> &reverse);
> + table->no_keyread= 0;
why?
> if (!query_plan.using_filesort)
> query_plan.scanned_rows= scanned_limit;
> }
> diff --git a/sql/sql_select.cc b/sql/sql_select.cc
> index 129dae9eedb..1f05156b96f 100644
> --- a/sql/sql_select.cc
> +++ b/sql/sql_select.cc
> @@ -23584,7 +23584,7 @@ test_if_skip_sort_order(JOIN_TAB *tab,ORDER *order,ha_rows select_limit,
> If ref_key used index tree reading only ('Using index' in EXPLAIN),
> and best_key doesn't, then revert the decision.
> */
> - if (table->covering_keys.is_set(best_key))
> + if (table->covering_keys.is_set(best_key) && !table->no_keyread)
> table->file->ha_start_keyread(best_key);
> else
> table->file->ha_end_keyread();
> @@ -28568,8 +28568,6 @@ test_if_cheaper_ordering(const JOIN_TAB *tab, ORDER *order, TABLE *table,
> if (new_used_key_parts != NULL)
> *new_used_key_parts= best_key_parts;
> table->file->ha_end_keyread();
> - if (is_best_covering && !table->no_keyread)
> - table->file->ha_start_keyread(best_key);
I find it very confusing that test_if_cheaper_ordering has a side effect
of ending keyread. It was ok, when it ended previous keyread and started
a new one. But just disabling - it's looks very strange.
> DBUG_RETURN(TRUE);
> }
Regards,
Sergei
VP of MariaDB Server Engineering
and security(a)mariadb.org
2
3