[Maria-discuss] Can't find log after Galera's SST
Hi, Almost everytime when I join node to Galera cluster (MariaDB 10.1.13) I'm getting an error after SST: 2016-04-13 21:09:37 139932281419712 [Note] Recovering after a crash using /var/log/mysql/mariadb-bin 2016-04-13 21:09:37 139932281419712 [ERROR] Binlog file '/var/log/mysql/mariadb-bin.000023' not found in binlog index, needed for recovery. Aborting. 2016-04-13 21:09:37 139932281419712 [ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error) and restart, or delete (or rename) binary log and start mysqld with --tc-heuristic-recover={commit|rollback} 2016-04-13 21:09:37 139932281419712 [ERROR] Can't init tc log 2016-04-13 21:09:37 139932281419712 [ERROR] Aborting As I see, almost every SST rotates a log file and downloads last (in example above I have mariadb-bin.000024), and not previous. But for recovering it needs previous. Sometimes I'm lucky and it restores. Anyone knows why it rotates a log and is it possible to force it to load more than last log file? Igor Mazur
What SST method are you using? Based on this I expect you are using rsync. I imagine someone knows more detailed answers here, but from my understanding: Internally the bin/wsrep_sst_rsync script indicates it needs "flush tables" to the server and the server runs this "sst_flush_tables()" function that effectively runs: FLUSH TABLES WITH READ LOCK + reload_acl_and_cache(thd, REFRESH_ENGINE_LOG | REFRESH_BINARY_LOG, ...) Which I think is equivalent to FLUSH ENGINE LOGS / FLUSH BINARY LOGS. I think around this point it also sets innodb_disallow_writes to stop InnoDB's background I/O (so rsync can be consistent). At that point the bin/wsrep_sst_rsync script graps the last $BINLOG_N_FILES binary logs, which I believe is hardcoded to '1' and seems to be evaluated after this flush condition. But internally I guess InnoDB can have a prepared transaction related to a previous binlog. I've wrestled with similar problems writing backup recovery routines for MariaDB + snapshot backups (typically via LVM snapshots), where I typically have to go 2 logs back and it seems to eliminate this kind of failure. You might be able to bump up BINLOG_N_FILES in /usr/bin/wsrep_sst_rsync (or wherever it is located) and see if that resolves your race condition. Alternatively I think percona-xtrabackup has solved this problem[1] a while back as well, so a different wsrep-sst-method may be an option. Although xtrabackup has its own problems - in particularly I've found that xtrabackup currently doesn't handle lost+found directories too well[2], so you either have to avoid having the datadir directly under the mountpoint or use some other filesystem (such as XFS) that doesn't create lost+found. I suppose if you're using encryption, xtrabackup is also not currently an option. :) ~Andrew [1] https://bugs.launchpad.net/percona-xtrabackup/+bug/1254227 [2] https://bugs.launchpad.net/percona-xtrabackup/+bug/1272329 On Wed, Apr 13, 2016 at 4:27 PM, Igor Mazur <igor.kozlov@gmail.com> wrote:
Hi,
Almost everytime when I join node to Galera cluster (MariaDB 10.1.13) I'm getting an error after SST:
2016-04-13 21:09:37 139932281419712 [Note] Recovering after a crash using /var/log/mysql/mariadb-bin 2016-04-13 21:09:37 139932281419712 [ERROR] Binlog file '/var/log/mysql/mariadb-bin.000023' not found in binlog index, needed for recovery. Aborting. 2016-04-13 21:09:37 139932281419712 [ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error) and restart, or delete (or rename) binary log and start mysqld with --tc-heuristic-recover={commit|rollback} 2016-04-13 21:09:37 139932281419712 [ERROR] Can't init tc log 2016-04-13 21:09:37 139932281419712 [ERROR] Aborting
As I see, almost every SST rotates a log file and downloads last (in example above I have mariadb-bin.000024), and not previous. But for recovering it needs previous. Sometimes I'm lucky and it restores.
Anyone knows why it rotates a log and is it possible to force it to load more than last log file?
Igor Mazur
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Sorry, I've forgotten to say that I use xtrabackup. 2016-04-14 2:04 GMT+03:00 Andrew Garner <muzazzi@gmail.com>:
What SST method are you using? Based on this I expect you are using rsync.
I imagine someone knows more detailed answers here, but from my understanding:
Internally the bin/wsrep_sst_rsync script indicates it needs "flush tables" to the server and the server runs this "sst_flush_tables()" function that effectively runs:
FLUSH TABLES WITH READ LOCK + reload_acl_and_cache(thd, REFRESH_ENGINE_LOG | REFRESH_BINARY_LOG, ...)
Which I think is equivalent to FLUSH ENGINE LOGS / FLUSH BINARY LOGS. I think around this point it also sets innodb_disallow_writes to stop InnoDB's background I/O (so rsync can be consistent).
At that point the bin/wsrep_sst_rsync script graps the last $BINLOG_N_FILES binary logs, which I believe is hardcoded to '1' and seems to be evaluated after this flush condition. But internally I guess InnoDB can have a prepared transaction related to a previous binlog.
I've wrestled with similar problems writing backup recovery routines for MariaDB + snapshot backups (typically via LVM snapshots), where I typically have to go 2 logs back and it seems to eliminate this kind of failure. You might be able to bump up BINLOG_N_FILES in /usr/bin/wsrep_sst_rsync (or wherever it is located) and see if that resolves your race condition.
Alternatively I think percona-xtrabackup has solved this problem[1] a while back as well, so a different wsrep-sst-method may be an option. Although xtrabackup has its own problems - in particularly I've found that xtrabackup currently doesn't handle lost+found directories too well[2], so you either have to avoid having the datadir directly under the mountpoint or use some other filesystem (such as XFS) that doesn't create lost+found. I suppose if you're using encryption, xtrabackup is also not currently an option. :)
~Andrew
[1] https://bugs.launchpad.net/percona-xtrabackup/+bug/1254227 [2] https://bugs.launchpad.net/percona-xtrabackup/+bug/1272329
On Wed, Apr 13, 2016 at 4:27 PM, Igor Mazur <igor.kozlov@gmail.com> wrote:
Hi,
Almost everytime when I join node to Galera cluster (MariaDB 10.1.13) I'm getting an error after SST:
2016-04-13 21:09:37 139932281419712 [Note] Recovering after a crash using /var/log/mysql/mariadb-bin 2016-04-13 21:09:37 139932281419712 [ERROR] Binlog file '/var/log/mysql/mariadb-bin.000023' not found in binlog index, needed for recovery. Aborting. 2016-04-13 21:09:37 139932281419712 [ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error) and restart, or delete (or rename) binary log and start mysqld with --tc-heuristic-recover={commit|rollback} 2016-04-13 21:09:37 139932281419712 [ERROR] Can't init tc log 2016-04-13 21:09:37 139932281419712 [ERROR] Aborting
As I see, almost every SST rotates a log file and downloads last (in example above I have mariadb-bin.000024), and not previous. But for recovering it needs previous. Sometimes I'm lucky and it restores.
Anyone knows why it rotates a log and is it possible to force it to load more than last log file?
Igor Mazur
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Sorry for the noise. XtraBackup does flush binary logs and does copy the last binary log into the backup when gtids are detected (i.e. always on mariadb). That's what's rotating your logs and it does only copy the last log (post-rotate). I don't think that's configurable at all. I was able to reproduce this in my environment with identical results to yours. MariaDB 10.1.13 / Xtrabackup 2.3.4 on Debian 8. I just started a busy loop inserting a row constantly into a table and forced SST: [Note] Recovering after a crash using /var/log/mysql/mysql-bin [ERROR] Binlog file '/var/log/mysql/mysql-bin.000149' not found in binlog index, needed for recovery. Aborting. [ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error and restart, or delete (or rename) binary log and start mysqld with --tc-heuristic-recover={commit|rollback} [ERROR] Can't init tc log ERROR] Aborting Actual binlog changes each time while i'm writing to the cluster. I did find that commenting out the bits from wsrep_sst_xtrabackup-v2 that copies the binlog back to /var/log/mysql/ avoids this error - I guess the presence of a binlog here starts the recovery process. I'm not sure what the ramifications are of skipping that recovery. This issue looks like it's already reported here: https://jira.mariadb.org/browse/MDEV-9423 I will add my comments there and hope we can find a solution. On Wed, Apr 13, 2016 at 11:50 PM, Igor Mazur <igor.kozlov@gmail.com> wrote:
Sorry, I've forgotten to say that I use xtrabackup.
2016-04-14 2:04 GMT+03:00 Andrew Garner <muzazzi@gmail.com>:
What SST method are you using? Based on this I expect you are using rsync.
I imagine someone knows more detailed answers here, but from my understanding:
Internally the bin/wsrep_sst_rsync script indicates it needs "flush tables" to the server and the server runs this "sst_flush_tables()" function that effectively runs:
FLUSH TABLES WITH READ LOCK + reload_acl_and_cache(thd, REFRESH_ENGINE_LOG | REFRESH_BINARY_LOG, ...)
Which I think is equivalent to FLUSH ENGINE LOGS / FLUSH BINARY LOGS. I think around this point it also sets innodb_disallow_writes to stop InnoDB's background I/O (so rsync can be consistent).
At that point the bin/wsrep_sst_rsync script graps the last $BINLOG_N_FILES binary logs, which I believe is hardcoded to '1' and seems to be evaluated after this flush condition. But internally I guess InnoDB can have a prepared transaction related to a previous binlog.
I've wrestled with similar problems writing backup recovery routines for MariaDB + snapshot backups (typically via LVM snapshots), where I typically have to go 2 logs back and it seems to eliminate this kind of failure. You might be able to bump up BINLOG_N_FILES in /usr/bin/wsrep_sst_rsync (or wherever it is located) and see if that resolves your race condition.
Alternatively I think percona-xtrabackup has solved this problem[1] a while back as well, so a different wsrep-sst-method may be an option. Although xtrabackup has its own problems - in particularly I've found that xtrabackup currently doesn't handle lost+found directories too well[2], so you either have to avoid having the datadir directly under the mountpoint or use some other filesystem (such as XFS) that doesn't create lost+found. I suppose if you're using encryption, xtrabackup is also not currently an option. :)
~Andrew
[1] https://bugs.launchpad.net/percona-xtrabackup/+bug/1254227 [2] https://bugs.launchpad.net/percona-xtrabackup/+bug/1272329
On Wed, Apr 13, 2016 at 4:27 PM, Igor Mazur <igor.kozlov@gmail.com> wrote:
Hi,
Almost everytime when I join node to Galera cluster (MariaDB 10.1.13) I'm getting an error after SST:
2016-04-13 21:09:37 139932281419712 [Note] Recovering after a crash using /var/log/mysql/mariadb-bin 2016-04-13 21:09:37 139932281419712 [ERROR] Binlog file '/var/log/mysql/mariadb-bin.000023' not found in binlog index, needed for recovery. Aborting. 2016-04-13 21:09:37 139932281419712 [ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error) and restart, or delete (or rename) binary log and start mysqld with --tc-heuristic-recover={commit|rollback} 2016-04-13 21:09:37 139932281419712 [ERROR] Can't init tc log 2016-04-13 21:09:37 139932281419712 [ERROR] Aborting
As I see, almost every SST rotates a log file and downloads last (in example above I have mariadb-bin.000024), and not previous. But for recovering it needs previous. Sometimes I'm lucky and it restores.
Anyone knows why it rotates a log and is it possible to force it to load more than last log file?
Igor Mazur
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Great! Thank you! I've just found this issue at Jira. Will try to understand more. What is strange - sometimes it synchronizes, but mostly - don't. It might be related to the last version of MariaDB 10.1.13 But, I remember, that I had some similar issue last summer (there was MariaDB 10.1.8) - but at that time, simple restart helped and I forgot about an issue. 2016-04-14 11:28 GMT+03:00 Andrew Garner <muzazzi@gmail.com>:
Sorry for the noise.
XtraBackup does flush binary logs and does copy the last binary log into the backup when gtids are detected (i.e. always on mariadb). That's what's rotating your logs and it does only copy the last log (post-rotate). I don't think that's configurable at all.
I was able to reproduce this in my environment with identical results to yours. MariaDB 10.1.13 / Xtrabackup 2.3.4 on Debian 8. I just started a busy loop inserting a row constantly into a table and forced SST:
[Note] Recovering after a crash using /var/log/mysql/mysql-bin [ERROR] Binlog file '/var/log/mysql/mysql-bin.000149' not found in binlog index, needed for recovery. Aborting. [ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error and restart, or delete (or rename) binary log and start mysqld with --tc-heuristic-recover={commit|rollback} [ERROR] Can't init tc log ERROR] Aborting
Actual binlog changes each time while i'm writing to the cluster.
I did find that commenting out the bits from wsrep_sst_xtrabackup-v2 that copies the binlog back to /var/log/mysql/ avoids this error - I guess the presence of a binlog here starts the recovery process. I'm not sure what the ramifications are of skipping that recovery.
This issue looks like it's already reported here: https://jira.mariadb.org/browse/MDEV-9423
I will add my comments there and hope we can find a solution.
On Wed, Apr 13, 2016 at 11:50 PM, Igor Mazur <igor.kozlov@gmail.com> wrote:
Sorry, I've forgotten to say that I use xtrabackup.
2016-04-14 2:04 GMT+03:00 Andrew Garner <muzazzi@gmail.com>:
What SST method are you using? Based on this I expect you are using
rsync.
I imagine someone knows more detailed answers here, but from my understanding:
Internally the bin/wsrep_sst_rsync script indicates it needs "flush tables" to the server and the server runs this "sst_flush_tables()" function that effectively runs:
FLUSH TABLES WITH READ LOCK + reload_acl_and_cache(thd, REFRESH_ENGINE_LOG | REFRESH_BINARY_LOG,
...)
Which I think is equivalent to FLUSH ENGINE LOGS / FLUSH BINARY LOGS. I think around this point it also sets innodb_disallow_writes to stop InnoDB's background I/O (so rsync can be consistent).
At that point the bin/wsrep_sst_rsync script graps the last $BINLOG_N_FILES binary logs, which I believe is hardcoded to '1' and seems to be evaluated after this flush condition. But internally I guess InnoDB can have a prepared transaction related to a previous binlog.
I've wrestled with similar problems writing backup recovery routines for MariaDB + snapshot backups (typically via LVM snapshots), where I typically have to go 2 logs back and it seems to eliminate this kind of failure. You might be able to bump up BINLOG_N_FILES in /usr/bin/wsrep_sst_rsync (or wherever it is located) and see if that resolves your race condition.
Alternatively I think percona-xtrabackup has solved this problem[1] a while back as well, so a different wsrep-sst-method may be an option. Although xtrabackup has its own problems - in particularly I've found that xtrabackup currently doesn't handle lost+found directories too well[2], so you either have to avoid having the datadir directly under the mountpoint or use some other filesystem (such as XFS) that doesn't create lost+found. I suppose if you're using encryption, xtrabackup is also not currently an option. :)
~Andrew
[1] https://bugs.launchpad.net/percona-xtrabackup/+bug/1254227 [2] https://bugs.launchpad.net/percona-xtrabackup/+bug/1272329
On Wed, Apr 13, 2016 at 4:27 PM, Igor Mazur <igor.kozlov@gmail.com>
wrote:
Hi,
Almost everytime when I join node to Galera cluster (MariaDB 10.1.13) I'm getting an error after SST:
2016-04-13 21:09:37 139932281419712 [Note] Recovering after a crash using /var/log/mysql/mariadb-bin 2016-04-13 21:09:37 139932281419712 [ERROR] Binlog file '/var/log/mysql/mariadb-bin.000023' not found in binlog index, needed for recovery. Aborting. 2016-04-13 21:09:37 139932281419712 [ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error) and restart, or delete (or rename) binary log and start mysqld with --tc-heuristic-recover={commit|rollback} 2016-04-13 21:09:37 139932281419712 [ERROR] Can't init tc log 2016-04-13 21:09:37 139932281419712 [ERROR] Aborting
As I see, almost every SST rotates a log file and downloads last (in example above I have mariadb-bin.000024), and not previous. But for recovering it needs previous. Sometimes I'm lucky and it restores.
Anyone knows why it rotates a log and is it possible to force it to load more than last log file?
Igor Mazur
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
participants (2)
-
Andrew Garner
-
Igor Mazur