[Maria-discuss] MariaDB 10.1: binlog stop updating
Hi, I have a few mysql cluster, previously on 5.5 (debian jessie) that we are upgrading in mariadb 10.1 (debian stretch) for now. We are experiencing two major issues with the production trafic: * We saw a huge increase in on disk tmp table space used (serveurs with 4Gb of free space in / (including /tmp) had to be added an additionnal 20Gb /tmp volume and even that is no enough everytime * If the replication process has to write a tmp file to execute the query, and log slave updates is active then in the even of the /tmp volume being full, the update of the binlog will also fail erroneouslly with a "No space" error and will stop every following binlog write. 2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3; Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device"), Error_code: 1026; Writing one row to the row-based binary log failed, Error_code: 1534; handler error HA_ERR_RBR_LOGGING_FAILED; the event's master log mysql-bin.006843, end_log_pos 42848762, Gtid 0-70-467485, Internal MariaDB error code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device") Error_code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device") Error_code: 1026 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Writing one row to the row-based binary log failed Error_code: 15342019-02-20 15:24:12 140618945124096 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.006843' position 42814887 Does someone know if they have been improvement in 10.2/10.3 on theses issues ?
Hi based on one statement " If the replication process has to write a tmp file to execute the query", i'd just like to ask are you using STATEMENT or MIXED replication? If so do you have a reason not to use ROW? Ultimately, the last is preferred, and i realize this does not directly answer your question, but would be interested in starting here and would far most consistent and only take up space required for the row/dataset. Best, Jeff On Wed, Feb 20, 2019 at 4:40 PM Benoit Plessis <benoit+one@plessis.info> wrote:
Hi,
I have a few mysql cluster, previously on 5.5 (debian jessie) that we are upgrading in mariadb 10.1 (debian stretch) for now.
We are experiencing two major issues with the production trafic:
* We saw a huge increase in on disk tmp table space used (serveurs with 4Gb of free space in / (including /tmp) had to be added an additionnal 20Gb /tmp volume and even that is no enough everytime
* If the replication process has to write a tmp file to execute the query, and log slave updates is active then in the even of the /tmp volume being full, the update of the binlog will also fail erroneouslly with a "No space" error and will stop every following binlog write.
2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3; Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device"), Error_code: 1026; Writing one row to the row-based binary log failed, Error_code: 1534; handler error HA_ERR_RBR_LOGGING_FAILED; the event's master log mysql-bin.006843, end_log_pos 42848762, Gtid 0-70-467485, Internal MariaDB error code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device") Error_code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device") Error_code: 1026 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Writing one row to the row-based binary log failed Error_code: 15342019-02-20 15:24:12 140618945124096 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.006843' position 42814887
Does someone know if they have been improvement in 10.2/10.3 on theses issues ?
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hi, Yes we are currently using mixed replication, we switched from statement two years ago. I had read mixed return about row based replication with some queries, so mixed looked like a good compromise. Reading current mysql (8.0) it look like RBR is the defaut now so it should be stable indeed. I need to check but i suppose i could test switch for mixed to row wihtout much difficulties. In RBR there isn't any need of tmp space ? I was looking at using a separate tmp space for the slave needs using slave_load_tmpdir Best, Benoit On 21/02/2019 02:29, Jeff Dyke wrote:
Hi based on one statement " If the replication process has to write a tmp file to execute the query", i'd just like to ask are you using STATEMENT or MIXED replication? If so do you have a reason not to use ROW? Ultimately, the last is preferred, and i realize this does not directly answer your question, but would be interested in starting here and would far most consistent and only take up space required for the row/dataset.
Best, Jeff
On Wed, Feb 20, 2019 at 4:40 PM Benoit Plessis <benoit+one@plessis.info <mailto:benoit%2Bone@plessis.info>> wrote:
Hi,
I have a few mysql cluster, previously on 5.5 (debian jessie) that we are upgrading in mariadb 10.1 (debian stretch) for now.
We are experiencing two major issues with the production trafic:
* We saw a huge increase in on disk tmp table space used (serveurs with 4Gb of free space in / (including /tmp) had to be added an additionnal 20Gb /tmp volume and even that is no enough everytime
* If the replication process has to write a tmp file to execute the query, and log slave updates is active then in the even of the /tmp volume being full, the update of the binlog will also fail erroneouslly with a "No space" error and will stop every following binlog write.
2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3; Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device"), Error_code: 1026; Writing one row to the row-based binary log failed, Error_code: 1534; handler error HA_ERR_RBR_LOGGING_FAILED; the event's master log mysql-bin.006843, end_log_pos 42848762, Gtid 0-70-467485, Internal MariaDB error code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device") Error_code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device") Error_code: 1026 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Writing one row to the row-based binary log failed Error_code: 15342019-02-20 15:24:12 140618945124096 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.006843' position 42814887
Does someone know if they have been improvement in 10.2/10.3 on theses issues ?
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hum, it look like the issue is even more simple, if the tmp space is filled for too long, even if the mysql slave thread has nothing to do i end up with corrupted binlog ... On 21/02/2019 09:53, Benoit Plessis wrote:
Hi,
Yes we are currently using mixed replication, we switched from statement two years ago.
I had read mixed return about row based replication with some queries, so mixed looked like a good compromise. Reading current mysql (8.0) it look like RBR is the defaut now so it should be stable indeed.
I need to check but i suppose i could test switch for mixed to row wihtout much difficulties.
In RBR there isn't any need of tmp space ? I was looking at using a separate tmp space for the slave needs using slave_load_tmpdir
Best, Benoit
On 21/02/2019 02:29, Jeff Dyke wrote:
Hi based on one statement " If the replication process has to write a tmp file to execute the query", i'd just like to ask are you using STATEMENT or MIXED replication? If so do you have a reason not to use ROW? Ultimately, the last is preferred, and i realize this does not directly answer your question, but would be interested in starting here and would far most consistent and only take up space required for the row/dataset.
Best, Jeff
On Wed, Feb 20, 2019 at 4:40 PM Benoit Plessis <benoit+one@plessis.info <mailto:benoit%2Bone@plessis.info>> wrote:
Hi,
I have a few mysql cluster, previously on 5.5 (debian jessie) that we are upgrading in mariadb 10.1 (debian stretch) for now.
We are experiencing two major issues with the production trafic:
* We saw a huge increase in on disk tmp table space used (serveurs with 4Gb of free space in / (including /tmp) had to be added an additionnal 20Gb /tmp volume and even that is no enough everytime
* If the replication process has to write a tmp file to execute the query, and log slave updates is active then in the even of the /tmp volume being full, the update of the binlog will also fail erroneouslly with a "No space" error and will stop every following binlog write.
2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3; Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device"), Error_code: 1026; Writing one row to the row-based binary log failed, Error_code: 1534; handler error HA_ERR_RBR_LOGGING_FAILED; the event's master log mysql-bin.006843, end_log_pos 42848762, Gtid 0-70-467485, Internal MariaDB error code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device") Error_code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device") Error_code: 1026 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Writing one row to the row-based binary log failed Error_code: 15342019-02-20 15:24:12 140618945124096 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.006843' position 42814887
Does someone know if they have been improvement in 10.2/10.3 on theses issues ?
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Benoit, hello.
Hi,
Yes we are currently using mixed replication, we switched from statement two years ago.
I had read mixed return about row based replication with some queries, so mixed looked like a good compromise. Reading current mysql (8.0) it look like RBR is the defaut now so it should be stable indeed.
I need to check but i suppose i could test switch for mixed to row wihtout much difficulties.
In RBR there isn't any need of tmp space ?
It may use e.g through replication caches that hold statements of being executed transaction. To the issue itself
2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3;
the error was caused by ROW format event and therefore does not do with
I was looking at using a separate tmp space for the slave needs using slave_load_tmpdir
the parameter above but indeed check out @@global.tmpdir location and size available. Perhaps the Cheers, Andrei
Best, Benoit
On 21/02/2019 02:29, Jeff Dyke wrote:
Hi based on one statement " If the replication process has to write a tmp file to execute the query", i'd just like to ask are you using STATEMENT or MIXED replication? If so do you have a reason not to use ROW? Ultimately, the last is preferred, and i realize this does not directly answer your question, but would be interested in starting here and would far most consistent and only take up space required for the row/dataset.
Best, Jeff
On Wed, Feb 20, 2019 at 4:40 PM Benoit Plessis <benoit+one@plessis.info> wrote:
Hi,
I have a few mysql cluster, previously on 5.5 (debian jessie) that we are upgrading in mariadb 10.1 (debian stretch) for now.
We are experiencing two major issues with the production trafic:
* We saw a huge increase in on disk tmp table space used (serveurs with 4Gb of free space in / (including /tmp) had to be added an additionnal 20Gb /tmp volume and even that is no enough everytime
* If the replication process has to write a tmp file to execute the query, and log slave updates is active then in the even of the /tmp volume being full, the update of the binlog will also fail erroneouslly with a "No space" error and will stop every following binlog write.
2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3; Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device"), Error_code: 1026; Writing one row to the row-based binary log failed, Error_code: 1534; handler error HA_ERR_RBR_LOGGING_FAILED; the event's master log mysql-bin.006843, end_log_pos 42848762, Gtid 0-70-467485, Internal MariaDB error code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device") Error_code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device") Error_code: 1026 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Writing one row to the row-based binary log failed Error_code: 15342019-02-20 15:24:12 140618945124096 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.006843' position 42814887
Does someone know if they have been improvement in 10.2/10.3 on theses issues ?
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hello Andrei, On 21/02/2019 14:19, andrei.elkin@pp.inet.fi wrote:
[...] To the issue itself
2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3;
the error was caused by ROW format event and therefore does not do with
yeah that's what i feared.
I was looking at using a separate tmp space for the slave needs using slave_load_tmpdir the parameter above but indeed check out @@global.tmpdir location and size available.
Yes as i said initially, the binlog corruption issues are related to saturations in tmpdir location. The initial issue is this strange huge increase in tmpdir space with mariadb 10.1 were previously 4Gb was largely enough and now even 5 times this (20Gb) isn't anymore ...
Perhaps the
someone ate a few words here ;) Thanks for your insight anyway!
It is possible that the two versions have different query plans, with the new version requiring more temp space. It is also possible that different temporary table engines are in use. Check to make sure you are using aria for temp tables and not innodb.
On Feb 21, 2019, at 8:31 AM, Benoit Plessis <benoit+one@plessis.info> wrote:
Hello Andrei,
On 21/02/2019 14:19, andrei.elkin@pp.inet.fi wrote: [...] To the issue itself
2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3;
the error was caused by ROW format event and therefore does not do with
yeah that's what i feared.
I was looking at using a separate tmp space for the slave needs using slave_load_tmpdir the parameter above but indeed check out @@global.tmpdir location and size available.
Yes as i said initially, the binlog corruption issues are related to saturations in tmpdir location.
The initial issue is this strange huge increase in tmpdir space with mariadb 10.1 were previously 4Gb was largely enough and now even 5 times this (20Gb) isn't anymore ...
Perhaps the
someone ate a few words here ;)
Thanks for your insight anyway!
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
On 21/02/2019 23:40, Justin Swanhart wrote:
It is possible that the two versions have different query plans, with the new version requiring more temp space. It is also possible that different temporary table engines are in use. Check to make sure you are using aria for temp tables and not innodb.
Yes i was using Aria for tmp storage engine, i switched to myisam like in mysql 5.5, and it seem to have a nice effect on tmp space used for now !
On 22/02/2019 00:03, Benoit Plessis wrote:
On 21/02/2019 23:40, Justin Swanhart wrote:
It is possible that the two versions have different query plans, with the new version requiring more temp space. It is also possible that different temporary table engines are in use. Check to make sure you are using aria for temp tables and not innodb.
Yes i was using Aria for tmp storage engine, i switched to myisam like in mysql 5.5, and it seem to have a nice effect on tmp space used for now !
It's going way better (for the /tmp space at least) with tmp_storage_engine as MyISAM, a request that take ~7Go of tmp storage with Aria storage fit nicely and "fastly" in the 512M memory table limit
participants (4)
-
andrei.elkin@pp.inet.fi
-
Benoit Plessis
-
Jeff Dyke
-
Justin Swanhart