[Maria-discuss] MariaDB 10.1: binlog stop updating

older
[Maria-discuss] “system user” in...

Benoit Plessis

20 Feb 2019 20 Feb '19

9:40 p.m.

Hi, I have a few mysql cluster, previously on 5.5 (debian jessie) that we are upgrading in mariadb 10.1 (debian stretch) for now. We are experiencing two major issues with the production trafic: * We saw a huge increase in on disk tmp table space used (serveurs with 4Gb of free space in / (including /tmp) had to be added an additionnal 20Gb /tmp volume and even that is no enough everytime * If the replication process has to write a tmp file to execute the query, and log slave updates is active then in the even of the /tmp volume being full, the update of the binlog will also fail erroneouslly with a "No space" error and will stop every following binlog write. 2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3; Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device"), Error_code: 1026; Writing one row to the row-based binary log failed, Error_code: 1534; handler error HA_ERR_RBR_LOGGING_FAILED; the event's master log mysql-bin.006843, end_log_pos 42848762, Gtid 0-70-467485, Internal MariaDB error code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device") Error_code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device") Error_code: 1026 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Writing one row to the row-based binary log failed Error_code: 15342019-02-20 15:24:12 140618945124096 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.006843' position 42814887 Does someone know if they have been improvement in 10.2/10.3 on theses issues ?

Show replies by date

Jeff Dyke

21 Feb 21 Feb

1:29 a.m.

Hi based on one statement " If the replication process has to write a tmp file to execute the query", i'd just like to ask are you using STATEMENT or MIXED replication? If so do you have a reason not to use ROW? Ultimately, the last is preferred, and i realize this does not directly answer your question, but would be interested in starting here and would far most consistent and only take up space required for the row/dataset. Best, Jeff On Wed, Feb 20, 2019 at 4:40 PM Benoit Plessis <benoit+one@plessis.info> wrote:

...

Hi,

I have a few mysql cluster, previously on 5.5 (debian jessie) that we are upgrading in mariadb 10.1 (debian stretch) for now.

We are experiencing two major issues with the production trafic:

* We saw a huge increase in on disk tmp table space used (serveurs with 4Gb of free space in / (including /tmp) had to be added an additionnal 20Gb /tmp volume and even that is no enough everytime

* If the replication process has to write a tmp file to execute the query, and log slave updates is active then in the even of the /tmp volume being full, the update of the binlog will also fail erroneouslly with a "No space" error and will stop every following binlog write.

2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3; Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device"), Error_code: 1026; Writing one row to the row-based binary log failed, Error_code: 1534; handler error HA_ERR_RBR_LOGGING_FAILED; the event's master log mysql-bin.006843, end_log_pos 42848762, Gtid 0-70-467485, Internal MariaDB error code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device") Error_code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device") Error_code: 1026 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Writing one row to the row-based binary log failed Error_code: 15342019-02-20 15:24:12 140618945124096 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.006843' position 42814887

Does someone know if they have been improvement in 10.2/10.3 on theses issues ?

_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

Benoit Plessis

8:53 a.m.

Hi, Yes we are currently using mixed replication, we switched from statement two years ago. I had read mixed return about row based replication with some queries, so mixed looked like a good compromise. Reading current mysql (8.0) it look like RBR is the defaut now so it should be stable indeed. I need to check but i suppose i could test switch for mixed to row wihtout much difficulties. In RBR there isn't any need of tmp space ? I was looking at using a separate tmp space for the slave needs using slave_load_tmpdir Best, Benoit On 21/02/2019 02:29, Jeff Dyke wrote:

...

Hi based on one statement " If the replication process has to write a tmp file to execute the query", i'd just like to ask are you using STATEMENT or MIXED replication? If so do you have a reason not to use ROW? Ultimately, the last is preferred, and i realize this does not directly answer your question, but would be interested in starting here and would far most consistent and only take up space required for the row/dataset.

Best, Jeff

On Wed, Feb 20, 2019 at 4:40 PM Benoit Plessis <benoit+one@plessis.info <mailto:benoit%2Bone@plessis.info>> wrote:

Hi,

I have a few mysql cluster, previously on 5.5 (debian jessie) that we are upgrading in mariadb 10.1 (debian stretch) for now.

We are experiencing two major issues with the production trafic:

* We saw a huge increase in on disk tmp table space used (serveurs with 4Gb of free space in / (including /tmp) had to be added an additionnal 20Gb /tmp volume and even that is no enough everytime

* If the replication process has to write a tmp file to execute the query, and log slave updates is active then in the even of the /tmp volume being full, the update of the binlog will also fail erroneouslly with a "No space" error and will stop every following binlog write.

2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3; Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device"), Error_code: 1026; Writing one row to the row-based binary log failed, Error_code: 1534; handler error HA_ERR_RBR_LOGGING_FAILED; the event's master log mysql-bin.006843, end_log_pos 42848762, Gtid 0-70-467485, Internal MariaDB error code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device") Error_code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device") Error_code: 1026 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Writing one row to the row-based binary log failed Error_code: 15342019-02-20 15:24:12 140618945124096 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.006843' position 42814887

Does someone know if they have been improvement in 10.2/10.3 on theses issues ?

_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

Benoit Plessis

9:53 a.m.

Hum, it look like the issue is even more simple, if the tmp space is filled for too long, even if the mysql slave thread has nothing to do i end up with corrupted binlog ... On 21/02/2019 09:53, Benoit Plessis wrote:

...

Hi,

Yes we are currently using mixed replication, we switched from statement two years ago.

I had read mixed return about row based replication with some queries, so mixed looked like a good compromise. Reading current mysql (8.0) it look like RBR is the defaut now so it should be stable indeed.

I need to check but i suppose i could test switch for mixed to row wihtout much difficulties.

In RBR there isn't any need of tmp space ? I was looking at using a separate tmp space for the slave needs using slave_load_tmpdir

Best, Benoit

On 21/02/2019 02:29, Jeff Dyke wrote:

...
Hi based on one statement " If the replication process has to write a tmp file to execute the query", i'd just like to ask are you using STATEMENT or MIXED replication? If so do you have a reason not to use ROW? Ultimately, the last is preferred, and i realize this does not directly answer your question, but would be interested in starting here and would far most consistent and only take up space required for the row/dataset.

Best, Jeff

On Wed, Feb 20, 2019 at 4:40 PM Benoit Plessis <benoit+one@plessis.info <mailto:benoit%2Bone@plessis.info>> wrote:

Hi,

I have a few mysql cluster, previously on 5.5 (debian jessie) that we are upgrading in mariadb 10.1 (debian stretch) for now.

We are experiencing two major issues with the production trafic:

* We saw a huge increase in on disk tmp table space used (serveurs with 4Gb of free space in / (including /tmp) had to be added an additionnal 20Gb /tmp volume and even that is no enough everytime

* If the replication process has to write a tmp file to execute the query, and log slave updates is active then in the even of the /tmp volume being full, the update of the binlog will also fail erroneouslly with a "No space" error and will stop every following binlog write.

2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3; Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device"), Error_code: 1026; Writing one row to the row-based binary log failed, Error_code: 1534; handler error HA_ERR_RBR_LOGGING_FAILED; the event's master log mysql-bin.006843, end_log_pos 42848762, Gtid 0-70-467485, Internal MariaDB error code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device") Error_code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device") Error_code: 1026 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Writing one row to the row-based binary log failed Error_code: 15342019-02-20 15:24:12 140618945124096 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.006843' position 42814887

Does someone know if they have been improvement in 10.2/10.3 on theses issues ?

_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

andrei.elkin＠pp.inet.fi

1:19 p.m.

Benoit, hello.

...

Hi,

Yes we are currently using mixed replication, we switched from statement two years ago.

I had read mixed return about row based replication with some queries, so mixed looked like a good compromise. Reading current mysql (8.0) it look like RBR is the defaut now so it should be stable indeed.

I need to check but i suppose i could test switch for mixed to row wihtout much difficulties.

In RBR there isn't any need of tmp space ?

It may use e.g through replication caches that hold statements of being executed transaction. To the issue itself

...

2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3;

the error was caused by ROW format event and therefore does not do with

...

I was looking at using a separate tmp space for the slave needs using slave_load_tmpdir

the parameter above but indeed check out @@global.tmpdir location and size available. Perhaps the Cheers, Andrei

...

Best, Benoit

On 21/02/2019 02:29, Jeff Dyke wrote:

Hi based on one statement " If the replication process has to write a tmp file to execute the query", i'd just like to ask are you using STATEMENT or MIXED replication? If so do you have a reason not to use ROW? Ultimately, the last is preferred, and i realize this does not directly answer your question, but would be interested in starting here and would far most consistent and only take up space required for the row/dataset.

Best, Jeff

On Wed, Feb 20, 2019 at 4:40 PM Benoit Plessis <benoit+one@plessis.info> wrote:

Hi,

I have a few mysql cluster, previously on 5.5 (debian jessie) that we are upgrading in mariadb 10.1 (debian stretch) for now.

We are experiencing two major issues with the production trafic:

* We saw a huge increase in on disk tmp table space used (serveurs with 4Gb of free space in / (including /tmp) had to be added an additionnal 20Gb /tmp volume and even that is no enough everytime

* If the replication process has to write a tmp file to execute the query, and log slave updates is active then in the even of the /tmp volume being full, the update of the binlog will also fail erroneouslly with a "No space" error and will stop every following binlog write.

2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3; Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device"), Error_code: 1026; Writing one row to the row-based binary log failed, Error_code: 1534; handler error HA_ERR_RBR_LOGGING_FAILED; the event's master log mysql-bin.006843, end_log_pos 42848762, Gtid 0-70-467485, Internal MariaDB error code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device") Error_code: 3 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Error writing file '/var/lib/mysql/mysql-bin' (errno: 28 "No space left on device") Error_code: 1026 2019-02-20 15:24:12 140618945124096 [Warning] Slave: Writing one row to the row-based binary log failed Error_code: 15342019-02-20 15:24:12 140618945124096 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.006843' position 42814887

Does someone know if they have been improvement in 10.2/10.3 on theses issues ?

_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

Benoit Plessis

1:31 p.m.

Hello Andrei, On 21/02/2019 14:19, andrei.elkin@pp.inet.fi wrote:

...

[...] To the issue itself

...
2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3;

the error was caused by ROW format event and therefore does not do with

yeah that's what i feared.

...

...
I was looking at using a separate tmp space for the slave needs using slave_load_tmpdir the parameter above but indeed check out @@global.tmpdir location and size available.

Yes as i said initially, the binlog corruption issues are related to saturations in tmpdir location. The initial issue is this strange huge increase in tmpdir space with mariadb 10.1 were previously 4Gb was largely enough and now even 5 times this (20Gb) isn't anymore ...

...

Perhaps the

someone ate a few words here ;) Thanks for your insight anyway!

Justin Swanhart

10:40 p.m.

It is possible that the two versions have different query plans, with the new version requiring more temp space. It is also possible that different temporary table engines are in use. Check to make sure you are using aria for temp tables and not innodb.

...

On Feb 21, 2019, at 8:31 AM, Benoit Plessis <benoit+one@plessis.info> wrote:

Hello Andrei,

...
On 21/02/2019 14:19, andrei.elkin@pp.inet.fi wrote: [...] To the issue itself

...
2019-02-20 15:24:12 140618945124096 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table xxx.yyyy; Error writing file '/tmp/MLxClNzW' (Errcode: 28 "No space left on device"), Error_code: 3;

the error was caused by ROW format event and therefore does not do with

yeah that's what i feared.

...
...
I was looking at using a separate tmp space for the slave needs using slave_load_tmpdir the parameter above but indeed check out @@global.tmpdir location and size available.

Yes as i said initially, the binlog corruption issues are related to saturations in tmpdir location.

The initial issue is this strange huge increase in tmpdir space with mariadb 10.1 were previously 4Gb was largely enough and now even 5 times this (20Gb) isn't anymore ...

...
Perhaps the

someone ate a few words here ;)

Thanks for your insight anyway!

_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

Benoit Plessis

11:03 p.m.

On 21/02/2019 23:40, Justin Swanhart wrote:

...

It is possible that the two versions have different query plans, with the new version requiring more temp space. It is also possible that different temporary table engines are in use. Check to make sure you are using aria for temp tables and not innodb.

Yes i was using Aria for tmp storage engine, i switched to myisam like in mysql 5.5, and it seem to have a nice effect on tmp space used for now !

Benoit Plessis

22 Feb 22 Feb

9:45 a.m.

On 22/02/2019 00:03, Benoit Plessis wrote:

...

On 21/02/2019 23:40, Justin Swanhart wrote:

...
It is possible that the two versions have different query plans, with the new version requiring more temp space. It is also possible that different temporary table engines are in use. Check to make sure you are using aria for temp tables and not innodb.

Yes i was using Aria for tmp storage engine, i switched to myisam like in mysql 5.5, and it seem to have a nice effect on tmp space used for now !

It's going way better (for the /tmp space at least) with tmp_storage_engine as MyISAM, a request that take ~7Go of tmp storage with Aria storage fit nicely and "fastly" in the 512M memory table limit

2331

Age (days ago)

2333

Last active (days ago)

List overview

8 comments

4 participants

participants (4)

andrei.elkin＠pp.inet.fi
Benoit Plessis
Jeff Dyke
Justin Swanhart

[Maria-discuss] MariaDB 10.1: binlog stop updating

tags

participants (4)