[Maria-discuss] MariaDB server horribly slow on start
Hello all. I hope I'm at the right place to ask this question. I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list. We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance. When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least! And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots I've attached the configuration of any server for reference in case this is the source of the issue. Any way to solve this would be greatly appreciated. Regards, 3C.
Op 27 jul. 2022 om 11:46 heeft Cédric Counotte <cedric.counotte@1check.com> het volgende geschreven:
Hello all. I hope I’m at the right place to ask this question. I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list.
We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance.
When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least!
And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots I’ve attached the configuration of any server for reference in case this is the source of the issue.
Any way to solve this would be greatly appreciated.
You seem to be focusing on effect. What is the cause? SST?
Regards, 3C. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Thanks for your reply ! If the server does an SST, the problem is way more dramatic than when it does an IST. This morning one server crashed and upon restarting it did an SST instead of an IST, and the issue was horrible. Even before being available, it blocked the donor for 15 minutes with something like those: 2022-07-27 12:02:42 7 [Note] WSREP: Processing event queue:... 20.9% ( 496/2376 events) complete. For a while it got even slower to process the queue than the queue was increasing. The same server crashed again so I started another one and it did an SST, but the problem was not as dramatic, however the processing even queue lasted 5 minutes and blocked the donor completed for that time. In very rare occasions the SST is not causing such issues, but very rare (twice in 6 months and 2 or 3 dozen of issue occurrences) and I didn’t change any settings since!? Very confusing. When servers do an SST, I usually kill the CHECK TABLE FOR UPGRADE that occurs as it appears to slow things down even more. Noticeably this morning I had 3 servers running, one went haywire, and caused another one to go down! Ended-up with a single server I had to restart caused it would complain about not being wsrep ready. It’s been a very bad day today as those 4 servers are in production and we received dozens of calls from our customers. Now I’m back with 2 servers and will wait tonight to restart the 2 others because of that issue. IMO it’s a bug as in very rare occasions it starts smoothly. But still I found galera to be unreliable and my company is asking me to install a more reliable solution ASAP or we will loose customers! So any help would be much appreciated. I’m thinking of using 3 servers with replication instead, keeping load balancing using source Ips, but I’m worried that this might be less reliable. We have 2 spare servers in another location, synched with replication but it happened too often that upon a server crash the replication would no longer start and had to be entirely restarted which shows as not being even less reliable. Sorry for the long story, but I’m no Galera expert and I’m having lots of issues I can’t find any info or solution about. This is another issue I’m facing with replication, while it seems to be caused by galera cluster: https://jira.mariadb.org/browse/MDEV-29132 De : William Edwards <wedwards@cyberfusion.nl> Envoyé : mercredi 27 juillet 2022 11:58 À : Cédric Counotte <cedric.counotte@1check.com> Cc : maria-discuss@lists.launchpad.net Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Op 27 jul. 2022 om 11:46 heeft Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> het volgende geschreven: Hello all. I hope I’m at the right place to ask this question. I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list. We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance. When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least! And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots I’ve attached the configuration of any server for reference in case this is the source of the issue. Any way to solve this would be greatly appreciated. You seem to be focusing on effect. What is the cause? SST? Regards, 3C. [cid:image001.png@01D8A1B5.A4E197B0] _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hi,
Op 27 jul. 2022 om 12:37 heeft Cédric Counotte <cedric.counotte@1check.com> het volgende geschreven:
Thanks for your reply !
If the server does an SST, the problem is way more dramatic than when it does an IST.
This morning one server crashed and upon restarting it did an SST instead of an IST, and the issue was horrible. Even before being available, it blocked the donor for 15 minutes with something like those:
2022-07-27 12:02:42 7 [Note] WSREP: Processing event queue:... 20.9% ( 496/2376 events) complete.
Does the issue occur while these messages are logged?
For a while it got even slower to process the queue than the queue was increasing.
The same server crashed again so I started another one and it did an SST, but the problem was not as dramatic, however the processing even queue lasted 5 minutes and blocked the donor completed for that time. In very rare occasions the SST is not causing such issues, but very rare (twice in 6 months and 2 or 3 dozen of issue occurrences) and I didn’t change any settings since!? Very confusing.
When servers do an SST, I usually kill the CHECK TABLE FOR UPGRADE that occurs as it appears to slow things down even more.
Noticeably this morning I had 3 servers running, one went haywire, and caused another one to go down! Ended-up with a single server I had to restart caused it would complain about not being wsrep ready.
It’s been a very bad day today as those 4 servers are in production and we received dozens of calls from our customers.
Again, I’d focus on cause. The effect is clear.
Now I’m back with 2 servers and will wait tonight to restart the 2 others because of that issue.
IMO it’s a bug as in very rare occasions it starts smoothly. But still I found galera to be unreliable and my company is asking me to install a more reliable solution ASAP or we will loose customers! So any help would be much appreciated.
Whether something’s a bug is not an opinion.
I’m thinking of using 3 servers with replication instead, keeping load balancing using source Ips, but I’m worried that this might be less reliable. We have 2 spare servers in another location, synched with replication but it happened too often that upon a server crash the replication would no longer start and had to be entirely restarted which shows as not being even less reliable.
Sorry for the long story, but I’m no Galera expert
Then you could indeed wonder if your company should be using Galera …
and I’m having lots of issues I can’t find any info or solution about.
This is another issue I’m facing with replication, while it seems to be caused by galera cluster: https://jira.mariadb.org/browse/MDEV-29132
De : William Edwards <wedwards@cyberfusion.nl> Envoyé : mercredi 27 juillet 2022 11:58 À : Cédric Counotte <cedric.counotte@1check.com> Cc : maria-discuss@lists.launchpad.net Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
Op 27 jul. 2022 om 11:46 heeft Cédric Counotte <cedric.counotte@1check.com> het volgende geschreven:
Hello all. I hope I’m at the right place to ask this question. I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list.
We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance.
When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least!
And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots I’ve attached the configuration of any server for reference in case this is the source of the issue.
Any way to solve this would be greatly appreciated.
You seem to be focusing on effect. What is the cause? SST?
Regards, 3C.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
When processing the queue, it seems the donor is blocking all queries. At least that’s what it looked like, but maybe it’s just even more slow. I’m not sure what the cause is, but only notice this problem after a server restart, be it a SST or an IST. We have scripts running every 1, 2, 3 and 5 minutes that process data on the DB, and for like 30 minutes after a server start, I have to kill them or disable cron altogether to avoid worsening the issue. At some point I believed that was enough to cope with this slowness, but fact is it’s not. We are processing between 5.000 to 10.000 queries per seconds. In “normal circumstances”, a single server is enough. But upon a single server start, even 4 servers are not handling the delays. If I restart 2 servers, the issue is even more dramatic. Sorry I’m trying to focus on the cause, but apart from restarting a server there is no other cause for the issue. I did an update of Ubuntu from 21.10 to 22.04 at night 3 days ago, and they all did an IST, but still the slowness occurred, even though there is little traffic at night. Something like 2000 queries per seconds. All 4 servers ended-up with 100+ queries stuck for entire minutes! Is there a way to avoid dramatic slowness on server start? I’ve read about optimizer_search_depth which could cause slow query when different than 0, but regardless of the value I set for it, the issue is exactly the same, so it’s currently set to 0. De : William Edwards <wedwards@cyberfusion.nl> Envoyé : mercredi 27 juillet 2022 12:45 À : Cédric Counotte <cedric.counotte@1check.com> Cc : maria-discuss@lists.launchpad.net Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Hi, Op 27 jul. 2022 om 12:37 heeft Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> het volgende geschreven: Thanks for your reply ! If the server does an SST, the problem is way more dramatic than when it does an IST. This morning one server crashed and upon restarting it did an SST instead of an IST, and the issue was horrible. Even before being available, it blocked the donor for 15 minutes with something like those: 2022-07-27 12:02:42 7 [Note] WSREP: Processing event queue:... 20.9% ( 496/2376 events) complete. Does the issue occur while these messages are logged? For a while it got even slower to process the queue than the queue was increasing. The same server crashed again so I started another one and it did an SST, but the problem was not as dramatic, however the processing even queue lasted 5 minutes and blocked the donor completed for that time. In very rare occasions the SST is not causing such issues, but very rare (twice in 6 months and 2 or 3 dozen of issue occurrences) and I didn’t change any settings since!? Very confusing. When servers do an SST, I usually kill the CHECK TABLE FOR UPGRADE that occurs as it appears to slow things down even more. Noticeably this morning I had 3 servers running, one went haywire, and caused another one to go down! Ended-up with a single server I had to restart caused it would complain about not being wsrep ready. It’s been a very bad day today as those 4 servers are in production and we received dozens of calls from our customers. Again, I’d focus on cause. The effect is clear. Now I’m back with 2 servers and will wait tonight to restart the 2 others because of that issue. IMO it’s a bug as in very rare occasions it starts smoothly. But still I found galera to be unreliable and my company is asking me to install a more reliable solution ASAP or we will loose customers! So any help would be much appreciated. Whether something’s a bug is not an opinion. I’m thinking of using 3 servers with replication instead, keeping load balancing using source Ips, but I’m worried that this might be less reliable. We have 2 spare servers in another location, synched with replication but it happened too often that upon a server crash the replication would no longer start and had to be entirely restarted which shows as not being even less reliable. Sorry for the long story, but I’m no Galera expert Then you could indeed wonder if your company should be using Galera … and I’m having lots of issues I can’t find any info or solution about. This is another issue I’m facing with replication, while it seems to be caused by galera cluster: https://jira.mariadb.org/browse/MDEV-29132 De : William Edwards <wedwards@cyberfusion.nl<mailto:wedwards@cyberfusion.nl>> Envoyé : mercredi 27 juillet 2022 11:58 À : Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> Cc : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Op 27 jul. 2022 om 11:46 heeft Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> het volgende geschreven: Hello all. I hope I’m at the right place to ask this question. I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list. We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance. When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least! And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots I’ve attached the configuration of any server for reference in case this is the source of the issue. Any way to solve this would be greatly appreciated. You seem to be focusing on effect. What is the cause? SST? Regards, 3C. [image001.png] _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Trying to dig into wsrep status, I found that: On donor node (30 minutes after SST complete): wsrep_local_send_queue_avg 8.77932 (now 8.69799) wsrep_local_recv_queue_avg 0.192287 On newly started node: wsrep_local_send_queue_avg 0.00315457 wsrep_local_recv_queue_avg 61.1511 (now 52.7237) The big values are decreasing slowly, the now is after finishing to write the email. De : William Edwards <wedwards@cyberfusion.nl> Envoyé : mercredi 27 juillet 2022 12:45 À : Cédric Counotte <cedric.counotte@1check.com> Cc : maria-discuss@lists.launchpad.net Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Hi, Op 27 jul. 2022 om 12:37 heeft Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> het volgende geschreven: Thanks for your reply ! If the server does an SST, the problem is way more dramatic than when it does an IST. This morning one server crashed and upon restarting it did an SST instead of an IST, and the issue was horrible. Even before being available, it blocked the donor for 15 minutes with something like those: 2022-07-27 12:02:42 7 [Note] WSREP: Processing event queue:... 20.9% ( 496/2376 events) complete. Does the issue occur while these messages are logged? For a while it got even slower to process the queue than the queue was increasing. The same server crashed again so I started another one and it did an SST, but the problem was not as dramatic, however the processing even queue lasted 5 minutes and blocked the donor completed for that time. In very rare occasions the SST is not causing such issues, but very rare (twice in 6 months and 2 or 3 dozen of issue occurrences) and I didn’t change any settings since!? Very confusing. When servers do an SST, I usually kill the CHECK TABLE FOR UPGRADE that occurs as it appears to slow things down even more. Noticeably this morning I had 3 servers running, one went haywire, and caused another one to go down! Ended-up with a single server I had to restart caused it would complain about not being wsrep ready. It’s been a very bad day today as those 4 servers are in production and we received dozens of calls from our customers. Again, I’d focus on cause. The effect is clear. Now I’m back with 2 servers and will wait tonight to restart the 2 others because of that issue. IMO it’s a bug as in very rare occasions it starts smoothly. But still I found galera to be unreliable and my company is asking me to install a more reliable solution ASAP or we will loose customers! So any help would be much appreciated. Whether something’s a bug is not an opinion. I’m thinking of using 3 servers with replication instead, keeping load balancing using source Ips, but I’m worried that this might be less reliable. We have 2 spare servers in another location, synched with replication but it happened too often that upon a server crash the replication would no longer start and had to be entirely restarted which shows as not being even less reliable. Sorry for the long story, but I’m no Galera expert Then you could indeed wonder if your company should be using Galera … and I’m having lots of issues I can’t find any info or solution about. This is another issue I’m facing with replication, while it seems to be caused by galera cluster: https://jira.mariadb.org/browse/MDEV-29132 De : William Edwards <wedwards@cyberfusion.nl<mailto:wedwards@cyberfusion.nl>> Envoyé : mercredi 27 juillet 2022 11:58 À : Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> Cc : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Op 27 jul. 2022 om 11:46 heeft Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> het volgende geschreven: Hello all. I hope I’m at the right place to ask this question. I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list. We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance. When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least! And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots I’ve attached the configuration of any server for reference in case this is the source of the issue. Any way to solve this would be greatly appreciated. You seem to be focusing on effect. What is the cause? SST? Regards, 3C. [image001.png] _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
I must add that the slowness occurs when the SST or IST are completed, and the server becomes available. Donor is also entirely blocked (it seems) when the queue is being processed. While the SST is performed, there is no issue. De : William Edwards <wedwards@cyberfusion.nl> Envoyé : mercredi 27 juillet 2022 11:58 À : Cédric Counotte <cedric.counotte@1check.com> Cc : maria-discuss@lists.launchpad.net Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Op 27 jul. 2022 om 11:46 heeft Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> het volgende geschreven: Hello all. I hope I’m at the right place to ask this question. I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list. We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance. When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least! And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots I’ve attached the configuration of any server for reference in case this is the source of the issue. Any way to solve this would be greatly appreciated. You seem to be focusing on effect. What is the cause? SST? Regards, 3C. [cid:image001.png@01D8A1B6.B1B72800] _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
IMO your biggest cause of problems is chasing the bleeding edge. You are on 10.6.x. I generally don't even deploy 10.5.x in production yet except in very exceptional circumstances (simple async replication, no expectation of extreme performance). Migrate to 10.4.x and stick with it until it goes EOL. Then maybe think about upgrading to 10.5.x in the 6 months following 10.4.x EOL. On Wed, Jul 27, 2022 at 1:50 PM Cédric Counotte <cedric.counotte@1check.com> wrote:
I must add that the slowness occurs when the SST or IST are completed, and the server becomes available. Donor is also entirely blocked (it seems) when the queue is being processed.
While the SST is performed, there is no issue.
*De :* William Edwards <wedwards@cyberfusion.nl> *Envoyé :* mercredi 27 juillet 2022 11:58 *À :* Cédric Counotte <cedric.counotte@1check.com> *Cc :* maria-discuss@lists.launchpad.net *Objet :* Re: [Maria-discuss] MariaDB server horribly slow on start
Op 27 jul. 2022 om 11:46 heeft Cédric Counotte <cedric.counotte@1check.com> het volgende geschreven:
Hello all. I hope I’m at the right place to ask this question.
I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list.
We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance.
When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least!
And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots
I’ve attached the configuration of any server for reference in case this is the source of the issue.
Any way to solve this would be greatly appreciated.
You seem to be focusing on effect. What is the cause? SST?
Regards,
3C.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
I started with 10.4.x back in January, exact same issue. Now on Ubuntu 22.04 it comes with MariaDB 10.6 and I was forced to upgrade! I’ll test downgrading somehow, but on mariadb downloads I can’t find 22.04 version?! Download MariaDB Server - MariaDB.org<https://mariadb.org/download/?t=repo-config&d=21.10+%22impish%22&v=10.5&r_m=mva> De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 13:02 À : Cédric Counotte <cedric.counotte@1check.com> Cc : William Edwards <wedwards@cyberfusion.nl>; maria-discuss@lists.launchpad.net Objet : Re: [Maria-discuss] MariaDB server horribly slow on start IMO your biggest cause of problems is chasing the bleeding edge. You are on 10.6.x. I generally don't even deploy 10.5.x in production yet except in very exceptional circumstances (simple async replication, no expectation of extreme performance). Migrate to 10.4.x and stick with it until it goes EOL. Then maybe think about upgrading to 10.5.x in the 6 months following 10.4.x EOL. On Wed, Jul 27, 2022 at 1:50 PM Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> wrote: I must add that the slowness occurs when the SST or IST are completed, and the server becomes available. Donor is also entirely blocked (it seems) when the queue is being processed. While the SST is performed, there is no issue. De : William Edwards <wedwards@cyberfusion.nl<mailto:wedwards@cyberfusion.nl>> Envoyé : mercredi 27 juillet 2022 11:58 À : Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> Cc : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Op 27 jul. 2022 om 11:46 heeft Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> het volgende geschreven: Hello all. I hope I’m at the right place to ask this question. I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list. We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance. When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least! And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots I’ve attached the configuration of any server for reference in case this is the source of the issue. Any way to solve this would be greatly appreciated. You seem to be focusing on effect. What is the cause? SST? Regards, 3C. [cid:image001.png@01D8A1B9.6DA3C260] _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro. Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry. So it seems you are, most unfortunately, in for an awful lot of pain. You seem to have multiple instances of some settings in your config, that is never a good sign. Do you have PMM set up on this cluster and fully instrumented? On Wed, 27 Jul 2022, 14:04 Cédric Counotte, <cedric.counotte@1check.com> wrote:
I started with 10.4.x back in January, exact same issue.
Now on Ubuntu 22.04 it comes with MariaDB 10.6 and I was forced to upgrade!
I’ll test downgrading somehow, but on mariadb downloads I can’t find 22.04 version?!
Download MariaDB Server - MariaDB.org <https://mariadb.org/download/?t=repo-config&d=21.10+%22impish%22&v=10.5&r_m=mva>
*De :* Gordan Bobic <gordan.bobic@gmail.com> *Envoyé :* mercredi 27 juillet 2022 13:02 *À :* Cédric Counotte <cedric.counotte@1check.com> *Cc :* William Edwards <wedwards@cyberfusion.nl>; maria-discuss@lists.launchpad.net *Objet :* Re: [Maria-discuss] MariaDB server horribly slow on start
IMO your biggest cause of problems is chasing the bleeding edge.
You are on 10.6.x. I generally don't even deploy 10.5.x in production yet except in very exceptional circumstances (simple async replication, no expectation of extreme performance).
Migrate to 10.4.x and stick with it until it goes EOL. Then maybe think about upgrading to 10.5.x in the 6 months following 10.4.x EOL.
On Wed, Jul 27, 2022 at 1:50 PM Cédric Counotte < cedric.counotte@1check.com> wrote:
I must add that the slowness occurs when the SST or IST are completed, and the server becomes available. Donor is also entirely blocked (it seems) when the queue is being processed.
While the SST is performed, there is no issue.
*De :* William Edwards <wedwards@cyberfusion.nl> *Envoyé :* mercredi 27 juillet 2022 11:58 *À :* Cédric Counotte <cedric.counotte@1check.com> *Cc :* maria-discuss@lists.launchpad.net *Objet :* Re: [Maria-discuss] MariaDB server horribly slow on start
Op 27 jul. 2022 om 11:46 heeft Cédric Counotte <cedric.counotte@1check.com> het volgende geschreven:
Hello all. I hope I’m at the right place to ask this question.
I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list.
We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance.
When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least!
And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots
I’ve attached the configuration of any server for reference in case this is the source of the issue.
Any way to solve this would be greatly appreciated.
You seem to be focusing on effect. What is the cause? SST?
Regards,
3C.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
I’m already in pain 😉 The config sent is a little old and I have since removed the dups, however those were the results of trying to fiddle with some of the settings in hope for improvements. No idea what is PMM ? Do you have a bug tracking this behavior, so that maybe I can follow it or apply work-around ? I’m not quite keen on downgrading (which I’ll do using the spare servers I have and a little replication in between), but if it turns out to provide a more reliable alternative I’m eager to do it then! De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 13:27 À : Cédric Counotte <cedric.counotte@1check.com> Cc : William Edwards <wedwards@cyberfusion.nl>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro. Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry. So it seems you are, most unfortunately, in for an awful lot of pain. You seem to have multiple instances of some settings in your config, that is never a good sign. Do you have PMM set up on this cluster and fully instrumented? On Wed, 27 Jul 2022, 14:04 Cédric Counotte, <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> wrote: I started with 10.4.x back in January, exact same issue. Now on Ubuntu 22.04 it comes with MariaDB 10.6 and I was forced to upgrade! I’ll test downgrading somehow, but on mariadb downloads I can’t find 22.04 version?! Download MariaDB Server - MariaDB.org<https://mariadb.org/download/?t=repo-config&d=21.10+%22impish%22&v=10.5&r_m=mva> De : Gordan Bobic <gordan.bobic@gmail.com<mailto:gordan.bobic@gmail.com>> Envoyé : mercredi 27 juillet 2022 13:02 À : Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> Cc : William Edwards <wedwards@cyberfusion.nl<mailto:wedwards@cyberfusion.nl>>; maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start IMO your biggest cause of problems is chasing the bleeding edge. You are on 10.6.x. I generally don't even deploy 10.5.x in production yet except in very exceptional circumstances (simple async replication, no expectation of extreme performance). Migrate to 10.4.x and stick with it until it goes EOL. Then maybe think about upgrading to 10.5.x in the 6 months following 10.4.x EOL. On Wed, Jul 27, 2022 at 1:50 PM Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> wrote: I must add that the slowness occurs when the SST or IST are completed, and the server becomes available. Donor is also entirely blocked (it seems) when the queue is being processed. While the SST is performed, there is no issue. De : William Edwards <wedwards@cyberfusion.nl<mailto:wedwards@cyberfusion.nl>> Envoyé : mercredi 27 juillet 2022 11:58 À : Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> Cc : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Op 27 jul. 2022 om 11:46 heeft Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> het volgende geschreven: Hello all. I hope I’m at the right place to ask this question. I opened a bug here: https://jira.mariadb.org/browse/MDEV-28969, however I was told to use this mailing list. We have 4 MariaDB servers in a Galera Cluster and it happens that a server has to be restarted (be it for a crash which I have to open a bug for) or maintenance. When that happens, the restarted server is causing huge slow down on the whole cluster, and it lasts for 10 to 30 minutes at the very least! And by huge, I mean huge, we end up with 500 to 800 pending queries on all servers as you can see on attached screenshots I’ve attached the configuration of any server for reference in case this is the source of the issue. Any way to solve this would be greatly appreciated. You seem to be focusing on effect. What is the cause? SST? Regards, 3C. [cid:image001.png@01D8A1B9.6DA3C260] _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
On Wed, Jul 27, 2022 at 2:38 PM Cédric Counotte <cedric.counotte@1check.com> wrote:
No idea what is PMM ?
https://www.percona.com/software/database-tools/percona-monitoring-and-manag... I prefer v1, but in Percona's build of PMM v1 there are several bugs / missing features that are going to cause you problems with newer kernels and MariaDB 10.5+. I have patches to fix it, but my build of it is not yet publicly available. But v2 is still far better than no PMM.
Do you have a bug tracking this behavior, so that maybe I can follow it or apply work-around ?
I cannot say I have looked. I'm generally happy to leave testing to others until I have an overwhelming reason to upgrade to a version with a lot of changes. And 10.5 was a rather big feature release.
I’m not quite keen on downgrading (which I’ll do using the spare servers I have and a little replication in between), but if it turns out to provide a more reliable alternative I’m eager to do it then!
Replication from newer version to older version is also not officially supported. It usually works, though.
AH! Thanks for the input about PMM, I'm actually using netdata as you can see on screenshot. And yes, it's better to have one than nothing, it helped me numerous times spot improvements and limits that could be changed to improve things. I did an upgrade from 10.4 -> 10.5 -> 10.6 while it kept saying SST are not supported, but it worked 😉 Not sure I'm glad it worked now that I might have to revert back to 10.4 (and Ubuntu 20.04!). -----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 13:51 À : Cédric Counotte <cedric.counotte@1check.com> Cc : William Edwards <wedwards@cyberfusion.nl>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start On Wed, Jul 27, 2022 at 2:38 PM Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> wrote:
No idea what is PMM ?
https://www.percona.com/software/database-tools/percona-monitoring-and-manag... I prefer v1, but in Percona's build of PMM v1 there are several bugs / missing features that are going to cause you problems with newer kernels and MariaDB 10.5+. I have patches to fix it, but my build of it is not yet publicly available. But v2 is still far better than no PMM.
Do you have a bug tracking this behavior, so that maybe I can follow it or apply work-around ?
I cannot say I have looked. I'm generally happy to leave testing to others until I have an overwhelming reason to upgrade to a version with a lot of changes. And 10.5 was a rather big feature release.
I’m not quite keen on downgrading (which I’ll do using the spare servers I have and a little replication in between), but if it turns out to provide a more reliable alternative I’m eager to do it then!
Replication from newer version to older version is also not officially supported. It usually works, though.
Hi, Can you attach the logs of global status and global variables On 7/27/22 15:12, Cédric Counotte wrote:
AH! Thanks for the input about PMM, I'm actually using netdata as you can see on screenshot.
And yes, it's better to have one than nothing, it helped me numerous times spot improvements and limits that could be changed to improve things.
I did an upgrade from 10.4 -> 10.5 -> 10.6 while it kept saying SST are not supported, but it worked 😉Not sure I'm glad it worked now that I might have to revert back to 10.4 (and Ubuntu 20.04!).
-----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 13:51 À : Cédric Counotte <cedric.counotte@1check.com> Cc : William Edwards <wedwards@cyberfusion.nl>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Wed, Jul 27, 2022 at 2:38 PM Cédric Counotte <cedric.counotte@1check.com <mailto:cedric.counotte@1check.com>> wrote:
No idea what is PMM ?
https://www.percona.com/software/database-tools/percona-monitoring-and-manag... <https://www.percona.com/software/database-tools/percona-monitoring-and-management>
I prefer v1, but in Percona's build of PMM v1 there are several bugs / missing features that are going to cause you problems with newer kernels and MariaDB 10.5+.
I have patches to fix it, but my build of it is not yet publicly available. But v2 is still far better than no PMM.
Do you have a bug tracking this behavior, so that maybe I can follow it or apply work-around ?
I cannot say I have looked. I'm generally happy to leave testing to others until I have an overwhelming reason to upgrade to a version with a lot of changes. And 10.5 was a rather big feature release.
I’m not quite keen on downgrading (which I’ll do using the spare servers I have and a little replication in between), but if it turns out to provide a more reliable alternative I’m eager to do it then!
Replication from newer version to older version is also not officially supported. It usually works, though.
_______________________________________________ Mailing list:https://launchpad.net/~maria-discuss Post to :maria-discuss@lists.launchpad.net Unsubscribe :https://launchpad.net/~maria-discuss More help :https://help.launchpad.net/ListHelp
Attached both. Later today (at nighttime in Europe) I will be testing server restarts so it might be a good time to change any settings that could cause this. Thanks for your time. De : Ali <mp.x@bk.ru> Envoyé : mercredi 27 juillet 2022 14:28 À : Cédric Counotte <cedric.counotte@1check.com> Cc : Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Hi, Can you attach the logs of global status and global variables On 7/27/22 15:12, Cédric Counotte wrote: AH! Thanks for the input about PMM, I'm actually using netdata as you can see on screenshot. And yes, it's better to have one than nothing, it helped me numerous times spot improvements and limits that could be changed to improve things. I did an upgrade from 10.4 -> 10.5 -> 10.6 while it kept saying SST are not supported, but it worked 😉 Not sure I'm glad it worked now that I might have to revert back to 10.4 (and Ubuntu 20.04!). -----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com><mailto:gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 13:51 À : Cédric Counotte <cedric.counotte@1check.com><mailto:cedric.counotte@1check.com> Cc : William Edwards <wedwards@cyberfusion.nl><mailto:wedwards@cyberfusion.nl>; Mailing-List mariadb <maria-discuss@lists.launchpad.net><mailto:maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start On Wed, Jul 27, 2022 at 2:38 PM Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> wrote:
No idea what is PMM ?
https://www.percona.com/software/database-tools/percona-monitoring-and-manag... I prefer v1, but in Percona's build of PMM v1 there are several bugs / missing features that are going to cause you problems with newer kernels and MariaDB 10.5+. I have patches to fix it, but my build of it is not yet publicly available. But v2 is still far better than no PMM.
Do you have a bug tracking this behavior, so that maybe I can follow it or apply work-around ?
I cannot say I have looked. I'm generally happy to leave testing to others until I have an overwhelming reason to upgrade to a version with a lot of changes. And 10.5 was a rather big feature release.
I’m not quite keen on downgrading (which I’ll do using the spare servers I have and a little replication in between), but if it turns out to provide a more reliable alternative I’m eager to do it then!
Replication from newer version to older version is also not officially supported. It usually works, though. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295. In MariaDB Server 10.6, the InnoDB locking was refactored. Thanks to that, the Galera cluster hang https://jira.mariadb.org/browse/MDEV-23328 only affected earlier major versions. Best regards, Marko
On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295.
What version was it reverted in? I am still seeing the errant redo log flushing behaviour in 10.5.15. It looks like no flushing happens until the hwm is reached at about 85% full. It then tries to commit everything down to the lwm. And inbetween it doesn't do anything, even while everything is idle and it should be running down the
Reading this: https://jira.mariadb.org/browse/MDEV-27295 It's quite unclear when it is fixed or reverted. That said I read that the following setting might fix it: SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001; Is that correct and should I try that and see if that helps? -----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 14:29 À : Marko Mäkelä <marko.makela@mariadb.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295.
What version was it reverted in? I am still seeing the errant redo log flushing behaviour in 10.5.15. It looks like no flushing happens until the hwm is reached at about 85% full. It then tries to commit everything down to the lwm. And inbetween it doesn't do anything, even while everything is idle and it should be running down the
You can try it, but that isn't a fix, that is a way to make the flushing run all the time at full rate. If I remember correctly, old behaviour was that the flushing would happen at innodb_io_capacity rate, and above the hwm, it would kick into innodb_io_capacity_max rate. Or something along those lines. On 10.5+ you get only two speeds, 0 and whatever your disks can handle (which can also starve other I/O) Whatever the "improvement" intended was, the outcome is a substantial downgrade. On Wed, Jul 27, 2022 at 3:36 PM Cédric Counotte <cedric.counotte@1check.com> wrote:
Reading this: https://jira.mariadb.org/browse/MDEV-27295
It's quite unclear when it is fixed or reverted.
That said I read that the following setting might fix it: SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;
Is that correct and should I try that and see if that helps?
-----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 14:29 À : Marko Mäkelä <marko.makela@mariadb.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295.
What version was it reverted in? I am still seeing the errant redo log flushing behaviour in 10.5.15. It looks like no flushing happens until the hwm is reached at about 85% full. It then tries to commit everything down to the lwm. And inbetween it doesn't do anything, even while everything is idle and it should be running down the
Hi Cédric! Just to be sure, do you really need the 2x 1G log_file_size ? BR, Jocelyn Fournier
Le 27 juil. 2022 à 14:36, Cédric Counotte <cedric.counotte@1check.com> a écrit :
Reading this: https://jira.mariadb.org/browse/MDEV-27295
It's quite unclear when it is fixed or reverted.
That said I read that the following setting might fix it: SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;
Is that correct and should I try that and see if that helps?
-----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 14:29 À : Marko Mäkelä <marko.makela@mariadb.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295.
What version was it reverted in? I am still seeing the errant redo log flushing behaviour in 10.5.15. It looks like no flushing happens until the hwm is reached at about 85% full. It then tries to commit everything down to the lwm. And inbetween it doesn't do anything, even while everything is idle and it should be running down the _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
10.5+ only uses a single log file, so that is 1x1GB. And 1GB is tiny, IMO it should be a default these days. I would only even consider something smaller if I was running on an older Raspberry Pi or something similarly constrained. On Wed, Jul 27, 2022 at 4:11 PM jocelyn fournier <jocelyn.fournier@gmail.com> wrote:
Hi Cédric!
Just to be sure, do you really need the 2x 1G log_file_size ?
BR, Jocelyn Fournier
Le 27 juil. 2022 à 14:36, Cédric Counotte <cedric.counotte@1check.com> a écrit :
Reading this: https://jira.mariadb.org/browse/MDEV-27295
It's quite unclear when it is fixed or reverted.
That said I read that the following setting might fix it: SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;
Is that correct and should I try that and see if that helps?
-----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 14:29 À : Marko Mäkelä <marko.makela@mariadb.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295.
What version was it reverted in? I am still seeing the errant redo log flushing behaviour in 10.5.15. It looks like no flushing happens until the hwm is reached at about 85% full. It then tries to commit everything down to the lwm. And inbetween it doesn't do anything, even while everything is idle and it should be running down the _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
It’s not that tiny, this article is a bit old, but still valid: https://www.percona.com/blog/2008/11/21/how-to-calculate-a-good-innodb-log-f... If you don’t need a big redo log, reduce its side to avoid slow crash recovery
Le 27 juil. 2022 à 15:23, Gordan Bobic <gordan.bobic@gmail.com> a écrit :
10.5+ only uses a single log file, so that is 1x1GB. And 1GB is tiny, IMO it should be a default these days. I would only even consider something smaller if I was running on an older Raspberry Pi or something similarly constrained.
On Wed, Jul 27, 2022 at 4:11 PM jocelyn fournier <jocelyn.fournier@gmail.com> wrote:
Hi Cédric!
Just to be sure, do you really need the 2x 1G log_file_size ?
BR, Jocelyn Fournier
Le 27 juil. 2022 à 14:36, Cédric Counotte <cedric.counotte@1check.com> a écrit :
Reading this: https://jira.mariadb.org/browse/MDEV-27295
It's quite unclear when it is fixed or reverted.
That said I read that the following setting might fix it: SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;
Is that correct and should I try that and see if that helps?
-----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 14:29 À : Marko Mäkelä <marko.makela@mariadb.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295.
What version was it reverted in? I am still seeing the errant redo log flushing behaviour in 10.5.15. It looks like no flushing happens until the hwm is reached at about 85% full. It then tries to commit everything down to the lwm. And inbetween it doesn't do anything, even while everything is idle and it should be running down the _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Replaying a gigabyte of a redo log takes seconds on hardware from this century. When it comes to redo log sizing, there are two bounds: Upper bound: big enough to soak your peak hour of writes. Lower bound: big enough that your checkpoint age never hits the effective redo log size. To provide a margin for error, I normally try to make it never exceed 80% of redo log size. There is such a thing as overtuning. If you are seeing crashes regularly and have no redundant infrastructure for failover, tuning redo log size to reduce crash recovery is wasting time fixing the wrong problem. On Wed, Jul 27, 2022 at 4:26 PM jocelyn fournier <jocelyn.fournier@gmail.com> wrote:
It’s not that tiny, this article is a bit old, but still valid:
https://www.percona.com/blog/2008/11/21/how-to-calculate-a-good-innodb-log-f...
If you don’t need a big redo log, reduce its side to avoid slow crash recovery
Le 27 juil. 2022 à 15:23, Gordan Bobic <gordan.bobic@gmail.com> a écrit :
10.5+ only uses a single log file, so that is 1x1GB. And 1GB is tiny, IMO it should be a default these days. I would only even consider something smaller if I was running on an older Raspberry Pi or something similarly constrained.
On Wed, Jul 27, 2022 at 4:11 PM jocelyn fournier <jocelyn.fournier@gmail.com> wrote:
Hi Cédric!
Just to be sure, do you really need the 2x 1G log_file_size ?
BR, Jocelyn Fournier
Le 27 juil. 2022 à 14:36, Cédric Counotte <cedric.counotte@1check.com> a écrit :
Reading this: https://jira.mariadb.org/browse/MDEV-27295
It's quite unclear when it is fixed or reverted.
That said I read that the following setting might fix it: SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;
Is that correct and should I try that and see if that helps?
-----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 14:29 À : Marko Mäkelä <marko.makela@mariadb.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295.
What version was it reverted in? I am still seeing the errant redo log flushing behaviour in 10.5.15. It looks like no flushing happens until the hwm is reached at about 85% full. It then tries to commit everything down to the lwm. And inbetween it doesn't do anything, even while everything is idle and it should be running down the _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Considering the binlog files are 1GB and I get 144 of them per day, doesn't it mean that my 1GB log file holds about 10 minutes only of logs? I suppose I should make it 6GB then to follow that article 😉 Still using galera, I assume crashes are compensated by another node and 1GB of log should be enough for now. FWIW Servers are using NVMe from OVH, 4 of them are on ovh gen 2 NVMe (whatever it means). -----Message d'origine----- De : jocelyn fournier <jocelyn.fournier@gmail.com> Envoyé : mercredi 27 juillet 2022 15:26 À : Gordan Bobic <gordan.bobic@gmail.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start It’s not that tiny, this article is a bit old, but still valid: https://www.percona.com/blog/2008/11/21/how-to-calculate-a-good-innodb-log-f... If you don’t need a big redo log, reduce its side to avoid slow crash recovery
Le 27 juil. 2022 à 15:23, Gordan Bobic <gordan.bobic@gmail.com<mailto:gordan.bobic@gmail.com>> a écrit :
10.5+ only uses a single log file, so that is 1x1GB.
And 1GB is tiny, IMO it should be a default these days.
I would only even consider something smaller if I was running on an
older Raspberry Pi or something similarly constrained.
On Wed, Jul 27, 2022 at 4:11 PM jocelyn fournier
<jocelyn.fournier@gmail.com<mailto:jocelyn.fournier@gmail.com>> wrote:
Hi Cédric!
Just to be sure, do you really need the 2x 1G log_file_size ?
BR,
Jocelyn Fournier
Le 27 juil. 2022 à 14:36, Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> a écrit :
Reading this: https://jira.mariadb.org/browse/MDEV-27295
It's quite unclear when it is fixed or reverted.
That said I read that the following setting might fix it:
SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;
Is that correct and should I try that and see if that helps?
-----Message d'origine-----
De : Gordan Bobic <gordan.bobic@gmail.com<mailto:gordan.bobic@gmail.com>> Envoyé : mercredi 27
juillet 2022 14:29 À : Marko Mäkelä <marko.makela@mariadb.com<mailto:marko.makela@mariadb.com>> Cc :
Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>>; Mailing-List mariadb
<maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net>>
Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com<mailto:marko.makela@mariadb.com>> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com<mailto:gordan.bobic@gmail.com>> wrote:
There is no supported downgrade path other than logical dump+restore.
There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some
misunderstanding around that, and indeed some partly unintended or
uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were
reverted later. It could be useful to read
What version was it reverted in?
I am still seeing the errant redo log flushing behaviour in 10.5.15.
It looks like no flushing happens until the hwm is reached at about
85% full. It then tries to commit everything down to the lwm. And
inbetween it doesn't do anything, even while everything is idle and
it should be running down the
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net>
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp
Perhaps you should increase its side then, anyway PMM is great to see what’s a good setting! :)
Le 27 juil. 2022 à 15:31, Cédric Counotte <cedric.counotte@1check.com> a écrit :
Considering the binlog files are 1GB and I get 144 of them per day, doesn't it mean that my 1GB log file holds about 10 minutes only of logs?
I suppose I should make it 6GB then to follow that article 😉
Still using galera, I assume crashes are compensated by another node and 1GB of log should be enough for now.
FWIW Servers are using NVMe from OVH, 4 of them are on ovh gen 2 NVMe (whatever it means).
-----Message d'origine----- De : jocelyn fournier <jocelyn.fournier@gmail.com> Envoyé : mercredi 27 juillet 2022 15:26 À : Gordan Bobic <gordan.bobic@gmail.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
It’s not that tiny, this article is a bit old, but still valid:
https://www.percona.com/blog/2008/11/21/how-to-calculate-a-good-innodb-log-f...
If you don’t need a big redo log, reduce its side to avoid slow crash recovery
Le 27 juil. 2022 à 15:23, Gordan Bobic <gordan.bobic@gmail.com> a écrit :
10.5+ only uses a single log file, so that is 1x1GB. And 1GB is tiny, IMO it should be a default these days. I would only even consider something smaller if I was running on an older Raspberry Pi or something similarly constrained.
On Wed, Jul 27, 2022 at 4:11 PM jocelyn fournier <jocelyn.fournier@gmail.com> wrote:
Hi Cédric!
Just to be sure, do you really need the 2x 1G log_file_size ?
BR, Jocelyn Fournier
Le 27 juil. 2022 à 14:36, Cédric Counotte <cedric.counotte@1check.com> a écrit :
Reading this: https://jira.mariadb.org/browse/MDEV-27295
It's quite unclear when it is fixed or reverted.
That said I read that the following setting might fix it: SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;
Is that correct and should I try that and see if that helps?
-----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 14:29 À : Marko Mäkelä <marko.makela@mariadb.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295.
What version was it reverted in? I am still seeing the errant redo log flushing behaviour in 10.5.15. It looks like no flushing happens until the hwm is reached at about 85% full. It then tries to commit everything down to the lwm. And inbetween it doesn't do anything, even while everything is idle and it should be running down the _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
On Wed, Jul 27, 2022 at 4:26 PM jocelyn fournier <jocelyn.fournier@gmail.com> wrote:
It’s not that tiny, this article is a bit old, but still valid:
https://www.percona.com/blog/2008/11/21/how-to-calculate-a-good-innodb-log-f...
If you don’t need a big redo log, reduce its side to avoid slow crash recovery
Starting with MariaDB Server 10.5, recovery should run faster than in earlier versions, thanks to the easier-to-parse log format (MDEV-12353) and some memory management optimizations such as MDEV-21351. Furthermore, MariaDB 10.5 should not run out of memory on recovery even when starting up with innodb_buffer_pool_size that is smaller than innodb_redo_log_size. In earlier versions, the calculations to avoid running out of memory are inaccurate. A too small redo log may cause backup to fail, depending on the write rate of the server. A server-assisted backup (some ideas are in MDEV-14992) would solve that. Marko
You mean the ib_logfile0, right? It's 1GB yes, but not sure why or what it helps (or not) with. Isn't that log used in case of crash only? Cause replication is using mysql-bin.* log files which I have also set to 1GB and got 144 of them per day, and keeping one day, just in case the replication fails for so long (I've seen it stops very often but that's another problem). Now considering that: >>> If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry. and >>>SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001; And the fact that I have 1GB log file, and 1GB bin log files as well, could that be the problem? If so, would that mean that setting either innodb_max_dirty_pages_pct_lwm or changing the log (or binlog) file size could fix that issue? -----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 15:23 À : jocelyn fournier <jocelyn.fournier@gmail.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start 10.5+ only uses a single log file, so that is 1x1GB. And 1GB is tiny, IMO it should be a default these days. I would only even consider something smaller if I was running on an older Raspberry Pi or something similarly constrained. On Wed, Jul 27, 2022 at 4:11 PM jocelyn fournier <jocelyn.fournier@gmail.com> wrote:
Hi Cédric!
Just to be sure, do you really need the 2x 1G log_file_size ?
BR, Jocelyn Fournier
Le 27 juil. 2022 à 14:36, Cédric Counotte <cedric.counotte@1check.com> a écrit :
Reading this: https://jira.mariadb.org/browse/MDEV-27295
It's quite unclear when it is fixed or reverted.
That said I read that the following setting might fix it: SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;
Is that correct and should I try that and see if that helps?
-----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 14:29 À : Marko Mäkelä <marko.makela@mariadb.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295.
What version was it reverted in? I am still seeing the errant redo log flushing behaviour in 10.5.15. It looks like no flushing happens until the hwm is reached at about 85% full. It then tries to commit everything down to the lwm. And inbetween it doesn't do anything, even while everything is idle and it should be running down the _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
I just tested using this setting, but it made no difference: innodb_max_dirty_pages_pct_lwm=0.001 Tested with 2 existing nodes running and sync'ed, and one new node being attached. As soon as SST sync ended, the first 2 nodes started piling queries, mostly stuck in commit. If the new node was used, all queries got stuck in opening tables. Please see attached screenshots showing part of the queues, which quickly reach 100+. It took about 20 minutes to recover from that situation without using the new node at all! It took another 20 minutes for the new node to become usable (eg not too slow). Restarted server, it did an IST, exact same result, just worse, pending queries above 200 this time! How is it possible for queries to be stuck for several minutes (10!!!) while in nominal state they take less than 0.2 seconds !? Saw this in logs just after the IST, however table_open_cache_instances = 16 is set in config file and live value is indeed 1!? 2022-07-27 23:55:09 11 [Warning] Detected table cache mutex contention at instance 1: 30% waits. Additional table cache instance cannot be activated: consider raising table_open_cache_instances. Number of active instances: 1. FWIW we have 350+ DB each with 100 tables, and doing a mariabackup requires ulimit -n 919200, servers are processing about 3.000 queries/seconds during those tests, while it peaks at 10.000 during the day. A single server is capable of handling the peak load (verified earlier this morning) !
You should hardly ever need to adjust table_open_cache_instances away from defaults, but it sounds like you may need into bump your table_open_cache by a 2-4x. Are you using more than one node for writing? And does the new node get used for anything immediately after joining? On Thu, 28 Jul 2022, 01:20 Cédric Counotte, <cedric.counotte@1check.com> wrote:
I just tested using this setting, but it made no difference: innodb_max_dirty_pages_pct_lwm=0.001
Tested with 2 existing nodes running and sync'ed, and one new node being attached.
As soon as SST sync ended, the first 2 nodes started piling queries, mostly stuck in commit. If the new node was used, all queries got stuck in opening tables.
Please see attached screenshots showing part of the queues, which quickly reach 100+.
It took about 20 minutes to recover from that situation without using the new node at all! It took another 20 minutes for the new node to become usable (eg not too slow).
Restarted server, it did an IST, exact same result, just worse, pending queries above 200 this time! How is it possible for queries to be stuck for several minutes (10!!!) while in nominal state they take less than 0.2 seconds !?
Saw this in logs just after the IST, however table_open_cache_instances = 16 is set in config file and live value is indeed 1!?
2022-07-27 23:55:09 11 [Warning] Detected table cache mutex contention at instance 1: 30% waits. Additional table cache instance cannot be activated: consider raising table_open_cache_instances. Number of active instances: 1.
FWIW we have 350+ DB each with 100 tables, and doing a mariabackup requires ulimit -n 919200, servers are processing about 3.000 queries/seconds during those tests, while it peaks at 10.000 during the day. A single server is capable of handling the peak load (verified earlier this morning) !
Concerning table_open_cache, it’s currently set to 13869 (however in config it’s set to 16384), global status shows this on new node, 9 hours after start: +-----------------------------------+---------+ | Table_open_cache_active_instances | 1 | | Table_open_cache_hits | 2136757 | | Table_open_cache_misses | 185097 | | Table_open_cache_overflows | 146153 | +-----------------------------------+---------+ +---------------+--------+ | Opened_tables | 159629 | +---------------+--------+ I’ve updated table_open_cache to 65536 on 2 servers and the cache_overflows stops increasing. In /usr/lib/systemd/system/mariadb.service, I see those: LimitNOFILE=32768 So I don’t understand why MariaDB decided to reduce the configured value? Not sure if changing the config will have any effect on the live value either? I’ll try to set both to 65536 this evening and see if it helps. Is it safe to increase both limits? Maybe to the value I use during mariabackup, which is 919200? All active nodes are used for writing, the HTTP load is spread evenly on all nodes. The ratio is 1.2% writes at 7780 read/sec with 125 write/sec on peak. Both read/writes are spread upon all active nodes using a load balancer using round-robin at the moment. During yesterday’s test the existing 2 nodes where active at first. Seeing the queries starting to be stuck I decided to activate the new node to spread the load in hope for some improvements, however it just made things even worse, so I deactivated it again. De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : jeudi 28 juillet 2022 08:15 À : Cédric Counotte <cedric.counotte@1check.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start You should hardly ever need to adjust table_open_cache_instances away from defaults, but it sounds like you may need into bump your table_open_cache by a 2-4x. Are you using more than one node for writing? And does the new node get used for anything immediately after joining? On Thu, 28 Jul 2022, 01:20 Cédric Counotte, <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> wrote: I just tested using this setting, but it made no difference: innodb_max_dirty_pages_pct_lwm=0.001 Tested with 2 existing nodes running and sync'ed, and one new node being attached. As soon as SST sync ended, the first 2 nodes started piling queries, mostly stuck in commit. If the new node was used, all queries got stuck in opening tables. Please see attached screenshots showing part of the queues, which quickly reach 100+. It took about 20 minutes to recover from that situation without using the new node at all! It took another 20 minutes for the new node to become usable (eg not too slow). Restarted server, it did an IST, exact same result, just worse, pending queries above 200 this time! How is it possible for queries to be stuck for several minutes (10!!!) while in nominal state they take less than 0.2 seconds !? Saw this in logs just after the IST, however table_open_cache_instances = 16 is set in config file and live value is indeed 1!? 2022-07-27 23:55:09 11 [Warning] Detected table cache mutex contention at instance 1: 30% waits. Additional table cache instance cannot be activated: consider raising table_open_cache_instances. Number of active instances: 1. FWIW we have 350+ DB each with 100 tables, and doing a mariabackup requires ulimit -n 919200, servers are processing about 3.000 queries/seconds during those tests, while it peaks at 10.000 during the day. A single server is capable of handling the peak load (verified earlier this morning) !
On Thu, Jul 28, 2022 at 10:56 AM Cédric Counotte <cedric.counotte@1check.com> wrote:
Concerning table_open_cache, it’s currently set to 13869 (however in config it’s set to 16384), global status shows this on new node, 9 hours after start:
That means you are out of file handles at systemd level. systemctl edit mariadb and add this to the override file: [Service] LimitNOFILE=1048576 systemctl daemon reload systemctl restart mariadb Yes, this will probably trigger the problem you are having, but with some luck it may make it better in the future. Do that on all nodes.
+-----------------------------------+---------+
| Table_open_cache_active_instances | 1 |
| Table_open_cache_hits | 2136757 |
| Table_open_cache_misses | 185097 |
| Table_open_cache_overflows | 146153 |
+-----------------------------------+---------+
+---------------+--------+
| Opened_tables | 159629 |
+---------------+--------+
I’ve updated table_open_cache to 65536 on 2 servers and the cache_overflows stops increasing.
In /usr/lib/systemd/system/mariadb.service, I see those:
LimitNOFILE=32768
That is way too low. This needs to be big enough to cover the sum total of: max_connections table_open_cache x2 (because innodb_open_files is separate) There is generally no harm in bumping LimitNOFILE much higher on modern kernels.
So I don’t understand why MariaDB decided to reduce the configured value? Not sure if changing the config will have any effect on the live value either? I’ll try to set both to 65536 this evening and see if it helps.
Because it tries to make sure that your total of the above mentioned settings fits in the number of file handles it has available to it.
Is it safe to increase both limits? Maybe to the value I use during mariabackup, which is 919200?
Yes, see above. But it requires a daemon-reload and a restart of the service to take effect.
All active nodes are used for writing, the HTTP load is spread evenly on all nodes. The ratio is 1.2% writes at 7780 read/sec with 125 write/sec on peak. Both read/writes are spread upon all active nodes using a load balancer using round-robin at the moment.
That is likely a part of your problem. You should never ever use more than one Galera node for writing at a time. Performance will be WORSE than performance of a single node, and you will get deadlocks all over the place. You can use any of them for reading, but you should never use more than one at a time for writing. It is a little concerning that you managed to get as far as putting Galera into production for months without full awareness of this.
During yesterday’s test the existing 2 nodes where active at first. Seeing the queries starting to be stuck I decided to activate the new node to spread the load in hope for some improvements, however it just made things even worse, so I deactivated it again.
The more writable nodes you have the worse the performance will get. There should only ever be one writable node at a time.
I've prepared all servers with that new setting, and this (is it ok or should I set it to 1048576 as well?): table_open_cache = 65536 I'll do the server restart this evening to avoid creating problems during the day. I did try to restart the backup cluster (2 nodes, one slave of the main cluster) and it didn't seem to slowdown the slave as it used to, so that might be the solution (or part of)! Thanks a lot for your time, will keep you posted later today when I restart a node of the main cluster. -----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : jeudi 28 juillet 2022 10:06 À : Cédric Counotte <cedric.counotte@1check.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start On Thu, Jul 28, 2022 at 10:56 AM Cédric Counotte <cedric.counotte@1check.com> wrote:
Concerning table_open_cache, it’s currently set to 13869 (however in config it’s set to 16384), global status shows this on new node, 9 hours after start:
That means you are out of file handles at systemd level. systemctl edit mariadb and add this to the override file: [Service] LimitNOFILE=1048576 systemctl daemon reload systemctl restart mariadb Yes, this will probably trigger the problem you are having, but with some luck it may make it better in the future. Do that on all nodes.
+-----------------------------------+---------+
| Table_open_cache_active_instances | 1 |
| Table_open_cache_hits | 2136757 |
| Table_open_cache_misses | 185097 |
| Table_open_cache_overflows | 146153 |
+-----------------------------------+---------+
+---------------+--------+
| Opened_tables | 159629 |
+---------------+--------+
I’ve updated table_open_cache to 65536 on 2 servers and the cache_overflows stops increasing.
In /usr/lib/systemd/system/mariadb.service, I see those:
LimitNOFILE=32768
That is way too low. This needs to be big enough to cover the sum total of: max_connections table_open_cache x2 (because innodb_open_files is separate) There is generally no harm in bumping LimitNOFILE much higher on modern kernels.
So I don’t understand why MariaDB decided to reduce the configured value? Not sure if changing the config will have any effect on the live value either? I’ll try to set both to 65536 this evening and see if it helps.
Because it tries to make sure that your total of the above mentioned settings fits in the number of file handles it has available to it.
Is it safe to increase both limits? Maybe to the value I use during mariabackup, which is 919200?
Yes, see above. But it requires a daemon-reload and a restart of the service to take effect.
All active nodes are used for writing, the HTTP load is spread evenly on all nodes. The ratio is 1.2% writes at 7780 read/sec with 125 write/sec on peak. Both read/writes are spread upon all active nodes using a load balancer using round-robin at the moment.
That is likely a part of your problem. You should never ever use more than one Galera node for writing at a time. Performance will be WORSE than performance of a single node, and you will get deadlocks all over the place. You can use any of them for reading, but you should never use more than one at a time for writing. It is a little concerning that you managed to get as far as putting Galera into production for months without full awareness of this.
During yesterday’s test the existing 2 nodes where active at first. Seeing the queries starting to be stuck I decided to activate the new node to spread the load in hope for some improvements, however it just made things even worse, so I deactivated it again.
The more writable nodes you have the worse the performance will get. There should only ever be one writable node at a time.
On Thu, Jul 28, 2022 at 11:24 AM Cédric Counotte <cedric.counotte@1check.com> wrote:
I've prepared all servers with that new setting, and this (is it ok or should I set it to 1048576 as well?): table_open_cache = 65536
No, remember that LimitNOFILE has to fit at least the sum of all of the settings I mentioned, and still leave plenty of headroom. If you had PMM up and running you could see at a glance whether your table_open_cache is big enough.
I'll do the server restart this evening to avoid creating problems during the day.
I did try to restart the backup cluster (2 nodes, one slave of the main cluster) and it didn't seem to slowdown the slave as it used to, so that might be the solution (or part of)!
Thanks a lot for your time, will keep you posted later today when I restart a node of the main cluster.
IMO you are probably experiencing bigger problems from writing to multiple nodes at the same time.
Thanks, I'll keep that in mind indeed. I was very surprised by the system limits (using mariabackup) so I couldn't imaging that being such an issue. After a general failurea few days ago I was left with one node alone and starting a second node produced the same issue while only one node was used for reading and writing. So writing to multiple nodes might be another issue I'll have to deal with later. If I can solve that restart issue, that'd be removing a huge problem which puts down all our services entirely for 30 minutes when it happens. So again, thanks a lot for your precious help, hopefully that setting is getting us in the right direction. -----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : jeudi 28 juillet 2022 10:41 À : Cédric Counotte <cedric.counotte@1check.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start On Thu, Jul 28, 2022 at 11:24 AM Cédric Counotte <cedric.counotte@1check.com> wrote:
I've prepared all servers with that new setting, and this (is it ok or should I set it to 1048576 as well?): table_open_cache = 65536
No, remember that LimitNOFILE has to fit at least the sum of all of the settings I mentioned, and still leave plenty of headroom. If you had PMM up and running you could see at a glance whether your table_open_cache is big enough.
I'll do the server restart this evening to avoid creating problems during the day.
I did try to restart the backup cluster (2 nodes, one slave of the main cluster) and it didn't seem to slowdown the slave as it used to, so that might be the solution (or part of)!
Thanks a lot for your time, will keep you posted later today when I restart a node of the main cluster.
IMO you are probably experiencing bigger problems from writing to multiple nodes at the same time.
Well, turns out the last attached server crashed and did an IST, the setting described below was already applied, and the issue didn't show up!? I'll try again later today to confirm, but it looks very positive 😉 [ It might be a false positive as I've seen 2 restarts goes smooth in the past while there are just too many restarts having gone haywire. ] -----Message d'origine----- De : Cédric Counotte Envoyé : jeudi 28 juillet 2022 10:24 À : Gordan Bobic <gordan.bobic@gmail.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : RE: [Maria-discuss] MariaDB server horribly slow on start I've prepared all servers with that new setting, and this (is it ok or should I set it to 1048576 as well?): table_open_cache = 65536 I'll do the server restart this evening to avoid creating problems during the day. I did try to restart the backup cluster (2 nodes, one slave of the main cluster) and it didn't seem to slowdown the slave as it used to, so that might be the solution (or part of)! Thanks a lot for your time, will keep you posted later today when I restart a node of the main cluster. -----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com<mailto:gordan.bobic@gmail.com>> Envoyé : jeudi 28 juillet 2022 10:06 À : Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> Cc : jocelyn fournier <jocelyn.fournier@gmail.com<mailto:jocelyn.fournier@gmail.com>>; Marko Mäkelä <marko.makela@mariadb.com<mailto:marko.makela@mariadb.com>>; Mailing-List mariadb <maria-discuss@lists.launchpad.net<mailto:maria-discuss@lists.launchpad.net>>; Pierre LAFON <pierre.lafon@1check.com<mailto:pierre.lafon@1check.com>> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start On Thu, Jul 28, 2022 at 10:56 AM Cédric Counotte <cedric.counotte@1check.com<mailto:cedric.counotte@1check.com>> wrote:
Concerning table_open_cache, it’s currently set to 13869 (however in config it’s set to 16384), global status shows this on new node, 9 hours after start:
That means you are out of file handles at systemd level. systemctl edit mariadb and add this to the override file: [Service] LimitNOFILE=1048576 systemctl daemon reload systemctl restart mariadb Yes, this will probably trigger the problem you are having, but with some luck it may make it better in the future. Do that on all nodes.
+-----------------------------------+---------+
| Table_open_cache_active_instances | 1 |
| Table_open_cache_hits | 2136757 |
| Table_open_cache_misses | 185097 |
| Table_open_cache_overflows | 146153 |
+-----------------------------------+---------+
+---------------+--------+
| Opened_tables | 159629 |
+---------------+--------+
I’ve updated table_open_cache to 65536 on 2 servers and the cache_overflows stops increasing.
In /usr/lib/systemd/system/mariadb.service, I see those:
LimitNOFILE=32768
That is way too low. This needs to be big enough to cover the sum total of: max_connections table_open_cache x2 (because innodb_open_files is separate) There is generally no harm in bumping LimitNOFILE much higher on modern kernels.
So I don’t understand why MariaDB decided to reduce the configured value? Not sure if changing the config will have any effect on the live value either? I’ll try to set both to 65536 this evening and see if it helps.
Because it tries to make sure that your total of the above mentioned settings fits in the number of file handles it has available to it.
Is it safe to increase both limits? Maybe to the value I use during mariabackup, which is 919200?
Yes, see above. But it requires a daemon-reload and a restart of the service to take effect.
All active nodes are used for writing, the HTTP load is spread evenly on all nodes. The ratio is 1.2% writes at 7780 read/sec with 125 write/sec on peak. Both read/writes are spread upon all active nodes using a load balancer using round-robin at the moment.
That is likely a part of your problem. You should never ever use more than one Galera node for writing at a time. Performance will be WORSE than performance of a single node, and you will get deadlocks all over the place. You can use any of them for reading, but you should never use more than one at a time for writing. It is a little concerning that you managed to get as far as putting Galera into production for months without full awareness of this.
During yesterday’s test the existing 2 nodes where active at first. Seeing the queries starting to be stuck I decided to activate the new node to spread the load in hope for some improvements, however it just made things even worse, so I deactivated it again.
The more writable nodes you have the worse the performance will get. There should only ever be one writable node at a time.
Servers shouldn't be crashing. If they are crashing you need to establish why and deal with it. Uptimes of years with MariaDB are not uncommon. Or at least months even among the security conscious who patch with every release cycle. On Thu, Jul 28, 2022 at 11:48 AM Cédric Counotte <cedric.counotte@1check.com> wrote:
Well, turns out the last attached server crashed and did an IST, the setting described below was already applied, and the issue didn't show up!?
I'll try again later today to confirm, but it looks very positive 😉
[ It might be a false positive as I've seen 2 restarts goes smooth in the past while there are just too many restarts having gone haywire. ]
-----Message d'origine----- De : Cédric Counotte Envoyé : jeudi 28 juillet 2022 10:24 À : Gordan Bobic <gordan.bobic@gmail.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : RE: [Maria-discuss] MariaDB server horribly slow on start
I've prepared all servers with that new setting, and this (is it ok or should I set it to 1048576 as well?):
table_open_cache = 65536
I'll do the server restart this evening to avoid creating problems during the day.
I did try to restart the backup cluster (2 nodes, one slave of the main cluster) and it didn't seem to slowdown the slave as it used to, so that might be the solution (or part of)!
Thanks a lot for your time, will keep you posted later today when I restart a node of the main cluster.
-----Message d'origine-----
De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : jeudi 28 juillet 2022 10:06 À : Cédric Counotte <cedric.counotte@1check.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Thu, Jul 28, 2022 at 10:56 AM Cédric Counotte <cedric.counotte@1check.com> wrote:
Concerning table_open_cache, it’s currently set to 13869 (however in config it’s set to 16384), global status shows this on new node, 9 hours after start:
That means you are out of file handles at systemd level.
systemctl edit mariadb
and add this to the override file:
[Service]
LimitNOFILE=1048576
systemctl daemon reload
systemctl restart mariadb
Yes, this will probably trigger the problem you are having, but with some luck it may make it better in the future.
Do that on all nodes.
+-----------------------------------+---------+
| Table_open_cache_active_instances | 1 |
| Table_open_cache_hits | 2136757 |
| Table_open_cache_misses | 185097 |
| Table_open_cache_overflows | 146153 |
+-----------------------------------+---------+
+---------------+--------+
| Opened_tables | 159629 |
+---------------+--------+
I’ve updated table_open_cache to 65536 on 2 servers and the cache_overflows stops increasing.
In /usr/lib/systemd/system/mariadb.service, I see those:
LimitNOFILE=32768
That is way too low. This needs to be big enough to cover the sum total of:
max_connections
table_open_cache x2 (because innodb_open_files is separate)
There is generally no harm in bumping LimitNOFILE much higher on modern kernels.
So I don’t understand why MariaDB decided to reduce the configured value? Not sure if changing the config will have any effect on the live value either? I’ll try to set both to 65536 this evening and see if it helps.
Because it tries to make sure that your total of the above mentioned settings fits in the number of file handles it has available to it.
Is it safe to increase both limits? Maybe to the value I use during mariabackup, which is 919200?
Yes, see above. But it requires a daemon-reload and a restart of the service to take effect.
All active nodes are used for writing, the HTTP load is spread evenly on all nodes. The ratio is 1.2% writes at 7780 read/sec with 125 write/sec on peak. Both read/writes are spread upon all active nodes using a load balancer using round-robin at the moment.
That is likely a part of your problem.
You should never ever use more than one Galera node for writing at a time.
Performance will be WORSE than performance of a single node, and you will get deadlocks all over the place.
You can use any of them for reading, but you should never use more than one at a time for writing.
It is a little concerning that you managed to get as far as putting Galera into production for months without full awareness of this.
During yesterday’s test the existing 2 nodes where active at first. Seeing the queries starting to be stuck I decided to activate the new node to spread the load in hope for some improvements, however it just made things even worse, so I deactivated it again.
The more writable nodes you have the worse the performance will get.
There should only ever be one writable node at a time.
Well, one server crashed twice a few days ago and I've asked my service provided (OVH) to look into it, but they asked me to test the hardware myself, found a NVMe disk with 17000+ errors, still waiting for their feedback on this. Only our 2 oldest servers are experiencing crashes (6 months old only!), and it turns out the RAID NVMe have very different written data, one disk has 58TB (not a replacement) while the other is at 400+TB within the same RAID ! All other servers have identical written data size on both disks of their RAID, so it seems we got used disks and that those are having issues. Still didn't have time to produce a crash dump and post an issue with those (to confirm the cause) as I kept having to deal with server restarts trying to reduce the slow issue for 30 minutes to one hour. There was issues with slave thread crashing which I posted an issue and got to update MariaDB to resolve, still there are issues with slave threads stopping without reason so I have written a script to restart it and posted an issue with that. The original objective was to have 2 usable cluster in different sites, synched with each other using replication, however all those issues have not allowed us to move forward with this. Not to mention the fact that we are now using OVH load balancer and that piece of hardware is sometimes thinking all our servers are down and starts showing error 503 to our customers while our servers are just running fine (no restart, no issue, nothing). So one more issue to deal with, for which we'll get a dedicated server and configure our own load balancer we can have control on. Sorry for the long story, it's been quite a pain, and I feel like I'm looking at the end of the tunnel with your help. -----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : jeudi 28 juillet 2022 10:56 À : Cédric Counotte <cedric.counotte@1check.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Servers shouldn't be crashing. If they are crashing you need to establish why and deal with it. Uptimes of years with MariaDB are not uncommon. Or at least months even among the security conscious who patch with every release cycle. On Thu, Jul 28, 2022 at 11:48 AM Cédric Counotte <cedric.counotte@1check.com> wrote:
Well, turns out the last attached server crashed and did an IST, the setting described below was already applied, and the issue didn't show up!?
I'll try again later today to confirm, but it looks very positive 😉
[ It might be a false positive as I've seen 2 restarts goes smooth in the past while there are just too many restarts having gone haywire. ]
-----Message d'origine----- De : Cédric Counotte Envoyé : jeudi 28 juillet 2022 10:24 À : Gordan Bobic <gordan.bobic@gmail.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : RE: [Maria-discuss] MariaDB server horribly slow on start
I've prepared all servers with that new setting, and this (is it ok or should I set it to 1048576 as well?):
table_open_cache = 65536
I'll do the server restart this evening to avoid creating problems during the day.
I did try to restart the backup cluster (2 nodes, one slave of the main cluster) and it didn't seem to slowdown the slave as it used to, so that might be the solution (or part of)!
Thanks a lot for your time, will keep you posted later today when I restart a node of the main cluster.
-----Message d'origine-----
De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : jeudi 28 juillet 2022 10:06 À : Cédric Counotte <cedric.counotte@1check.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Thu, Jul 28, 2022 at 10:56 AM Cédric Counotte <cedric.counotte@1check.com> wrote:
Concerning table_open_cache, it’s currently set to 13869 (however in config it’s set to 16384), global status shows this on new node, 9 hours after start:
That means you are out of file handles at systemd level.
systemctl edit mariadb
and add this to the override file:
[Service]
LimitNOFILE=1048576
systemctl daemon reload
systemctl restart mariadb
Yes, this will probably trigger the problem you are having, but with some luck it may make it better in the future.
Do that on all nodes.
+-----------------------------------+---------+
| Table_open_cache_active_instances | 1 |
| Table_open_cache_hits | 2136757 |
| Table_open_cache_misses | 185097 |
| Table_open_cache_overflows | 146153 |
+-----------------------------------+---------+
+---------------+--------+
| Opened_tables | 159629 |
+---------------+--------+
I’ve updated table_open_cache to 65536 on 2 servers and the cache_overflows stops increasing.
In /usr/lib/systemd/system/mariadb.service, I see those:
LimitNOFILE=32768
That is way too low. This needs to be big enough to cover the sum total of:
max_connections
table_open_cache x2 (because innodb_open_files is separate)
There is generally no harm in bumping LimitNOFILE much higher on modern kernels.
So I don’t understand why MariaDB decided to reduce the configured value? Not sure if changing the config will have any effect on the live value either? I’ll try to set both to 65536 this evening and see if it helps.
Because it tries to make sure that your total of the above mentioned settings fits in the number of file handles it has available to it.
Is it safe to increase both limits? Maybe to the value I use during mariabackup, which is 919200?
Yes, see above. But it requires a daemon-reload and a restart of the service to take effect.
All active nodes are used for writing, the HTTP load is spread evenly on all nodes. The ratio is 1.2% writes at 7780 read/sec with 125 write/sec on peak. Both read/writes are spread upon all active nodes using a load balancer using round-robin at the moment.
That is likely a part of your problem.
You should never ever use more than one Galera node for writing at a time.
Performance will be WORSE than performance of a single node, and you will get deadlocks all over the place.
You can use any of them for reading, but you should never use more than one at a time for writing.
It is a little concerning that you managed to get as far as putting Galera into production for months without full awareness of this.
During yesterday’s test the existing 2 nodes where active at first. Seeing the queries starting to be stuck I decided to activate the new node to spread the load in hope for some improvements, however it just made things even worse, so I deactivated it again.
The more writable nodes you have the worse the performance will get.
There should only ever be one writable node at a time.
On Thu, Jul 28, 2022 at 12:07 PM Cédric Counotte <cedric.counotte@1check.com> wrote:
Well, one server crashed twice a few days ago and I've asked my service provided (OVH) to look into it, but they asked me to test the hardware myself, found a NVMe disk with 17000+ errors, still waiting for their feedback on this.
It sounds like you need: 1) ZFS 2) Better monitoring
Only our 2 oldest servers are experiencing crashes (6 months old only!), and it turns out the RAID NVMe have very different written data, one disk has 58TB (not a replacement) while the other is at 400+TB within the same RAID ! All other servers have identical written data size on both disks of their RAID, so it seems we got used disks and that those are having issues.
Welcome to the cloud. But this is not a bad thing, it's better than having multiple disks in the same array fail at the same time. ZFS would help you by catching those errors before the database ingests them. In normal non-ZFS RAID, it is plausible and even quite probable that the corrupted data will be loaded from disk and propagate to other nodes, either via a state transfer or via corrupted binlogs. ZFS prevents that by making sure every block's checksum is compared at read time and any errors that show up get recovered from other redundant disks. Under the current circumstances, I wouldn't trust your data integrity until you run a full extended table check on all tables on all nodes. And probably pt-table-checksum on all the tables between the nodes to make sure.
Still didn't have time to produce a crash dump and post an issue with those (to confirm the cause) as I kept having to deal with server restarts trying to reduce the slow issue for 30 minutes to one hour.
you need to be careful with that - state transfer from a node with failing disks can actually result in the corrupted data propagating to the node being bootstrapped.
There was issues with slave thread crashing which I posted an issue and got to update MariaDB to resolve, still there are issues with slave threads stopping without reason so I have written a script to restart it and posted an issue with that.
I don't think you can meaningfully debug anything until you have verified that your hardware is reliable. Do your OVH servers have ECC memory?
The original objective was to have 2 usable cluster in different sites, synched with each other using replication, however all those issues have not allowed us to move forward with this.
With 4 nodes across 2 DCs, you are going to lose writability if you lose a DC even if it is the secondary DC. Your writes are also going to be very slow because with 4 nodes, all writes have to be acknowledged by 3 nodes - and the 3rd node is always going to be slow because it is connected over a WAN. I would seriously question whether Galera is the correct solution for you. And that's on top of writing to multiple nodes which will make things far worse on top.
Not to mention the fact that we are now using OVH load balancer and that piece of hardware is sometimes thinking all our servers are down and starts showing error 503 to our customers while our servers are just running fine (no restart, no issue, nothing). So one more issue to deal with, for which we'll get a dedicated server and configure our own load balancer we can have control on.
I think you need to take a long hard look at what you are trying to achieve and re-assess: 1) Whether it is actually achievable sensibly within the constraints you imposed 2) What the best workable compromise is between what you want and what you can reasonably have Right now, I don't think you have a solution that is likely to be workable.
I'll look into ZFS and check all tables asap, thanks. Note that with 2 DCs, those will be synched with replication, not part of a single galera cluster, but 2 galera clusters. Atm the spare DC is a replication slave of the first DC. Only first DC is active, the second can be used for read-only purpose by a read-only data API, but it's barely used. The spare DC is here in case the active one goes down at our service provider (it happened in the past, fire hazard @ OVH!) The biggest problem was really the node start slowing down the rest of the cluster. Second to that is the replication slave stopping for unknown reason. Other than those I could setup replication both ways to have both DC synched, not in real time but synched, thus changing the load balancing for a source IP instead of round robin and it should work as planned. Since we experienced too many issues and writing the same data to 6-8 nodes seem counterproductive (against NVMe max writes) we're thinking about splitting our big customers to dedicated servers, completely separated from the rest, however this would require coding and implementing the redirection in our Android application and having dedicated URLs per customers, and our customers are in for it. Anyway, one step at a time, first the issues that are causing huge problems, then we can focus on a better solution. -----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : jeudi 28 juillet 2022 11:35 À : Cédric Counotte <cedric.counotte@1check.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start On Thu, Jul 28, 2022 at 12:07 PM Cédric Counotte <cedric.counotte@1check.com> wrote:
Well, one server crashed twice a few days ago and I've asked my service provided (OVH) to look into it, but they asked me to test the hardware myself, found a NVMe disk with 17000+ errors, still waiting for their feedback on this.
It sounds like you need: 1) ZFS 2) Better monitoring
Only our 2 oldest servers are experiencing crashes (6 months old only!), and it turns out the RAID NVMe have very different written data, one disk has 58TB (not a replacement) while the other is at 400+TB within the same RAID ! All other servers have identical written data size on both disks of their RAID, so it seems we got used disks and that those are having issues.
Welcome to the cloud. But this is not a bad thing, it's better than having multiple disks in the same array fail at the same time. ZFS would help you by catching those errors before the database ingests them. In normal non-ZFS RAID, it is plausible and even quite probable that the corrupted data will be loaded from disk and propagate to other nodes, either via a state transfer or via corrupted binlogs. ZFS prevents that by making sure every block's checksum is compared at read time and any errors that show up get recovered from other redundant disks. Under the current circumstances, I wouldn't trust your data integrity until you run a full extended table check on all tables on all nodes. And probably pt-table-checksum on all the tables between the nodes to make sure.
Still didn't have time to produce a crash dump and post an issue with those (to confirm the cause) as I kept having to deal with server restarts trying to reduce the slow issue for 30 minutes to one hour.
you need to be careful with that - state transfer from a node with failing disks can actually result in the corrupted data propagating to the node being bootstrapped.
There was issues with slave thread crashing which I posted an issue and got to update MariaDB to resolve, still there are issues with slave threads stopping without reason so I have written a script to restart it and posted an issue with that.
I don't think you can meaningfully debug anything until you have verified that your hardware is reliable. Do your OVH servers have ECC memory?
The original objective was to have 2 usable cluster in different sites, synched with each other using replication, however all those issues have not allowed us to move forward with this.
With 4 nodes across 2 DCs, you are going to lose writability if you lose a DC even if it is the secondary DC. Your writes are also going to be very slow because with 4 nodes, all writes have to be acknowledged by 3 nodes - and the 3rd node is always going to be slow because it is connected over a WAN. I would seriously question whether Galera is the correct solution for you. And that's on top of writing to multiple nodes which will make things far worse on top.
Not to mention the fact that we are now using OVH load balancer and that piece of hardware is sometimes thinking all our servers are down and starts showing error 503 to our customers while our servers are just running fine (no restart, no issue, nothing). So one more issue to deal with, for which we'll get a dedicated server and configure our own load balancer we can have control on.
I think you need to take a long hard look at what you are trying to achieve and re-assess: 1) Whether it is actually achievable sensibly within the constraints you imposed 2) What the best workable compromise is between what you want and what you can reasonably have Right now, I don't think you have a solution that is likely to be workable.
I had to restart a server earlier and with this new setting, everything was smooth after the SST ! for the second time in a row. I just started another server, and it did its SST and worked just fine immediately! Wow! I never imagined such an OS limit would be imposed and cause so much trouble! It ought to be documented if not already (then I missed it). Thanks a ton for your help! That's a big problem solved!
I suggest you get PMM installed. With it this troubleshooting would have taken someone familiar with it 1% of the time we all spent typing emails on this thread. On Thu, 28 Jul 2022, 21:07 Cédric Counotte, <cedric.counotte@1check.com> wrote:
I had to restart a server earlier and with this new setting, everything was smooth after the SST ! for the second time in a row.
I just started another server, and it did its SST and worked just fine immediately! Wow! I never imagined such an OS limit would be imposed and cause so much trouble! It ought to be documented if not already (then I missed it).
Thanks a ton for your help! That's a big problem solved!
Hello all, Again, I need help with a galera cluster. One of the galera has about 3800 files named galera.page.0000xxx (not galera.cache) of 128MB each stored in /var/lib/mysql, which is about twice the size of the total DB !? and it keeps creating files every minute, never deleting any of them for 3 days. The other nodes have no such files, but only one galera.cache file of 128MB. I tried deleting some of them, but it doesn’t reclaim disk space and in about 2 days disk would be entirely filled and the server would obviously stop functioning if not the entire cluster. Can’t find any information about galera.page files but only galera.cache file so I’m not sure it’s related. As far as I could tell those are used when replicating information between node? Does that mean that the changes on that node are not replicated to others? Any help would be appreciated. C.
Like: https://dba.stackexchange.com/questions/306131/mariadb-galera-cluster-fillin... and linked MDEVs I haven't followed your other thread to identify your versions used. If the linked MDEVs aren't relevant to your newer version, please create a new issue. Comments on closed issues are almost always going to be forgotten. On Mon, Aug 1, 2022 at 4:53 PM Cédric Counotte <cedric.counotte@1check.com> wrote:
Hello all,
Again, I need help with a galera cluster.
One of the galera has about 3800 files named galera.page.0000xxx (not galera.cache) of 128MB each stored in /var/lib/mysql, which is about twice the size of the total DB !? and it keeps creating files every minute, never deleting any of them for 3 days.
The other nodes have no such files, but only one galera.cache file of 128MB.
I tried deleting some of them, but it doesn’t reclaim disk space and in about 2 days disk would be entirely filled and the server would obviously stop functioning if not the entire cluster.
Can’t find any information about galera.page files but only galera.cache file so I’m not sure it’s related. As far as I could tell those are used when replicating information between node? Does that mean that the changes on that node are not replicated to others?
Any help would be appreciated.
C.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Because I'm on 10.6.7 under Ubuntu 22.0.4.1, I've already created a new issue: https://jira.mariadb.org/browse/MDEV-29223 I'm waiting end of business hours to restart server, possibly switching active cluster to spare one. -----Message d'origine----- De : Daniel Black <daniel@mariadb.org> Envoyé : lundi 1 août 2022 13:37 À : Cédric Counotte <cedric.counotte@1check.com> Cc : Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] 485 GB of galera.page.xxxxx files never deleted!? Like: https://dba.stackexchange.com/questions/306131/mariadb-galera-cluster-fillin... and linked MDEVs I haven't followed your other thread to identify your versions used. If the linked MDEVs aren't relevant to your newer version, please create a new issue. Comments on closed issues are almost always going to be forgotten. On Mon, Aug 1, 2022 at 4:53 PM Cédric Counotte <cedric.counotte@1check.com> wrote:
Hello all,
Again, I need help with a galera cluster.
One of the galera has about 3800 files named galera.page.0000xxx (not galera.cache) of 128MB each stored in /var/lib/mysql, which is about twice the size of the total DB !? and it keeps creating files every minute, never deleting any of them for 3 days.
The other nodes have no such files, but only one galera.cache file of 128MB.
I tried deleting some of them, but it doesn’t reclaim disk space and in about 2 days disk would be entirely filled and the server would obviously stop functioning if not the entire cluster.
Can’t find any information about galera.page files but only galera.cache file so I’m not sure it’s related. As far as I could tell those are used when replicating information between node? Does that mean that the changes on that node are not replicated to others?
Any help would be appreciated.
C.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
I tried to follow but missed the solution. What OS limit are you talking about? Thanks, Eliezer ---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/ -----Original Message----- From: Maria-discuss <maria-discuss-bounces+ngtech1ltd=gmail.com@lists.launchpad.net> On Behalf Of Cédric Counotte Sent: Thursday, 28 July 2022 21:07 To: Gordan Bobic <gordan.bobic@gmail.com> Cc: Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Subject: Re: [Maria-discuss] MariaDB server horribly slow on start I had to restart a server earlier and with this new setting, everything was smooth after the SST ! for the second time in a row. I just started another server, and it did its SST and worked just fine immediately! Wow! I never imagined such an OS limit would be imposed and cause so much trouble! It ought to be documented if not already (then I missed it). Thanks a ton for your help! That's a big problem solved! _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
He was far too low on table_open_cache which was in turn bumping into the LimitNOFILE limit. On Thu, 28 Jul 2022, 22:57 , <ngtech1ltd@gmail.com> wrote:
I tried to follow but missed the solution. What OS limit are you talking about?
Thanks, Eliezer
---- Eliezer Croitoru NgTech, Tech Support Mobile: +972-5-28704261 Email: ngtech1ltd@gmail.com Web: https://ngtech.co.il/ My-Tube: https://tube.ngtech.co.il/
-----Original Message----- From: Maria-discuss <maria-discuss-bounces+ngtech1ltd= gmail.com@lists.launchpad.net> On Behalf Of Cédric Counotte Sent: Thursday, 28 July 2022 21:07 To: Gordan Bobic <gordan.bobic@gmail.com> Cc: Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Subject: Re: [Maria-discuss] MariaDB server horribly slow on start
I had to restart a server earlier and with this new setting, everything was smooth after the SST ! for the second time in a row.
I just started another server, and it did its SST and worked just fine immediately! Wow! I never imagined such an OS limit would be imposed and cause so much trouble! It ought to be documented if not already (then I missed it).
Thanks a ton for your help! That's a big problem solved!
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
participants (8)
-
Ali
-
Cédric Counotte
-
Daniel Black
-
Gordan Bobic
-
jocelyn fournier
-
Marko Mäkelä
-
ngtech1ltd@gmail.com
-
William Edwards