Well, one server crashed twice a few days ago and I've asked my service provided (OVH) to look into it, but they asked me to test the hardware myself, found a NVMe disk with 17000+ errors, still waiting for their feedback on this. Only our 2 oldest servers are experiencing crashes (6 months old only!), and it turns out the RAID NVMe have very different written data, one disk has 58TB (not a replacement) while the other is at 400+TB within the same RAID ! All other servers have identical written data size on both disks of their RAID, so it seems we got used disks and that those are having issues. Still didn't have time to produce a crash dump and post an issue with those (to confirm the cause) as I kept having to deal with server restarts trying to reduce the slow issue for 30 minutes to one hour. There was issues with slave thread crashing which I posted an issue and got to update MariaDB to resolve, still there are issues with slave threads stopping without reason so I have written a script to restart it and posted an issue with that. The original objective was to have 2 usable cluster in different sites, synched with each other using replication, however all those issues have not allowed us to move forward with this. Not to mention the fact that we are now using OVH load balancer and that piece of hardware is sometimes thinking all our servers are down and starts showing error 503 to our customers while our servers are just running fine (no restart, no issue, nothing). So one more issue to deal with, for which we'll get a dedicated server and configure our own load balancer we can have control on. Sorry for the long story, it's been quite a pain, and I feel like I'm looking at the end of the tunnel with your help. -----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : jeudi 28 juillet 2022 10:56 À : Cédric Counotte <cedric.counotte@1check.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start Servers shouldn't be crashing. If they are crashing you need to establish why and deal with it. Uptimes of years with MariaDB are not uncommon. Or at least months even among the security conscious who patch with every release cycle. On Thu, Jul 28, 2022 at 11:48 AM Cédric Counotte <cedric.counotte@1check.com> wrote:
Well, turns out the last attached server crashed and did an IST, the setting described below was already applied, and the issue didn't show up!?
I'll try again later today to confirm, but it looks very positive 😉
[ It might be a false positive as I've seen 2 restarts goes smooth in the past while there are just too many restarts having gone haywire. ]
-----Message d'origine----- De : Cédric Counotte Envoyé : jeudi 28 juillet 2022 10:24 À : Gordan Bobic <gordan.bobic@gmail.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : RE: [Maria-discuss] MariaDB server horribly slow on start
I've prepared all servers with that new setting, and this (is it ok or should I set it to 1048576 as well?):
table_open_cache = 65536
I'll do the server restart this evening to avoid creating problems during the day.
I did try to restart the backup cluster (2 nodes, one slave of the main cluster) and it didn't seem to slowdown the slave as it used to, so that might be the solution (or part of)!
Thanks a lot for your time, will keep you posted later today when I restart a node of the main cluster.
-----Message d'origine-----
De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : jeudi 28 juillet 2022 10:06 À : Cédric Counotte <cedric.counotte@1check.com> Cc : jocelyn fournier <jocelyn.fournier@gmail.com>; Marko Mäkelä <marko.makela@mariadb.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net>; Pierre LAFON <pierre.lafon@1check.com> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Thu, Jul 28, 2022 at 10:56 AM Cédric Counotte <cedric.counotte@1check.com> wrote:
Concerning table_open_cache, it’s currently set to 13869 (however in config it’s set to 16384), global status shows this on new node, 9 hours after start:
That means you are out of file handles at systemd level.
systemctl edit mariadb
and add this to the override file:
[Service]
LimitNOFILE=1048576
systemctl daemon reload
systemctl restart mariadb
Yes, this will probably trigger the problem you are having, but with some luck it may make it better in the future.
Do that on all nodes.
+-----------------------------------+---------+
| Table_open_cache_active_instances | 1 |
| Table_open_cache_hits | 2136757 |
| Table_open_cache_misses | 185097 |
| Table_open_cache_overflows | 146153 |
+-----------------------------------+---------+
+---------------+--------+
| Opened_tables | 159629 |
+---------------+--------+
I’ve updated table_open_cache to 65536 on 2 servers and the cache_overflows stops increasing.
In /usr/lib/systemd/system/mariadb.service, I see those:
LimitNOFILE=32768
That is way too low. This needs to be big enough to cover the sum total of:
max_connections
table_open_cache x2 (because innodb_open_files is separate)
There is generally no harm in bumping LimitNOFILE much higher on modern kernels.
So I don’t understand why MariaDB decided to reduce the configured value? Not sure if changing the config will have any effect on the live value either? I’ll try to set both to 65536 this evening and see if it helps.
Because it tries to make sure that your total of the above mentioned settings fits in the number of file handles it has available to it.
Is it safe to increase both limits? Maybe to the value I use during mariabackup, which is 919200?
Yes, see above. But it requires a daemon-reload and a restart of the service to take effect.
All active nodes are used for writing, the HTTP load is spread evenly on all nodes. The ratio is 1.2% writes at 7780 read/sec with 125 write/sec on peak. Both read/writes are spread upon all active nodes using a load balancer using round-robin at the moment.
That is likely a part of your problem.
You should never ever use more than one Galera node for writing at a time.
Performance will be WORSE than performance of a single node, and you will get deadlocks all over the place.
You can use any of them for reading, but you should never use more than one at a time for writing.
It is a little concerning that you managed to get as far as putting Galera into production for months without full awareness of this.
During yesterday’s test the existing 2 nodes where active at first. Seeing the queries starting to be stuck I decided to activate the new node to spread the load in hope for some improvements, however it just made things even worse, so I deactivated it again.
The more writable nodes you have the worse the performance will get.
There should only ever be one writable node at a time.