You can try it, but that isn't a fix, that is a way to make the flushing run all the time at full rate. If I remember correctly, old behaviour was that the flushing would happen at innodb_io_capacity rate, and above the hwm, it would kick into innodb_io_capacity_max rate. Or something along those lines. On 10.5+ you get only two speeds, 0 and whatever your disks can handle (which can also starve other I/O) Whatever the "improvement" intended was, the outcome is a substantial downgrade. On Wed, Jul 27, 2022 at 3:36 PM Cédric Counotte <cedric.counotte@1check.com> wrote:
Reading this: https://jira.mariadb.org/browse/MDEV-27295
It's quite unclear when it is fixed or reverted.
That said I read that the following setting might fix it: SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.001;
Is that correct and should I try that and see if that helps?
-----Message d'origine----- De : Gordan Bobic <gordan.bobic@gmail.com> Envoyé : mercredi 27 juillet 2022 14:29 À : Marko Mäkelä <marko.makela@mariadb.com> Cc : Cédric Counotte <cedric.counotte@1check.com>; Mailing-List mariadb <maria-discuss@lists.launchpad.net> Objet : Re: [Maria-discuss] MariaDB server horribly slow on start
On Wed, Jul 27, 2022 at 3:08 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
On Wed, Jul 27, 2022 at 2:48 PM Gordan Bobic <gordan.bobic@gmail.com> wrote:
There is no supported downgrade path other than logical dump+restore. There are also no packages built for distros where the major version is older than what ships with the distro.
Since your queries seem to end up stuck in commit stage, it could be related to redo log flushing, which behaves very erratically on 10.5+. If it leaves the log to fill up to 90% and the state transfer hits, it could be that with the checkpoint age already high, there just isn't enough headroom to avoid a massive stall. Purely guessing here without any telemetry.
I think that you may refer to InnoDB page flushing. There was some misunderstanding around that, and indeed some partly unintended or uninformed changes in behaviour (in 10.5.7 and 10.5.8) that were reverted later. It could be useful to read https://jira.mariadb.org/browse/MDEV-27295.
What version was it reverted in? I am still seeing the errant redo log flushing behaviour in 10.5.15. It looks like no flushing happens until the hwm is reached at about 85% full. It then tries to commit everything down to the lwm. And inbetween it doesn't do anything, even while everything is idle and it should be running down the