![](https://secure.gravatar.com/avatar/3bce7f356a04335a99fe8b1a74a04e02.jpg?s=120&d=mm&r=g)
Hi Julien,
First of all, is the version really 10.2.23, which was released in
March 2019? If you are actually using the latest 10.2.33 release, you
might be affected by MDEV-20638, which surprisingly introduced a
performance regression. Based on benchmarks, we reverted that change
from the upcoming 10.2, 10.3, and 10.4 releases (but not 10.5).
A significant source of background activity is the purge of version
history of old transactions. If SHOW ENGINE INNODB STATUS is reporting
a nonzero "History list length", some purge activity will be needed.
Before MariaDB 10.5, there was also a background merge of buffered
changes to secondary indexes.
I would suggest checking with http://poormansprofiler.org/ or "sudo
perf top" what is causing the I/O, and posting the stack traces. You
might even try attaching GDB to the server and setting a breakpoint on
fsync(), with breakpoint commands "finish" and "continue" so that you
can collect interesting stack traces.
Also, I would recommend you to try a later major version. The InnoDB
version in MariaDB 10.3 should scale better thanks to a lock-free hash
table for maintaining the set of active transactions. MariaDB 10.5
included further improvements. But, be aware that we are working on a
10.5 performance regression in page flushing, in MDEV-23399.
With best regards,
Marko Mäkelä
On Tue, Sep 15, 2020 at 1:25 PM Julien Muchembled
Hello,
We are evaluating new hardware by reproducing real-life workload on real-life data which works fine on an existing server. We copied over the software used to ensure an apple-to-apple comparison, which is based on MariaDB 10.2.23 and uses primarily InnoDB tables, with a few Mroonga tables which do not seem involved in the problem.
On the new hardware, we are seeing catastrophically bad performance, especially chain-deadlocks happening and being resolved but producing virtually no useful work at our usual parallelism level of 64 active connections.
If we reduce parallelism to 2 active connections, queries start to succeed in any meaningful proportion.
The database was freshly restored from a mysqldump (as we cannot interrupt the original database), could this have any effect on deadlocks ? The main tables involved in the deadlocking queries (job queues) are initially (in the test, and periodically in reality) empty, so it would seem surprising.
What should I check to debug further ?
Note that these deadlocks happen in production but they're negligible.
See attachment for mariadb configuration file.
Some more details: When we restore from mysqldump, we use the following extra options to speed things up: innodb_flush_log_at_trx_commit = 0 innodb_flush_method = nosync innodb_doublewrite = 0 sync_frm = 0 as during that period nothing of value can be lost (worst case we restart the restore from scratch). BTW, despite these settings, we are still noticing a lot of fsyncs. Is this expected ? Are we missing some other option ?
Also, after the import and without any connection, MariaDB was still producing a non-trivial amount of activity on the machine: 5% CPU, read <1MB/s, write 10MB/s, 60 fsync/s. I could not identify what is causing these, where should I look ? How can I tell when it will stabilise back to idle ? When I interrupt the benchmark workload, I see a similar resource usage.
I saw https://jira.mariadb.org/browse/MDEV-18698 And InnoDB seems to throttle its background activity: is there a way to tell InnoDB to perform its background tasks at maximum speed ? (the goal is that after the import, we can make a clean tarball that we extract when we want to launch the test case again)
Regards, Julien _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
-- Marko Mäkelä, Lead Developer InnoDB MariaDB Corporation