[Maria-discuss] InnoDB: background jobs & fsyncs
Hello, We are evaluating new hardware by reproducing real-life workload on real-life data which works fine on an existing server. We copied over the software used to ensure an apple-to-apple comparison, which is based on MariaDB 10.2.23 and uses primarily InnoDB tables, with a few Mroonga tables which do not seem involved in the problem. On the new hardware, we are seeing catastrophically bad performance, especially chain-deadlocks happening and being resolved but producing virtually no useful work at our usual parallelism level of 64 active connections. If we reduce parallelism to 2 active connections, queries start to succeed in any meaningful proportion. The database was freshly restored from a mysqldump (as we cannot interrupt the original database), could this have any effect on deadlocks ? The main tables involved in the deadlocking queries (job queues) are initially (in the test, and periodically in reality) empty, so it would seem surprising. What should I check to debug further ? Note that these deadlocks happen in production but they're negligible. See attachment for mariadb configuration file. Some more details: When we restore from mysqldump, we use the following extra options to speed things up: innodb_flush_log_at_trx_commit = 0 innodb_flush_method = nosync innodb_doublewrite = 0 sync_frm = 0 as during that period nothing of value can be lost (worst case we restart the restore from scratch). BTW, despite these settings, we are still noticing a lot of fsyncs. Is this expected ? Are we missing some other option ? Also, after the import and without any connection, MariaDB was still producing a non-trivial amount of activity on the machine: 5% CPU, read <1MB/s, write 10MB/s, 60 fsync/s. I could not identify what is causing these, where should I look ? How can I tell when it will stabilise back to idle ? When I interrupt the benchmark workload, I see a similar resource usage. I saw https://jira.mariadb.org/browse/MDEV-18698 And InnoDB seems to throttle its background activity: is there a way to tell InnoDB to perform its background tasks at maximum speed ? (the goal is that after the import, we can make a clean tarball that we extract when we want to launch the test case again) Regards, Julien
Hi Julien, First of all, is the version really 10.2.23, which was released in March 2019? If you are actually using the latest 10.2.33 release, you might be affected by MDEV-20638, which surprisingly introduced a performance regression. Based on benchmarks, we reverted that change from the upcoming 10.2, 10.3, and 10.4 releases (but not 10.5). A significant source of background activity is the purge of version history of old transactions. If SHOW ENGINE INNODB STATUS is reporting a nonzero "History list length", some purge activity will be needed. Before MariaDB 10.5, there was also a background merge of buffered changes to secondary indexes. I would suggest checking with http://poormansprofiler.org/ or "sudo perf top" what is causing the I/O, and posting the stack traces. You might even try attaching GDB to the server and setting a breakpoint on fsync(), with breakpoint commands "finish" and "continue" so that you can collect interesting stack traces. Also, I would recommend you to try a later major version. The InnoDB version in MariaDB 10.3 should scale better thanks to a lock-free hash table for maintaining the set of active transactions. MariaDB 10.5 included further improvements. But, be aware that we are working on a 10.5 performance regression in page flushing, in MDEV-23399. With best regards, Marko Mäkelä On Tue, Sep 15, 2020 at 1:25 PM Julien Muchembled <jm@jmuchemb.eu> wrote:
Hello,
We are evaluating new hardware by reproducing real-life workload on real-life data which works fine on an existing server. We copied over the software used to ensure an apple-to-apple comparison, which is based on MariaDB 10.2.23 and uses primarily InnoDB tables, with a few Mroonga tables which do not seem involved in the problem.
On the new hardware, we are seeing catastrophically bad performance, especially chain-deadlocks happening and being resolved but producing virtually no useful work at our usual parallelism level of 64 active connections.
If we reduce parallelism to 2 active connections, queries start to succeed in any meaningful proportion.
The database was freshly restored from a mysqldump (as we cannot interrupt the original database), could this have any effect on deadlocks ? The main tables involved in the deadlocking queries (job queues) are initially (in the test, and periodically in reality) empty, so it would seem surprising.
What should I check to debug further ?
Note that these deadlocks happen in production but they're negligible.
See attachment for mariadb configuration file.
Some more details: When we restore from mysqldump, we use the following extra options to speed things up: innodb_flush_log_at_trx_commit = 0 innodb_flush_method = nosync innodb_doublewrite = 0 sync_frm = 0 as during that period nothing of value can be lost (worst case we restart the restore from scratch). BTW, despite these settings, we are still noticing a lot of fsyncs. Is this expected ? Are we missing some other option ?
Also, after the import and without any connection, MariaDB was still producing a non-trivial amount of activity on the machine: 5% CPU, read <1MB/s, write 10MB/s, 60 fsync/s. I could not identify what is causing these, where should I look ? How can I tell when it will stabilise back to idle ? When I interrupt the benchmark workload, I see a similar resource usage.
I saw https://jira.mariadb.org/browse/MDEV-18698 And InnoDB seems to throttle its background activity: is there a way to tell InnoDB to perform its background tasks at maximum speed ? (the goal is that after the import, we can make a clean tarball that we extract when we want to launch the test case again)
Regards, Julien _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
-- Marko Mäkelä, Lead Developer InnoDB MariaDB Corporation
Thank you for your help. Le 15/09/2020 à 12:51, Marko Mäkelä a écrit :
First of all, is the version really 10.2.23, which was released in March 2019?
Yes, it is. No upgrade since it was installed in April 2019.
A significant source of background activity is the purge of version history of old transactions. If SHOW ENGINE INNODB STATUS is reporting a nonzero "History list length", some purge activity will be needed. Before MariaDB 10.5, there was also a background merge of buffered changes to secondary indexes.
Good to know. I hoped to get an answer about how to tell InnoDB to complete its background tasks as soon as possible (even if background activity has no impact on perf, not having to wait so long to make a clean backup of the datadir remains useful) but I guess there's no way. Should I open a task issue on JIRA ?
I would suggest checking with http://poormansprofiler.org/ or "sudo perf top" what is causing the I/O, and posting the stack traces.
Tried but apparently without pertinent results. NVMe seems too fast for background fsyncs to have an impact here. MariaDB reports a high number of fsyncs/s but maybe some (most?) of them are somehow merged. With blktrace, I could verify that eatmydata ( https://launchpad.net/libeatmydata ) does eliminate the fsync done by mysqld but it does not speed up anything. Same thing if I wait for background activity to finish: for the kind of foreground activity we have, it's not significantly faster.
You might even try attaching GDB to the server and setting a breakpoint on fsync(), with breakpoint commands "finish" and "continue" so that you can collect interesting stack traces.
I didn't try.
Also, I would recommend you to try a later major version. The InnoDB version in MariaDB 10.3 should scale better thanks to a lock-free hash table for maintaining the set of active transactions. MariaDB 10.5 included further improvements. But, be aware that we are working on a 10.5 performance regression in page flushing, in MDEV-23399.
I only tried 10.3.24 and same deadlocks. After further analysis, we found that the deadlocks are due to a problem in our application. We have a scheduler that uses MariaDB to store tasks and the performance profile of the new hardware is so unusual that some kinds of tasks accumulated to a point that the scheduler couldn't handle. Julien
Hi Julien, On Mon, Sep 21, 2020 at 6:13 PM Julien Muchembled <jm@nexedi.com> wrote:
I hoped to get an answer about how to tell InnoDB to complete its background tasks as soon as possible (even if background activity has no impact on perf, not having to wait so long to make a clean backup of the datadir remains useful) but I guess there's no way. Should I open a task issue on JIRA ?
We already have https://jira.mariadb.org/browse/MDEV-22958 filed for adding a SET GLOBAL statement that would wait for the purge to be finished. (Admittedly, the title of the bug is misleading, referring to a test timeout on our CI system.)
After further analysis, we found that the deadlocks are due to a problem in our application. We have a scheduler that uses MariaDB to store tasks and the performance profile of the new hardware is so unusual that some kinds of tasks accumulated to a point that the scheduler couldn't handle.
So, is there anything that you would still want to diagnose? I would recommend reading https://mysqlentomologist.blogspot.com/2020/09/dynamic-tracing-of-pthreadmut... and watching the MariaDB Fest 2020 recording of Valerii’s talk. There is also the command "offcputime" which should be similar to "perf" but counting waits. http://www.brendangregg.com/offcpuanalysis.html shows how to combine it with flame graphs. I do not have any hands-on experience on these techniques yet. Marko -- Marko Mäkelä, Lead Developer InnoDB MariaDB Corporation
participants (3)
-
Julien Muchembled
-
Julien Muchembled
-
Marko Mäkelä