[MariaDB discuss] Re: increasing InnoDB history length and stalls on MariaDB 11.4

4 Dec 2024

      Hi,

On 2024-12-03 12:21, Gordan Bobic wrote:
...
...
To achieve a good performance as before, we now use READ COMMITTED
transaction isolation for DML and use REPEATABLE READ isolation for 
DQL
on MariaDB 11.4. This way guarantees the consistency for our
application.
Have you actually verified that your application needs
repeatable-read? It's possible but in my experience it is extremely
rare for applications to re-read the rows they already read.
Not to say it never happens, but I have seen application
implementations that actually needs this maybe twice out of thousands
of applications I've worked on over the last 25 years.
Our queries are dynamic and we also have queries which depends on the 
result of earlier queries in the same transaction.
On MariaDB 10.4 with innodb-locks-unsafe-for-binlog, we use REPEATABLE 
READ everywhere.
Having a few carefully-checked exceptions to use READ COMMITTED would be 
fine, but would require more work than we can afford.
...
...
Issue 1: InnoDB history length keeps growing while running lots of DML
and DQL queries in parallel.
InnoDB history length is stable around 5k on 10.4, but it keeps 
growing
on 11.4 up to 100k until the end. In our experience, increases in
history list length are roughly
correlated with poor query performance (causing queries to "step over
tombstones" ?), and we do our best to avoid having long-running
transactions in order to allow older snapshots to be freed ASAP. We 
are
satisfied with the status on 10.4, where history list is constant.
What is the reason behind such increases on 11.4 and is there any
configuration to avoid such ?
Have you been able to measure the performance impact of this? 100K
isn't usually high enough to start causing a measurable performance
degradation, that usually starts to happen north of 1M history length.
For the performance impact, I compared pt-query-digest of slow queries 
between :
* 10.4 with READ COMMITTED and REPEATABLE READ
* 11.4 with READ COMMITTED and REPEATABLE READ, with 2GB log file size 
where we had no stalls

On 10.4, all slow queries are DML. On 11.4, we have fewer DML slow 
queries but we also have some DQL slow queries.
'DQL slow queries on 11.4 only' is a sign of performance impact of much 
more history length, I guess.
(and 'fewer DML slow queries on 11.4' should come from various 
improvements on recent MariaDB).

And this reference workload simulates a small part of what we do in 
production. So if the history length keeps increasing like this, it will 
reach 1M or more.

I just tried READ COMMITTED everywhere on 11.4 and I had no such 
consistent increase of the history length (see the attachment). But it 
is not an acceptable solution and we anyway had no such issue on 10.4 
with READ COMMITTED and REPEATABLE READ combination.
...
...
Issue 2: MariaDB stalls when free redo log is too small.
...
This looks like the furious flushing regression that was introduced in
10.5 and doesn't appear to have been fixed since.
It is possible to tune it out to a large extent, though:
https://shatteredsilicon.net/mariadb-furious-flushing-fix-checkpoint-age/
Your article looks quite interesting and the symptom we had seems 
identical. And I tried the same configuration as you wrote there,
innodb_adaptive_flushing = 0
innodb_adaptive_flushing_lwm = 50
innodb_max_dirty_pages_pct_lwm = 50
but the behaviour was still same, i.e. MariaDB stalled when free redo 
log is low.

I can admit 128MB log file size is too small for 16GB buffer pool even 
though we had no issue on MariaDB 10.4, but MariaDB 11.4 still stalls 
with 1GB log file as well.

Best regards,
Kazuhiko

[MariaDB discuss] Re: increasing InnoDB history length and stalls on MariaDB 11.4

kazuhiko＠nexedi.com