[MariaDB discuss] Re: Redo log and tablespace flushing

19 Nov 2023


      On Sun, Nov 19, 2023 at 3:42 PM Marko Mäkelä <marko.makela@mariadb.com> wrote:
...
...
Thanks for this. Is there a way to force replay of the entire redo log on an unclean shutdown even if the checkpoint in the redo log says it was flushed to tablespace?
You can overwrite the newer checkpoint block, so that recovery is
forced to use the older one. Before MariaDB 10.8, the two checkpoint
blocks are 512 (0x200) bytes starting at ib_logfile0 offset 0x200 and
0x600. Starting with 10.8, the checkpoint blocks are 64 bytes starting
at ib_logfile0 offset 0x1000 and 0x2000. Obviously, do not try this on
any important data, or experiment on a copy of the data. It is
possible that the recovery will fail in various ways if the section of
the log between the older checkpoint and the logical end of the log
has been overwritten. The InnoDB WAL file is cyclic: checkpoints
"truncate" the head and the tail (new log records) is not supposed to
overwrite the head. If you are moving the head backwards by discarding
the latest checkpoint, there will be no guarantee that no overwrite
took place.
Another way to experiment would be to run mariadb-backup --backup
while a server is executing a write heavy workload. When you --prepare
the backup, it will start from the LSN of the checkpoint that was the
latest when the backup started. When the backup finishes, the server’s
log file may already be several checkpoints ahead of the backup.
I think what I'm looking for is an option to ignore checkpoints, scan
the entire redo log and replay everything from lowest to highest
available LSN.
From what you are saying, if I zero out
bytes 512-1023 and bytes 1536-2047
That will force a full log scan / replay? Did I understand that correctly?
...
...
I'm exploring the idea of running datadir on storage that preserves write ordering but runs with the equivalent of nobarrier. It will still flush in the background every X seconds where X is configurable, so I am hoping to use the redo log to keep my data crash-safe even though I am lying about tablespace write flushes, because write ordering will be preserved despite running with the equivalent of nobarrier.
I can't comment much on that. It could be a good idea to execute some
kind of "pull the plug" testing during a write workload. Perhaps that
could be arranged more easily in a virtualized environment.
Yes, obviously this would need some extreme testing, that goes without
saying. I just wanted to make sure my idea wasn't outright retarded
before I went down this particular rabbit hole.

[MariaDB discuss] Re: Redo log and tablespace flushing

Gordan Bobic