On Mon, Jan 6, 2025 at 2:15 PM Kristian Nielsen via developers <developers@lists.mariadb.org> wrote:
That code line was a bit sloppy and not correct. What I had in mind is more something like this:
if (page_offset > FIL_PAGE_DATA && block->page.oldest_modification() <= 1) { // Adding to a page that was already flushed. Redo log all the data to // protect recovery against torn page on subsequent page write. mtr->memcpy(*block, FIL_PAGE_DATA, (page_offset - FIL_PAGE_DATA) + size+3); } else { mtr->memcpy(*block, page_offset, size+3); }
I wonder if we could do a test case for this.
Yes, something like that could work. For testing, my preference would be to use DEBUG_SYNC possibly together with DBUG_EXECUTE_IF to prohibit page writes and then using Perl code to corrupt the data file. We have a number of tests that make use of no_checkpoint_start.inc and sometimes $ENV{MTR_SUITE_DIR}/include/crc32.pl to compute valid checksums for intentionally corrupted pages. Here we could just overwrite the last binlog block with NUL bytes. I think that we could allow the binlog layer to write directly to the 4096-byte blocks that are allocated from the InnoDB buffer pool. The binlog page cleaner thread might even be writing the last (incomplete) block concurrently while we are adding more data to it. If that is an issue for external tools that are trying to read a consistent copy of all of the binlog, then it could be better to use page latches properly, like we do for InnoDB data pages. Crash recovery would not have a problem with such racey writes, provided that the ib_logfile0 will always completely initialize the pages. That's normally signalled by writing an INIT_PAGE record before the WRITE record. Marko -- Marko Mäkelä, Lead Developer InnoDB MariaDB plc