
Kristian Nielsen via developers <developers@lists.mariadb.org> writes:
What if the binlog code simply keeps track of whenever the current page of the binlog gets partially written to the file system? And when this happens, the next mtr_t write to that page will simply re-write all the data from the start of the page? This way the recovery code can always assume that the page is valid on disk prior to each redo record, and should be set to zeros following the record.
I think it's litterally just replacing this line:
mtr->memcpy(*block, page_offset, size+3);
with this in the rare case after the page was partially written:
mtr->memcpy(*block, 0, size+3);
That code line was a bit sloppy and not correct. What I had in mind is more something like this: if (page_offset > FIL_PAGE_DATA && block->page.oldest_modification() <= 1) { // Adding to a page that was already flushed. Redo log all the data to // protect recovery against torn page on subsequent page write. mtr->memcpy(*block, FIL_PAGE_DATA, (page_offset - FIL_PAGE_DATA) + size+3); } else { mtr->memcpy(*block, page_offset, size+3); } I wonder if we could do a test case for this. Some DBUG injection in the code that writes the page to disk, which instead writes garbage to the page and crashes the server, simulating a power outage that corrups the page write. Then would need to somehow arrange for the page to be first partially written and then written again with the DBUG injection active. - Kristian.