Marko Mäkelä <marko.makela@mariadb.com> writes:
Hi Kristian,
On Mon, Feb 26, 2024 at 8:31 PM Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
I would tweak the log checkpoint to ensure that all pages of the "previous" binlog tablespace must be written back before we can advance the log checkpoint. The tablespace ID (actually just 1 bit of
Conversely, we would then also need to wait for a log checkpoint before we can rotate to a new binlog tablespace, right?
That is not necessary. We only have to completely write back the changes from the buffer pool to the last-but-one binlog file, whose tablespace ID we are about to reuse for the new file. That can be done
checkpoint from being advanced. If we write the last modification LSN to the first page of the binlog tablespace, recovery can simply skip all log records for the binlog tablespace that are older than the LSN.
Ah! Yes, I see, that seems a good solution. And users will want to have binlog data written to the file as quickly as possible anyway, to be visible with external tools (mariadb-binlog), so that fits perfectly. So this looks perfect, I like the approach of cycling between two reserved tablespace IDs.
Created pages are fixed in the buffer pool until the mtr_t::commit() that would release the page latch and the buffer-fix. Simply by invoking buf_page_t::io_fix() before mtr_t::commit() you can extend the buffer-fix, to reuse the page in a subsequent mini-transaction. For example,
Ok, thanks for the explanation, sounds useful.
If a page is buffer-fixed for an unbounded time, it could interfere with an attempt to shrink the buffer pool or to respond to a memory pressure event. Some interface for releasing those pages would be nice
Right. In this case, the idea would be to fix at most one page at a time, the last partial page that is currently being appended to. - Kristian.