Nirbhay Choubey <nirbhay@mariadb.com> writes: [Cc: maria-developers@, please always keep these discussions on the mailing list]
In Galera cluster, the state transfer scripts perform FTWRL and copy data along with the last of all available binlog files to the joiner node.
After MDEV-181, I understand that the binlog checkpoint can be in any of the binary log files (and not necessarily the last one).
This seemingly has caused MDEV-9423, in which the joiner node complains of the missing binlog file.
Now the question is : Is FTWRL not sufficient to ensure that the checkpoint is always the last binlog file?
So if I understand correctly, the issue is related to having binlog files available during XA crash recovery. When the binlog file is rotated, there is a small window where both the latest and the previous binlog files are needed for crash recovery. The binlog checkpoint is the earliest binlog file that is needed for crash recovery, and it can be seen from the binlog checkpoint event. So the problem here is that a copy is made just after binlog rotation, and Galera only copies the most recent, mostly-empty binlog file, leaving insufficient information for XA recovery, right? One option to solve this is to always copy the last two binlog files. While it is theoretically possible to have the binlog checkpoint more than two files back, I think it will not occur in practice. Another option is to wait for the binlog checkpoint to reach the current binlog file. You can see this done in the test suite: mysql-test/include/wait_for_binlog_checkpoint.inc The binlog checkpointing happens asynchroneously, I *think* it can complete even while FTWRL is active, but I am not 100% sure though. The checkpoint happens after InnoDB has made its commits durable with fsync() or similar - only after that is it safe to discard the old binlog data and still have correct crash recovery. - Kristian.