"nanyi607rao" <nanyi607rao@gmail.com> writes:
currently, a relaylog is immediately purged after it be completely replayed (relay_log_purge=1). and if IO thread crashes, we should delete all relay logs and fetch master binlog again from exec_master_log_pos/file (relay_log_purge=ON). I think that make a greap waste of disk IO especially to cloud disk. The critical problem is that there is no method to know the original binlog filename on master of a event in relaylog.
Actually we can know a relaylog event's original binlog offset on master, but we can't know the original binlog filename. so why not forcely add a rotate event or any other new type event which contained the original master binlog filename at begin of a relay log. as I know two different master binlogs' events wouldn't contained in one relay log, so the original binlog filename of a relaylog's events is only one.
The idea sounds reasonable. In fact, it's already partially there. I just tested with MariaDB 10.0: MariaDB [test]> show relaylog events in 'frigg-relay-bin.000006'; +------------------------+-----+-------------------+-----------+-------------+------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +------------------------+-----+-------------------+-----------+-------------+------------------------------------------------------+ | frigg-relay-bin.000006 | 4 | Format_desc | 2 | 248 | Server ver: 10.0.15-MariaDB-debug-log, Binlog ver: 4 | | frigg-relay-bin.000006 | 248 | Rotate | 1 | 0 | master-bin.000003;pos=4 | | frigg-relay-bin.000006 | 292 | Format_desc | 1 | 248 | Server ver: 10.0.15-MariaDB-debug-log, Binlog ver: 4 | | frigg-relay-bin.000006 | 536 | Gtid_list | 1 | 287 | [0-1-6] | | frigg-relay-bin.000006 | 575 | Binlog_checkpoint | 1 | 327 | master-bin.000002 | | frigg-relay-bin.000006 | 615 | Binlog_checkpoint | 1 | 367 | master-bin.000003 | | frigg-relay-bin.000006 | 655 | Gtid | 1 | 405 | BEGIN GTID 0-1-7 | | frigg-relay-bin.000006 | 693 | Query | 1 | 494 | use `test`; insert into t1 values (39) | | frigg-relay-bin.000006 | 782 | Xid | 1 | 521 | COMMIT /* xid=25 */ | +------------------------+-----+-------------------+-----------+-------------+------------------------------------------------------+ Here, the second event is just what you suggested, a rotate event containing the name of the master binlog file. But I suppose, the issue is if the slaves relay logs are rotated in the middle of a master binlog file, due to max_relay_log_size < max_binlog_size or FLUSH LOGS on the slave? When I tested, I got this: MariaDB [test]> show relaylog events in 'frigg-relay-bin.000007'; +------------------------+-----+-------------+-----------+-------------+------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +------------------------+-----+-------------+-----------+-------------+------------------------------------------------------+ | frigg-relay-bin.000007 | 4 | Format_desc | 2 | 248 | Server ver: 10.0.15-MariaDB-debug-log, Binlog ver: 4 | | frigg-relay-bin.000007 | 248 | Format_desc | 1 | 0 | Server ver: 10.0.15-MariaDB-debug-log, Binlog ver: 4 | | frigg-relay-bin.000007 | 492 | Gtid | 1 | 713 | BEGIN GTID 0-1-9 | | frigg-relay-bin.000007 | 530 | Query | 1 | 802 | use `test`; insert into t1 values (41) | | frigg-relay-bin.000007 | 619 | Xid | 1 | 829 | COMMIT /* xid=27 */ | +------------------------+-----+-------------+-----------+-------------+------------------------------------------------------+ So indeed, no rotate event there. So if there is a master binlog rotate, we get a corresponding slave relaylog rotate containing the desired rotate event. But other relaylog rotations are missing the event. One thing to be aware of is that relay log rotates can happen in the middle of a transaction. So you need to check that this is handled correctly. Also, there are some special semantics associated with rotate events, some rotate events are "fake" events generated on the master, and some are synthetic generated on the slaves, it needs to be all handled correctly. But I think it should be possible.
That can make many benefits, firstly no need to delete all relay logs and fetch master binlogs again when IO thread crashed, because we can get exact last read_master_log_filename/position from last relay log. secondly, there
I guess you would need to implement some relay log recovery procedure. Because in case of crash, the end of the relay log can be corrupt (partial transaction or event at the end). If we ensure that the SQL thread is also stopped at the time of recovery, that should be not too hard.
is no need to use GTID in 1 vs n replication failover Scenario (Gtid must set log_slave_updates=ON in mysql 5.6, which increase disk IO load), if master crashes, other slaves can get lost events for the newest slave's relay log, as long as relay log don't be purged, then promotes the newest slave to master.
Well, MariaDB GTID works also with log_slave_updates=OFF. But I agree, being able to crash-recover relay logs and use them for various purposes could be useful. - Kristian.