[Maria-developers] add a rotate event at begin of relay log make greate help
Hi kristian, currently, a relaylog is immediately purged after it be completely replayed (relay_log_purge=1). and if IO thread crashes, we should delete all relay logs and fetch master binlog again from exec_master_log_pos/file (relay_log_purge=ON). I think that make a greap waste of disk IO especially to cloud disk. The critical problem is that there is no method to know the original binlog filename on master of a event in relaylog. Actually we can know a relaylog event's original binlog offset on master, but we can't know the original binlog filename. so why not forcely add a rotate event or any other new type event which contained the original master binlog filename at begin of a relay log. as I know two different master binlogs' events wouldn't contained in one relay log, so the original binlog filename of a relaylog's events is only one. If we want to know a relaylog event's original binlog filename/position, we can get original filename from the begin of relaylog, and get original binlog offset from event's "end log pos". oppositely, if we know a event's binlog filename/end positon on master, wo also can quickly find it in slave's relay log. That can make many benefits, firstly no need to delete all relay logs and fetch master binlogs again when IO thread crashed, because we can get exact last read_master_log_filename/position from last relay log. secondly, there is no need to use GTID in 1 vs n replication failover Scenario (Gtid must set log_slave_updates=ON in mysql 5.6, which increase disk IO load), if master crashes, other slaves can get lost events for the newest slave's relay log, as long as relay log don't be purged, then promotes the newest slave to master. Thanks. 2014-11-26 nanyi607rao
"nanyi607rao" <nanyi607rao@gmail.com> writes:
currently, a relaylog is immediately purged after it be completely replayed (relay_log_purge=1). and if IO thread crashes, we should delete all relay logs and fetch master binlog again from exec_master_log_pos/file (relay_log_purge=ON). I think that make a greap waste of disk IO especially to cloud disk. The critical problem is that there is no method to know the original binlog filename on master of a event in relaylog.
Actually we can know a relaylog event's original binlog offset on master, but we can't know the original binlog filename. so why not forcely add a rotate event or any other new type event which contained the original master binlog filename at begin of a relay log. as I know two different master binlogs' events wouldn't contained in one relay log, so the original binlog filename of a relaylog's events is only one.
The idea sounds reasonable. In fact, it's already partially there. I just tested with MariaDB 10.0: MariaDB [test]> show relaylog events in 'frigg-relay-bin.000006'; +------------------------+-----+-------------------+-----------+-------------+------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +------------------------+-----+-------------------+-----------+-------------+------------------------------------------------------+ | frigg-relay-bin.000006 | 4 | Format_desc | 2 | 248 | Server ver: 10.0.15-MariaDB-debug-log, Binlog ver: 4 | | frigg-relay-bin.000006 | 248 | Rotate | 1 | 0 | master-bin.000003;pos=4 | | frigg-relay-bin.000006 | 292 | Format_desc | 1 | 248 | Server ver: 10.0.15-MariaDB-debug-log, Binlog ver: 4 | | frigg-relay-bin.000006 | 536 | Gtid_list | 1 | 287 | [0-1-6] | | frigg-relay-bin.000006 | 575 | Binlog_checkpoint | 1 | 327 | master-bin.000002 | | frigg-relay-bin.000006 | 615 | Binlog_checkpoint | 1 | 367 | master-bin.000003 | | frigg-relay-bin.000006 | 655 | Gtid | 1 | 405 | BEGIN GTID 0-1-7 | | frigg-relay-bin.000006 | 693 | Query | 1 | 494 | use `test`; insert into t1 values (39) | | frigg-relay-bin.000006 | 782 | Xid | 1 | 521 | COMMIT /* xid=25 */ | +------------------------+-----+-------------------+-----------+-------------+------------------------------------------------------+ Here, the second event is just what you suggested, a rotate event containing the name of the master binlog file. But I suppose, the issue is if the slaves relay logs are rotated in the middle of a master binlog file, due to max_relay_log_size < max_binlog_size or FLUSH LOGS on the slave? When I tested, I got this: MariaDB [test]> show relaylog events in 'frigg-relay-bin.000007'; +------------------------+-----+-------------+-----------+-------------+------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +------------------------+-----+-------------+-----------+-------------+------------------------------------------------------+ | frigg-relay-bin.000007 | 4 | Format_desc | 2 | 248 | Server ver: 10.0.15-MariaDB-debug-log, Binlog ver: 4 | | frigg-relay-bin.000007 | 248 | Format_desc | 1 | 0 | Server ver: 10.0.15-MariaDB-debug-log, Binlog ver: 4 | | frigg-relay-bin.000007 | 492 | Gtid | 1 | 713 | BEGIN GTID 0-1-9 | | frigg-relay-bin.000007 | 530 | Query | 1 | 802 | use `test`; insert into t1 values (41) | | frigg-relay-bin.000007 | 619 | Xid | 1 | 829 | COMMIT /* xid=27 */ | +------------------------+-----+-------------+-----------+-------------+------------------------------------------------------+ So indeed, no rotate event there. So if there is a master binlog rotate, we get a corresponding slave relaylog rotate containing the desired rotate event. But other relaylog rotations are missing the event. One thing to be aware of is that relay log rotates can happen in the middle of a transaction. So you need to check that this is handled correctly. Also, there are some special semantics associated with rotate events, some rotate events are "fake" events generated on the master, and some are synthetic generated on the slaves, it needs to be all handled correctly. But I think it should be possible.
That can make many benefits, firstly no need to delete all relay logs and fetch master binlogs again when IO thread crashed, because we can get exact last read_master_log_filename/position from last relay log. secondly, there
I guess you would need to implement some relay log recovery procedure. Because in case of crash, the end of the relay log can be corrupt (partial transaction or event at the end). If we ensure that the SQL thread is also stopped at the time of recovery, that should be not too hard.
is no need to use GTID in 1 vs n replication failover Scenario (Gtid must set log_slave_updates=ON in mysql 5.6, which increase disk IO load), if master crashes, other slaves can get lost events for the newest slave's relay log, as long as relay log don't be purged, then promotes the newest slave to master.
Well, MariaDB GTID works also with log_slave_updates=OFF. But I agree, being able to crash-recover relay logs and use them for various purposes could be useful. - Kristian.
Kristian Nielsen <knielsen@knielsen-hq.org> writes:
Here, the second event is just what you suggested, a rotate event containing the name of the master binlog file.
But I suppose, the issue is if the slaves relay logs are rotated in the middle of a master binlog file, due to max_relay_log_size < max_binlog_size or FLUSH LOGS on the slave? When I tested, I got this:
It indeed would put a rotate event i suggested at relay log head, but not always. if forcely insert a rotate event at relay log head,there would be a redundancy rotate event, which i think not a serious problem though.
One thing to be aware of is that relay log rotates can happen in the middle of a transaction. So you need to check that this is handled correctly. Also, there are some special semantics associated with rotate events, some rotate events are "fake" events generated on the master, and some are synthetic generated on the slaves, it needs to be all handled correctly. But I think it should be possible.
rotate event’s server id can tell us where it is from (master or slave), so the rotate event i suggested should contain master’ server id. i think there will be no difference to SQL Thread whether it is a fake rotate event or a real rotate event from master or a rotate event i suggested. but SQL Thread have to handle carefully a rotate event during a transaction.
I guess you would need to implement some relay log recovery procedure. Because in case of crash, the end of the relay log can be corrupt (partial transaction or event at the end). If we ensure that the SQL thread is also stopped at the time of recovery, that should be not too hard.
i can use xid event to detect a partial transaction, but how to detect a partial event without event checksum, because event checksum only contained in Mariadb 5.5+ and MySQL 5.6+
raolonghui <nanyi607rao@gmail.com> writes:
It indeed would put a rotate event i suggested at relay log head, but not always. if forcely insert a rotate event at relay log head,there would be a redundancy rotate event, which i think not a serious problem though.
Probably not, or if we really want, it should not be hard to only insert the extra rotate event when it is needed.
i can use xid event to detect a partial transaction, but how to detect a partial event without event checksum, because event checksum only contained in Mariadb 5.5+ and MySQL 5.6+
Partial event should be easy to detect, it will just be that you see end-of-file on the relay log file before the event is fully read. - Kristian.
participants (3)
-
Kristian Nielsen
-
nanyi607rao
-
raolonghui