Re: [Maria-developers] ee538938345: MDEV-21117: refine the server binlog-based recovery for semisync
Hi, Andrei! On Mar 01, Andrei Elkin wrote:
I've reviewed almost everything, see comments below. But not the Recovery_context methods. Please explain how it works and how all these truncate_validated, truncate_reset_done, truncate_set_in_1st, etc all work together.
...specifically to this point. Just in case I hope you did not miss to read recovery_design.txt from the MDEV, which does not go into coding details that you're effectively about in above.
I've read it now. Still I don't understand why you need an extra round of binlog scanning (two rounds if only one file, three rounds if many). Also, this recovery_design.txt is not part of the code, so whoever will look at it later will be just as puzzled as I was. ...
[ G1, G2, ..., G_k, g_{k+1}, ... g_n ]
here the uppercase `G' stands for committed trx, the smallcase `g' for prepared,`_k' - sub-scripts in the recovery sequence. As the capital letter first rules in the single-engine case the first occurrence of a pattern `G_k,g_k+1' identifies the truncate index. The patch reflects such fact with raising `truncate_validated' flag.
But it's more complicated in the multiple binlog files / engines case. The first `Gg' letter-case drop is not guaranteed to be the only drop so `truncate_reset_done' and `truncate_set_in_1st' are introduced to help with truncate index identification.
Why is it not guaranteed?
When `truncate_validated' is set that indicates the truncate index is determined and may not change in the current (1st of 2nd) nor future rounds. `truncate_reset_done' says that an "inverse" `g_k,G_k+1' pair is found so that any earlier truncation candidate gets reset (to "zero"). If there will be later any candidate found in *this* (1st or 2nd) round in the sequence its index will be obviously greater.
`truncate_set_in_1st' function is to remember that the truncate candidate was found in the 1st round (in the "hot" binlog file), but if the candidate has not been validated `!truncate_validated' it may be exacted in the 2nd round and then to an earlier transaction. So the flag helps to handle exception from truncate candidate monotony rule: e.g the hot binlog B2 contains `[g5,g6,...g_n]' and a ref to binlog checkpoint file B1 that contains `[G1,g2,G3,g4]'. The first round truncate candidate of g5 would be first exacted to `g2' before finally ascertained to `g4' in the 2nd round. (Notice `g2 -> g4' preserves the truncate index monotony).
Notice that due to exacting like `g2 -> g4' in the 2nd round of the above example `g2' got "to be up-cased" into `G2' for committing (feasible with two trx:s on two different engine scenario - g2 with Innodb only got prepared, G4 - on Rocksdb, and got committed prior the crash). That's what the 3rd round is for.
I hope this will be helpful.
Unfortunately, not very much. It describes how you juggle with variables and scanning rounds. But not *what* you're trying to find. And it's kind of difficult to reverse engineer your "how" back into "what" :( Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
participants (1)
-
Sergei Golubchik