Hi Serg, thanks for review! Sergei Golubchik <serg@askmonty.org> writes:
On Apr 27, knielsen@knielsen-hq.org wrote:
Add that slaves set a variable @mariadb_slave_capability to inform the master in a robust way about which events it can, and cannot, handle.
I'm not sure I like the idea of magic user variables. Why wouldn't you create a session server variable @@mariadb_slave_capability?
I'll admit I did not think much about it, I just did it the same way that other replication features are announced by slaves to the master, eg. @master_binlog_checksum or @master_heartbeat_period. There is some merit in being consistent with current code, but there is also merit in avoiding magic user variables. If you prefer a session server variable @@mariadb_slave_capability, I will change it?
Also: Why would "a later release of a slave change what an earlier release of a master believes about its capabilities"?
Agree this is unclear. The basic idea is just to argue that it is better that the slave announces its capabilities explicitly rather than the master tries to guess based on version. What I meant here was: if the master used the slave version to guess what the slave is and is not capable of, then that guess will be fixed. So whenever we release a new version of the slave, it will not be possible to influence what the old master guesses (unless we change the slave version number, which is not always possible/desirable). For example, we could not backport a slave feature to an older version number and work with old master. I re-worded the description in the Jira task to this to clarify: "The slave will, when connecting, set a user variable {{@mariadb_slave_capability}}. The master will read the value of this variable and use it to determine the capabilities of the slave. This is more robust and flexible than relying on version numbers, since ideally any version master should be able to replicate against any version slave. When we release the code in a master, it is hard to predict what slave capabilities every possible version of future releases will have - especially with multiple forks/branches of MySQL being developed individually. And announcing the slave capabilities explicitly allows the slave code full flexibility in working correctly against older masters."
+--echo # Test slave with no capability gets dummy event, which is ignored. +--source include/stop_slave.inc +SET @@global.debug_dbug='d,simulate_slave_capability_none';
Don't do that please, it'll stop any debug trace for ./mtr --debug. Instead, write
SET @@global.debug_dbug='+d,simulate_slave_capability_old_53'; ... your tests SET @@global.debug_dbug='-d,simulate_slave_capability_old_53';
or SET @debug_old=@@global.debug_dbug; SET @@global.debug_dbug='+d,simulate_slave_capability_old_53'; ... your tests SET @@global.debug_dbug=@debug_old;
Yes, this is much better, thanks for the tip!
+ if (data_len < min_query_event_len) + { + /* + Have to use dummy user_var event for such a short packet. + + This works, but the event will be considered part of an event group with + the following event, so things like START SLAVE UNTIL may behave slightly + unexpected.
Why?
I'll reword to: "This works, but the event will be considered part of an event group with the following event. So for example @@global.sql_slave_skip_counter=1 will skip not only the dummy event, but also the immediately following event." I am not too happy about this behaviour, but I did not find a better way. On the other hand - this is unlikely to occur often in practise (if ever), because most events will be large enough to be replaced by a dummy query event instead. And besides, this is only for replicating new slave against old master, which is not "officially" supported anyway. And in any case, it is surely better than sending an event to the slave that slave does not understand at all and causes replication to break.
+ /* Announce MariaDB slave capabilities. */ + DBUG_EXECUTE_IF("simulate_slave_capability_none", goto after_set_cabability;); + { + int rc= DBUG_EVALUATE_IF("simulate_slave_capability_old_53", + mysql_real_query(mysql, STRING_WITH_LEN("SET @mariadb_slave_capability=" + STRINGIFY_ARG(MARIA_SLAVE_CAPABILITY_ANNOTATE))), + mysql_real_query(mysql, STRING_WITH_LEN("SET @mariadb_slave_capability=" + STRINGIFY_ARG(MARIA_SLAVE_CAPABILITY_MINE))));
Better put DBUG_EVALUATE_IF only around the STRINGIFY_ARG values:
DBUG_EVALUATE_IF("...", STRINGIFY_ARG(...), STRINGIFY_ARG(...))
so that next time I (or somebody) wouldn't need to vdiff two long function calls, trying to count parentheses and commans :)
That is not possible, I'm afraid. DBUG_EVALUATE_IF() expands to an expression (XXX ? YYY : ZZZ) - but preprocessor needs a constant to concatenate two strings "foo" "bar". And STRING_WITH_LEN(...) expands to two expressions with a comma between them UUU, VVV - so I also cannot put DBUG_EVALUATE_IF() around STRING_WITH_LEN(). I rewrote it to use strlen() instead of STRING_WITH_LEN - this code is not performance critical: const char *q= DBUG_EVALUATE_IF("simulate_slave_capability_old_53", "SET @mariadb_slave_capability=" STRINGIFY_ARG(MARIA_SLAVE_CAPABILITY_ANNOTATE), "SET @mariadb_slave_capability=" STRINGIFY_ARG(MARIA_SLAVE_CAPABILITY_MINE)); if (mysql_real_query(mysql, q, strlen(q))) ...
=== modified file 'sql/sql_repl.cc' --- a/sql/sql_repl.cc 2012-03-28 17:26:00 +0000 +++ b/sql/sql_repl.cc 2012-04-27 08:20:38 +0000 @@ -563,14 +584,44 @@ static int send_heartbeat_event(NET* net static const char * send_event_to_slave(THD *thd, NET *net, String* const packet, ushort flags, Log_event_type event_type, char *log_file_name, - IO_CACHE *log) + IO_CACHE *log, int mariadb_slave_capability, + ulong ev_offset, uint8 current_checksum_alg)
current_checksum_alg is part of THD, why do you bother to pass is separately?
No, this is a different value - this is the checksum algorithm used for the binlog file currently being read, which is independent of what is in THD.
+ else + { + /* + The slave does not understand ANNOTATE_ROWS_EVENT. + + Older MariaDB slaves (and MySQL slaves) will break replication if there + are holes in the binlog stream (they will miscompute the binlog offset + and request the wrong position when reconnecting). + + So replace the event with a dummy event of the same size that will be + a no-operation on the slave. + */ + if (Query_log_event::dummy_event(packet, ev_offset, current_checksum_alg)) + return "Failed to replace row annotate event with dummy: too small event.";
No ER_xxx error message for that?
Again, I have to admit I did not consider it, I just followed what the existing replication code was already doing (which is to set ER_UNKNOWN_ERROR :-/). But now that I do consider it - I do not think it is worth it to add a new error code for this. I believe this is an impossible condition - the limit is 6 bytes, and I do not think it is possible for a statement of less than 6 bytes to cause a row-based binlog event to be logged (the shortest I can think of is 6 bytes - "CALL f"). I can add a DBUG_ASSERT(), if you like? - Kristian.