Thanks for reviewing, Kristian! I've pushed a new commit to 11.6-MDEV-21322 which addresses
most of your findings. Anything left out I've responded to below. There were some additional
synchronization changes that went alongside the use of LOCK_thd_data, so it would be good if
you could go over at least the code changes again.
Kristian Nielsen via commits <commits@lists.mariadb.org> writes:
> From: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
>
> This patch extends the command `SHOW REPLICA HOSTS` with three
> columns:
Hi Brandon, please find my review of MDEV-21322 below.
> This patch extends the command `SHOW REPLICA HOSTS` with three
> columns:
>
> 1) Gtid_State_Sent. This represents that latest GTIDs sent to the
> replica in each domain. It will always be populated, regardless of
> the semi-sync status (i.e. asynchronous connections will still
> update this column with the latest GTID state sent to the replica).
>
> 2) Gtid_State_Ack. For semi-synchronous connections (only), this
> column represents the last GTID in each domain that the replica has
> acknowledged.
Use "Pos" instead of "State", eg. Gtid_Pos_Sent and Gtid_Pos_Ack.
To be consistent with eg. gtid_binlog_pos, which has last GTID per
domain_id; vs. gtid_binlog_state, which has GTID per every (domain_id,
server_id) combination.
> @@ -2272,6 +2292,30 @@ send_event_to_slave(binlog_send_info *info, Log_event_type event_type,
> return "Failed to run hook 'after_send_event'";
> }
>
> + if (info->thd->slave_info)
> + {
> + strncpy(info->thd->slave_info->gtid_state_sent.log_file,
> + info->log_file_name + info->dirlen,
> + strlen(info->log_file_name) - info->dirlen);
> + info->thd->slave_info->gtid_state_sent.log_pos= pos;
> + }
Isn't there missing synchronisation here against concurrent access from
another thread?
I guess we need LOCK_thd_data around the strncpy().
Maybe we can avoid taking LOCK_thd_data except in the uncommon case where
the gile name changes. It would be best to avoid a mutex lock/unlock for
_every_ event sent to a slave.
The position doesn't need locking I think, you can use an atomic
std::memory_order_relaxed mostly for documentation if you like.
> @@ -125,8 +114,11 @@ int THD::register_slave(uchar *packet, size_t packet_length)
> if (check_access(this, PRIV_COM_REGISTER_SLAVE, any_db.str, NULL,NULL,0,0))
> return 1;
> if (!(si= (Slave_info*)my_malloc(key_memory_SLAVE_INFO, sizeof(Slave_info),
> - MYF(MY_WME))))
> + MYF(MY_WME|MY_ZEROFILL))))
> return 1;
> + memset(si->gtid_state_sent.log_file, '\0', FN_REFLEN);
> + memset(si->gtid_state_ack.log_file, '\0', FN_REFLEN);
I think it's good to add the MY_ZEROFILL here, but then the two memset() are
redundant and can be removed.
> @@ -1235,6 +1251,14 @@ int Repl_semi_sync_master::update_sync_header(THD* thd, unsigned char *packet,
> *need_sync= sync;
>
> l_end:
> + if (is_on())
> + {
> + thd->slave_info->sync_status=
> + sync ? thd->slave_info->sync_status=
> + Slave_info::SYNC_STATE_SEMI_SYNC_ACTIVE
> + : thd->slave_info->sync_status=
> + Slave_info::SYNC_STATE_SEMI_SYNC_STALE;
> + }
There's something wierd here, double assignment of
thd->slave_info->sync_status, probably you just meant:
thd->slave_info->sync_status= sync ?
Slave_info::SYNC_STATE_SEMI_SYNC_ACTIVE : Slave_info::SYNC_STATE_SEMI_SYNC_STALE;
More importantly, update_sync_header() is called just _before_ writing the
event on the socket to the connected slave. Is that really the right place
to do so? Shouldn't the status be updated only _after_ the event has been
sent?
Or does it not matter, because the actual TCP sending is asynchroneous
anyway? In that case, should add a comment explaining why updating the
status here is ok.
I don't think it matters too much, but generally, updating here seems correct to me.
Here, it provides immediate context for the event which we are about to send. Then
users will know that there will be an ACK, or if we are just sending events "blindly"
until the slave catches up. Note I've put a comment into the code in my new patch to
explain this.
> --- a/mysql-test/suite/rpl/r/rpl_mixed_ddl_dml.result
> +++ b/mysql-test/suite/rpl/r/rpl_mixed_ddl_dml.result
> @@ -11,8 +11,8 @@ n
> 2002
> connection master;
> show slave hosts;
> -Server_id Host Port Master_id
> -2 127.0.0.1 9999 1
> +Server_id Host Port Master_id Gtid_State_Sent Gtid_State_Ack Sync_Status
> +2 127.0.0.1 9999 1 0-1-2 Asynchronous
Just a remark that this output could be non-deterministic unless the test
guarantees that the state will be stable at this point. So please check that
the test will produce stable output; or if you already did so, then great!
It is deterministic, as the test runs --sync_slave_with_master beforehand, and as we
are asynchronous, it is guaranteed that the transaction must have been sent (and
thereby should have updated Gtid_Pos_Sent :) )
> +--echo # state, and Gtid_State_Sent/Ack should start, and only Gtid_State_Sent
Typo, s/should start/should start empty/ ?
> +--connection server_2
> +--source include/stop_slave.inc
> +set global rpl_semi_sync_slave_enabled = 1;
> +--source include/start_slave.inc
> +
> +--connection server_1
> +let $status_var= Rpl_semi_sync_master_clients;
> +let $status_var_value= 1;
> +source include/wait_for_status_var.inc;
> +
> +--replace_result $SLAVE_MYPORT SLAVE_PORT $SERVER_MYPORT_3 SLAVE2_PORT $DEFAULT_MASTER_PORT DEFAULT_PORT
> +SHOW REPLICA HOSTS;
Hm, won't this output be non-deterministic?
From a quick look in the code, Rpl_semi_sync_master_clients is incremented
from the start of mysql_binlog_send(), but the state is only set from
update_sync_header() when the first event is sent to the slave. So won't
there be a (small) window where the state has not yet switched to ACTIVE
when this SHOW SLAVE HOSTS runs?
So better wait for the state to become ACTIVE instead. Or for the expected
value in Gtid_Pos_Sent, where that is non-empty.
Also, better use SHOW SLAVE HOSTS here, for consistency with rest of test
suite.
> +--let $nextval= `SELECT max(a)+1 from t1`
> +--eval insert into t1 values ($nextval)
Is there a reason here to use separate --let and --eval? And not just:
INSERT INTO t1 SELECT max(a)+1 FROM t1;
I like seeing the actual number value for cross referencing the values potentially
in the table (e.g. if manually debugging and running a SELECT).
> +--echo # Ensuring master gtid_binlog_pos matches Gtid_State_Ack
> +--let $gtid_ack= query_get_value(show slave hosts, Gtid_State_Ack, 1)
> +if (`SELECT strcmp("$master_gtid","$gtid_ack") != 0`)
> +{
> + --echo # Master gtid_binlog_pos: $master_gtid
> + --echo # Gtid_State_Ack: $gtid_ack
> + --die Master's gtid_binlog_pos should match Gtid_State_Ack, but doesn't
Again, sync_with_master_gtid.inc on the slave will not guarantee that the
master saw the ack yet, even though it will be a rare race when it doesn't
happen.
So you'd need to wait for the expected value here with wait_condition.inc
or somilar instead, I think.
Probably the same in subsequent parts of the test case.
This one is guaranteed that the master saw the ACK implicitly, as the transaction
has returned from commit (which only happens in the case of ACK received, or
timeout (which would invalidate the test anyway)).
> +--echo #
> +--echo # 21322.5: When connecting a new slave (server_id 3) which initially has
> +--echo # semi-sync disabled, SHOW SLAVE HOSTS on the master should show its
> +--echo # Sync_Status as asynchronous (while server_id 2 is still semi-sync
> +--echo # active).
Sorry, it's a very long and complex test case, I didn't have the time today
to carefully read and check every detail in it after this point. Please
carefully check all the cases for possible races like described above.
Replication tests are notorious for having non-deterministic failures (as
I'm sure you've experienced first-hand), and it's important to try to avoid
that as much as possible.
Right.I tried pretty hard up-front to make the test as deterministic as possible, though
I did fix one more spot based on your previous catch of rpl_semi_sync_master_clients
not in sync with Sync_Status :)
A general remark on the test, the output of Gtid_Pos_Sent and Gtid_Pos_Ack
are full GTID positions (multiple domain_id). But there is no test of having
more than one domain in the output. I think you need to extend the test to
test at least one case where there are multiple domains. And also need to
test when the GTID state in the binlog has multiple server_id (though maybe
the test case already does this, I didn't check myself).
Done, thanks for the suggestion!
- Kristian.