andrei.elkin@pp.inet.fi writes:
Why won't we defer the current eager/optimistic old sub-id records discard in rpl_slave_state::record_gtid()
mysql_mutex_lock(&LOCK_slave_state);
if ((elist= elem->grab_list()) != NULL) { /* Delete any old stuff, but keep around the most recent one. */
mysql_mutex_unlock(&LOCK_slave_state);
till we know the transaction will be committed. E.g:
Xid_log_event::do_apply_event rpl_slave_state::update_state_hash rpl_slave_state::update
We need to have those records removed from the state immediately, otherwise the next record_gtid() (from another parallel replication worker) will try to grab and delete the same records, leading to extra redundant delete calls and loss of efficiency. We cannot hold the lock on LOCK_slave_state until we have committed, of course, as that would severely reduce parallelism. And we cannot defer the deletion of records until rpl_slave_state::update(), as that would require starting another transaction, which is again inefficient. However, I do like the idea of defering the handling of deletions from record_gtid() to rpl_slave_state::update(). I have never been entirely happy about the overhead of one row deletion in every replicated transaction. Maybe the deletion of old rows (and update of gtid state) could be done in the replication background thread instead (handle_slave_background())? In rpl_slave_state::update(), we will increment a counter. When this counter reaches 100 (or some configurable value), we will signal the background thread to grab the list and delete all of it in one go. This way we batch up more row deletions, and avoid the delete overhead in time-critical record_gtid(), which is probably more efficient. We do introduce some extra transactions, but that can be minimised by adjusting the batch size (eg 100 -> just 1% more transactions). I can try to create a patch to do this. Maybe moving deletions from record_gtid() into the background thread is too big a change for 10.1/10.2 ? But we could use the current patch for 10.1, and do the more complicated patch only in 10.3 (or 10.4?). What do you think? - Kristian.