Review of MDEV-34857: Implement --slave-abort-blocking-timeout If a slave replicating an event has waited for more than @@slave_abort_blocking_timeout for a conflicting metadata lock held by a non-replication thread, the blocking query is killed to allow replication to proceed and not be blocked indefinitely by a user query. <cut> diff --git a/sql/mdl.cc b/sql/mdl.cc index faccd1c9476..9845718e165 100644 --- a/sql/mdl.cc +++ b/sql/mdl.cc <cut> @@ -2397,14 +2398,39 @@ MDL_context::acquire_lock(MDL_request *mdl_request, double lock_wait_timeout) find_deadlock(); - struct timespec abs_timeout, abs_shortwait; + struct timespec abs_timeout, abs_shortwait, abs_abort_blocking_timeout; + bool abort_blocking_enabled= false; + double abort_blocking_timeout= slave_abort_blocking_timeout; Add a comment here. Something like /* This code implements aborting of queries on the slaves that stops replication from continuing. */ + if (abort_blocking_timeout < lock_wait_timeout && + m_owner->get_thd()->rgi_slave) + { + set_timespec_nsec(abs_abort_blocking_timeout, + (ulonglong)(abort_blocking_timeout * 1000000000ULL)); + abort_blocking_enabled= true; + } set_timespec_nsec(abs_timeout, (ulonglong)(lock_wait_timeout * 1000000000ULL)); - set_timespec(abs_shortwait, 1); wait_status= MDL_wait::EMPTY; - while (cmp_timespec(abs_shortwait, abs_timeout) <= 0) + for (;;) { + bool abort_blocking= false; + set_timespec(abs_shortwait, 1); + if (abort_blocking_enabled && + cmp_timespec(abs_shortwait, abs_abort_blocking_timeout) >= 0) + { + /* + If a slave DDL has waited for --slave-abort-select-timeout, then notify + any blocking SELECT once before continuing to wait until the full + timeout. + */ + abs_shortwait= abs_abort_blocking_timeout; + abort_blocking= true; + abort_blocking_enabled= false; + } + if (cmp_timespec(abs_shortwait, abs_timeout) > 0) + break; Is the above right last compare right? Assuming that cmp_timespec(abs_shortwait, abs_abort_blocking_timeout) > 0 then the above will not break and we will not signal other threads to break even if the previous code expectrs us to abort blocking. (I think the cmp_timespec() return test should be the same in both cases) Shouldn't the code be: for (;;) { bool abort_blocking= false; set_timespec(abs_shortwait, 1); if (abort_blocking_enabled && cmp_timespec(abs_shortwait, abs_abort_blocking_timeout) >= 0) { /* If a slave DDL has waited for --slave-abort-select-timeout, then notify any blocking SELECT once before continuing to wait until the full timeout. */ abort_blocking= true; abort_blocking_enabled= false; } else if (cmp_timespec(abs_shortwait, abs_timeout) > 0) break; Or have I missed something? /* abs_timeout is far away. Wait a short while and notify locks. */ wait_status= m_wait.timed_wait(m_owner, &abs_shortwait, FALSE, mdl_request->key.get_wait_state_name()); @@ -2425,9 +2451,8 @@ MDL_context::acquire_lock(MDL_request *mdl_request, double lock_wait_timeout) mysql_prlock_wrlock(&lock->m_rwlock); if (lock->needs_notification(ticket)) - lock->notify_conflicting_locks(this); + lock->notify_conflicting_locks(this, abort_blocking); mysql_prlock_unlock(&lock->m_rwlock); - set_timespec(abs_shortwait, 1); I like the new code in that we are not setting abs_shortwait two times anymore. (Using a while loop in the is case was sub optimal. Ok to push after the above change or if you have a good argument why the current code is correct