[Maria-developers] Reviews of parallel replications

10 Dec 2013

Hi!

Now continuing with reviews of the parallel replication code.
Will take one patch at a time with a target to have everything reviewed
this week.

Review of revno 3678: Parallel replication: error handling.

------------------------------------------------------------
revno: 3678
committer: knielsen@knielsen-hq.org
branch nick: work-10.0-mdev4506
timestamp: Mon 2013-10-14 15:28:16 +0200
message:
  MDEV-4506: Parallel replication: error handling.
  
  Add an error code to the wait_for_commit facility.
  
  Now, when a transaction fails, it can signal the error to
  any subsequent transaction that is waiting for it to commit.
  The waiting transactions then receive the error code back from
  wait_for_prior_commit() and can handle the error appropriately.
  
  Also fix one race that could cause crash if @@slave_parallel_threads
  were changed several times quickly in succession.

<cut>

=== modified file 'sql/handler.cc'

--- sql/handler.cc	2013-10-13 21:24:05 +0000
+++ sql/handler.cc	2013-10-14 13:28:16 +0000
@@ -1458,10 +1458,11 @@ int ha_commit_one_phase(THD *thd, bool a
     transaction.all.ha_list, see why in trans_register_ha()).
   */
   bool is_real_trans=all || thd->transaction.all.ha_list == 0;
+  int res;
   DBUG_ENTER("ha_commit_one_phase");
-  if (is_real_trans)
-    thd->wait_for_prior_commit();
-  int res= commit_one_phase_2(thd, all, trans, is_real_trans);
+  if (is_real_trans && (res= thd->wait_for_prior_commit()))
+    DBUG_RETURN(res);
+  res= commit_one_phase_2(thd, all, trans, is_real_trans);
   DBUG_RETURN(res);

Another way to write the above is:

 if (is_real_trans && !(res= thd->wait_for_prior_commit()))
   res= commit_one_phase_2(thd, all, trans, is_real_trans);
 DBUG_RETURN(res);

<cut>

I was checking how ha_commit_one_phase() was used.

In ha_commit_trans() or trans_xa_commit(), if ha_commit_one_phase()
fails, it will not run rollback.  Is this right?

I would have assumed that on failure, ha_commit_one_phase() should run
ha_rollback_trans() before returning.


=== modified file 'sql/log.cc'
@@ -7844,7 +7844,8 @@ int TC_LOG_MMAP::log_and_order(THD *thd,
     mysql_mutex_unlock(&LOCK_prepare_ordered);
   }
 
-  thd->wait_for_prior_commit();
+  if (thd->wait_for_prior_commit())
+    return 0;
 
What does the return 0 mean here?
Is it an error?  Is thd->is_fatal_error assumed to be set if this happens?
A comment here would not be totally wrong...

@@ -5633,7 +5634,7 @@ wait_for_commit::~wait_for_commit()
 
 
 void
-wait_for_commit::wakeup()
+wait_for_commit::wakeup(int wakeup_error)
 {
   /*
     We signal each waiter on their own condition and mutex (rather than using
@@ -5649,6 +5650,7 @@ wait_for_commit::wakeup()
   */
   mysql_mutex_lock(&LOCK_wait_commit);
   waiting_for_commit= false;
+  this->wakeup_error= wakeup_error;

Consider adding an assert that 'this->wakeup_error' is 0.
Good way to ensure we don't accidently call this function twice...
(as things would be bad if we first would call it with an error and
then with 0)

   mysql_mutex_unlock(&LOCK_wait_commit);
   mysql_cond_signal(&COND_wait_commit);
 }

Regards,
Monty

    

[Maria-developers] Reviews of parallel replications

Michael Widenius