Hello Kristian,

On Sun, Aug 14, 2016 at 1:51 PM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Rich Prohaska <prohaska7@gmail.com> writes:

> I suspect that the poor slave replication performance for optimistic
> replication occurs because TokuDB does not implement the kill_query
> handlerton function.  kill_handlerton gets called to resolve lock wait for
> situations that occur when parallel replicating a small sysbench table.
> InnoDB implements kill_query while TokuDB does not implement it.

Possibly, but I'm not sure it's that important. The kill will be effective
as soon as the wait is over.

I'm thinking that it's just because my patch is incomplete - it only handles
the case where transaction T1 goes to wait on T2 and T2 is already holding
the lock. If the lock is later passed to T3 (while T1 is still waiting),
then my patch doesn't handle killing T3. So T1 will need to wait for its
lock wait timeout to trigger, and then it will be re-tried - and _then_ T3
will be killed.

At least it looks a bit like that is what is happening in your processlist
output. But I'll need to do some tests to be sure. And I think I know how to
fix my patch, hopefully I'll have something in a day or two.

tokudb lock timeouts are resolving the replication stall.  unfortunately, the tokudb lock timeout is 4 seconds, so the throughput is almost zero.


>> when slave in conservative mode with 2 threads, the tokudb wait for
>> callback is being called (i put in a "printf"), which implies a parallel
>> lock conflict.  I assumed that conservative mode implies parallel execution
>> of transactions that were group committed together, which I assumed would
>> imply that these transactions were conflict free.  Obviously not the case.

This is interesting. Is there somewhere I can read details of how TokuDB
does lock waits? That would help me understand what is going on.

TokuFT implements pessimistic locking and 2 phase locking algorithms.  This wiki describes locking and concurrency in a little more detail:  https://github.com/percona/tokudb-engine/wiki/Transactions-and-Concurrency.  

We actually have the same situation in InnoDB in some cases. For example:

  CREATE TABLE t4 (a INT PRIMARY KEY, b INT, KEY b_idx(b)) ENGINE=InnoDB;
  INSERT INTO t4 VALUES (1,NULL), (2,2), (3,NULL), (4,4), (5, NULL), (6, 6);
  UPDATE t4 SET b=NULL WHERE a=6;
  DELETE FROM t4 WHERE b <= 3;

The UPDATE and DELETE may or may not conflict, depending on the order in
which they run. So it is possible for them to group commit together on the
master, but still conflict on the slave. Maybe something similar is possible
in TokuDB?

Another option is that some of the callbacks are false positives. Lock waits
should only be reported if they are for locks that will be held until
COMMIT. For example in InnoDB, there are shorter-lived locks on the
auto-increment counter, and such locks should _not_ be reported.

Yes, I think they are false positives since the thd_report_wait_for API is called but it does NOT call the THD::awake function.
 

 - Kristian.