Re: [Maria-developers] [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

15 Aug 2016


      Hello Kristian,
The simplest kill_query implementation for tokudb would just signal all of
the pending lock request's condition variables.  This would cause the
killed callback to be called.  A performance refinement, if necessary,
would allow thread A (executing the kill_query function) to identify and
signal a condition variable for a blocked thread B.

On Mon, Aug 15, 2016 at 5:42 AM, Kristian Nielsen <knielsen@knielsen-hq.org>
wrote:
...
Rich Prohaska <prohaska7@gmail.com> writes:
...
tokudb lock timeouts are resolving the replication stall.  unfortunately,
the tokudb lock timeout is 4 seconds, so the throughput is almost zero.
Yes. Sorry for not making it clear that my proof-of-concept patch was
incomplete...
...
...
...
I suspect that the poor slave replication performance for optimistic
replication occurs because TokuDB does not implement the kill_query
handlerton function.  kill_handlerton gets called to resolve lock wait
...
...
Possibly, but I'm not sure it's that important. The kill will be
effective
as soon as the wait is over.
No, you're absolutely right, after testing (and thinking) some more, I
realise that indeed the kill_query functionality is important.
A possible scenario is, given transactions T1, T2, and T3 in that order:
T3 acquires a lock on row R3, T2 similarly acquires R2.
Now T3 tries to acquire R2, but has to wait for T2 to release it.
Later T1 tries to acquire R3, also has to wait.
At this point, we kill T3, since it is holding a lock (R3) needed by an
earlier transaction T1. However, T3 will not notice the kill until its own
wait (on R2 held by T2) times out. T2 cannot release the lock because it is
waiting for T1 to commit first. So we have a deadlock :-/
With InnoDB, the kill causes T3 to wake up immediately and roll back, so
that T1 can proceed without much delay.
Ok, so something more is needed here. I see there is a killed_callback()
which seems to check for the kill, so I'm hoping that can be used with a
suitable wakeup of the offending lock_request (or all requests,
perhaps). But as I'm completely new to TokuDB, I still need some more time
to read the code and try to understand how everything fits together...
...
TokuFT implements pessimistic locking and 2 phase locking algorithms.
This
wiki describes locking and concurrency in a little more detail:
https://github.com/percona/tokudb-engine/wiki/
Transactions-and-Concurrency.
Thanks, this was quite helpful.
...
Yes, I think they are false positives since the thd_report_wait_for API
is
called but it does NOT call the THD::awake function.
Ah. Then it's probably normal, caused by the group-commit optimisation. In
conservative mode, if two transactions T1 and T2 did not group commit on
the
master, then cannot be started in parallel on the slave. But T2 can start
as
soon as T1 has reached COMMIT. Thus, if T2 happens to conflict with T1,
there is a small window where T2 can need to wait on T1 until T1 has
completed its commit.
Thanks,
- Kristian.