Hello Kristian,

I am working on a replacement algorithm for the retry all lock_requests function.  Still in development.  The race is that some locks are released while some other thread is running the lock retry code.  These locks are not retried, so a blocked lock request is not handed the lock and it times out. 

Back in the day, we were thinking of attaching the pending lock requests to the conflicting locks so that when the conflicting locks are released, it is easy to find the pending lock requests.  This change is beyond the scope of my current work.

I have the PerconaFT changes to support MariaDB's optimistic parallel replication on a branch that is suitable for merging into Pecona's repo:  https://github.com/prohaska7/tokuft/tree/killwait

Need to run some replication benchmarks that compare conservative vs optimistic replication performance.  This might lead to some interesting results.


On Tue, Aug 30, 2016 at 10:24 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Rich Prohaska <prohaska7@gmail.com> writes:

> I rearranged the tokudb lock request wait function a little bit, and got
> the lock tree unit tests to compile again (since the API changed).  commit
> on this tree:  https://github.com/prohaska7/mariadb-server/tree/toku_opr3

Thanks!

So I took a look at the complete patch we have so far, thinking what remains
to do to get this included in main MariaDB.

Do you have any suggestions for how/whether to upstream the TokuDB parts? I
assume the new kill_query functionality would make sense to upstream. The
callback to report waits could be upstreamed, but would not be used in
non-mariadb builds, probably. And the actual
tokudb_lock_wait_needed_callback() would probably not be upstreamed (or
would go in #ifdef MARIADB_BASE_VERSION perhaps). Any thoughts?

I still need to look into the should_retry_lock_requests, currently I get
hangs if it is enabled. And I want to see if the wait_needed_callback
interface can be cleaned up a bit. Otherwise the patch looks fairly complete
to me, do you agree?

I think this could go in 10.1, to fix optimistic parallel replication with
TokuDB (which does not currently work at all).

 - Kristian.