Hello Kristian, I am working on a replacement algorithm for the retry all lock_requests function. Still in development. The race is that some locks are released while some other thread is running the lock retry code. These locks are not retried, so a blocked lock request is not handed the lock and it times out. Back in the day, we were thinking of attaching the pending lock requests to the conflicting locks so that when the conflicting locks are released, it is easy to find the pending lock requests. This change is beyond the scope of my current work. I have the PerconaFT changes to support MariaDB's optimistic parallel replication on a branch that is suitable for merging into Pecona's repo: https://github.com/prohaska7/tokuft/tree/killwait Need to run some replication benchmarks that compare conservative vs optimistic replication performance. This might lead to some interesting results. On Tue, Aug 30, 2016 at 10:24 AM, Kristian Nielsen <knielsen@knielsen-hq.org
wrote:
Rich Prohaska <prohaska7@gmail.com> writes:
I rearranged the tokudb lock request wait function a little bit, and got the lock tree unit tests to compile again (since the API changed). commit on this tree: https://github.com/prohaska7/ mariadb-server/tree/toku_opr3
Thanks!
So I took a look at the complete patch we have so far, thinking what remains to do to get this included in main MariaDB.
Do you have any suggestions for how/whether to upstream the TokuDB parts? I assume the new kill_query functionality would make sense to upstream. The callback to report waits could be upstreamed, but would not be used in non-mariadb builds, probably. And the actual tokudb_lock_wait_needed_callback() would probably not be upstreamed (or would go in #ifdef MARIADB_BASE_VERSION perhaps). Any thoughts?
I still need to look into the should_retry_lock_requests, currently I get hangs if it is enabled. And I want to see if the wait_needed_callback interface can be cleaned up a bit. Otherwise the patch looks fairly complete to me, do you agree?
I think this could go in 10.1, to fix optimistic parallel replication with TokuDB (which does not currently work at all).
- Kristian.