Re: [Maria-developers] [Commits] Rev 4281: MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail. in http://bazaar.launchpad.net/~maria-captains/maria/10.0
Jan Lindström <jan.lindstrom@skysql.com> writes:
I have mostly InnoDB style comments, but some questions below:
Thanks for comments! Sorry about style violations, and thanks for pointing then out. I'll try to fix all of it. Answers to other questions below:
Why you need to define these, why #include "mysql/plugin.h" file is not enough ?
+extern "C" void thd_report_wait_for(const MYSQL_THD thd, MYSQL_THD other_thd); +extern "C" int thd_need_wait_for(const MYSQL_THD thd); +extern "C" +int thd_need_ordering_with(const MYSQL_THD thd, const MYSQL_THD other_thd);
See the discussions here: https://lists.launchpad.net/maria-developers/msg07494.html The conclusion with Serg was that doing a proper public API in 10.0 was too big a change. We will do it in 10.1 (MDEV-6429). plugin.h should not be modified in a GA version and with an API that has not been thought through wrt. all engines. Also, the API should be a service, not in plugin.h.
Why you need to introduce a new global structure for this, can't this be part of some already existing system structure e.g. lock_sys and why you do not use ut_list ?
+/* Buffer to collect THDs to report waits for. */ +struct thd_wait_reports { + struct thd_wait_reports *next; + ulint used; + trx_t *waitees[64]; +};
So the idea here is to collect a bunch of trx pointers during the graph traversal, and pass each of them to thd_report_wait_for(). I wanted to pass them after completing the graph traversal to avoid issues if thd_report_wait_for() needs to kill one of them; this goes to lock_cancel_waiting_and_release() eventually which I assume might modify the graph being traversed. This is time-critical code (it runs under global lock_sys->mutex), so it is important to avoid memory allocations. So I use an array allocated on the stack, this is really short-lived data, it only lives for the duration of lock_deadlock_check_and_resolve(). In all normal cases the single array should be sufficient, but the code needs to handle the overflow also, of course. So in case of overflow, I malloc further arrays and link them into the first. I suppose another solution would be to abort in case of overflow and roll back the transaction like in case of ctx->too_deep==true, however I don't like to introduce such arbitrary limits when it's not necessary. I looked into using ib_list, but this seems to always do mallocs, which I want to avoid. I also looked at ut0vec and dyn0dyn, but they also seem to always use malloc and thus do not seem appropriate. So I did not find anything existing I could use, unfortunately, but then I know little of the InnoDB code. Do you have a suggestion for an existing data structure to use that can take an inital buffer on the stack and malloc extra memory only if needed? Thanks for comments! - Kristian.
participants (1)
-
Kristian Nielsen