[Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

older
[Maria-discuss] MariaDB 5.5.51,...

jocelyn fournier

15 Jul 2016 15 Jul '16

1:51 p.m.

Hi, After upgrading from TokuDB Enterprise with MariaDB 5.5 to MariaDB 10.1.14, I tried to enable the parallel replication (parallel_mode=optimistic, slave_parallel_threads=4) on a GTID enabled RFR slave, but it leads to random failures. (both master & slave are running 10.1.14) E.g : Could not execute Update_rows_v1 event on table sc_2.sc_param_index; Can't find record in 'sc_param_index', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.008418, end_log_pos 217608066 If I issue a start slave (with sql_slave_skip_counter=0) : Could not execute Update_rows_v1 event on table sc_2.sc_log_cron; Can't find record in 'sc_log_cron', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.008418, end_log_pos 218275984 If I put back slave_parallel_threads to 0, the slave keeps replicating happily without any issue. Is this a known limitation with TokuDB ? Tokudb configuration : tokudb_rpl_lookup_rows=0 tokudb_rpl_unique_checks=0 tokudb_pk_insert_mode=2 tokudb_directio = 1 Thanks and regards, Jocelyn Fournier

Show replies by date

Kristian Nielsen

15 Jul 15 Jul

2:40 p.m.

New subject: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

jocelyn fournier <jocelyn.fournier@gmail.com> writes:

...

After upgrading from TokuDB Enterprise with MariaDB 5.5 to MariaDB 10.1.14, I tried to enable the parallel replication (parallel_mode=optimistic, slave_parallel_threads=4) on a GTID enabled

...

Is this a known limitation with TokuDB ?

Yes. TokuDB does not (to my knowledge) implement the thd_report_wait_for() API, which is what makes optimistic parallel replication work. - Kristian.

jocelyn fournier

2:52 p.m.

Hi Kristian, Thanks for the quick answer! I wonder if it would be possible the automatically disable the optimistic parallel replication for an engine if it does not implement it ? If both InnoDB & TokuDB engine are used for example, the current behaviour prevents from using the parallel replication for the InnoDB tables. Thanks, Jocelyn Le 15/07/2016 à 16:40, Kristian Nielsen a écrit :

...

jocelyn fournier <jocelyn.fournier@gmail.com> writes:

...
After upgrading from TokuDB Enterprise with MariaDB 5.5 to MariaDB 10.1.14, I tried to enable the parallel replication (parallel_mode=optimistic, slave_parallel_threads=4) on a GTID enabled Is this a known limitation with TokuDB ? Yes. TokuDB does not (to my knowledge) implement the thd_report_wait_for() API, which is what makes optimistic parallel replication work.

- Kristian.

Kristian Nielsen

3:09 p.m.

New subject: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

jocelyn fournier <jocelyn.fournier@gmail.com> writes:

...

Thanks for the quick answer! I wonder if it would be possible the automatically disable the optimistic parallel replication for an engine if it does not implement it ?

That would probably be good - though it would be better to just implement the necessary API, it's a very small change (basically TokuDB just needs to inform the upper layer of any lock waits that take place inside). However, looking more at your description, you got a "key not found" error. Not implementing the thd_report_wait_for() could lead to deadlocks, but it shouldn't cause key not found. In fact, in optimistic mode, all errors are treated as "deadlock" errors, the query is rolled back, and run again, this time not in parallel. So I'm wondering if there is something else going on. If transactions T1 and T2 run in parallel, it's possible that they have a row conflict. But if T2 deleted a row expected by T1, I would expect T1 to wait on a row lock held by T2, not get a duplicate key error. And if T1 has not yet inserted a row expected by T2, then T2 would be rolled back and retried after T1 has committed. The first can cause deadlock, but neither case seems to cause duplicate error. Maybe TokuDB is doing something special with locks around replication, or something else goes wrong. I guess TokuDB just hasn't been tested much with parallel replication. Does it work ok when running in conservative parallel mode? - Kristian.

jocelyn fournier

3:27 p.m.

Le 15/07/2016 à 17:09, Kristian Nielsen a écrit :

...

jocelyn fournier <jocelyn.fournier@gmail.com> writes:

...
Thanks for the quick answer! I wonder if it would be possible the automatically disable the optimistic parallel replication for an engine if it does not implement it ? That would probably be good - though it would be better to just implement the necessary API, it's a very small change (basically TokuDB just needs to inform the upper layer of any lock waits that take place inside).

However, looking more at your description, you got a "key not found" error. Not implementing the thd_report_wait_for() could lead to deadlocks, but it shouldn't cause key not found. In fact, in optimistic mode, all errors are treated as "deadlock" errors, the query is rolled back, and run again, this time not in parallel.

So I'm wondering if there is something else going on. If transactions T1 and T2 run in parallel, it's possible that they have a row conflict. But if T2 deleted a row expected by T1, I would expect T1 to wait on a row lock held by T2, not get a duplicate key error. And if T1 has not yet inserted a row expected by T2, then T2 would be rolled back and retried after T1 has committed. The first can cause deadlock, but neither case seems to cause duplicate error.

Maybe TokuDB is doing something special with locks around replication, or something else goes wrong. I guess TokuDB just hasn't been tested much with parallel replication.

Does it work ok when running in conservative parallel mode? So far conservative parallel mode seems to work properly as well. My first thought was that this issue was cause by the Read Free Replication not locking rows in the expected way, although it's advertised to be 100% compatible with parallel replication. I'll try the optimistic mode without the Read Free Replication to check if it could be related.

Jocelyn

jocelyn fournier

3:33 p.m.

Le 15/07/2016 à 17:27, jocelyn fournier a écrit :

...

Le 15/07/2016 à 17:09, Kristian Nielsen a écrit :

...
jocelyn fournier <jocelyn.fournier@gmail.com> writes:

...
Thanks for the quick answer! I wonder if it would be possible the automatically disable the optimistic parallel replication for an engine if it does not implement it ? That would probably be good - though it would be better to just implement the necessary API, it's a very small change (basically TokuDB just needs to inform the upper layer of any lock waits that take place inside).

However, looking more at your description, you got a "key not found" error. Not implementing the thd_report_wait_for() could lead to deadlocks, but it shouldn't cause key not found. In fact, in optimistic mode, all errors are treated as "deadlock" errors, the query is rolled back, and run again, this time not in parallel.

So I'm wondering if there is something else going on. If transactions T1 and T2 run in parallel, it's possible that they have a row conflict. But if T2 deleted a row expected by T1, I would expect T1 to wait on a row lock held by T2, not get a duplicate key error. And if T1 has not yet inserted a row expected by T2, then T2 would be rolled back and retried after T1 has committed. The first can cause deadlock, but neither case seems to cause duplicate error.

Maybe TokuDB is doing something special with locks around replication, or something else goes wrong. I guess TokuDB just hasn't been tested much with parallel replication.

Does it work ok when running in conservative parallel mode? So far conservative parallel mode seems to work properly as well. My first thought was that this issue was cause by the Read Free Replication not locking rows in the expected way, although it's advertised to be 100% compatible with parallel replication. I'll try the optimistic mode without the Read Free Replication to check if it could be related.

Well same issue without RFR : Could not execute Delete_rows_v1 event on table sc_2.sc_product_genre; Can't find record in 'sc_product_genre', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.008420, end_log_pos 77519956 (and it didn't succeed in recovering this one, I had to skip it). Jocelyn

Kristian Nielsen

3:45 p.m.

New subject: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

Do you have a test case that can be used to repeat the bug? - Kristian.

jocelyn fournier

3:52 p.m.

I'll try to extract from the binlog operations applied to only one table, and see if I can reproduce easily the issue. Did you try to run the replication test with the tokudb engine instead of innodb ? (just in case it could be easily reproduced with the existing tests) Jocelyn Le 15/07/2016 à 17:45, Kristian Nielsen a écrit :

...

Do you have a test case that can be used to repeat the bug?

- Kristian.

jocelyn fournier

3:47 p.m.

...

Le 15/07/2016 à 17:27, jocelyn fournier a écrit :

...
Le 15/07/2016 à 17:09, Kristian Nielsen a écrit :

...
jocelyn fournier <jocelyn.fournier@gmail.com> writes:

...
Thanks for the quick answer! I wonder if it would be possible the automatically disable the optimistic parallel replication for an engine if it does not implement it ? That would probably be good - though it would be better to just implement the necessary API, it's a very small change (basically TokuDB just needs to inform the upper layer of any lock waits that take place inside).

However, looking more at your description, you got a "key not found" error. Not implementing the thd_report_wait_for() could lead to deadlocks, but it shouldn't cause key not found. In fact, in optimistic mode, all errors are treated as "deadlock" errors, the query is rolled back, and run again, this time not in parallel.

So I'm wondering if there is something else going on. If transactions T1 and T2 run in parallel, it's possible that they have a row conflict. But if T2 deleted a row expected by T1, I would expect T1 to wait on a row lock held by T2, not get a duplicate key error. And if T1 has not yet inserted a row expected by T2, then T2 would be rolled back and retried after T1 has committed. The first can cause deadlock, but neither case seems to cause duplicate error.

Maybe TokuDB is doing something special with locks around replication, or something else goes wrong. I guess TokuDB just hasn't been tested much with parallel replication.

Does it work ok when running in conservative parallel mode? So far conservative parallel mode seems to work properly as well. My first thought was that this issue was cause by the Read Free Replication not locking rows in the expected way, although it's advertised to be 100% compatible with parallel replication. I'll try the optimistic mode without the Read Free Replication to check if it could be related.

Well same issue without RFR :

Could not execute Delete_rows_v1 event on table sc_2.sc_product_genre; Can't find record in 'sc_product_genre', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.008420, end_log_pos 77519956

(and it didn't succeed in recovering this one, I had to skip it).

Actually it has definitly corrupted the state of the slave, I have to rebuild the replication from a backup. Jocelyn

jocelyn fournier

7 Aug 7 Aug

11:50 p.m.

Hi Kristian, Just FYI I confirm the "Lock wait timeout exceeded; try restarting transaction" behaviour you described. I've duplicated & modified the rpl_parallel_optimistic.test and run it into storage/tokudb/mysql-test/tokudb_rpl/t/rpl_parallel_optimistic.test : ./mtr --suite=tokudb_rpl <1:33:48 Logging: ./mtr --suite=tokudb_rpl vardir: /home/joce/mariadb-10.1.16/mysql-test/var Checking leftover processes... Removing old var directory... Creating var directory '/home/joce/mariadb-10.1.16/mysql-test/var'... Checking supported features... MariaDB Version 10.1.16-MariaDB-debug - SSL connections supported - binaries are debug compiled Using suites: tokudb_rpl Collecting tests... Installing system database... ============================================================================== TEST RESULT TIME (ms) or COMMENT -------------------------------------------------------------------------- worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019 worker[1] mysql-test-run: WARNING: running this script as _root_ will cause some tests to be skipped tokudb_rpl.rpl_parallel_optimistic 'innodb_plugin,mix' [ fail ] Test ended at 2016-08-08 01:26:34 CURRENT_TEST: tokudb_rpl.rpl_parallel_optimistic mysqltest: In included file "./include/sync_with_master_gtid.inc": included from /home/joce/mariadb-10.1.16/storage/tokudb/mysql-test/tokudb_rpl/t/rpl_parallel_optimistic.test at line 59: At line 50: Failed to sync with master The result from queries just before the failure was: < snip > DELETE FROM t1 WHERE a=2; INSERT INTO t1 VALUES (2,5); DELETE FROM t1 WHERE a=3; INSERT INTO t1 VALUES(3,2); DELETE FROM t1 WHERE a=1; INSERT INTO t1 VALUES(1,2); DELETE FROM t1 WHERE a=3; INSERT INTO t1 VALUES(3,3); DELETE FROM t1 WHERE a=2; INSERT INTO t1 VALUES (2,6); include/save_master_gtid.inc SELECT * FROM t1 ORDER BY a; a b 1 2 2 6 3 3 include/start_slave.inc include/sync_with_master_gtid.inc Timeout in master_gtid_wait('0-1-20', 120), current slave GTID position is: 0-1-3. Slave state : Waiting for master to send event 127.0.0.1 root 16000 1 master-bin.000001 3468 slave-relay-bin.000002 796 master-bin.000001 Yes No 1205 Lock wait timeout exceeded; try restarting transaction 0 772 3790 None 0 No No 0 1205 Lock wait timeout exceeded; try restarting transaction 1 Slave_Pos 0-1-20 optimistic I've no explanation so far for the DUPLICATE KEY error I've seen. Jocelyn Le 15/07/2016 à 17:09, Kristian Nielsen a écrit :

...

jocelyn fournier <jocelyn.fournier@gmail.com> writes:

...
Thanks for the quick answer! I wonder if it would be possible the automatically disable the optimistic parallel replication for an engine if it does not implement it ? That would probably be good - though it would be better to just implement the necessary API, it's a very small change (basically TokuDB just needs to inform the upper layer of any lock waits that take place inside).

However, looking more at your description, you got a "key not found" error. Not implementing the thd_report_wait_for() could lead to deadlocks, but it shouldn't cause key not found. In fact, in optimistic mode, all errors are treated as "deadlock" errors, the query is rolled back, and run again, this time not in parallel.

So I'm wondering if there is something else going on. If transactions T1 and T2 run in parallel, it's possible that they have a row conflict. But if T2 deleted a row expected by T1, I would expect T1 to wait on a row lock held by T2, not get a duplicate key error. And if T1 has not yet inserted a row expected by T2, then T2 would be rolled back and retried after T1 has committed. The first can cause deadlock, but neither case seems to cause duplicate error.

Maybe TokuDB is doing something special with locks around replication, or something else goes wrong. I guess TokuDB just hasn't been tested much with parallel replication.

Does it work ok when running in conservative parallel mode?

- Kristian.

jocelyn fournier

8 Aug 8 Aug

7:14 a.m.

Here is the commit with the test : https://github.com/jocel1/server/commit/e1e1716ec2af981d29239e9e075734080a2a... (I've not updated the result file) And a small modification to output the show slave status in case of sync failure : https://github.com/jocel1/server/commit/e1261396af0282738e8034885949bcc6a6f5... Jocelyn Le 08/08/2016 à 01:50, jocelyn fournier a écrit :

...

Hi Kristian,

Just FYI I confirm the "Lock wait timeout exceeded; try restarting transaction" behaviour you described.

I've duplicated & modified the rpl_parallel_optimistic.test and run it into storage/tokudb/mysql-test/tokudb_rpl/t/rpl_parallel_optimistic.test :

./mtr --suite=tokudb_rpl <1:33:48 Logging: ./mtr --suite=tokudb_rpl vardir: /home/joce/mariadb-10.1.16/mysql-test/var Checking leftover processes... Removing old var directory... Creating var directory '/home/joce/mariadb-10.1.16/mysql-test/var'... Checking supported features... MariaDB Version 10.1.16-MariaDB-debug - SSL connections supported - binaries are debug compiled Using suites: tokudb_rpl Collecting tests... Installing system database... ==============================================================================

TEST RESULT TIME (ms) or COMMENT --------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019 worker[1] mysql-test-run: WARNING: running this script as _root_ will cause some tests to be skipped tokudb_rpl.rpl_parallel_optimistic 'innodb_plugin,mix' [ fail ] Test ended at 2016-08-08 01:26:34

CURRENT_TEST: tokudb_rpl.rpl_parallel_optimistic mysqltest: In included file "./include/sync_with_master_gtid.inc": included from /home/joce/mariadb-10.1.16/storage/tokudb/mysql-test/tokudb_rpl/t/rpl_parallel_optimistic.test at line 59: At line 50: Failed to sync with master

The result from queries just before the failure was: < snip > DELETE FROM t1 WHERE a=2; INSERT INTO t1 VALUES (2,5); DELETE FROM t1 WHERE a=3; INSERT INTO t1 VALUES(3,2); DELETE FROM t1 WHERE a=1; INSERT INTO t1 VALUES(1,2); DELETE FROM t1 WHERE a=3; INSERT INTO t1 VALUES(3,3); DELETE FROM t1 WHERE a=2; INSERT INTO t1 VALUES (2,6); include/save_master_gtid.inc SELECT * FROM t1 ORDER BY a; a b 1 2 2 6 3 3 include/start_slave.inc include/sync_with_master_gtid.inc Timeout in master_gtid_wait('0-1-20', 120), current slave GTID position is: 0-1-3. Slave state : Waiting for master to send event 127.0.0.1 root 16000 1 master-bin.000001 3468 slave-relay-bin.000002 796 master-bin.000001 Yes No 1205 Lock wait timeout exceeded; try restarting transaction 0 772 3790 None 0 No No 0 1205 Lock wait timeout exceeded; try restarting transaction 1 Slave_Pos 0-1-20 optimistic

I've no explanation so far for the DUPLICATE KEY error I've seen.

Jocelyn

Le 15/07/2016 à 17:09, Kristian Nielsen a écrit :

...
jocelyn fournier <jocelyn.fournier@gmail.com> writes:

...
Thanks for the quick answer! I wonder if it would be possible the automatically disable the optimistic parallel replication for an engine if it does not implement it ? That would probably be good - though it would be better to just implement the necessary API, it's a very small change (basically TokuDB just needs to inform the upper layer of any lock waits that take place inside).

However, looking more at your description, you got a "key not found" error. Not implementing the thd_report_wait_for() could lead to deadlocks, but it shouldn't cause key not found. In fact, in optimistic mode, all errors are treated as "deadlock" errors, the query is rolled back, and run again, this time not in parallel.

So I'm wondering if there is something else going on. If transactions T1 and T2 run in parallel, it's possible that they have a row conflict. But if T2 deleted a row expected by T1, I would expect T1 to wait on a row lock held by T2, not get a duplicate key error. And if T1 has not yet inserted a row expected by T2, then T2 would be rolled back and retried after T1 has committed. The first can cause deadlock, but neither case seems to cause duplicate error.

Maybe TokuDB is doing something special with locks around replication, or something else goes wrong. I guess TokuDB just hasn't been tested much with parallel replication.

Does it work ok when running in conservative parallel mode?

- Kristian.

Rich Prohaska

11:43 p.m.

Hello All, I have been running sysbench oltp with a mariadb 10.1 master-slave topology. I have not seen any replication errors when slave parallel mode is conservative. However, when I configure slave parallel mode to optimistic and slave parallel threads = 2, I get a lock timeout replication error with TokuDB. Just before the lock timeout error fires (which requires a tokudb lock timeout to occur), I see the one of the replication threads waiting for a lock held by the other replication thread. gdb shows the first thread waiting on a lock inside of tokudb. the other thread is stalled when committing the transaction in wait_for_prior_commit_2 <- wait_for_prior_commit <- THD::wait_for_prior_commit <- TC_LOG_MMAP::log_and_order <- ha_commit_trans. Is TokuDB supposed to call the thd report wait for API just prior to a thread about to wait on a tokudb lock? On Sun, Aug 7, 2016 at 7:50 PM, jocelyn fournier <jocelyn.fournier@gmail.com

...

wrote:

...

Hi Kristian,

Just FYI I confirm the "Lock wait timeout exceeded; try restarting transaction" behaviour you described.

I've duplicated & modified the rpl_parallel_optimistic.test and run it into storage/tokudb/mysql-test/tokudb_rpl/t/rpl_parallel_optimistic.test :

./mtr --suite=tokudb_rpl <1:33:48 Logging: ./mtr --suite=tokudb_rpl vardir: /home/joce/mariadb-10.1.16/mysql-test/var Checking leftover processes... Removing old var directory... Creating var directory '/home/joce/mariadb-10.1.16/mysql-test/var'... Checking supported features... MariaDB Version 10.1.16-MariaDB-debug - SSL connections supported - binaries are debug compiled Using suites: tokudb_rpl Collecting tests... Installing system database... ============================================================ ==================

TEST RESULT TIME (ms) or COMMENT --------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019 worker[1] mysql-test-run: WARNING: running this script as _root_ will cause some tests to be skipped tokudb_rpl.rpl_parallel_optimistic 'innodb_plugin,mix' [ fail ] Test ended at 2016-08-08 01:26:34

CURRENT_TEST: tokudb_rpl.rpl_parallel_optimistic mysqltest: In included file "./include/sync_with_master_gtid.inc": included from /home/joce/mariadb-10.1.16/storage/tokudb/mysql-test/tokudb_ rpl/t/rpl_parallel_optimistic.test at line 59: At line 50: Failed to sync with master

The result from queries just before the failure was: < snip > DELETE FROM t1 WHERE a=2; INSERT INTO t1 VALUES (2,5); DELETE FROM t1 WHERE a=3; INSERT INTO t1 VALUES(3,2); DELETE FROM t1 WHERE a=1; INSERT INTO t1 VALUES(1,2); DELETE FROM t1 WHERE a=3; INSERT INTO t1 VALUES(3,3); DELETE FROM t1 WHERE a=2; INSERT INTO t1 VALUES (2,6); include/save_master_gtid.inc SELECT * FROM t1 ORDER BY a; a b 1 2 2 6 3 3 include/start_slave.inc include/sync_with_master_gtid.inc Timeout in master_gtid_wait('0-1-20', 120), current slave GTID position is: 0-1-3. Slave state : Waiting for master to send event 127.0.0.1 root 16000 1 master-bin.000001 3468 slave-relay-bin.000002 796 master-bin.000001 Yes No 1205 Lock wait timeout exceeded; try restarting transaction 0 772 3790 None 0 No No 0 1205 Lock wait timeout exceeded; try restarting transaction 1 Slave_Pos 0-1-20 optimistic

I've no explanation so far for the DUPLICATE KEY error I've seen.

Jocelyn

Le 15/07/2016 à 17:09, Kristian Nielsen a écrit :

...
jocelyn fournier <jocelyn.fournier@gmail.com> writes:

Thanks for the quick answer! I wonder if it would be possible the

...
automatically disable the optimistic parallel replication for an engine if it does not implement it ?

That would probably be good - though it would be better to just implement the necessary API, it's a very small change (basically TokuDB just needs to inform the upper layer of any lock waits that take place inside).

However, looking more at your description, you got a "key not found" error. Not implementing the thd_report_wait_for() could lead to deadlocks, but it shouldn't cause key not found. In fact, in optimistic mode, all errors are treated as "deadlock" errors, the query is rolled back, and run again, this time not in parallel.

So I'm wondering if there is something else going on. If transactions T1 and T2 run in parallel, it's possible that they have a row conflict. But if T2 deleted a row expected by T1, I would expect T1 to wait on a row lock held by T2, not get a duplicate key error. And if T1 has not yet inserted a row expected by T2, then T2 would be rolled back and retried after T1 has committed. The first can cause deadlock, but neither case seems to cause duplicate error.

Maybe TokuDB is doing something special with locks around replication, or something else goes wrong. I guess TokuDB just hasn't been tested much with parallel replication.

Does it work ok when running in conservative parallel mode?

- Kristian.

Kristian Nielsen

9 Aug 9 Aug

6:21 a.m.

New subject: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

Rich Prohaska <prohaska7@gmail.com> writes:

...

Is TokuDB supposed to call the thd report wait for API just prior to a thread about to wait on a tokudb lock?

Yes, that's basically it. Optimistic parallel replication runs transactions in parallel, but enforces that they commit in the original order. So suppose we have transactions T1 followed by T2 in the replication stream, and that they try to update the same row. When T2 gets ready to commit, it needs to wait for T1 to commit first (this is what you see in wait_for_prior_commit()). However, if T1 is waiting on a row lock held by T2, we have a deadlock. thd_report_wait_for() checks for this condition. If a transaction goes to wait on a lock held by a later (in terms of in-order replication) transaction, the later transaction is killed (using the normal thread kill mechanism). Parallel replication then gracefully handles the kill (by rollback and retry). You can see in storage/xtradb/lock/lock0lock.cc how this is done for InnoDB/XtraDB, eg. lock_report_waiters_to_mysql(). Hopefully it would be easy to hook this into TokuDB. It does require being able to locate the transaction (and in particular the THD) that owns a given lock. Another potential issue (at least it was for InnoDB/XtraDB) is that thd_report_wait_for() can call back into the handlerton->kill_query method, so the callor of thd_report_wait_for() needs to be prepared for this to happen. Note that we can modify/extend the thd_report_wait_for() API to work better for TokuDB, if necessary. The current API was deliberately left "internal" (not a service with public headerfile etc.) in anticipation that it might need changing to better support other storage engines, such as TokuDB. Also note that the call to thd_report_wait_for() does not need to happen "just prior" to the lock wait - it can happen later, as long as it happens at some point (though of course the earlier the better, in terms of more quickly resolving the deadlock and allowing replication to proceed).

...

I have been running sysbench oltp with a mariadb 10.1 master-slave topology. I have not seen any replication errors when slave parallel mode is conservative.

No, it should not happen, because in conservative mode transactions are not run in parallel on a slave unless they ran without lock conflicts on the master (both transactions reached the commit point at the same time). But in InnoDB/XtraDB, there are some interesting (but very rare) corner cases where two transactions may or may not have lock conflicts depending on the exact order of execution. So for these cases, the thd_report_wait_for() mechanism is also needed. - Kristian.

Kristian Nielsen

10 Aug 10 Aug

7:31 a.m.

New subject: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

Rich Prohaska <prohaska7@gmail.com> writes:

...

Is TokuDB supposed to call the thd report wait for API just prior to a thread about to wait on a tokudb lock?

If I wanted to look into implementing this, do you have a quick pointer to where in the TokuDB code I could start looking? Like the place where lock waits are done? (I have not worked with the TokuDB source before, though I am somewhat familiar with the concept of how it works.) - Kristian.

Kristian Nielsen

12:02 p.m.

New subject: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

Kristian Nielsen <knielsen@knielsen-hq.org> writes:

...

Rich Prohaska <prohaska7@gmail.com> writes:

...
Is TokuDB supposed to call the thd report wait for API just prior to a thread about to wait on a tokudb lock?

If I wanted to look into implementing this, do you have a quick pointer to where in the TokuDB code I could start looking? Like the place where lock waits are done? (I have not worked with the TokuDB source before, though I

I took just a quick look at the code, in particular lock_request.cc: int lock_request::start(void) { txnid_set conflicts; .... r = m_lt->acquire_write_lock(m_txnid, m_left_key, m_right_key, &conflicts, m_big_txn); if (r == DB_LOCK_NOTGRANTED) { It seems to me that at this point in the code, what is required is to call thd_report_wait_for() on each element in the set conflicts, and that should be about it. Some mechanism will be needed to get from TXNID to THD, of course. A more subtle problem is how to ensure that those THDs cannot go away while iterating? I'm not familiar with what kind of inter-thread locking is used around TokuDB row locks. But it looks like a proof-of-concept patch for TokuDB optimistic parallel replication might be fairly simple to do. I also noticed that TokuDB does not support handlerton->kill_query() (so KILL cannot break a TokuDB row lock wait). That should be fine, the KILL will be handled when the wait finishes (or if _all_ transactions are waiting on the row locks of each other, then a normal TokuDB deadlock detection will handle things). - Kristian.

Rich Prohaska

12:29 p.m.

On Wed, Aug 10, 2016 at 8:02 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:

...

Kristian Nielsen <knielsen@knielsen-hq.org> writes:

...
Rich Prohaska <prohaska7@gmail.com> writes:

...
Is TokuDB supposed to call the thd report wait for API just prior to a thread about to wait on a tokudb lock?

If I wanted to look into implementing this, do you have a quick pointer to where in the TokuDB code I could start looking? Like the place where lock waits are done? (I have not worked with the TokuDB source before, though I

I took just a quick look at the code, in particular lock_request.cc:

int lock_request::start(void) { txnid_set conflicts; .... r = m_lt->acquire_write_lock(m_txnid, m_left_key, m_right_key, &conflicts, m_big_txn); if (r == DB_LOCK_NOTGRANTED) {

It seems to me that at this point in the code, what is required is to call thd_report_wait_for() on each element in the set conflicts, and that should be about it.

I agree. IMO, the implementation could add a lock wait for callback to the TokuFT layer (see set_lock_timeout_callback for a similar method). The TokuFT lock manager would call this function (if it exists). The TokuDB layer would implement the callback and call the appropriate mysql API that handles the replication logic. There is enough information to map tokuft transaction id's to mysql client id's (see below).

...

Some mechanism will be needed to get from TXNID to THD, of course. A more subtle problem is how to ensure that those THDs cannot go away while iterating? I'm not familiar with what kind of inter-thread locking is used around TokuDB row locks.

The tokudb information schema needs to map a tokuft transaction id to a mysql client id, so the code attaches the mysql client id to the tokuft transaction object via set_client_id calls, and gets the mapping via get_client_id calls. There should be a function that maps a mysql client id to a mysql THD (if there isn't one already). The tokuft lock manager lock can ensure that the lock tree state is not changes when the wait for callback is called. This means that granted locks can not be removed, which means that the transaction that owns the locks can not be destroyed, which means that the owning THD can not go away.

...

But it looks like a proof-of-concept patch for TokuDB optimistic parallel replication might be fairly simple to do.

I also noticed that TokuDB does not support handlerton->kill_query() (so KILL cannot break a TokuDB row lock wait). That should be fine, the KILL will be handled when the wait finishes (or if _all_ transactions are waiting on the row locks of each other, then a normal TokuDB deadlock detection will handle things).

- Kristian.

Kristian Nielsen

2:03 p.m.

New subject: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

Rich, Cool, thanks for the pointers, that looks very helpful. I'll try to see if I can come up with something. - Kristian.

Rich Prohaska

2:34 p.m.

Could add some methods to tokuft like set/get client_extra (void *extra) where the extra is the THD pointer in the case of tokudb. On Wed, Aug 10, 2016 at 10:03 AM, Kristian Nielsen <knielsen@knielsen-hq.org

...

wrote:

...

Rich,

Cool, thanks for the pointers, that looks very helpful. I'll try to see if I can come up with something.

- Kristian.

Rich Prohaska

6:50 p.m.

Or could change get/set_client_id APIs to include extra in addition to id parameters. On Wed, Aug 10, 2016 at 10:34 AM, Rich Prohaska <prohaska7@gmail.com> wrote:

...

Could add some methods to tokuft like set/get client_extra (void *extra) where the extra is the THD pointer in the case of tokudb.

On Wed, Aug 10, 2016 at 10:03 AM, Kristian Nielsen < knielsen@knielsen-hq.org> wrote:

...
Rich,

Cool, thanks for the pointers, that looks very helpful. I'll try to see if I can come up with something.

- Kristian.

3216

Age (days ago)

3242

Last active (days ago)

List overview

18 comments

3 participants

participants (3)

jocelyn fournier
Kristian Nielsen
Rich Prohaska

[Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

tags

participants (3)