[Maria-discuss] Deadlocks in R/W phase of Sysbench
Hello all, have used Sysbench extensively to test MariaDB CPU load factors between various architectures. I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction" in the MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x 1000's) is happening across all releases in the Sysbench R/W phase. Here is the Sysbench command that drives the workload against MariaDB - sysbench --test=lua/oltp.lua --oltp_tables_count=128 --oltp-table-size=85937 --rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run Would anyone have a tip / idea /pointer? Regards, JC John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal.
I run sysbench frequently with MySQL (not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy <sean@jdcassidy.eu> wrote:
Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various architectures.
I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction" in the MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x 1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128 --oltp-table-size=85937 --rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 <+49%201525%208961601> (Germany) Mobile: +352 621 577 149 <+352%20621%20577%20149> (Luxembourg) Mobile: +41 78 769 17 97 <+41%2078%20769%2017%2097> (CH) Landline: +41 44 509 1957 <+41%2044%20509%2019%2057>
"Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
-- Mark Callaghan mdcallag@gmail.com
Hello Mark, appreciate the reply. The OLTP table size and OLTP size give me a DB size of appx 12GB. This is what I want. I have in the meantime, looked at some older logs and see that whatever amount of threads I specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in the R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also occuring on an X86-64 box. Regards, JC I run sysbench frequently with MySQL (not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all, have used Sysbench extensively to test MariaDB CPU load factors between various architectures. I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction" in the MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x 1000's) is happening across all releases in the Sysbench R/W phase. Here is the Sysbench command that drives the workload against MariaDB - sysbench --test=lua/oltp.lua --oltp_tables_count=128 --oltp-table-size=85937 --rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run Would anyone have a tip / idea /pointer? Regards, JC John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan mdcallag@gmail.com John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal.
Hi! I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51". This is with latest snapshot of 10.1 and sysbench 1.0. I believe you shouldn't see deadlocks on such a big data set. Probably something is wrong with your configs? Regards, Sergey On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size and OLTP size give me a DB size of appx 12GB. This is what I want. I have in the meantime, looked at some older logs and see that whatever amount of threads I specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in the R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with MySQL (not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various architectures.
I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction" in the MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x 1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128 --oltp-table-size=85937 --rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan mdcallag@gmail.com
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Privet Sergey, do not think I have a problem with my config(s). I am not on the machine at the moment. From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is 64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K. I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest.. NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable. On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks. Any further information you require, please let me know. Regards, JC Hi! I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51". This is with latest snapshot of 10.1 and sysbench 1.0. I believe you shouldn't see deadlocks on such a big data set. Probably something is wrong with your configs? Regards, Sergey On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size
and
OLTP size give me a DB size of appx 12GB. This is what I want. I have in the meantime, looked at some older logs and see that whatever amount of threads I specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in the R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB
also occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with MySQL (not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of
and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various architectures.
I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction" in the MariaDB error log during the R/W
respectable for my purposes, but how do I correct this condition?? Can I ignore this / must
10.1.24 - built directly on the machine. S390x by the way, but the problem is tables - between 4 phase. The Sysbench R/W TPS rates are still live with it?
I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x
1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against
MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128
--oltp-table-size=85937
--rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone
have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152
58961601 (Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH)
Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan mdcallag@gmail.com
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile:
+49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" -
Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal.
Privet JC, Still no deadlocks on my side even on ramdisk. Note that we have a few fixes in 10.2 that should improve RW scalability. Alas, there're some reports about scalability regressions that haven't been addressed yet. It would be nice if you could share your MariaDB configuration. Did you also try reducing number of tables down to 4-8? Regards, Sergey On Thu, Jun 08, 2017 at 06:06:03PM +0200, J. Cassidy wrote:
Privet Sergey,
do not think I have a problem with my config(s).
I am not on the machine at the moment. From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is
64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K.
I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest..
NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable.
On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks.
Any further information you require, please let me know.
Regards,
JC
Hi!
I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51".
This is with latest snapshot of 10.1 and sysbench 1.0.
I believe you shouldn't see deadlocks on such a big data set. Probably something is wrong with your configs?
Regards, Sergey
On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size
and
OLTP size give me a DB size of appx 12GB. This is what I want. I have in the meantime, looked at some older logs and see that whatever amount of threads I specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in the R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB
also occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with MySQL (not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of
and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various architectures.
I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction" in the MariaDB error log during the R/W
respectable for my purposes, but how do I correct this condition?? Can I ignore this / must
10.1.24 - built directly on the machine. S390x by the way, but the problem is tables - between 4 phase. The Sysbench R/W TPS rates are still live with it?
I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x
1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against
MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128
--oltp-table-size=85937
--rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone
have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152
58961601 (Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH)
Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan mdcallag@gmail.com
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile:
+49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" -
Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
participants (3)
-
J. Cassidy
-
MARK CALLAGHAN
-
Sergey Vojtovich