Re: [Maria-discuss] Deadlocks in R/W phase of Sysbench
All, as discussed yesterday, here is my setup and configuration. ************************************************************* 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* Single Image. No clustering or outside network interaction * SuSE SLES 12 SP2 using a 4.4.49-92 kernel * ************************************************************* ### Sysbench R/W - 64 Threads sysbench 0.5: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 64 Report intermediate results every 2 second(s) Initializing random number generator from seed (42). Threads started! [ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 . . . [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 OLTP test statistics: queries performed: read: 256244548 write: 73212536 other: 36606172 total: 366063256 transactions: 18302990 (30454.11 per sec.) read/write requests: 329457084 (548179.44 per sec.) other operations: 36606172 (60908.54 per sec.) ignored errors: 192 (0.32 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 601.0023s total number of events: 18302990 total time taken by event execution: 38450.5507s response time: min: 0.42ms avg: 2.10ms max: 62.26ms approx. 99 percentile: 5.62ms Threads fairness: events (avg/stddev): 285984.2188/3788.79 execution time (avg/stddev): 600.7899/0.00 ================================================================================================================= ### Excerpt from MariaDB Error Log (Thousands of similiar messages). . . . 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction . . . =================================================================================================================== The configuration below has been tried and tested and works across on different architectures e.g. S390X and S390X. [mysqld_safe] user=root password='' [mysqld] symbolic-links=0 open_files_limit = 65535 # was 102400 max_allowed_packet = 16M # was 768M max-connect-errors = 1000000 connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets # innodb_log_group_home_dir= /data-lun22/logs pid-file=/tmp/mysqld.pid.sysbench port = 3306 # log-error=/var/log/mariadb/mariadb-error-john.log log-error=/var/log/mariadb/mariadb-error-john.log general_log_file=/var/log/mariadb/mariadb-john.log long_query_time = 0 slow_query_log = 0 # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log # general-log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables # tmpdir in RAM tmpdir = /mnt/tmpdir #####non innodb options (fixed) max_connections = 4000 back_log = 150 table_open_cache = 8000 # was 2048 # was 4000 # was 8000 key_buffer_size = 16M query_cache_type = 0 #was 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K autocommit = 0 thread-cache-size = 50 table-definition-cache = 1024 table-open-cache = 2048 ####fixed innodb options innodb_file_per_table = true innodb_open_files = 4096 # was 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G # was 14G # was 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 36G #was 8G # was 48G innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0 ####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_io_capacity = 8000 # was 20000 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 2 innodb_write_io_threads = 2 <*> Note NMON reports 35% memory usage in the middle of the abovementioned Banchmark ergo the DB is in cache. No paging. ++ Note The R/W TPS rate are "respectable" based on observations from countless other runs ran on other configurations/architecture. The TPS rates would obviously improve if the deadlock situation outlined above was resolved. So there you have it. Any tips / pointers that would enlighten me are much appreciated. Regards, JC Privet Sergey, do not think I have a problem with my config(s). I am not on the machine at the moment.
From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is
64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K. I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest.. NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable. On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks. Any further information you require, please let me know. Regards, JC Hi! I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51". This is with latest snapshot of 10.1 and sysbench 1.0. I believe you shouldn't see deadlocks on such a big data set. Probably something is wrong with your configs? Regards, Sergey On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the
reply.
The OLTP table size and OLTP size give me a
DB size of appx 12GB. This is what I want. I have in the
meantime, looked at some older logs and see that whatever amount of threads I
specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in the
R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also
occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with
(not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various architectures.
I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when
MySQL trying to get lock; try restarting transaction" in the
MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still
respectable for my purposes, but how do I
correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x 1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128 --oltp-table-size=85937
--rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off
--report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut
faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss Post to :
maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan
mdcallag@gmail.com
John Cassidy
Obere
Bühlstrasse 21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49
152 58961601 (Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile:
+41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal.
Thanks. I was hoping there was an easy fix if you did something like used serializable isolation, but you haven't done that. On Fri, Jun 9, 2017 at 12:27 AM, J. Cassidy <sean@jdcassidy.eu> wrote:
All,
as discussed yesterday, here is my setup and configuration.
************************************************************* 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* Single Image. No clustering or outside network interaction * SuSE SLES 12 SP2 using a 4.4.49-92 kernel * ************************************************************* ### Sysbench R/W - 64 Threads
sysbench 0.5: multi-threaded system evaluation benchmark
Running the test with following options: Number of threads: 64 Report intermediate results every 2 second(s) Initializing random number generator from seed (42).
Threads started!
[ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 . . . [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 OLTP test statistics: queries performed: read: 256244548 write: 73212536 other: 36606172 total: 366063256 transactions: 18302990 (30454.11 per sec.) read/write requests: 329457084 (548179.44 per sec.) other operations: 36606172 (60908.54 per sec.) ignored errors: 192 (0.32 per sec.) reconnects: 0 (0.00 per sec.)
General statistics: total time: 601.0023s total number of events: 18302990 total time taken by event execution: 38450.5507s response time: min: 0.42ms avg: 2.10ms max: 62.26ms approx. 99 percentile: 5.62ms
Threads fairness: events (avg/stddev): 285984.2188/3788.79 execution time (avg/stddev): 600.7899/0.00
============================================================ =====================================================
### Excerpt from MariaDB Error Log (Thousands of similiar messages). . . . 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction . . .
============================================================ ======================================================= The configuration below has been tried and tested and works across on different architectures e.g. S390X and S390X.
[mysqld_safe] user=root password='' [mysqld] symbolic-links=0 open_files_limit = 65535 # was 102400 max_allowed_packet = 16M # was 768M max-connect-errors = 1000000 connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets # innodb_log_group_home_dir= /data-lun22/logs pid-file=/tmp/mysqld.pid.sysbench port = 3306 # log-error=/var/log/mariadb/mariadb-error-john.log log-error=/var/log/mariadb/mariadb-error-john.log general_log_file=/var/log/mariadb/mariadb-john.log long_query_time = 0 slow_query_log = 0 # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log # general-log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables # tmpdir in RAM tmpdir = /mnt/tmpdir #####non innodb options (fixed) max_connections = 4000 back_log = 150 table_open_cache = 8000 # was 2048 # was 4000 # was 8000 key_buffer_size = 16M query_cache_type = 0 #was 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K autocommit = 0 thread-cache-size = 50 table-definition-cache = 1024 table-open-cache = 2048 ####fixed innodb options innodb_file_per_table = true innodb_open_files = 4096 # was 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G # was 14G # was 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 36G #was 8G # was 48G innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0 ####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_io_capacity = 8000 # was 20000 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 2 innodb_write_io_threads = 2
<*> Note NMON reports 35% memory usage in the middle of the abovementioned Banchmark ergo the DB is in cache. No paging.
++ Note The R/W TPS rate are "respectable" based on observations from countless other runs ran on other configurations/architecture. The TPS rates would obviously improve if the deadlock situation outlined above was resolved.
So there you have it. Any tips / pointers that would enlighten me are much appreciated.
Regards,
JC
Privet Sergey,
do not think I have a problem with my config(s).
I am not on the machine at the moment. From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is
64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K.
I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest..
NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable.
On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks.
Any further information you require, please let me know.
Regards,
JC
Hi!
I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51".
This is with latest snapshot of 10.1 and sysbench 1.0.
I believe you shouldn't see deadlocks on such a big data set. Probably
something is wrong with your configs?
Regards, Sergey
On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size and OLTP size give me a DB size of appx 12GB. This is what I want. I have in
meantime, looked at some older logs and see that whatever amount of
specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in the R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the
also occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with MySQL (not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various architectures.
I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction" in
the threads I problem is the
MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x 1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128 --oltp-table-size=85937 --rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 <+49%201525%208961601> (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 <+41%2078%20769%2017%2097> (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan mdcallag@gmail.com
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 <+352%20621%20577%20149> (Luxembourg) Mobile: +41 78 769 17 97 <+41%2078%20769%2017%2097> (CH) Landline: +41 44 509 1957 <+41%2044%20509%2019%2057>
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 <+49%201525%208961601> (Germany) Mobile: +352 621 577 149 <+352%20621%20577%20149> (Luxembourg) Mobile: +41 78 769 17 97 <+41%2078%20769%2017%2097> (CH) Landline: +41 44 509 1957 <+41%2044%20509%2019%2057>
"Aut viam inveniam aut faciam" - Hannibal.
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 <+49%201525%208961601> (Germany) Mobile: +352 621 577 149 <+352%20621%20577%20149> (Luxembourg) Mobile: +41 78 769 17 97 <+41%2078%20769%2017%2097> (CH) Landline: +41 44 509 1957 <+41%2044%20509%2019%2057>
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
-- Mark Callaghan mdcallag@gmail.com
Mark, still scratching head... Regards, JC Thanks. I was hoping there was an easy fix if you did something like used serializable isolation, but you haven't done that. On Fri, Jun 9, 2017 at 12:27 AM, J. Cassidy sean@jdcassidy.eu> wrote: All, as discussed yesterday, here is my setup and configuration. ************************************************************* 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* Single Image. No clustering or outside network interaction * SuSE SLES 12 SP2 using a 4.4.49-92 kernel * ************************************************************* ### Sysbench R/W - 64 Threads sysbench 0.5: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 64 Report intermediate results every 2 second(s) Initializing random number generator from seed (42). Threads started! [ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 . . . [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 OLTP test statistics: queries performed: read: 256244548 write: 73212536 other: 36606172 total: 366063256 transactions: 18302990 (30454.11 per sec.) read/write requests: 329457084 (548179.44 per sec.) other operations: 36606172 (60908.54 per sec.) ignored errors: 192 (0.32 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 601.0023s total number of events: 18302990 total time taken by event execution: 38450.5507s response time: min: 0.42ms avg: 2.10ms max: 62.26ms approx. 99 percentile: 5.62ms Threads fairness: events (avg/stddev): 285984.2188/3788.79 execution time (avg/stddev): 600.7899/0.00 ================================================================================================================= ### Excerpt from MariaDB Error Log (Thousands of similiar messages). . . . 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction . . . =================================================================================================================== The configuration below has been tried and tested and works across on different architectures e.g. S390X and S390X. [mysqld_safe] user=root password='' [mysqld] symbolic-links=0 open_files_limit = 65535 # was 102400 max_allowed_packet = 16M # was 768M max-connect-errors = 1000000 connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets # innodb_log_group_home_dir= /data-lun22/logs pid-file=/tmp/mysqld.pid.sysbench port = 3306 # log-error=/var/log/mariadb/mariadb-error-john.log log-error=/var/log/mariadb/mariadb-error-john.log general_log_file=/var/log/mariadb/mariadb-john.log long_query_time = 0 slow_query_log = 0 # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log # general-log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables # tmpdir in RAM tmpdir = /mnt/tmpdir #####non innodb options (fixed) max_connections = 4000 back_log = 150 table_open_cache = 8000 # was 2048 # was 4000 # was 8000 key_buffer_size = 16M query_cache_type = 0 #was 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K autocommit = 0 thread-cache-size = 50 table-definition-cache = 1024 table-open-cache = 2048 ####fixed innodb options innodb_file_per_table = true innodb_open_files = 4096 # was 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G # was 14G # was 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 36G #was 8G # was 48G innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0 ####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_io_capacity = 8000 # was 20000 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 2 innodb_write_io_threads = 2 Note NMON reports 35% memory usage in the middle of the abovementioned Banchmark ergo the DB is in cache. No paging. ++ Note The R/W TPS rate are "respectable" based on observations from countless other runs ran on other configurations/architecture. The TPS rates would obviously improve if the deadlock situation outlined above was resolved. So there you have it. Any tips / pointers that would enlighten me are much appreciated. Regards, JC Privet Sergey, do not think I have a problem with my config(s). I am not on the machine at the moment.
From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is
64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K. I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest.. NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable. On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks. Any further information you require, please let me know. Regards, JC Hi! I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51". This is with latest snapshot of 10.1 and sysbench 1.0. I believe you shouldn't see deadlocks on such a big data set. Probably something is wrong with your configs? Regards, Sergey On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size and OLTP
size give me a DB size of appx 12GB. This is what I want. I have in the
specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in
meantime, looked at some older logs and see that whatever amount of threads I the
R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also
occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with
MySQL
(not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various
architectures.
I am noticing massive amounts of "[ERROR]
Deadlock found when trying to get lock; try restarting
mysqld: transaction" in the
MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x
1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against
MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128
--oltp-table-size=85937
--rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone
have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149
(Luxembourg)
Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to :
maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp
--Mark Callaghan mdcallag@gmail.com
John Cassidy
Obere Bühlstrasse 21 8700
Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan mdcallag@gmail.com John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal.
Hello, What IO scheduler are you using for FS and witch one ? deadline or loop is a must do. Also mounting with noatime can help for write Would you try such settings i found to make a difference autocommit = 1 innodb_adaptive_hash_index=0 innodb_max_dirty_page_pct=20 innodb_write_io_threads =64 innodb_log_file_size = 1024M innodb_log_file_in_group = 4 innodb_log_files_in_group = 4 innodb_thread_concurrency = 0 innodb_purge_threads = 8 innodb_change_buffering=none innodb_open_files = 16384 innodb_file_per_table=1 innodb_autoinc_lock_mode = 2 And comment # innodb_lru_scan_depth = 4096 Stéphane Varoqui, Senior Consultant Phone: +33 695-926-401, skype: svaroqui http://www.mariadb.com <http://www.mariadb.com/>
Le 9 juin 2017 à 16:22, J. Cassidy <sean@jdcassidy.eu> a écrit :
Mark,
still scratching head...
Regards,
JC
Thanks. I was hoping there was an easy fix if you did something like used serializable isolation, but you haven't done that.
On Fri, Jun 9, 2017 at 12:27 AM, J. Cassidy sean@jdcassidy.eu> wrote: All,
as discussed yesterday, here is my setup and configuration.
************************************************************* 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* Single Image. No clustering or outside network interaction * SuSE SLES 12 SP2 using a 4.4.49-92 kernel * ************************************************************* ### Sysbench R/W - 64 Threads
sysbench 0.5: multi-threaded system evaluation benchmark
Running the test with following options: Number of threads: 64 Report intermediate results every 2 second(s) Initializing random number generator from seed (42).
Threads started!
[ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 . . . [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 OLTP test statistics: queries performed: read: 256244548 write: 73212536 other: 36606172 total: 366063256 transactions: 18302990 (30454.11 per sec.) read/write requests: 329457084 (548179.44 per sec.) other operations: 36606172 (60908.54 per sec.) ignored errors: 192 (0.32 per sec.) reconnects: 0 (0.00 per sec.)
General statistics: total time: 601.0023s total number of events: 18302990 total time taken by event execution: 38450.5507s response time: min: 0.42ms avg: 2.10ms max: 62.26ms approx. 99 percentile: 5.62ms
Threads fairness: events (avg/stddev): 285984.2188/3788.79 execution time (avg/stddev): 600.7899/0.00
=================================================================================================================
### Excerpt from MariaDB Error Log (Thousands of similiar messages). . . . 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction . . .
=================================================================================================================== The configuration below has been tried and tested and works across on different architectures e.g. S390X and S390X.
[mysqld_safe] user=root password='' [mysqld] symbolic-links=0 open_files_limit = 65535 # was 102400 max_allowed_packet = 16M # was 768M max-connect-errors = 1000000 connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets # innodb_log_group_home_dir= /data-lun22/logs pid-file=/tmp/mysqld.pid.sysbench port = 3306 # log-error=/var/log/mariadb/mariadb-error-john.log log-error=/var/log/mariadb/mariadb-error-john.log general_log_file=/var/log/mariadb/mariadb-john.log long_query_time = 0 slow_query_log = 0 # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log # general-log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables # tmpdir in RAM tmpdir = /mnt/tmpdir #####non innodb options (fixed) max_connections = 4000 back_log = 150 table_open_cache = 8000 # was 2048 # was 4000 # was 8000 key_buffer_size = 16M query_cache_type = 0 #was 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K autocommit = 0 thread-cache-size = 50 table-definition-cache = 1024 table-open-cache = 2048 ####fixed innodb options innodb_file_per_table = true innodb_open_files = 4096 # was 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G # was 14G # was 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 36G #was 8G # was 48G innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0 ####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_io_capacity = 8000 # was 20000 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 2 innodb_write_io_threads = 2
Note NMON reports 35% memory usage in the middle of the abovementioned Banchmark ergo the DB is in cache. No paging.
++ Note The R/W TPS rate are "respectable" based on observations from countless other runs ran on other configurations/architecture. The TPS rates would obviously improve if the deadlock situation outlined above was resolved.
So there you have it. Any tips / pointers that would enlighten me are much appreciated.
Regards,
JC
Privet Sergey,
do not think I have a problem with my config(s).
I am not on the machine at the moment. From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is
64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K.
I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest..
NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable.
On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks.
Any further information you require, please let me know.
Regards,
JC
Hi!
I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51".
This is with latest snapshot of 10.1 and sysbench 1.0.
I believe you shouldn't see deadlocks on such a big data set. Probably
something is wrong with your configs?
Regards, Sergey
On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size and OLTP size give me a DB size of appx 12GB. This is what I want. I have in the meantime, looked at some older logs and see that whatever amount of threads I specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in the R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with MySQL (not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu <mailto:sean@jdcassidy.eu>> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various architectures.
I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction" in the MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x 1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128 --oltp-table-size=85937 --rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 <x-msg://55/images/blank.png> (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 <x-msg://55/images/blank.png> (CH) Landline: +41 44 509 1957
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> More help : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp> --Mark Callaghan mdcallag@gmail.com <mailto:mdcallag@gmail.com>
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 <x-msg://55/images/blank.png> (Luxembourg) Mobile: +41 78 769 17 97 <x-msg://55/images/blank.png> (CH) Landline: +41 44 509 1957 <x-msg://55/images/blank.png>
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> More help : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp>
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 <x-msg://55/images/blank.png> (Germany) Mobile: +352 621 577 149 <x-msg://55/images/blank.png> (Luxembourg) Mobile: +41 78 769 17 97 <x-msg://55/images/blank.png> (CH) Landline: +41 44 509 1957 <x-msg://55/images/blank.png>
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal.
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 <x-msg://55/images/blank.png> (Germany) Mobile: +352 621 577 149 <x-msg://55/images/blank.png> (Luxembourg) Mobile: +41 78 769 17 97 <x-msg://55/images/blank.png> (CH) Landline: +41 44 509 1957 <x-msg://55/images/blank.png>
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> More help : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp>
-- Mark Callaghan mdcallag@gmail.com <mailto:mdcallag@gmail.com>
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal._______________________________________________ Mailing list: https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> More help : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp>
Salut Stephane, I/O does not enter into the equation, the database is in memory (RAM). Regards, JC Hello, What IO scheduler are you using for FS and witch one ?deadline or loop is a must do. Also mounting with noatime can help for write Would you try such settings i found to make a differenceautocommit = 1 innodb_adaptive_hash_index=0innodb_max_dirty_page_pct=20innodb_write_io_threads =64innodb_log_file_size = 1024Minnodb_log_file_in_group = 4innodb_log_files_in_group = 4 innodb_thread_concurrency = 0innodb_purge_threads = 8 innodb_change_buffering=noneinnodb_open_files = 16384innodb_file_per_table=1innodb_autoinc_lock_mode = 2 And comment# innodb_lru_scan_depth = 4096 Stéphane Varoqui, Senior ConsultantPhone: +33 695-926-401, skype: svaroquihttp://www.mariadb.com Le 9 juin 2017 à 16:22, J. Cassidy sean@jdcassidy.eu> a écrit : Mark, still scratching head... Regards, JC Thanks. I was hoping there was an easy fix if you did something like used serializable isolation, but you haven't done that.On Fri, Jun 9, 2017 at 12:27 AM, J. Cassidy sean@jdcassidy.eu> wrote: All, as discussed yesterday, here is my setup and configuration. ************************************************************* 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* Single Image. No clustering or outside network interaction * SuSE SLES 12 SP2 using a 4.4.49-92 kernel * ************************************************************* ### Sysbench R/W - 64 Threads sysbench 0.5: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 64 Report intermediate results every 2 second(s) Initializing random number generator from seed (42). Threads started! [ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 . . . [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 OLTP test statistics: queries performed: read: 256244548 write: 73212536 other: 36606172 total: 366063256 transactions: 18302990 (30454.11 per sec.) read/write requests: 329457084 (548179.44 per sec.) other operations: 36606172 (60908.54 per sec.) ignored errors: 192 (0.32 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 601.0023s total number of events: 18302990 total time taken by event execution: 38450.5507s response time: min: 0.42ms avg: 2.10ms max: 62.26ms approx. 99 percentile: 5.62ms Threads fairness: events (avg/stddev): 285984.2188/3788.79 execution time (avg/stddev): 600.7899/0.00 ================================================================================================================= ### Excerpt from MariaDB Error Log (Thousands of similiar messages). . . . 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction . . . =================================================================================================================== The configuration below has been tried and tested and works across on different architectures e.g. S390X and S390X. [mysqld_safe] user=root password='' [mysqld] symbolic-links=0 open_files_limit = 65535 # was 102400 max_allowed_packet = 16M # was 768M max-connect-errors = 1000000 connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets # innodb_log_group_home_dir= /data-lun22/logs pid-file=/tmp/mysqld.pid.sysbench port = 3306 # log-error=/var/log/mariadb/mariadb-error-john.log log-error=/var/log/mariadb/mariadb-error-john.log general_log_file=/var/log/mariadb/mariadb-john.log long_query_time = 0 slow_query_log = 0 # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log # general-log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables # tmpdir in RAM tmpdir = /mnt/tmpdir #####non innodb options (fixed) max_connections = 4000 back_log = 150 table_open_cache = 8000 # was 2048 # was 4000 # was 8000 key_buffer_size = 16M query_cache_type = 0 #was 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K autocommit = 0 thread-cache-size = 50 table-definition-cache = 1024 table-open-cache = 2048 ####fixed innodb options innodb_file_per_table = true innodb_open_files = 4096 # was 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G # was 14G # was 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 36G #was 8G # was 48G innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0 ####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_io_capacity = 8000 # was 20000 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 2 innodb_write_io_threads = 2 Note NMON reports 35% memory usage in the middle of the abovementioned Banchmark ergo the DB is in cache. No paging. ++ Note The R/W TPS rate are "respectable" based on observations from countless other runs ran on other configurations/architecture. The TPS rates would obviously improve if the deadlock situation outlined above was resolved. So there you have it. Any tips / pointers that would enlighten me are much appreciated. Regards, JC Privet Sergey, do not think I have a problem with my config(s). I am not on the machine at the moment.
From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is
64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K. I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest.. NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable. On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks. Any further information you require, please let me know. Regards, JC Hi! I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51". This is with latest snapshot of 10.1 and sysbench 1.0. I believe you shouldn't see deadlocks on such a big data set. Probably something is wrong with your configs? Regards, Sergey On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size and OLTP
size give me a DB size of appx 12GB. This is what I want. I have in the
specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in
meantime, looked at some older logs and see that whatever amount of threads I the
R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also
occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with
MySQL
(not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various
architectures.
I am noticing massive amounts of "[ERROR]
Deadlock found when trying to get lock; try restarting
mysqld: transaction" in the
MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x
1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against
MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128
--oltp-table-size=85937
--rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone
have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149
(Luxembourg)
Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to :
maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp
--Mark Callaghan mdcallag@gmail.com
John Cassidy
Obere Bühlstrasse 21 8700
Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan mdcallag@gmail.com John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal._______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal.
Re, Yep get this but just that io a supposed to be fast but still have to hit FS moved from PB to redo log to tblspace In a perfect fast world all layer should have equal ressources so no reasons not giving as many threads to write io vs doing the work itself of modifying pages. Disabling auto commit with sysbench is suspicious and intersting : never question myself on this topic before ? /svar Stéphane Varoqui, Senior Consultant Phone: +33 695-926-401, skype: svaroqui http://www.mariadb.com <http://www.mariadb.com/>
Le 9 juin 2017 à 16:47, J. Cassidy <sean@jdcassidy.eu> a écrit :
Salut Stephane,
I/O does not enter into the equation, the database is in memory (RAM).
Regards,
JC
Hello,
What IO scheduler are you using for FS and witch one ? deadline or loop is a must do. Also mounting with noatime can help for write
Would you try such settings i found to make a difference autocommit = 1
innodb_adaptive_hash_index=0 innodb_max_dirty_page_pct=20 innodb_write_io_threads =64 innodb_log_file_size = 1024M innodb_log_file_in_group = 4 innodb_log_files_in_group = 4 innodb_thread_concurrency = 0 innodb_purge_threads = 8 innodb_change_buffering=none innodb_open_files = 16384 innodb_file_per_table=1 innodb_autoinc_lock_mode = 2
And comment # innodb_lru_scan_depth = 4096
Stéphane Varoqui, Senior Consultant
Phone: +33 695-926-401, skype: svaroqui http://www.mariadb.com <http://www.mariadb.com/>
Le 9 juin 2017 à 16:22, J. Cassidy sean@jdcassidy.eu> a écrit :
Mark,
still scratching head...
Regards,
JC
Thanks. I was hoping there was an easy fix if you did something like used serializable isolation, but you haven't done that. On Fri, Jun 9, 2017 at 12:27 AM, J. Cassidy sean@jdcassidy.eu <mailto:sean@jdcassidy.eu>> wrote: All,
as discussed yesterday, here is my setup and configuration.
************************************************************* 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* Single Image. No clustering or outside network interaction * SuSE SLES 12 SP2 using a 4.4.49-92 kernel * ************************************************************* ### Sysbench R/W - 64 Threads
sysbench 0.5: multi-threaded system evaluation benchmark
Running the test with following options: Number of threads: 64 Report intermediate results every 2 second(s) Initializing random number generator from seed (42).
Threads started!
[ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 . . . [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 OLTP test statistics: queries performed: read: 256244548 write: 73212536 other: 36606172 total: 366063256 transactions: 18302990 (30454.11 per sec.) read/write requests: 329457084 (548179.44 per sec.) other operations: 36606172 (60908.54 per sec.) ignored errors: 192 (0.32 per sec.) reconnects: 0 (0.00 per sec.)
General statistics: total time: 601.0023s total number of events: 18302990 total time taken by event execution: 38450.5507s response time: min: 0.42ms avg: 2.10ms max: 62.26ms approx. 99 percentile: 5.62ms
Threads fairness: events (avg/stddev): 285984.2188/3788.79 execution time (avg/stddev): 600.7899/0.00
=================================================================================================================
### Excerpt from MariaDB Error Log (Thousands of similiar messages). . . . 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction . . .
=================================================================================================================== The configuration below has been tried and tested and works across on different architectures e.g. S390X and S390X.
[mysqld_safe] user=root password='' [mysqld] symbolic-links=0 open_files_limit = 65535 # was 102400 max_allowed_packet = 16M # was 768M max-connect-errors = 1000000 connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets # innodb_log_group_home_dir= /data-lun22/logs pid-file=/tmp/mysqld.pid.sysbench port = 3306 # log-error=/var/log/mariadb/mariadb-error-john.log log-error=/var/log/mariadb/mariadb-error-john.log general_log_file=/var/log/mariadb/mariadb-john.log long_query_time = 0 slow_query_log = 0 # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log # general-log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables # tmpdir in RAM tmpdir = /mnt/tmpdir #####non innodb options (fixed) max_connections = 4000 back_log = 150 table_open_cache = 8000 # was 2048 # was 4000 # was 8000 key_buffer_size = 16M query_cache_type = 0 #was 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K autocommit = 0 thread-cache-size = 50 table-definition-cache = 1024 table-open-cache = 2048 ####fixed innodb options innodb_file_per_table = true innodb_open_files = 4096 # was 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G # was 14G # was 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 36G #was 8G # was 48G innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0 ####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_io_capacity = 8000 # was 20000 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 2 innodb_write_io_threads = 2
Note NMON reports 35% memory usage in the middle of the abovementioned Banchmark ergo the DB is in cache. No paging.
++ Note The R/W TPS rate are "respectable" based on observations from countless other runs ran on other configurations/architecture. The TPS rates would obviously improve if the deadlock situation outlined above was resolved.
So there you have it. Any tips / pointers that would enlighten me are much appreciated.
Regards,
JC
Privet Sergey,
do not think I have a problem with my config(s).
I am not on the machine at the moment. From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is
64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K.
I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest..
NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable.
On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks.
Any further information you require, please let me know.
Regards,
JC
Hi!
I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51".
This is with latest snapshot of 10.1 and sysbench 1.0.
I believe you shouldn't see deadlocks on such a big data set. Probably
something is wrong with your configs?
Regards, Sergey
On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size and OLTP size give me a DB size of appx 12GB. This is what I want. I have in the meantime, looked at some older logs and see that whatever amount of threads I specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in the R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with MySQL (not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu <mailto:sean@jdcassidy.eu>> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various architectures.
I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction" in the MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x 1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128 --oltp-table-size=85937 --rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 <x-msg://59/images/blank.png> (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 <x-msg://59/images/blank.png> (CH) Landline: +41 44 509 1957
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> More help : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp> --Mark Callaghan mdcallag@gmail.com <mailto:mdcallag@gmail.com>
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 <x-msg://59/images/blank.png> (Luxembourg) Mobile: +41 78 769 17 97 <x-msg://59/images/blank.png> (CH) Landline: +41 44 509 1957 <x-msg://59/images/blank.png>
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> More help : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp>
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 <x-msg://59/images/blank.png> (Germany) Mobile: +352 621 577 149 <x-msg://59/images/blank.png> (Luxembourg) Mobile: +41 78 769 17 97 <x-msg://59/images/blank.png> (CH) Landline: +41 44 509 1957 <x-msg://59/images/blank.png>
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal.
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 <x-msg://59/images/blank.png> (Germany) Mobile: +352 621 577 149 <x-msg://59/images/blank.png> (Luxembourg) Mobile: +41 78 769 17 97 <x-msg://59/images/blank.png> (CH) Landline: +41 44 509 1957 <x-msg://59/images/blank.png>
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> More help : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp>
-- Mark Callaghan mdcallag@gmail.com <mailto:mdcallag@gmail.com>
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal._______________________________________________ Mailing list: https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> More help : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp>
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
Stephane, all of my directories, logs, spaces, tables etc. are in RAM. During the course of the benchmark there in ZERO I/O which is my intention. Here is a snippet from the MariaDB KB -autocommit By default, MariaDB runs with autocommit mode enabled. This means that as soon as you execute a statement that updates (modifies) a table, MariaDB stores the update on disk to make it permanent. To disable autocommit mode, use the following statement:SET autocommit=0; So it would seem that I am running with autocommit enabled. Regards, JC Re, Yep get this but just that io a supposed to be fast but still have to hit FS moved from PB to redo log to tblspaceIn a perfect fast world all layer should have equal ressources so no reasons not giving as many threads to write io vs doing the work itself of modifying pages. Disabling auto commit with sysbench is suspicious and intersting : never question myself on this topic before ? /svarStéphane Varoqui, Senior ConsultantPhone: +33 695-926-401, skype: svaroquihttp://www.mariadb.com Le 9 juin 2017 à 16:47, J. Cassidy sean@jdcassidy.eu> a écrit : Salut Stephane, I/O does not enter into the equation, the database is in memory (RAM). Regards, JC Hello, What IO scheduler are you using for FS and witch one ?deadline or loop is a must do. Also mounting with noatime can help for write Would you try such settings i found to make a differenceautocommit = 1 innodb_adaptive_hash_index=0innodb_max_dirty_page_pct=20innodb_write_io_threads =64innodb_log_file_size = 1024Minnodb_log_file_in_group = 4innodb_log_files_in_group = 4 innodb_thread_concurrency = 0innodb_purge_threads = 8 innodb_change_buffering=noneinnodb_open_files = 16384innodb_file_per_table=1innodb_autoinc_lock_mode = 2 And comment# innodb_lru_scan_depth = 4096 Stéphane Varoqui, Senior ConsultantPhone: +33 695-926-401, skype: svaroquihttp://www.mariadb.com Le 9 juin 2017 à 16:22, J. Cassidy sean@jdcassidy.eu> a écrit :Mark, still scratching head... Regards, JC Thanks. I was hoping there was an easy fix if you did something like used serializable isolation, but you haven't done that.On Fri, Jun 9, 2017 at 12:27 AM, J. Cassidy sean@jdcassidy.eu> wrote: All, as discussed yesterday, here is my setup and configuration. ************************************************************* 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* Single Image. No clustering or outside network interaction * SuSE SLES 12 SP2 using a 4.4.49-92 kernel * ************************************************************* ### Sysbench R/W - 64 Threads sysbench 0.5: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 64 Report intermediate results every 2 second(s) Initializing random number generator from seed (42). Threads started! [ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 . . . [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 OLTP test statistics: queries performed: read: 256244548 write: 73212536 other: 36606172 total: 366063256 transactions: 18302990 (30454.11 per sec.) read/write requests: 329457084 (548179.44 per sec.) other operations: 36606172 (60908.54 per sec.) ignored errors: 192 (0.32 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 601.0023s total number of events: 18302990 total time taken by event execution: 38450.5507s response time: min: 0.42ms avg: 2.10ms max: 62.26ms approx. 99 percentile: 5.62ms Threads fairness: events (avg/stddev): 285984.2188/3788.79 execution time (avg/stddev): 600.7899/0.00 ================================================================================================================= ### Excerpt from MariaDB Error Log (Thousands of similiar messages). . . . 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction . . . =================================================================================================================== The configuration below has been tried and tested and works across on different architectures e.g. S390X and S390X. [mysqld_safe] user=root password='' [mysqld] symbolic-links=0 open_files_limit = 65535 # was 102400 max_allowed_packet = 16M # was 768M max-connect-errors = 1000000 connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets # innodb_log_group_home_dir= /data-lun22/logs pid-file=/tmp/mysqld.pid.sysbench port = 3306 # log-error=/var/log/mariadb/mariadb-error-john.log log-error=/var/log/mariadb/mariadb-error-john.log general_log_file=/var/log/mariadb/mariadb-john.log long_query_time = 0 slow_query_log = 0 # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log # general-log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables # tmpdir in RAM tmpdir = /mnt/tmpdir #####non innodb options (fixed) max_connections = 4000 back_log = 150 table_open_cache = 8000 # was 2048 # was 4000 # was 8000 key_buffer_size = 16M query_cache_type = 0 #was 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K autocommit = 0 thread-cache-size = 50 table-definition-cache = 1024 table-open-cache = 2048 ####fixed innodb options innodb_file_per_table = true innodb_open_files = 4096 # was 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G # was 14G # was 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 36G #was 8G # was 48G innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0 ####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_io_capacity = 8000 # was 20000 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 2 innodb_write_io_threads = 2 Note NMON reports 35% memory usage in the middle of the abovementioned Banchmark ergo the DB is in cache. No paging. ++ Note The R/W TPS rate are "respectable" based on observations from countless other runs ran on other configurations/architecture. The TPS rates would obviously improve if the deadlock situation outlined above was resolved. So there you have it. Any tips / pointers that would enlighten me are much appreciated. Regards, JC Privet Sergey, do not think I have a problem with my config(s). I am not on the machine at the moment.
From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is
64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K. I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest.. NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable. On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks. Any further information you require, please let me know. Regards, JC Hi! I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51". This is with latest snapshot of 10.1 and sysbench 1.0. I believe you shouldn't see deadlocks on such a big data set. Probably something is wrong with your configs? Regards, Sergey On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size and OLTP
size give me a DB size of appx 12GB. This is what I want. I have in the
specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in
meantime, looked at some older logs and see that whatever amount of threads I the
R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also
occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with
MySQL
(not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various
architectures.
I am noticing massive amounts of "[ERROR]
Deadlock found when trying to get lock; try restarting
mysqld: transaction" in the
MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x
1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against
MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128
--oltp-table-size=85937
--rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone
have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149
(Luxembourg)
Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to :
maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp
--Mark Callaghan mdcallag@gmail.com
John Cassidy
Obere Bühlstrasse 21 8700
Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan mdcallag@gmail.com John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal._______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal.
Hi John, It seems you have autocommit = 0 in your my.cnf Jocelyn Le 09/06/2017 à 17:46, J. Cassidy a écrit : > Stephane, > > all of my directories, logs, spaces, tables etc. are in RAM. During > the course of the benchmark there in ZERO I/O > which is my intention. > > Here is a snippet from the MariaDB KB - > > > autocommit > > By default, MariaDB runs with autocommit > <https://mariadb.com/kb/en/server-system-variables/#autocommit> mode > enabled. This means that as soon as you execute a statement that > updates (modifies) a table, MariaDB stores the update on disk to make > it permanent. To disable autocommit mode, use the following statement: > > SET autocommit=0; > So it would seem that I am running with autocommit enabled. > > Regards, > > > JC > > Re, > Yep get this but just that io a supposed to be fast but still have to > hit FS moved from PB to redo log to tblspace > In a perfect fast world all layer should have equal ressources so no > reasons not giving as many threads to write io vs doing the work > itself of modifying pages. > Disabling auto commit with sysbench is suspicious and intersting : > never question myself on this topic before ? > /svar > > Stéphane Varoqui, Senior Consultant > > Phone: +33 695-926-401, skype: svaroqui > http://www.mariadb.com > > Le 9 juin 2017 à 16:47, J. Cassidy sean@jdcassidy.eu> a écrit : > Salut Stephane, > > I/O does not enter into the equation, the database is in memory (RAM). > > > Regards, > > > JC > > Hello, > What IO scheduler are you using for FS and witch one ? > deadline or loop is a must do. Also mounting with noatime can help for > write > Would you try such settings i found to make a difference > autocommit = 1 > innodb_adaptive_hash_index=0 > innodb_max_dirty_page_pct=20 > innodb_write_io_threads =64 > innodb_log_file_size = 1024M > innodb_log_file_in_group = 4 > innodb_log_files_in_group = 4 > innodb_thread_concurrency = 0 > innodb_purge_threads = 8 > innodb_change_buffering=none > innodb_open_files = 16384 > innodb_file_per_table=1 > innodb_autoinc_lock_mode = 2 > And comment > # innodb_lru_scan_depth = 4096 > > Stéphane Varoqui, Senior Consultant > > Phone: +33 695-926-401, skype: svaroqui > http://www.mariadb.com <http://www.mariadb.com/> > Le 9 juin 2017 à 16:22, J. Cassidy sean@jdcassidy.eu > <mailto:sean@jdcassidy.eu>> a écrit : > Mark, > > > still scratching head... > > > Regards, > > > JC > > Thanks. I was hoping there was an easy fix if you did something like > used serializable isolation, but you haven't done that. > On Fri, Jun 9, 2017 at 12:27 AM, J. Cassidy sean@jdcassidy.eu > <mailto:sean@jdcassidy.eu>> wrote: > > All, > > > as discussed yesterday, here is my setup and configuration. > > > ************************************************************* > 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* > Single Image. No clustering or outside network interaction * > SuSE SLES 12 SP2 using a 4.4.49-92 kernel * > ************************************************************* > ### Sysbench R/W - 64 Threads > > sysbench 0.5: multi-threaded system evaluation benchmark > > Running the test with following options: > Number of threads: 64 > Report intermediate results every 2 second(s) > Initializing random number generator from seed (42). > > > Threads started! > > [ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, > response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 > [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, > response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 > [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, > response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 > [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, > response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 > [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: > 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 > [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: > 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 > [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: > 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 > [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: > 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 > . > . > . > [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: > 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 > [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: > 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 > [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: > 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 > [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: > 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 > [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: > 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 > [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: > 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 > [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: > 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 > [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: > 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 > [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: > 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 > [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: > 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 > [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: > 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 > [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: > 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 > [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: > 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 > [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: > 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 > [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: > 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 > OLTP test statistics: > queries performed: > read: 256244548 > write: 73212536 > other: 36606172 > total: 366063256 > transactions: 18302990 (30454.11 per sec.) > read/write requests: 329457084 (548179.44 per sec.) > other operations: 36606172 (60908.54 per sec.) > ignored errors: 192 (0.32 per sec.) > reconnects: 0 (0.00 per sec.) > > General statistics: > total time: 601.0023s > total number of events: 18302990 > total time taken by event execution: 38450.5507s > response time: > min: 0.42ms > avg: 2.10ms > max: 62.26ms > approx. 99 percentile: 5.62ms > > Threads fairness: > events (avg/stddev): 285984.2188/3788.79 > execution time (avg/stddev): 600.7899/0.00 > > > ================================================================================================================= > > ### Excerpt from MariaDB Error Log (Thousands of similiar messages). > . > . > . > 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when > trying to get lock; try restarting transaction > 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when > trying to get lock; try restarting transaction > 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when > trying to get lock; try restarting transaction > 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when > trying to get lock; try restarting transaction > 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when > trying to get lock; try restarting transaction > 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when > trying to get lock; try restarting transaction > 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when > trying to get lock; try restarting transaction > 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when > trying to get lock; try restarting transaction > 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when > trying to get lock; try restarting transaction > 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when > trying to get lock; try restarting transaction > . > . > . > > =================================================================================================================== > The configuration below has been tried and tested and works across on > different architectures e.g. S390X and S390X. > > > [mysqld_safe] > user=root > password='' > [mysqld] > symbolic-links=0 > open_files_limit = 65535 # was 102400 > max_allowed_packet = 16M # was 768M > max-connect-errors = 1000000 > connect_timeout = 120 # was 60 > performance-schema = false > net_read_timeout = 720 > log-warnings=2 > #####files and sockets > # innodb_log_group_home_dir= /data-lun22/logs > pid-file=/tmp/mysqld.pid.sysbench > port = 3306 > # log-error=/var/log/mariadb/mariadb-error-john.log > log-error=/var/log/mariadb/mariadb-error-john.log > general_log_file=/var/log/mariadb/mariadb-john.log > long_query_time = 0 > slow_query_log = 0 > # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log > # general-log > # general_log_file=/var/log/mariadb/mariadb-general.log > ##### MariaDB temporary tables > # tmpdir in RAM > tmpdir = /mnt/tmpdir > #####non innodb options (fixed) > max_connections = 4000 > back_log = 150 > table_open_cache = 8000 # was 2048 # was 4000 # was 8000 > key_buffer_size = 16M > query_cache_type = 0 #was 1 # was 0 > join_buffer_size = 32K > sort_buffer_size = 32K > autocommit = 0 > thread-cache-size = 50 > table-definition-cache = 1024 > table-open-cache = 2048 > ####fixed innodb options > innodb_file_per_table = true > innodb_open_files = 4096 # was 8192 # was 4096 > innodb_data_file_path = ibdata1:250M:autoextend > innodb_flush_log_at_trx_commit = 2 > innodb_flush_method = O_DIRECT_NO_FSYNC > innodb_log_buffer_size = 256M > innodb_log_file_size = 4G # was 14G # was 4G > innodb_log_files_in_group = 2 > innodb_buffer_pool_size = 36G #was 8G # was 48G > innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 > innodb_adaptive_hash_index_partitions = 32 > innodb_thread_concurrency = 0 > ####tuning for RAM disk > innodb_adaptive_flushing = 1 > innodb_flush_neighbors = 0 > innodb_io_capacity = 8000 # was 20000 > innodb_io_capacity_max = 40000 > innodb_lru_scan_depth = 4096 > innodb_purge_threads = 2 > innodb_read_io_threads = 2 > innodb_write_io_threads = 2 > > > Note NMON reports 35% memory usage in the middle of the abovementioned > Banchmark ergo the DB is in cache. No paging. > > ++ Note The R/W TPS rate are "respectable" based on observations from > countless other runs ran on other configurations/architecture. > The TPS rates would obviously improve if the deadlock situation > outlined above was resolved. > > > So there you have it. Any tips / pointers that would enlighten me are > much appreciated. > > Regards, > > > JC > > > Privet Sergey, > > do not think I have a problem with my config(s). > > I am not on the machine at the moment. > From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs > etc. tailored to this configuration, the TPS rate is > > 64 threads - R/O 73K R/W 26K > 128 threads - R/O 71K R/W 23K. > > I am putting the DB in RAM of course to obviate I/O issues. I am > interested in seeing what is happening with the CPU caches and nest.. > > NMON tells me that for the R/O runs I have approximately 85% User, > 12-14% System and the rest Idle - this stays consistent throughout the > run (300 seconds R/O, 600 seconds R/W) which is reasonable. > > On the R/W runs, NMONs CPU usage display shows that some CPUs are > barely firing, or only intermittently and the granularity I had with > R/O is gone. This would seem to tie in with what I see in the MariaDB > log - headwinds, violent storms and deadlocks. > > Any further information you require, please let me know. > > > > Regards, > > > JC > > Hi! > > I just tried this on x86: 40 threads against single table having 10k rows. > No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51". > > This is with latest snapshot of 10.1 and sysbench 1.0. > > I believe you shouldn't see deadlocks on such a big data set. Probably > > something is wrong with your configs? > > Regards, > Sergey > > On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote: > > > > > > Hello Mark, > > > > appreciate the reply. > > > > The OLTP table size and > > OLTP size give me a DB size of appx 12GB. This is what I want. I > have in the > > meantime, looked at some older logs and see that whatever amount of > threads I > > specify (8, 16,32, 64 or 128), the deadlock messages are still > surfacing in the > > R/W phase > > of Sysbench. I even dropped in two different MariaDB release levels > > (10.0.19, 10.1.24) to see whether it would make a difference, > deadlock still > > there. I am using Sysbench 0.5 for these tests. > > I am currently using MariaDB > > 10.1.24 - built directly on the machine. S390x by the way, but the > problem is > > also > > occuring on an X86-64 box. > > > > Regards, > > > > > > JC > > > > > > > > I run sysbench frequently with MySQL > > (not MariaDB yet) and don't recall ever seeing this error. But I > use much > > larger values for --oltp-table-size. You used 85937 and I use 1M or > larger. > > Maybe > > there is too much data contention with smaller tables, even if a > large number of > > tables is used (128 in your case). I use a smaller number of tables > - between 4 > > and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy > sean@jdcassidy.eu <mailto:sean@jdcassidy.eu>> wrote: > > Hello all, > > > > > > have used Sysbench extensively to test MariaDB CPU load factors > between various > > architectures. > > > > I am noticing massive amounts of "[ERROR] mysqld: > > Deadlock found when trying to get lock; try restarting transaction" > in the > > MariaDB error log during the R/W phase. The Sysbench R/W TPS rates > are still > > respectable for my purposes, but how do I > > correct this condition?? Can I > > ignore this / must live with it? > > I have tried three different MariaDB > > releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this > message (x > > 1000's) is happening across all releases in the Sysbench R/W phase. > > > > Here is the Sysbench command that drives the workload against MariaDB - > > > > sysbench --test=lua/oltp.lua --oltp_tables_count=128 > --oltp-table-size=85937 > > --rand-seed=42 --rand-type=uniform --num-threads=128 > --oltp-read-only=off > > --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock > --max-time=201 > > --max-requests=0 --mysql-user=root --percentile=99 run > > > > Would anyone > > have a tip / idea /pointer? > > > > > > Regards, > > > > > > JC > > > > > > John Cassidy > > > > Obere Bühlstrasse 21 > > 8700 Küsnacht > > (ZH) > > Switzerland / Suisse / Schweiz > > > > > > Mobile: +49 152 58961601 <../images/blank.png> (Germany) > > Mobile: +352 621 577 149 (Luxembourg) > > Mobile: +41 78 769 17 97 <../images/blank.png> (CH) > > Landline: +41 44 509 1957 > > > > http://www.jdcassidy.eu <http://www.jdcassidy.eu/> > > > > "Aut viam inveniam aut > > faciam" - Hannibal. > > _______________________________________________ > > Mailing list: https://launchpad.net/~maria-discuss > <https://launchpad.net/%7Emaria-discuss> > > Post to : > > maria-discuss@lists.launchpad.net > <mailto:maria-discuss@lists.launchpad.net> > > Unsubscribe : https://launchpad.net/~maria-discuss > <https://launchpad.net/%7Emaria-discuss> > > More help : https://help.launchpad.net/ListHelp > > --Mark Callaghan > > mdcallag@gmail.com <mailto:mdcallag@gmail.com> > > > > John Cassidy > > > > Obere Bühlstrasse 21 > > 8700 > > Küsnacht (ZH) > > Switzerland / Suisse / Schweiz > > > > > > Mobile: > > +49 152 58961601 (Germany) > > Mobile: +352 621 577 149 <../images/blank.png> (Luxembourg) > > Mobile: > > +41 78 769 17 97 <../images/blank.png> (CH) > > Landline: +41 44 509 1957 <../images/blank.png> > > > > http://www.jdcassidy.eu <http://www.jdcassidy.eu/> > > > > "Aut viam inveniam aut faciam" - > > Hannibal. > > > _______________________________________________ > > Mailing list: https://launchpad.net/~maria-discuss > <https://launchpad.net/%7Emaria-discuss> > > Post to : maria-discuss@lists.launchpad.net > <mailto:maria-discuss@lists.launchpad.net> > > Unsubscribe : https://launchpad.net/~maria-discuss > <https://launchpad.net/%7Emaria-discuss> > > More help : https://help.launchpad.net/ListHelp > > > > > John Cassidy > > Obere Bühlstrasse 21 > 8700 Küsnacht (ZH) > Switzerland / Suisse / Schweiz > > > Mobile: +49 152 58961601 <../images/blank.png> (Germany) > Mobile: +352 621 577 149 <../images/blank.png> (Luxembourg) > Mobile: +41 78 769 17 97 <../images/blank.png> (CH) > Landline: +41 44 509 1957 <../images/blank.png> > > http://www.jdcassidy.eu <http://www.jdcassidy.eu/> > > "Aut viam inveniam aut faciam" - Hannibal. > > John Cassidy > > Obere Bühlstrasse 21 > 8700 Küsnacht (ZH) > Switzerland / Suisse / Schweiz > > > Mobile: +49 152 58961601 <../images/blank.png> (Germany) > Mobile: +352 621 577 149 <../images/blank.png> (Luxembourg) > Mobile: +41 78 769 17 97 <../images/blank.png> (CH) > Landline: +41 44 509 1957 <../images/blank.png> > > http://www.jdcassidy.eu <http://www.jdcassidy.eu/> > > "Aut viam inveniam aut faciam" - Hannibal. > > _______________________________________________ > Mailing list: https://launchpad.net/~maria-discuss > <https://launchpad.net/%7Emaria-discuss> > Post to : maria-discuss@lists.launchpad.net > <mailto:maria-discuss@lists.launchpad.net> > Unsubscribe : https://launchpad.net/~maria-discuss > <https://launchpad.net/%7Emaria-discuss> > More help : https://help.launchpad.net/ListHelp > -- > Mark Callaghan > mdcallag@gmail.com <mailto:mdcallag@gmail.com> > > > John Cassidy > > Obere Bühlstrasse 21 > 8700 Küsnacht (ZH) > Switzerland / Suisse / Schweiz > > > Mobile: +49 152 58961601 (Germany) > Mobile: +352 621 577 149 (Luxembourg) > Mobile: +41 78 769 17 97 (CH) > Landline: +41 44 509 1957 > > http://www.jdcassidy.eu <http://www.jdcassidy.eu/> > > "Aut viam inveniam aut faciam" - > Hannibal._______________________________________________ > Mailing list: https://launchpad.net/~maria-discuss > <https://launchpad.net/%7Emaria-discuss> > Post to : maria-discuss@lists.launchpad.net > <mailto:maria-discuss@lists.launchpad.net> > Unsubscribe : https://launchpad.net/~maria-discuss > <https://launchpad.net/%7Emaria-discuss> > More help : https://help.launchpad.net/ListHelp > > > John Cassidy > > Obere Bühlstrasse 21 > 8700 Küsnacht (ZH) > Switzerland / Suisse / Schweiz > > > Mobile: +49 152 58961601 (Germany) > Mobile: +352 621 577 149 (Luxembourg) > Mobile: +41 78 769 17 97 (CH) > Landline: +41 44 509 1957 > > http://www.jdcassidy.eu > > "Aut viam inveniam aut faciam" - Hannibal. > > > John Cassidy > > Obere Bühlstrasse 21 > 8700 Küsnacht (ZH) > Switzerland / Suisse / Schweiz > > > Mobile: +49 152 58961601 (Germany) > Mobile: +352 621 577 149 (Luxembourg) > Mobile: +41 78 769 17 97 (CH) > Landline: +41 44 509 1957 > > http://www.jdcassidy.eu > > "Aut viam inveniam aut faciam" - Hannibal. > > > _______________________________________________ > Mailing list: https://launchpad.net/~maria-discuss > Post to : maria-discuss@lists.launchpad.net > Unsubscribe : https://launchpad.net/~maria-discuss > More help : https://help.launchpad.net/ListHelp
From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer
Salut Jocelyn, I have just replied to Stephane. Is there a "best practice" setting guidline for autocommit? In my case - Sysbench R/W, single instance MariaDB, no clustering, DB in RAM. Regards, JC Hi John, It seems you have autocommit = 0 in your my.cnf Jocelyn Le 09/06/2017 à 17:46, J. Cassidy a écrit :Stephane, all of my directories, logs, spaces, tables etc. are in RAM. During the course of the benchmark there in ZERO I/O which is my intention. Here is a snippet from the MariaDB KB -autocommit By default, MariaDB runs with autocommit mode enabled. This means that as soon as you execute a statement that updates (modifies) a table, MariaDB stores the update on disk to make it permanent. To disable autocommit mode, use the following statement:SET autocommit=0; So it would seem that I am running with autocommit enabled. Regards, JC Re, Yep get this but just that io a supposed to be fast but still have to hit FS moved from PB to redo log to tblspaceIn a perfect fast world all layer should have equal ressources so no reasons not giving as many threads to write io vs doing the work itself of modifying pages. Disabling auto commit with sysbench is suspicious and intersting : never question myself on this topic before ? /svarStéphane Varoqui, Senior ConsultantPhone: +33 695-926-401, skype: svaroquihttp://www.mariadb.com Le 9 juin 2017 à 16:47, J. Cassidy sean@jdcassidy.eu> a écrit : Salut Stephane, I/O does not enter into the equation, the database is in memory (RAM). Regards, JC Hello, What IO scheduler are you using for FS and witch one ?deadline or loop is a must do. Also mounting with noatime can help for write Would you try such settings i found to make a differenceautocommit = 1 innodb_adaptive_hash_index=0innodb_max_dirty_page_pct=20innodb_write_io_threads =64innodb_log_file_size = 1024Minnodb_log_file_in_group = 4innodb_log_files_in_group = 4 innodb_thread_concurrency = 0innodb_purge_threads = 8 innodb_change_buffering=noneinnodb_open_files = 16384innodb_file_per_table=1innodb_autoinc_lock_mode = 2 And comment# innodb_lru_scan_depth = 4096 Stéphane Varoqui, Senior ConsultantPhone: +33 695-926-401, skype: svaroquihttp://www.mariadb.com Le 9 juin 2017 à 16:22, J. Cassidy sean@jdcassidy.eu> a écrit :Mark, still scratching head... Regards, JC Thanks. I was hoping there was an easy fix if you did something like used serializable isolation, but you haven't done that.On Fri, Jun 9, 2017 at 12:27 AM, J. Cassidy sean@jdcassidy.eu> wrote: All, as discussed yesterday, here is my setup and configuration. ************************************************************* 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* Single Image. No clustering or outside network interaction * SuSE SLES 12 SP2 using a 4.4.49-92 kernel * ************************************************************* ### Sysbench R/W - 64 Threads sysbench 0.5: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 64 Report intermediate results every 2 second(s) Initializing random number generator from seed (42). Threads started! [ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 . . . [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 OLTP test statistics: queries performed: read: 256244548 write: 73212536 other: 36606172 total: 366063256 transactions: 18302990 (30454.11 per sec.) read/write requests: 329457084 (548179.44 per sec.) other operations: 36606172 (60908.54 per sec.) ignored errors: 192 (0.32 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 601.0023s total number of events: 18302990 total time taken by event execution: 38450.5507s response time: min: 0.42ms avg: 2.10ms max: 62.26ms approx. 99 percentile: 5.62ms Threads fairness: events (avg/stddev): 285984.2188/3788.79 execution time (avg/stddev): 600.7899/0.00 ================================================================================================================= ### Excerpt from MariaDB Error Log (Thousands of similiar messages). . . . 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction . . . =================================================================================================================== The configuration below has been tried and tested and works across on different architectures e.g. S390X and S390X. [mysqld_safe] user=root password='' [mysqld] symbolic-links=0 open_files_limit = 65535 # was 102400 max_allowed_packet = 16M # was 768M max-connect-errors = 1000000 connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets # innodb_log_group_home_dir= /data-lun22/logs pid-file=/tmp/mysqld.pid.sysbench port = 3306 # log-error=/var/log/mariadb/mariadb-error-john.log log-error=/var/log/mariadb/mariadb-error-john.log general_log_file=/var/log/mariadb/mariadb-john.log long_query_time = 0 slow_query_log = 0 # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log # general-log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables # tmpdir in RAM tmpdir = /mnt/tmpdir #####non innodb options (fixed) max_connections = 4000 back_log = 150 table_open_cache = 8000 # was 2048 # was 4000 # was 8000 key_buffer_size = 16M query_cache_type = 0 #was 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K autocommit = 0 thread-cache-size = 50 table-definition-cache = 1024 table-open-cache = 2048 ####fixed innodb options innodb_file_per_table = true innodb_open_files = 4096 # was 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G # was 14G # was 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 36G #was 8G # was 48G innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0 ####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_io_capacity = 8000 # was 20000 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 2 innodb_write_io_threads = 2 Note NMON reports 35% memory usage in the middle of the abovementioned Banchmark ergo the DB is in cache. No paging. ++ Note The R/W TPS rate are "respectable" based on observations from countless other runs ran on other configurations/architecture. The TPS rates would obviously improve if the deadlock situation outlined above was resolved. So there you have it. Any tips / pointers that would enlighten me are much appreciated. Regards, JC Privet Sergey, do not think I have a problem with my config(s). I am not on the machine at the moment. pools, logs etc. tailored to this configuration, the TPS rate is 64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K. I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest.. NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable. On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks. Any further information you require, please let me know. Regards, JC Hi! I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51". This is with latest snapshot of 10.1 and sysbench 1.0. I believe you shouldn't see deadlocks on such a big data set. Probably something is wrong with your configs? Regards, Sergey On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size and OLTP
size give me a DB size of appx 12GB. This is what I want. I have in the
specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in
meantime, looked at some older logs and see that whatever amount of threads I the
R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also
occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with
MySQL
(not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various
architectures.
I am noticing massive amounts of "[ERROR]
Deadlock found when trying to get lock; try restarting
mysqld: transaction" in the
MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x
1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against
MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128
--oltp-table-size=85937
--rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone
have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149
(Luxembourg)
Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to :
maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp
--Mark Callaghan mdcallag@gmail.com
John Cassidy
Obere Bühlstrasse 21 8700
Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan mdcallag@gmail.com John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal._______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal.
I switched from old sysbench to modern sysbench (1.0) last year. I use sysbench a lot and have never set autocommit=0 in my.cnf. I also have never had this problem. I also know that correlation != causation. I have had problems in the past while modifying the insert bench when autocommit was set wrong. I think you mentioned something about running sysbench on s390. I would love to read more about that. -- Mark Callaghan mdcallag@gmail.com
Ok but form the conf you past you get autocommit = 0 So if sysbench does not commit you may get into deadlock quiet fast /svar Stéphane Varoqui, Senior Consultant Phone: +33 695-926-401, skype: svaroqui http://www.mariadb.com <http://www.mariadb.com/>
Le 9 juin 2017 à 09:27, J. Cassidy <sean@jdcassidy.eu> a écrit :
All,
as discussed yesterday, here is my setup and configuration.
************************************************************* 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* Single Image. No clustering or outside network interaction * SuSE SLES 12 SP2 using a 4.4.49-92 kernel * ************************************************************* ### Sysbench R/W - 64 Threads
sysbench 0.5: multi-threaded system evaluation benchmark
Running the test with following options: Number of threads: 64 Report intermediate results every 2 second(s) Initializing random number generator from seed (42).
Threads started!
[ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 . . . [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 OLTP test statistics: queries performed: read: 256244548 write: 73212536 other: 36606172 total: 366063256 transactions: 18302990 (30454.11 per sec.) read/write requests: 329457084 (548179.44 per sec.) other operations: 36606172 (60908.54 per sec.) ignored errors: 192 (0.32 per sec.) reconnects: 0 (0.00 per sec.)
General statistics: total time: 601.0023s total number of events: 18302990 total time taken by event execution: 38450.5507s response time: min: 0.42ms avg: 2.10ms max: 62.26ms approx. 99 percentile: 5.62ms
Threads fairness: events (avg/stddev): 285984.2188/3788.79 execution time (avg/stddev): 600.7899/0.00
=================================================================================================================
### Excerpt from MariaDB Error Log (Thousands of similiar messages). . . . 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction . . .
=================================================================================================================== The configuration below has been tried and tested and works across on different architectures e.g. S390X and S390X.
[mysqld_safe] user=root password='' [mysqld] symbolic-links=0 open_files_limit = 65535 # was 102400 max_allowed_packet = 16M # was 768M max-connect-errors = 1000000 connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets # innodb_log_group_home_dir= /data-lun22/logs pid-file=/tmp/mysqld.pid.sysbench port = 3306 # log-error=/var/log/mariadb/mariadb-error-john.log log-error=/var/log/mariadb/mariadb-error-john.log general_log_file=/var/log/mariadb/mariadb-john.log long_query_time = 0 slow_query_log = 0 # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log # general-log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables # tmpdir in RAM tmpdir = /mnt/tmpdir #####non innodb options (fixed) max_connections = 4000 back_log = 150 table_open_cache = 8000 # was 2048 # was 4000 # was 8000 key_buffer_size = 16M query_cache_type = 0 #was 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K autocommit = 0 thread-cache-size = 50 table-definition-cache = 1024 table-open-cache = 2048 ####fixed innodb options innodb_file_per_table = true innodb_open_files = 4096 # was 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G # was 14G # was 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 36G #was 8G # was 48G innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0 ####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_io_capacity = 8000 # was 20000 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 2 innodb_write_io_threads = 2
<*> Note NMON reports 35% memory usage in the middle of the abovementioned Banchmark ergo the DB is in cache. No paging.
++ Note The R/W TPS rate are "respectable" based on observations from countless other runs ran on other configurations/architecture. The TPS rates would obviously improve if the deadlock situation outlined above was resolved.
So there you have it. Any tips / pointers that would enlighten me are much appreciated.
Regards,
JC
Privet Sergey,
do not think I have a problem with my config(s).
I am not on the machine at the moment. From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is
64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K.
I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest..
NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable.
On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks.
Any further information you require, please let me know.
Regards,
JC
Hi!
I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51".
This is with latest snapshot of 10.1 and sysbench 1.0.
I believe you shouldn't see deadlocks on such a big data set. Probably something is wrong with your configs?
Regards, Sergey
On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size and OLTP size give me a DB size of appx 12GB. This is what I want. I have in the meantime, looked at some older logs and see that whatever amount of threads I specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in the R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with MySQL (not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu <mailto:sean@jdcassidy.eu>> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various architectures.
I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction" in the MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x 1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command that drives the workload against MariaDB -
sysbench --test=lua/oltp.lua --oltp_tables_count=128 --oltp-table-size=85937 --rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal. _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan mdcallag@gmail.com <mailto:mdcallag@gmail.com>
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> More help : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp>
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal.
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
http://www.jdcassidy.eu <http://www.jdcassidy.eu/>
"Aut viam inveniam aut faciam" - Hannibal.<untitled-[2].html>_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss <https://launchpad.net/~maria-discuss> More help : https://help.launchpad.net/ListHelp <https://help.launchpad.net/ListHelp>
Stephane, you are off course, correct. I must test this when I get the chance. Most likely tomorrow morning on an X86 box. I shall report back. Regards, JC Ok but form the conf you past you get autocommit = 0 So if sysbench does not commit you may get into deadlock quiet fast /svar Stéphane Varoqui, Senior ConsultantPhone: +33 695-926-401, skype: svaroquihttp://www.mariadb.com Le 9 juin 2017 à 09:27, J. Cassidy sean@jdcassidy.eu> a écrit : All, as discussed yesterday, here is my setup and configuration. ************************************************************* 16 VCPU 64GB Memory, DB size 12 GB (In RAM), MariaDB 10.1.24* Single Image. No clustering or outside network interaction * SuSE SLES 12 SP2 using a 4.4.49-92 kernel * ************************************************************* ### Sysbench R/W - 64 Threads sysbench 0.5: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 64 Report intermediate results every 2 second(s) Initializing random number generator from seed (42). Threads started! [ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes: 204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00 [ 4s] threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time: 4.65ms (99%), errors: 5.00, reconnects: 0.00 [ 6s] threads: 64, tps: 46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%), errors: 2.50, reconnects: 0.00 [ 8s] threads: 64, tps: 45940.12, reads: 643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00, reconnects: 0.00 [ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes: 183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00 [ 12s] threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time: 4.80ms (99%), errors: 1.50, reconnects: 0.00 [ 14s] threads: 64, tps: 44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%), errors: 0.50, reconnects: 0.00 [ 16s] threads: 64, tps: 44634.78, reads: 624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50, reconnects: 0.00 . . . [ 572s] threads: 64, tps: 24980.71, reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00, reconnects: 0.00 [ 574s] threads: 64, tps: 25337.03, reads: 354796.98, writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00 [ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24, response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00 [ 578s] threads: 64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 [ 580s] threads: 64, tps: 25241.11, reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00, reconnects: 0.00 [ 582s] threads: 64, tps: 25146.18, reads: 352056.04, writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00 [ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 586s] threads: 64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms (99%), errors: 0.00, reconnects: 0.00 [ 588s] threads: 64, tps: 25070.60, reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00, reconnects: 0.00 [ 590s] threads: 64, tps: 25115.20, reads: 351666.84, writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94, response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00 [ 594s] threads: 64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms (99%), errors: 0.00, reconnects: 0.00 [ 596s] threads: 64, tps: 24854.73, reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00, reconnects: 0.00 [ 598s] threads: 64, tps: 24888.16, reads: 348463.80, writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00 [ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38, response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00 OLTP test statistics: queries performed: read: 256244548 write: 73212536 other: 36606172 total: 366063256 transactions: 18302990 (30454.11 per sec.) read/write requests: 329457084 (548179.44 per sec.) other operations: 36606172 (60908.54 per sec.) ignored errors: 192 (0.32 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 601.0023s total number of events: 18302990 total time taken by event execution: 38450.5507s response time: min: 0.42ms avg: 2.10ms max: 62.26ms approx. 99 percentile: 5.62ms Threads fairness: events (avg/stddev): 285984.2188/3788.79 execution time (avg/stddev): 600.7899/0.00 ================================================================================================================= ### Excerpt from MariaDB Error Log (Thousands of similiar messages). . . . 2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4360227392784 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353073515792 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction 2017-06-08 8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try restarting transaction . . . =================================================================================================================== The configuration below has been tried and tested and works across on different architectures e.g. S390X and S390X. [mysqld_safe] user=root password='' [mysqld] symbolic-links=0 open_files_limit = 65535 # was 102400 max_allowed_packet = 16M # was 768M max-connect-errors = 1000000 connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets # innodb_log_group_home_dir= /data-lun22/logs pid-file=/tmp/mysqld.pid.sysbench port = 3306 # log-error=/var/log/mariadb/mariadb-error-john.log log-error=/var/log/mariadb/mariadb-error-john.log general_log_file=/var/log/mariadb/mariadb-john.log long_query_time = 0 slow_query_log = 0 # slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log # general-log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables # tmpdir in RAM tmpdir = /mnt/tmpdir #####non innodb options (fixed) max_connections = 4000 back_log = 150 table_open_cache = 8000 # was 2048 # was 4000 # was 8000 key_buffer_size = 16M query_cache_type = 0 #was 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K autocommit = 0 thread-cache-size = 50 table-definition-cache = 1024 table-open-cache = 2048 ####fixed innodb options innodb_file_per_table = true innodb_open_files = 4096 # was 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G # was 14G # was 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 36G #was 8G # was 48G innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0 ####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_io_capacity = 8000 # was 20000 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 2 innodb_write_io_threads = 2 Note NMON reports 35% memory usage in the middle of the abovementioned Banchmark ergo the DB is in cache. No paging. ++ Note The R/W TPS rate are "respectable" based on observations from countless other runs ran on other configurations/architecture. The TPS rates would obviously improve if the deadlock situation outlined above was resolved. So there you have it. Any tips / pointers that would enlighten me are much appreciated. Regards, JC Privet Sergey, do not think I have a problem with my config(s). I am not on the machine at the moment.
From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools, logs etc. tailored to this configuration, the TPS rate is
64 threads - R/O 73K R/W 26K 128 threads - R/O 71K R/W 23K. I am putting the DB in RAM of course to obviate I/O issues. I am interested in seeing what is happening with the CPU caches and nest.. NMON tells me that for the R/O runs I have approximately 85% User, 12-14% System and the rest Idle - this stays consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is reasonable. On the R/W runs, NMONs CPU usage display shows that some CPUs are barely firing, or only intermittently and the granularity I had with R/O is gone. This would seem to tie in with what I see in the MariaDB log - headwinds, violent storms and deadlocks. Any further information you require, please let me know. Regards, JC Hi! I just tried this on x86: 40 threads against single table having 10k rows. No deadlocks, "tps: 12372.40, reads: 173219.52, writes: 49490.51". This is with latest snapshot of 10.1 and sysbench 1.0. I believe you shouldn't see deadlocks on such a big data set. Probably something is wrong with your configs? Regards, Sergey On Thu, Jun 08, 2017 at 04:51:31PM +0200, J. Cassidy wrote:
Hello Mark,
appreciate the reply.
The OLTP table size
OLTP size give me a DB size of appx 12GB. This is what I want. I have in the meantime, looked at some older logs and see that whatever amount of threads I specify (8, 16,32, 64 or 128),
R/W phase of Sysbench. I even dropped in two different MariaDB release levels (10.0.19, 10.1.24) to see whether it would make a difference, deadlock still there. I am using Sysbench 0.5 for these tests. I am currently using MariaDB 10.1.24 - built directly on the machine. S390x by the way, but the problem is also occuring on an X86-64 box.
Regards,
JC
I run sysbench frequently with MySQL (not MariaDB yet) and don't recall ever seeing this error. But I use much larger values for --oltp-table-size. You used 85937 and I use 1M or larger. Maybe there is too much data contention with smaller tables, even if a large number of tables is used (128 in your case). I use a smaller number of tables - between 4 and 20. On Thu, Jun 8, 2017 at 12:48 AM, J. Cassidy sean@jdcassidy.eu> wrote: Hello all,
have used Sysbench extensively to test MariaDB CPU load factors between various architectures.
I am noticing massive amounts of "[ERROR] mysqld: Deadlock found when trying to get lock; try restarting
MariaDB error log during the R/W phase. The Sysbench R/W TPS rates are still respectable for my purposes, but how do I correct this condition?? Can I ignore this / must live with it? I have tried three different MariaDB releases (10.0.19, 10.1.12, 10.1.24) to narrow things down, but this message (x 1000's) is happening across all releases in the Sysbench R/W phase.
Here is the Sysbench command
and the deadlock messages are still surfacing in the transaction" in the that drives the workload against MariaDB -
sysbench --test=lua/oltp.lua
--oltp_tables_count=128 --oltp-table-size=85937
--rand-seed=42 --rand-type=uniform --num-threads=128 --oltp-read-only=off --report-interval=2 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=201 --max-requests=0 --mysql-user=root --percentile=99 run
Would anyone have a tip / idea /pointer?
Regards,
JC
John Cassidy
Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH)
Landline: +41 44 509 1957
"Aut viam inveniam aut
faciam" - Hannibal.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp --Mark Callaghan
mdcallag@gmail.com
John Cassidy
Obere Bühlstrasse
21
8700
Küsnacht (ZH)
Switzerland / Suisse / Schweiz
Mobile: +49 152 58961601 (Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957
"Aut viam inveniam aut faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal. John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal._______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp John Cassidy Obere Bühlstrasse 21 8700 Küsnacht (ZH) Switzerland / Suisse / Schweiz Mobile: +49 152 58961601 (Germany) Mobile: +352 621 577 149 (Luxembourg) Mobile: +41 78 769 17 97 (CH) Landline: +41 44 509 1957 http://www.jdcassidy.eu "Aut viam inveniam aut faciam" - Hannibal.
participants (4)
-
J. Cassidy
-
jocelyn fournier
-
MARK CALLAGHAN
-
Stephane Varoqui