Hi, There are a number of things it could be. I've provided data collection instruction if these suggestions don't help, but it is most likely a combination of the three following issues. A). You might be dirtying your buffer pool at a rate at which flushing can not keep up. The most likely reasons are that you have innodb_io_capacity too low (and perhaps innodb_lru_seek_depth). You have innodb_io_capacity=4000. Since you have sysbench, try a 16kb random io write test and see how many IOPS you get. You may have to adjust this value up or down accordingly. You should set it to about 60-80% of your actual write IO capacity, because you need IOPS for reads, and this variable only takes into account writes. You could also try doing a random read/write test (16kb) and set it to 1/3 the IOPS you get, and set innodb_io_cspacity_max=1/2 of it. B) you definitely have too many buffer pools. You should use a buffer pool per 8 gb, so you should use six or seven buffer pools. This should seriously reduce the flush rate and yield better overall results. C) for smooth transaction rates with many threads writing to disk, 4GB log size is almost certainly too small. Try 16GB logs. Try C) this first to see if it smooths out your ride. If it just moves it to later, try the above changes which affect flushing. If it fixes it, try the above changes anyway, as you'll have a baseline to compare them too, but definitely reduce the number of buffer pools and compare. On a side note: Are you using one disk or multiple disks? Is this a SAN? If it is a SAN, then using DIRECT IO is probably a bad idea, but shouldn't cause periodic drops. Also no warranty on advice, if it causes your database to nuclear detonate, sorry, and as always YMMV, tuning is an art, not a science. To collect diag info if above suggestions fail: ----------------- Please run sysbench for six minutes. At 30 second intervals execute: SHOW ENGINE INNODB STATUS\G SHOW GLOBAL STATUS; In addition(only run these once, terminate after sysbench) iostat -kx 15> disk.txt & vmstat 15 > mem.txt & Or get pt-stalk from Percona toolkit, which will collect this info (and a bunch more) automatically. --Swany Sent from my iPhone
On Oct 7, 2015, at 8:14 AM, J. Cassidy <sean@jdcassidy.eu> wrote:
Hello all,
I am running MariaDB 10.1.5 with Sysbench 0.5 (lua) against an IBM Blade (Haswell) using Hitachi 15K SAS Hdd's.
Operating system is RHEL 7. Hyperthreading is turned off. System memory is 48GB.
Up to now, I have ran extensive Sysbench tests against MariaDB using RAM as the datastore with no appreciable problems.
Now I have switched the datastore to normal disk (see above) and am seeing the Sysbench R/W TPS rate take a hit after approximately 3 minutes (180 seconds) into the run (runtime is 600 seconds) and every three minutes thereafter. I am seeing nothing in the MariaDB logs (error and Slow queries), nor in Linux itself, nor is Sysbench telling me anything (I have debug rurned on and verbosity set to 5).
Here is an excerpt from one of the R/W Sysbench runs
[ 168s] threads: 128, tps: 26262.00, reads: 367679.02, writes: 105188.51, response time: 50.12ms (99%), errors: 0.00, reconnects: 0.00 [ 170s] threads: 128, tps: 25913.49, reads: 362826.86, writes: 103670.96, response time: 49.08ms (99%), errors: 0.00, reconnects: 0.00 [ 172s] threads: 128, tps: 26271.51, reads: 367736.58, writes: 105009.52, response time: 50.34ms (99%), errors: 0.00, reconnects: 0.00 [ 174s] threads: 128, tps: 26059.52, reads: 364822.73, writes: 104187.07, response time: 44.69ms (99%), errors: 0.00, reconnects: 0.00 [ 176s] threads: 128, tps: 26394.00, reads: 369573.99, writes: 105652.50, response time: 52.28ms (99%), errors: 0.00, reconnects: 0.00 [ 178s] threads: 128, tps: 26248.49, reads: 367422.90, writes: 104927.97, response time: 49.85ms (99%), errors: 0.00, reconnects: 0.00 [ 180s] threads: 128, tps: 26549.00, reads: 371726.52, writes: 106166.01, response time: 60.01ms (99%), errors: 0.00, reconnects: 0.00 [ 182s] threads: 128, tps: 21493.92, reads: 301020.88, writes: 86033.18, response time: 47.29ms (99%), errors: 0.00, reconnects: 0.00 [ 184s] threads: 128, tps: 174.00, reads: 2435.96, writes: 706.99, response time: 2245.86ms (99%), errors: 0.00, reconnects: 0.00 [ 186s] threads: 128, tps: 185.50, reads: 2597.00, writes: 760.50, response time: 2334.28ms (99%), errors: 0.00, reconnects: 0.00 [ 188s] threads: 128, tps: 269.00, reads: 3766.00, writes: 1066.50, response time: 2314.11ms (99%), errors: 0.00, reconnects: 0.00 [ 190s] threads: 128, tps: 153.50, reads: 2149.00, writes: 619.50, response time: 2121.06ms (99%), errors: 0.00, reconnects: 0.00 [ 192s] threads: 128, tps: 552.00, reads: 7728.00, writes: 2199.50, response time: 1440.33ms (99%), errors: 0.00, reconnects: 0.00 [ 194s] threads: 128, tps: 615.00, reads: 8610.00, writes: 2450.00, response time: 1414.69ms (99%), errors: 0.00, reconnects: 0.00 [ 196s] threads: 128, tps: 908.00, reads: 12712.00, writes: 3650.00, response time: 950.93ms (99%), errors: 0.00, reconnects: 0.00 [ 198s] threads: 128, tps: 1579.00, reads: 22052.47, writes: 6283.99, response time: 932.61ms (99%), errors: 0.00, reconnects: 0.00 [ 200s] threads: 128, tps: 1562.00, reads: 21921.53, writes: 6257.01, response time: 748.43ms (99%), errors: 0.00, reconnects: 0.00 [ 202s] threads: 128, tps: 4166.00, reads: 58323.95, writes: 16652.49, response time: 445.78ms (99%), errors: 0.00, reconnects: 0.00 [ 204s] threads: 128, tps: 11574.99, reads: 161931.90, writes: 46264.97, response time: 263.77ms (99%), errors: 0.00, reconnects: 0.00 [ 206s] threads: 128, tps: 17420.50, reads: 244004.98, writes: 69714.50, response time: 103.69ms (99%), errors: 0.00, reconnects: 0.00 [ 208s] threads: 128, tps: 19833.49, reads: 277626.90, writes: 79424.97, response time: 93.88ms (99%), errors: 0.00, reconnects: 0.00 [ 210s] threads: 128, tps: 24994.00, reads: 349910.48, writes: 99975.49, response time: 48.12ms (99%), errors: 0.00, reconnects: 0.00 [ 212s] threads: 128, tps: 25129.50, reads: 351825.03, writes: 100549.51, response time: 52.12ms (99%), errors: 0.00, reconnects: 0.00 [ 214s] threads: 128, tps: 24982.99, reads: 349739.36, writes: 99821.46, response time: 47.66ms (99%), errors: 0.00, reconnects: 0.00
My.cnf -
[mysqld_safe] user=root password='' [mysqld] open_files_limit = 102400 max_allowed_packet = 768M connect_timeout = 120 # was 60 performance-schema = false net_read_timeout = 720 log-warnings=2 #####files and sockets pid-file=/tmp/mysqld.pid.sysbench log_warnings=2 log-error=/var/log/mariadb/mariadb-error-john.log long_query_time = 1 slow_query_log = 1 slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log log_queries_not_using_indexes = 1 # general-log # general-log-file=/var/log/mariadb/mariadb-general.log # general_log_file=/var/log/mariadb/mariadb-general.log ##### MariaDB temporary tables tmpdir = /mnt/ramdisk
#####non innodb options (fixed) max_connections = 8000 # was 4000 back_log = 150 table_open_cache = 4000 # was 8000 key_buffer_size = 16M query_cache_type = 1 # was 0 join_buffer_size = 32K sort_buffer_size = 32K
#####use InnoDB plugin #ignore-builtin-innodb #plugin-load=innodb=ha_innodb.so
#####fixed innodb options innodb_file_per_table = true innodb_open_files = 8192 # was 4096 innodb_data_file_path = ibdata1:250M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_buffer_size = 256M innodb_log_file_size = 4G innodb_log_files_in_group = 2 innodb_buffer_pool_size = 40G innodb_buffer_pool_instances = 32 innodb_adaptive_hash_index_partitions = 32 innodb_thread_concurrency = 0
#####Power8 specific #innodb_spin_wait_delay = 0
#####tuning for SAN storage innodb_adaptive_flushing = 1 innodb_flush_neighbors = 1 innodb_io_capacity = 4000 innodb_io_capacity_max = 6000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_read_io_threads = 8 # was 8 innodb_write_io_threads = 16
####tuning for RAM disk #innodb_adaptive_flushing = 1 #innodb_flush_neighbors = 0 #innodb_io_capacity = 20000 #innodb_io_capacity_max = 40000 #innodb_lru_scan_depth = 4096 #innodb_purge_threads = 2 #innodb_read_io_threads = 2 #innodb_write_io_threads = 2
Would anyone be able to enlighten me as to why I am getting this TPS performance hit after approximately three minutes?
Any pointers much appreciated.
Regards,
Aubrey
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp