[Maria-developers] Parallel replication benchmarks
I've done a set of benchmarks for parallel replication on the same machine I used previously for my group commit benchmarks, http://kristiannielsen.livejournal.com/16382.html The code tested is the newest code in the bzr repository and what will be in 10.0.9 (this is significantly improved from what is in 10.0.8). I plan to write up a blog post about it in a couple of days with nice graphs, but meanwhile Axel asked me to summarise in this mail. I tested with sysbench 0.5, using oltp.lua (medium-sized transactions) and update_index.lua (minimal transactions with just a single primary-key update per transaction). I used 10M rows, 16GB buffer pool and 2 * 1.9 GB redo logs. This is with a single table. I tested simply by preparing the binlog on the master, then setting up an already prepared slave and doing START SLAVE UNTIL the end of the log. The error log then shows the time spent for the slave to catch up. I tested everyting in GTID mode, as that is the recommended mode for parallel replication (though my guess is that old-style replication will be much the same, there isn't much difference in the code between what is done to actually execute events). Node that these tests are for in-order parallel replication. All commits on all slaves happen in the same order as on the master; the use of parallel replication is invisible to applications. This is in contrast to eg. MySQL 5.6 multi-threaded slave, which requires the application to partition its data into independent schemas. Here are the prelimiary results, in number of seconds for the slave to catch up (lower is better) versus number of threads (--slave-parallel-threads, 0 means not using parallel replication): For oltp.lua. 48 threads used to generate the load on the master, and --binlog-commit-wait-count=12 --binlog-commit-wait-usec=10000 to allow the server to delay a commit by up to 10 milliseconds in order to get more group commit and thus more opportunities for parallel apply on the slave: A: --log-slave-updates --sync-binlog=1 --innodb-flush-log-at-trx-commit=1 B: --skip-log-slave-updates --innodb-flush-log-at-trx-commit=1 C: --skip-log-bin --innodb-flush-log-at-trx-commit=2 D: --log-bin=master-bin --sync-binlog=0 --innodb-flush-log-at-trx-commit=0 #thr A B C D 0 1065 869 193 202 2 361 432 147 161 4 221 264 118 121 8 135 177 103 107 12 114 153 104 105 16 109 140 104 107 24 111 139 107 105 32 111 136 99 109 48 111 126 108 109 64 111 121 99 111 We see here a 2-10 times speedup from parallel replication. The master has around 12 transactions in every group commit, which provides good opportunities for parallelism on the slave. Note that parallel replication is especially effective when the binlog is enabled and crash-safe (--sync-binlog=1 --innodb-flush-log-at-trx-commit=1). This is because parallel replication can run the commit of one transaction in parallel with any other transaction, even if the two transactions would otherwise conflict. This makes group commit especially effective. In fact, this manages to more or less completely eliminate any penalty for enabling crash-safe binlog on the slave, which is quite nice. Note also that disabling the binlog actually tends to make things _slower_, not faster, when using parallel replication. I believe this is due to MDEV-5802, which may be worth fixing for 10.0. Here are results for update_simple.lua with 48 threads on the master. This produced around 13 transactions per group commit on the master, with no --binlog-commit-wait-count to delay commits: A: --log-slave-updates --sync-binlog=1 --innodb-flush-log-at-trx-commit=1 B: --skip-log-slave-updates --innodb-flush-log-at-trx-commit=1 C: --skip-log-bin --innodb-flush-log-at-trx-commit=2 #thr A B C 0 931 899 271 2 546 653 258 4 365 494 176 8 261 365 203 12 247 350 197 16 233 336 207 24 242 316 209 32 237 292 194 48 235 270 208 64 228 249 195 Again we get a good speedup from parallel replication, even though with such small transactions, there is less opportunity for improvement, as the actual work for transactions is rather small compared to the overhead for managing the replication of each event. And again, the ability to utilise group commit effectively provides the biggest benefit. Finally, I tried a test of update_index.lua where I ran the load on the master single-threadedly. This creates a binlog with _no_ opportunities for parallelism from group commits - each transaction needs to be executed on its own by the slave, as we do not know for sure that they will not conflict on row locks. However, due to the possibility to run the commits in parallel (and hence get group commit on the slave), we still see some speedup even here when --sync-binlog=1 and --innodb-flush-log-at-trx-commit=1. When binlog and innodb sync is disabled, parallel replication makes things slower due to the overhead of thread communication: A: --log-slave-updates --sync-binlog=1 --innodb-flush-log-at-trx-commit=1 B: --skip-log-slave-updates --innodb-flush-log-at-trx-commit=1 C: --skip-log-bin --innodb-flush-log-at-trx-commit=2 #thr A B C 0 1075 949 270 2 597 673 319 4 443 623 334 8 407 588 349 12 393 544 336 16 391 536 352 24 - 492 336 32 389 472 358 48 389 419 344 64 391 399 354 So overall, results look very good, especially for slaves with binlog enabled. (And binlog disabled could turn out better if MDEV-5802 is fixed). Let me know if there are any questions, and I'll be happy to answer them. - Kristian.
participants (1)
-
Kristian Nielsen