Re: [Maria-developers] parallel replication performance
Axel Schwenke <axel@askmonty.org> writes:
The poor performance of MySQL with 1 database is expected. However MySQL is much better with 32 databases. Even better than MariaDB :(
I haven't had time to go through all the numbers and details yet, but just a quick check: In this test, are you sure that you are running MariaDB in GTID mode? CHANGE MASTER TO master_use_gtid=slave_pos If not, you won't get any parallelism out of using 32 databases with different domain_ids. So two interesting, but different comparisons to make here: 1. MySQL multi-threaded slave compared to MariaDB GTID with different domain_id per database. If MySQL is significantly better here it could indicate some problem in MariaDB parallel replication that should be investigated; this is possible as this part has not been tested a lot. 2. MySQL multi-threaded slave with load split into multiple databases, compared to MariaDB without configuring different domains. Or just MariaDB compared without and with different domain ids. This is an "unfair" benchmark; by explicitly declaring that different databases/domains can run in parallel, more opportunity for parallellism is available. But it could still be interesting to see how much of that can be obtained from group commit parallelism, without changing applications. Both of these are interesting, but it was not clear to me which of them you were describing? - Kristian.
Hi Kristian, Am 25.03.2014 15:55, schrieb Kristian Nielsen:
Axel Schwenke <axel@askmonty.org> writes:
The poor performance of MySQL with 1 database is expected. However MySQL is much better with 32 databases. Even better than MariaDB :(
I haven't had time to go through all the numbers and details yet, but just a quick check:
In this test, are you sure that you are running MariaDB in GTID mode?
CHANGE MASTER TO master_use_gtid=slave_pos
Yes. This is how the shell code looks: mysql -S ${SLAVE_SOCKET} -u root -e "SET GLOBAL gtid_slave_pos=''" mysql -S ${SLAVE_SOCKET} -u root -e "change master to master_host='127.0.0.1', master_user='rep', master_use_gtid=current_pos" mysql -S ${SLAVE_SOCKET} -u root -e "start slave io_thread" ok?
So two interesting, but different comparisons to make here:
1. MySQL multi-threaded slave compared to MariaDB GTID with different domain_id per database. If MySQL is significantly better here it could indicate some problem in MariaDB parallel replication that should be investigated; this is possible as this part has not been tested a lot.
I see
2. MySQL multi-threaded slave with load split into multiple databases, compared to MariaDB without configuring different domains. Or just MariaDB compared without and with different domain ids. This is an "unfair" benchmark; by explicitly declaring that different databases/domains can run in parallel, more opportunity for parallellism is available. But it could still be interesting to see how much of that can be obtained from group commit parallelism, without changing applications.
Both of these are interesting, but it was not clear to me which of them you were describing?
Actually I haven't tested "MariaDB + multi databases + single GTID domain". But I don't expect this the behave significantly different from "MariaDB + single database + single GTID domain". Or will it? So far the contestants are: MySQL + single db MySQL + 32 db MariaDB + single db MariaDB + 32 db + 32 GTID domains where the last one should compare best to "MySQL + 32 db" Two related questions from me: 1. is there an upper limit for the number of trx per commit group? Any variables influencing that? commit_wait_count? innodb_log_buffer_size? 2. with multiple GTID domains - do START SLAVE ... UNTIL master_gtid_pos=... and MASTER_GTID_WAIT(...) work as expected? What I expect is this: trx in the binlog/relaylog have a strict order. If a position is given per GTID, the terms /before/ and /after/ that GTID should be based on the position in the log. Does that hold with multiple slave threads? BR, XL -- Axel Schwenke - Senior Performance Engineer MariaDB Services AB
Axel Schwenke <axel@askmonty.org> writes:
In this test, are you sure that you are running MariaDB in GTID mode?
CHANGE MASTER TO master_use_gtid=slave_pos
Yes. This is how the shell code looks:
mysql -S ${SLAVE_SOCKET} -u root -e "SET GLOBAL gtid_slave_pos=''" mysql -S ${SLAVE_SOCKET} -u root -e "change master to master_host='127.0.0.1', master_user='rep', master_use_gtid=current_pos" mysql -S ${SLAVE_SOCKET} -u root -e "start slave io_thread"
ok?
Yes, that looks good, thanks for the clarification.
1. MySQL multi-threaded slave compared to MariaDB GTID with different domain_id per database. If MySQL is significantly better here it could indicate some problem in MariaDB parallel replication that should be investigated; this is possible as this part has not been tested a lot.
I see
Right, so this is something I should investigate. A couple of things that can be adjusted are --slave-domain-parallel-threads and --slave-parallel-max-queued. But I wouldn't expect them to have much effect on a sysbench load. If you want, you could try increase --slave-parallel-max-queued and set --slave-domain-parallel-threads to 1/32 of --slave-parallel-threads, to better match what (I think) MySQL multi-threaded slave is doing. But its possible there is some performance limitation or bug that I need to investigate.
Actually I haven't tested "MariaDB + multi databases + single GTID domain". But I don't expect this the behave significantly different from "MariaDB + single database + single GTID domain". Or will it?
No, I would expect it to be much the same.
Two related questions from me:
1. is there an upper limit for the number of trx per commit group? Any variables influencing that? commit_wait_count? innodb_log_buffer_size?
There is no upper limit, except that it cannot be larger than the maximum number of client threads running/
2. with multiple GTID domains - do START SLAVE ... UNTIL master_gtid_pos=... and MASTER_GTID_WAIT(...) work as expected?
Yes. Assuming your expectations are correct, of course :-) START SLAVE UNTIL "0-1-100,1-3-300" will replicate in domain 0 until 0-1-100, and in domain 1 until 1-3-300, and then stop. Note that if there is anything in domain 2, it will stop immediately (effectively at 2-X-0), so nothing will be executed in domain 2. Each domain will stop at the right point independently of the others. MASTER_GTID_WAIT("0-1-100,1-3-300") will wait for domain 0 to reach 0-1-100, and then also wait for domain 1 to reach 1-3-300, and then return. It will not be affected by anything in domain 2. So the only issue I can think of is that you need to specify all domains in your START UNTIL and MASTER_GTID_WAIT() commands. You can just use the value of @@gtid_binlog_pos on the master, I suppose.
What I expect is this: trx in the binlog/relaylog have a strict order. If a position is given per GTID, the terms /before/ and /after/ that GTID should be based on the position in the log. Does that hold with multiple slave threads?
Hm, I do not think I understand what you are asking here. Yes, the transactions are strictly ordered in the binlog. But only within each domain. Transactions/GTIDs in different domains are in principle not ordered with respect to each other, though of course in any given binlog they will be seen in _some_ order. But certainly, if using GTID with parallel replication, it is possible for two GTIDs with different domains, say 0-1-100 and 1-1-300, to appear in this order in the master's binlog, but in the opposite order in the slave's binlog. In contrast, two GTIDs with the same domain id will always occur in the same order in the master and slave binlogs. Hope this helps, - Kristian.
Kristian Nielsen wrote:
A couple of things that can be adjusted are --slave-domain-parallel-threads and --slave-parallel-max-queued. But I wouldn't expect them to have much effect on a sysbench load. If you want, you could try increase --slave-parallel-max-queued and set --slave-domain-parallel-threads to 1/32 of --slave-parallel-threads, to better match what (I think) MySQL multi-threaded slave is doing.
I see. I haven't touched those at all yet. But since sysbench distributes the load quite evenly on the databases I don't expect any change from them. I use the following Lua code "set session gtid_domain_id=" .. (thread_id % oltp_databases) + 1 to set the GTID domain per sysbench worker thread. Sysbench sets the thread_id internally, counting from 0 to the specified concurrency. So with i.e. only 4 workers, updates will go to 4 databases and use 4 different GTID domains.
1. is there an upper limit for the number of trx per commit group? Any variables influencing that? commit_wait_count? innodb_log_buffer_size?
There is no upper limit, except that it cannot be larger than the maximum number of client threads running/
I see.
2. with multiple GTID domains - do START SLAVE ... UNTIL master_gtid_pos=... and MASTER_GTID_WAIT(...) work as expected?
Yes. Assuming your expectations are correct, of course :-)
START SLAVE UNTIL "0-1-100,1-3-300" will replicate in domain 0 until 0-1-100, and in domain 1 until 1-3-300, and then stop. Note that if there is anything in domain 2, it will stop immediately (effectively at 2-X-0), so nothing will be executed in domain 2. Each domain will stop at the right point independently of the others.
Ok. My fault. I didn't realize that a complete position needs GTIDs for each domain used. I was somehow expecting that a GTID from a single domain would act just like a traditional binlog name + position.
So the only issue I can think of is that you need to specify all domains in your START UNTIL and MASTER_GTID_WAIT() commands. You can just use the value of @@gtid_binlog_pos on the master, I suppose.
That's exactly what I am doing and it looks perfect. XL
participants (2)
-
Axel Schwenke
-
Kristian Nielsen