[Maria-discuss] MariaDB master-slave chained replication and parallelism
Hi, We have a MariaDB 10.3 replication setup with one master and a few chained slaves (each has log_slave_updates switched on). Master uses mainly MyISAM tables, slaves have about 10 or 40 threads for parallel replication. Interesting is, that the first slave in the chain counts replicated statements into Non-Transactional Groups and the following ones count them into Transactional Groups. Also, when checking process lists it seems that just one statement is being processed at the time (of the many threads) on the first slave, while there are multiple slave replication statements being executed on the 2nd and following slaves. Please, does anyone know the reason why the replicated statements are counted into different groups? Or, more importantly, how to increase the parallelism on the first slave in the chain? Thank you for any help. Jan
Jan Křístek
We have a MariaDB 10.3 replication setup with one master and a few chained slaves (each has log_slave_updates switched on). Master uses mainly MyISAM tables, slaves have about 10 or 40 threads for parallel replication.
Interesting is, that the first slave in the chain counts replicated statements into Non-Transactional Groups and the following ones count them into Transactional Groups.
Interesting. Where do you see these counts? My guess is that these are counting the "transactional" status flag on each GTID event in the binlog. You can see these yourself in a mysqlbinlog output from a binlog on the master respectively the slaves. #190606 19:42:35 server id 1 end_log_pos 514 GTID 0-1-2 trans If these show non-transactional on the master but transactional on the first slave, it sounds like you are replicating from MyISAM tables on the master to InnoDB tables on the slave. Try SHOW CREATE TABLE t on a relevant table on the master and the slave and see which storage engine they are using.
Also, when checking process lists it seems that just one statement is being processed at the time (of the many threads) on the first slave, while there are multiple slave replication statements being executed on the 2nd and following slaves.
This observation matches the theory that the tables are MyISAM on the master but InnoDB on the slaves. MariaDB parallel replication has limited capabilities in parallelising MyISAM changes. The main algorithms are based on optimistic apply, where transactions are run in parallel by default, and any conflicts are handled by rollback and retry. This is possible in InnoDB but not MyISAM. And the transactional status is checked on the table engine used on the master, not the slave. Thus, the first slave sees MyISAM changes, and does not do parallel operation, but writes InnoDB transactions. These InnoDB transactions are then seen by following slaves which enables the parallel replication algorithms.
Please, does anyone know the reason why the replicated statements are counted into different groups? Or, more importantly, how to increase the parallelism on the first slave in the chain?
The obvious answer is to change the tables to be InnoDB on the master. Which may or may not be possible in your setup. A possibly crazy/theoretical idea would be to setup the first slave with the blackhole engine for all tables. This requires statement-based replication and doesn't store _any_ data on the slave, just passes statements through to the next slave in line. There's an old idea to use the blackhole engine in this way as a "replication relay", and IIRC the blackhole engine is transactional. Not sure if this would actually work though, would require careful testing and is definitely not a supported configuration, I would say (but fun to think about). - Kristian.
Jan, Kristian, howdy.
Jan Křístek
writes: We have a MariaDB 10.3 replication setup with one master and a few chained slaves (each has log_slave_updates switched on). Master uses mainly MyISAM tables, slaves have about 10 or 40 threads for parallel replication.
Interesting is, that the first slave in the chain counts replicated statements into Non-Transactional Groups
So the first slave does not change the group type which remain non-transactional, I assume the input (master binlog) group is such.
and the following ones count them into Transactional Groups.
And the above transition can be explained by MDEV-24654 GTID event falsely marked transactional, its patch is under review.
Interesting. Where do you see these counts? My guess is that these are counting the "transactional" status flag on each GTID event in the binlog. You can see these yourself in a mysqlbinlog output from a binlog on the master respectively the slaves.
#190606 19:42:35 server id 1 end_log_pos 514 GTID 0-1-2 trans
If these show non-transactional on the master but transactional on the first slave, it sounds like you are replicating from MyISAM tables on the master to InnoDB tables on the slave. Try SHOW CREATE TABLE t on a relevant table on the master and the slave and see which storage engine they are using.
This remains to be a possibility too.
Also, when checking process lists it seems that just one statement is being processed at the time (of the many threads) on the first slave, while there are multiple slave replication statements being executed on the 2nd and following slaves.
This observation matches the theory that the tables are MyISAM on the master but InnoDB on the slaves. MariaDB parallel replication has limited capabilities in parallelising MyISAM changes. The main algorithms are based on optimistic apply, where transactions are run in parallel by default, and any conflicts are handled by rollback and retry. This is possible in InnoDB but not MyISAM. And the transactional status is checked on the table engine used on the master, not the slave.
Thus, the first slave sees MyISAM changes, and does not do parallel operation, but writes InnoDB transactions. These InnoDB transactions are then seen by following slaves which enables the parallel replication algorithms.
This fits to MDEV-24654 scenario.
Please, does anyone know the reason why the replicated statements are counted into different groups? Or, more importantly, how to increase the parallelism on the first slave in the chain?
The obvious answer is to change the tables to be InnoDB on the master. Which may or may not be possible in your setup.
I'd also recommend that.
A possibly crazy/theoretical idea would be to setup the first slave with the blackhole engine for all tables. This requires statement-based replication and doesn't store _any_ data on the slave, just passes statements through to the next slave in line.
Actually ROW format is fine so the 1st slave would re-log replicated group in ROW format to its binlog.
There's an old idea to use the blackhole engine in this way as a "replication relay", and IIRC the blackhole engine is transactional.
Right.
Not sure if this would actually work though, would require careful testing and is definitely not a supported configuration, I would say (but fun to think about).
I am also not sure how practical it could be in this specifica case, but it's fast as well :-))).
- Kristian.
Cheers, Andrei
andrei.elkin@pp.inet.fi writes:
And the above transition can be explained by MDEV-24654 GTID event falsely marked transactional, its patch is under review.
Oh, yes, this bug sounds like it could result in what Jan described. It was not clear to me from the bug description exactly under what conditions the bug occurs, but if the first slave marks the replicated transactions as "transactional" in its binlog, then the observed behaviour could occur. The question then is how the chained slaves manage to run MyISAM transactions in parallel without getting conflicts and hanging. One possibility is that these are mostly insert-only queries (as Jan mentioned in another mail), and I believe that MyISAM has the feature that MyISAM can handle insert-only queries in parallel without locks and conflicts. Would require a bit more research to be sure this is the explanation, but it seems a possibility. - Kristian.
participants (3)
-
andrei.elkin@pp.inet.fi
-
Jan Křístek
-
Kristian Nielsen