![](https://secure.gravatar.com/avatar/99fde0c1dfd216326aae0aff30d493f1.jpg?s=120&d=mm&r=g)
Hi Serg, Can you help me with suggestions/comments for the following proposal for how to do configuration for MDEV-6676, speculative parallel replication? I am not too happy about how things are in 10.0. There, there is a single option --slave-parallel-threads=N. If N>0, then parallel replication is enabled, else not. One problem is that this makes it not configurable per multi-source master connection. Another problem is that there are two possible mechanisms for parallelisation, group-commit based and domain_id based; currently, one can enable none of them or both, but not just one of them. MDEV-6676 will introduce at least one other mechanism, which seems to make it essential to make a better way to configure this. Now, ideally, there would not be any configuration at all. The server would just run things in parallel when possible and to the degree desirable. However, I think there are some reasons that we need to allow the user this configurability: - Parallel replication is still a somewhat experimental feature, so it seems too risky to enable it by default. Also, it doesn't really seem possible for the server to automatically set the best number of threads to use, with current implementation (or possibly any implementation). - When replicating with non-transactional updates, or in non-gtid mode, slave state is not crash safe. This is true in non-parallel replication also, but in parallel replication, the problem seems amplified, as there may be multiple transactions in progress at the time of a crash, complicating possible manual recovery. This also suggests that parallel replication must be configurable. - When using domain-based parallel replication, the user is responsible for ensuring that independent domains are non-conflicting and can be replicated out-of-order wrt. each other. So if replication domains are used, but this property is not guaranteed, then domain-based parallel replication need to be configurable, or parallel replication cannot be used at all. - The new speculative replication feature in MDEV-6676 is not always guaranteed to be a win - in some workloads, where there are many conflicts between successive transactions, excessive rollback could cause it to be less efficient than not using it. Again, this suggests it needs to be configurable. So given this, I came up with the following idea for syntax: CHANGE MASTER TO PARALLEL_MODE=(domain,groupcommit,transactional,waiting) Each of the four keywords in the parenthesis is optional. "domain" enables domain-based parallelisation, where each replication domain is treated independently. "groupcommit" enables the non-speculative mode, where only transactions that group-committed together on the master are applied in parallel on the slave. "transactional" enables the speculative mode, where all transactional DML is optimistically tried in parallel, and then in case of conflict a rollback and retry is done. "groupcommit" and "transactional" are mutually exclusive, at most one of them can be specified. The default would be (domain,groupcommit) to be back-wards compatible with 10.0. If slave_parallel_thread=0, then no parallel apply will happen even if PARALLEL_MODE is non-empty. If slave_parallel_thread>0 but PARALLEL_MODE is empty (PARALLEL_MODE=()), then again no parallel apply will be done. The "waiting" option is not essential to add, we could remove it, I put it in because there were already a number of options so it seemed to cause no harm. The idea is that on the master, we detect if transaction T2 had to do a row lock wait on transaction T1. If so, it seems likely that a similar conflict could occur on the slave, so we will not run T2 in parallel with T1; instead we will let T2 wait for T1 to commit before T2 is started. The "waiting" option could be enabled by a user to disable this check, enabling even more aggressive parallelisation. I am not sure if it is worth it to have this configurable though, comments welcome. I have not checked how hard it will be to implement the new syntax. We do not have any similar multi-option CHANGE MASTER elements, as far as I know, but it is similar to ENUM system variables. And we already have a IGNORE_SERVER_IDS syntax with comma-separated list within parenthesis. So hopefully not too hard to do, and somewhat consistent with existing syntax. So what do you think? Is it a reasonable syntax? Any comments, or suggestions for better way to do it? Thanks, - Kristian.