![](https://secure.gravatar.com/avatar/99fde0c1dfd216326aae0aff30d493f1.jpg?s=120&d=mm&r=g)
Pavel Ivanov <pivanof@google.com> writes:
Does this mean that group commit will be possible if slave is able to execute several transactions consecutively while previous transaction commits/fsyncs?
Yes. Because the following transaction is in all cases allowed to start as soon as the prior transaction reaches its COMMIT event, see previous mail for details.
I'd suggest to name this option differently because looking just at the list of available values it's not quite clear what could be the difference between follow_master_commits and only_commits. I don't know yet what is the best name for this. Maybe overlap_commits?
I think overlap_commits sounds good, thanks for the suggest.
--slave-parallel-domains=on|off (default on)
"This replaces the "domain" option of old --slave-parallel-mode. When enabled, parallel replication will apply in parallel transactions whose GTID has different domain ids (GTID mode only).
I don't understand what would be the meaning of combining this flag with --slave-parallel-mode. Does it mean that when this flag is on transactions from different domains are executed on "all_transactions" level of parallelism no matter what value --slave-parallel-mode has? What will happen if this flag off but --slave-parallel-mode=all_transactions?
These apply on two different levels. With --slave-parallel-domains=on, each replication domain is replicated as completely independent streams, similar to different multi-source replication slaves. The position in each stream is tracked with each one GTID in gtid_slave_pos, and one stream can be arbitrarily ahead of another. The --slave-parallel-mode applies within each stream. Within one stream, commits are strictly ordered, and --slave-parallel-mode specifies how much parallelism is attempted. The --slave-parallel-mode can be set to any value and the server is responsible to ensure that replication works correctly. In contrast, using --slave-parallel-domains, it is the users/DBAs responsibility to ensure that replication domains are set up correctly so that no conflict can occur between them.
I feel like you are up to something here, but implementing it using this flag is not quite right.
Can you elaborate? --slave-parallel-domains controls whether we have one stream or many. --slave-parallel-mode controls what happens inside each stream. Any suggestion how to clarify?
Hm... The fact that a transaction did a lock wait on master doesn't mean that the conflicting transaction was committed on master, or that both of these transactions were committed close enough to even make it possible to be executed in parallel on slaves, right? Are you sure that this flag will be useful?
Right, and no, I'm not sure. Testing will be needed to have a better idea. If two short transactions T1 and T2 conflict on a row, T2 is quite likely to commit just after T1, and thus likely to conflict on the slave. So there is some rationale behind this.
@@SESSION.replicate_expect_conflicts=0|1 (default 0)
I think this variable will be completely useless and is not worth implementing. How user will understand that the transaction he is about to execute is likely to conflict with another transactions committed at about the same time? I think it will be completely impossible to do that judgement, at the same time it will give too much impact on the slave's behavior into users' hands. Am I missing something? What kind of scenario you are envisioning this variable to be used in?
My main worry with optimistic parallel replication is if too many conflicts and retries on the slave will outweight the performance gained from parallelism. If this does not happen, I feel it will be awesome. So I was very focused on what to do if we _do_ get a lot of conflicts. So I wanted to give advanced users the possibility to work around hotspot rows basically, if necessary. Like single row that is updated very frequently. I did not think that this was allowing users much impact on the slave's behaviour. This option is only a heuristics, it controls how aggressive the slave will try to parallelise, but it cannot affect correctness. And the user alredy has a lot of ways to affect parallelism in optimistic parallel replication. For example, imagine lots of transactions like this executed serially on the master: UPDATE t1 SET a=a+1 WHERE id=0; UPDATE t1 SET a=a+1 WHERE id=0; UPDATE t1 SET a=a+1 WHERE id=0; ... All of these would conflict on a slave. It seems likely to cause O(N**2) transaction retries on the slave for --slave-parallel-mode=all_transactions --slave-parallel-threads=N. So the idea was that user can already cause trouble for parallelism on the slave; @@replicate_expect_conflicts is intended for the poweruser to be able to hint the slave at how to get less trouble. But I'm open to change it, if you think it's important. Your perspective is rather different from my usual point of view, which is useful input. Jonas Oreland <jonaso@google.com> writes:
I still prefer "auto" as default,
Right... I really want "normal" users to be able to just enable "auto" and have things work reasonably. And I really need fine-grained control to enable testing various combinations in real life, to better understand how to implement "auto". I need to find a good way to combine these...
if you want the fine grained control, I think an optimizer_switch approach is better than adding X new config options, i.e --parallel_mode=option1=true,option3=4 don't you think that there will be new options/variants ? i do don't you think you will want to remove options/variants ? i do
Good point. I think --slave-parallel-domains is reasonable. This is a question of semantics, the user must explicitly request it, as it can break replication if not used correctly by applications. But the other options are less clear. They all should behave equally correct. Maybe something like this: 1. --slave-parallel-domains is off by default, can be enabled if application is written to guarantee no conflicts. 2. --parallel-slave=on|off. This defines if parallelism will be done within each replication stream. 3. --slave-parallel-threads=N. This could default to 3*#CPUs or whatever if --parallel-slave is enabled on at least one multi-master slave. 4. --slave-parallel-mode=auto | <lots of fine-grained options>. Then "normal" user will only need to say --parallel-slave=on. --slave-parallel-mode defaults to "auto". I then need to consider how this affects backwards compatibility with 10.0...
I don't like having the word "transaction" in the names of some modes, ALL variants will maintain transaction semantics. Having "transaction" is name of only a few sort of implies that others are not transactional...
i think "optimistic" is better than "transactional" or "all_transactions". in the same spirit "conservative" could be "follow_master_commits"
Nice, I really like "conservative" and "optimistic". And maybe "only_commits" isn't really useful. There seems little reason not to at least use the "conservative" mode, it shouldn't cause any problems over "only_commits". So slave-parallel-mode= none | conservative | optimistic, perhaps...
i don't understand what "only_commits" is, how can it prepare transaction "queue up" for group commit if only 1 is prepared in parallel ?
The following transaction T2 is started as soon as T1 sees its COMMIT event. So if T2 can reach its own commit while T1 is still waiting for LOCK_log (or --binlog-commit-wait-usec), this is possible.
though, i think all suggestions might work if they have good defaults and a solid implementation
Right, I'd like to get the interface right from the start, but too much bikeshedding is also counterproductive. Thanks, Pavel and Jonas, for your comments! - Kristian.