Re: [Maria-developers] More suggestions for changing option names for optimistic parallel replication

9 Dec 2014

      Pavel Ivanov <pivanof@google.com> writes:
...
This is not entirely true, right? Let's say master binlog has
transactions T1.1, T1.2, T1.3, T1.4, T2.1, T1.5, T2.2 (where T1.* have
domain_id = 1 and T2.* have domain_id = 2) and slave has 3 parallel
threads. Then as I understand threads will be assigned to execute
T1.1, T1.2 and T1.3. T2.1 won't be scheduled to execute until these 3
transactions (or at least 2 of them T1.1 and T1.2) have been
committed. So streams from different domains are not completely
independent, right?
One can use --slave-domain-parallel-threads to limit the number of threads
that one domain in one multi-source connection can reserve. By default, things
work as in your example. With eg. --slave-parallel-threads=3
--slave-domain-parallel-threads=2, two threads will be assigned to run T1.1,
T1.2, T1.3, and T1.4, and one free thread will remain to run T2.1 in parallel
with them.
...
As I pointed above the streams from multiple domains are completely
independent only when they are coming from multiple masters. When they
come from a single master they are not completely independent and that
creates a confusion (at least for me) of how these options work
together in that case.
It's the same for multiple masters as for single masters. There is a global
pool of --slave-parallel-threads=N threads. One domain in one master
connection can allocate up to --slave-domain-parallel-threads=M of those to
apply transactions in-order. If one master connection is using all available
threads, other connections will stall :-/

So the thread management is pretty basic in the current version of parallel
replication. And domain-based parallel replication is not as easy to use as I
would like. Hopefully this can be improved in a later version.
...
I guess a big question I want to ask: why would someone want to use
multiple domains together with slave-parallel-domains = off? If it's a
kind of kill-switch to turn off multi-domain feature completely if it
causes troubles for some reason, then I don't think it is baked deep
enough to actually work like that. But I don't understand what else
could it be used for.
The original motivation for replication domains is multi-source
replication. Suppose we have M1->S1, M2->S1, S1->S2, S1->S3:

    M1 --\           /---S2
          +-- S1 ---+
    M2 --/           \---S3

Configuring different domains for M1 and M2 is necessary to be able to
reconfigure the replication hierarchy, for example to M1->S2, M2->S2;
or to S2->S3:

    M1 --\    /---S2
          +--+
    M2 --/    \---S1 ---S3

    M1 --\           /---S2 ---S3
          +-- S1 ---+
    M2 --/

This requires a way to track the position in the binlog streams of M1 and M2
independently, hence the need for domain_id.

The domains can also be used for parallel replication; this is needed to allow
S2 and S3 to have the same parallelism as S1. However, this kind of parallel
replication requires support from the application to avoid conflicts. Now
concurrent changes on M1 and M2 have to be conflict-free not just on S1, but
on _all_ slaves in the hierarchy.

I think that such a feature, which can break replication unless the user
carefully designs the application to avoid it, requires a switch to turn it on
or off.
...
Both seem to be a very narrow use case to make it worth adding a flag
that can significantly hurt the majority of other use cases. I think
I see your point. Another thing that makes the use case even narrower is that
it will be kind of random if we actually get the lock wait in T2 on the
master. So even if delaying T2 would significantly improve performance on the
slave, it is not a reliable mechanism.
...
this feature will be useful only if master will somehow leave
information about which transaction T2 was in conflict with, and then
slave would make sure that T2 is not started until T1 has finished.
Though this sounds over-complicated already.
Yeah, it does.

What I really need is to get some results from testing optimistic parallel
replication, to understand how many retries will be needed in various
scenarios, and if those retries are a bottleneck for performance.
...
I understand everything that you say, but I think the difference
between our views is that you consider DBAs and database users to be
mostly the same people or two small groups sitting in the same room
and easily communicating with each other. For me that's not true. For
(I meant getting your input has made me think more about use cases that
are different from something like eg. Facebook, with a single carefully
controlled application and a team of highly skilled database developers.
Which is where I come from originally, though we were a _lot_ smaller
than Facebook, obviously :-)

If I understand correctly, your use case is one of a team of highly skilled
DBAs (for lack of a better name) managing a database service used by a lot of
users, each with their own applications, and each without high database
skills. Another use case is users that run a single application on their own
hosting of MariaDB, but use the database as a commodity without wanting to
invest many resources in acquiring detailed MariaDB skills. These use cases
are probably a lot more common than "facebook-like" applications.
...
So when you say "the user already has a lot of ways to affect parallel
replication" it translates to me as "there are certain workloads when
parallel replication will behave slower than sequential". Yes, I agree
with that. If I meet such workload I will have to turn off the
parallel replication, or I (with your help) will have to find some
generic improvement to make parallel replication work better with such
Right, point taken.
...
So overall I don't think this variable will be useful for large installations.
I agree it will not be useful for you, nor for the majority of other users.

The question is, if --slave-parallel-wait-if-conflict-on-master and
@@replicate_expect_conflicts are sufficiently useful for a minority of users
to be worth including in _some_ capacity? From the input I have gotten so far,
I think I will remove them, unless someone chimes in with a different point of
view (we can always add them back later in some form, if they actually turn
out to be needed in real life testing).

That would leave the following options (in one form or another):

  --master_connection.slave-parallel-mode = conservative | optimistic

  --slave-parallel-threads = N

  --master_connection.parallel-slave = on|off

  --master_connection.slave-parallel-domains = on|off

I am not yet confident enough in the code to not provide a way to disable the
new optimistic mode. And the fixed-size thread pool, while limited, is what we
have for now. And being able to enable/disable in-order and out-of-order
parallel replication independently on a per-master-connection basis seems
useful.

Or can we do better?

Once again, thanks for taking the time to help me improve this important
aspect of the parallel replication.

Thanks,

 - Kristian.

Re: [Maria-developers] More suggestions for changing option names for optimistic parallel replication

Kristian Nielsen