Hi Jonas and Kristian,

The idea of a hybrid approach seems very good. My experience implementing parallel apply on Tungsten leads me to believe that masters can supply useful metadata for replication but cannot supply a definitive plan. There are a number of reasons for this.

1. Slaves do not always apply all data. This is particularly true if you are replicating heterogeneously, which we do quite a bit on Tungsten. It's quite common to drop some fraction of the changes.

2. Slave resources are not fixed and their workload may differ substantially from the master. For instance, both CPU and I/O capacity are variable, especially if there is any asymmetry related to host resources. Workloads are also asymmetric. You may need to trade off resources devoted to replication against read-only queries. In Tungsten we tune the number of threads for parallel apply as well as load balancing decisions based on these considerations.

3. Slave side optimizations come into play. Tungsten can permit causally independent replication streams to diverge substantially--for example you could allow the slowest and fastest parallel threads to diverge by up to 5 minutes. Doing so ensures that you continue to get good parallelization even when workloads have a mix of very large and very small transactions. The choice of interval depends on factors like how long you are willing to wait for replication to serialize fully when going offline or how much memory you have in the OS page cache.

MariaDB parallel apply works differently from Tungsten of course and you may permit a different set of trade-offs. In general though it seems that the most valuable contributions from the master side are the following:

1.) Provide a fully serialized binlog. I cannot begin to say how helpful it is that MySQL did this a long time ago.

2.) Provide as much metadata as possible about whether succeeding transactions are causally independent.

3.) Where feasible limit transactions that would require full serialization of replication. For instance, it's very helpful to forbid transactions from spanning schema boundaries, so you get a series of guaranteed causally independent streams at the master.

Beyond that it's up to the slave to decide how to use the information when applying transactions.

Cheers, Robert

On Fri, Jul 4, 2014 at 5:05 AM, Jonas Oreland <jonaso@google.com> wrote:

On Fri, Jul 4, 2014 at 10:26 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:

Jonas Oreland <jonaso@google.com> writes:

> <quick thoughts on implementation>
> for row-based replication this seems quite "easy".
>
> for statement-based replication i image that you would have to add hooks
> into the "real" code
> after parsing has been performed, but before the actual execution is
> started (and yes, i know that there is sometimes a blurry line here)
> </thoughts>

A different approach could be to do this on the master.
When a transaction is binlogged, we have easy access to most/all of this
information. And there is room in the GTID event at the start of every binlog
event group to save this information for the slave. Then the slave has the
information immediately when it starts scheduling events for parallel
execution. So this does not sound too hard. Though the amount of information
that can be provided is then somewhat limited for space and other reasons, of
course.

or perhaps a hybrid approach.
master does "interesting" annotations
slave takes decision based on annotations *and* own analysis

/Jonas

_______________________________________________
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help : https://help.launchpad.net/ListHelp