Hi Jonas and Kristian,
The idea of a hybrid approach seems very good. My experience implementing parallel apply on Tungsten leads me to believe that masters can supply useful metadata for replication but cannot supply a definitive plan. There are a number of reasons for this.
1. Slaves do not always apply all data. This is particularly true if you are replicating heterogeneously, which we do quite a bit on Tungsten. It's quite common to drop some fraction of the changes.
2. Slave resources are not fixed and their workload may differ substantially from the master. For instance, both CPU and I/O capacity are variable, especially if there is any asymmetry related to host resources. Workloads are also asymmetric. You may need to trade off resources devoted to replication against read-only queries. In Tungsten we tune the number of threads for parallel apply as well as load balancing decisions based on these considerations.
3. Slave side optimizations come into play. Tungsten can permit causally independent replication streams to diverge substantially--for example you could allow the slowest and fastest parallel threads to diverge by up to 5 minutes. Doing so ensures that you continue to get good parallelization even when workloads have a mix of very large and very small transactions. The choice of interval depends on factors like how long you are willing to wait for replication to serialize fully when going offline or how much memory you have in the OS page cache.
MariaDB parallel apply works differently from Tungsten of course and you may permit a different set of trade-offs. In general though it seems that the most valuable contributions from the master side are the following:
1.) Provide a fully serialized binlog. I cannot begin to say how helpful it is that MySQL did this a long time ago.
2.) Provide as much metadata as possible about whether succeeding transactions are causally independent.
3.) Where feasible limit transactions that would require full serialization of replication. For instance, it's very helpful to forbid transactions from spanning schema boundaries, so you get a series of guaranteed causally independent streams at the master.
Beyond that it's up to the slave to decide how to use the information when applying transactions.
Cheers, Robert