[Maria-developers] comments on parallel applying
hi there, disclaimer: i haven't followed all turns and details on this task so sorry if my comments are outdated (or generally uninteresting). --- i've been glancing on the review comments on the parallel applying, and have some comments to make. i think it will be well invested time to add infrastructure to be able to restrict when parallel applying will be performed. typically i think it would be nice to - only allow parallel transaction in same engine - decrease parallelism when performing DDL ...(more of these can probably be added) restrictions that will make it easier to reason about correctness and that will not affect performance for 99% of the cases (i think :-) if adding such restrictions was an option, the resolving review comments on weird corner cases would be much easier. --- from what I can tell, there are currently no such restrictions, and no infrastructure to add them either. i realize that it might not be trivial to introduce such restrictions, but personally i think it will be worth the effort to get a solid solution in place. <quick thoughts on implementation> for row-based replication this seems quite "easy". for statement-based replication i image that you would have to add hooks into the "real" code after parsing has been performed, but before the actual execution is started (and yes, i know that there is sometimes a blurry line here) </thoughts> --- /Jonas
Jonas Oreland <jonaso@google.com> writes:
<quick thoughts on implementation> for row-based replication this seems quite "easy".
for statement-based replication i image that you would have to add hooks into the "real" code after parsing has been performed, but before the actual execution is started (and yes, i know that there is sometimes a blurry line here) </thoughts>
A different approach could be to do this on the master. When a transaction is binlogged, we have easy access to most/all of this information. And there is room in the GTID event at the start of every binlog event group to save this information for the slave. Then the slave has the information immediately when it starts scheduling events for parallel execution. So this does not sound too hard. Though the amount of information that can be provided is then somewhat limited for space and other reasons, of course.
i think it will be well invested time to add infrastructure to be able to restrict when parallel applying will be performed.
Yes, I think it is an interesting idea. For example, we could only run transactions in parallel that are safe to rollback (marked by a bit in the GTID). Then in case of a deadlock, we know it's safe to roll back all of them and try again. This would only need to be done within each replication domain, which is where we need to enforce commit order. It would still be possible to eg. run a long-running DDL in parallel in a separate domain, with the user/DBA taking care of ensuring that no conflicts will occur. I'll need to think more of it, but it's an interesting idea. Thanks for the suggestion! - Kristian.
On Fri, Jul 4, 2014 at 10:26 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Jonas Oreland <jonaso@google.com> writes:
<quick thoughts on implementation> for row-based replication this seems quite "easy".
for statement-based replication i image that you would have to add hooks into the "real" code after parsing has been performed, but before the actual execution is started (and yes, i know that there is sometimes a blurry line here) </thoughts>
A different approach could be to do this on the master. When a transaction is binlogged, we have easy access to most/all of this information. And there is room in the GTID event at the start of every binlog event group to save this information for the slave. Then the slave has the information immediately when it starts scheduling events for parallel execution. So this does not sound too hard. Though the amount of information that can be provided is then somewhat limited for space and other reasons, of course.
or perhaps a hybrid approach. master does "interesting" annotations slave takes decision based on annotations *and* own analysis /Jonas
Jonas Oreland <jonaso@google.com> writes:
or perhaps a hybrid approach. master does "interesting" annotations slave takes decision based on annotations *and* own analysis
Agree. There should be interesting possibilities to investigate. One conplication for slave's own analysis is that for large transactions, it may be necessary/desirable for the slave to start executing the first part of a transaction before the last part has been received. A hybrid approach seems well suited to provide sufficient information on the slave to make a good decision in most cases. - Kristian.
Hi Jonas and Kristian, The idea of a hybrid approach seems very good. My experience implementing parallel apply on Tungsten leads me to believe that masters can supply useful metadata for replication but cannot supply a definitive plan. There are a number of reasons for this. 1. Slaves do not always apply all data. This is particularly true if you are replicating heterogeneously, which we do quite a bit on Tungsten. It's quite common to drop some fraction of the changes. 2. Slave resources are not fixed and their workload may differ substantially from the master. For instance, both CPU and I/O capacity are variable, especially if there is any asymmetry related to host resources. Workloads are also asymmetric. You may need to trade off resources devoted to replication against read-only queries. In Tungsten we tune the number of threads for parallel apply as well as load balancing decisions based on these considerations. 3. Slave side optimizations come into play. Tungsten can permit causally independent replication streams to diverge substantially--for example you could allow the slowest and fastest parallel threads to diverge by up to 5 minutes. Doing so ensures that you continue to get good parallelization even when workloads have a mix of very large and very small transactions. The choice of interval depends on factors like how long you are willing to wait for replication to serialize fully when going offline or how much memory you have in the OS page cache. MariaDB parallel apply works differently from Tungsten of course and you may permit a different set of trade-offs. In general though it seems that the most valuable contributions from the master side are the following: 1.) Provide a fully serialized binlog. I cannot begin to say how helpful it is that MySQL did this a long time ago. 2.) Provide as much metadata as possible about whether succeeding transactions are causally independent. 3.) Where feasible limit transactions that would require full serialization of replication. For instance, it's very helpful to forbid transactions from spanning schema boundaries, so you get a series of guaranteed causally independent streams at the master. Beyond that it's up to the slave to decide how to use the information when applying transactions. Cheers, Robert On Fri, Jul 4, 2014 at 5:05 AM, Jonas Oreland <jonaso@google.com> wrote:
On Fri, Jul 4, 2014 at 10:26 AM, Kristian Nielsen < knielsen@knielsen-hq.org> wrote:
Jonas Oreland <jonaso@google.com> writes:
<quick thoughts on implementation> for row-based replication this seems quite "easy".
for statement-based replication i image that you would have to add hooks into the "real" code after parsing has been performed, but before the actual execution is started (and yes, i know that there is sometimes a blurry line here) </thoughts>
A different approach could be to do this on the master. When a transaction is binlogged, we have easy access to most/all of this information. And there is room in the GTID event at the start of every binlog event group to save this information for the slave. Then the slave has the information immediately when it starts scheduling events for parallel execution. So this does not sound too hard. Though the amount of information that can be provided is then somewhat limited for space and other reasons, of course.
or perhaps a hybrid approach. master does "interesting" annotations slave takes decision based on annotations *and* own analysis
/Jonas
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
participants (3)
-
Jonas Oreland
-
Kristian Nielsen
-
Robert Hodges