Kristian Nielsen via developers <developers@lists.mariadb.org> writes:
The XA COMMIT will normally be scheduled on the same worker as the XA PREPARE, unless the two events are far apart in the replication stream. This is mostly unavoidable in the current XA replication design, since the XA PREPARE and XA COMMIT of a single transaction cannot group-commit together.
After some discussions with Andrei, things are a bit more complex than this. On a (parallel) slave where transactions are generally small and fsync is constly, the ability to group commit multiple transactions together is important for performance. And if XA PREPARE and the matching XA COMMIT cannot group-commit together, this becomes a problem in a simple serial stream of prepares-and-commits: XA PREPARE 't1' XA COMMIT 't1' XA PREPARE 't2' XA COMMIT 't2' XA PREPARE 't3' XA COMMIT 't3' ... If XA PREPARE 't1' cannot group-commit together with XA COMMIT 't1', then the maximum group commit size will be 2. This is because transactions must binlog in the same order on the slave as on the master. This limitation exists in my suggested "minimal" patch. In the more complex MDEV-31949 patches, this limitation is partially lifted, from what I understood on Andrei. The XA PREPARE 't1' can group-commit _to the binlog_ together with XA COMMIT 't1'. This allows large group commits to the binlog. Inside the engine (InnoDB), the XA PREPARE 't1' and XA COMMIT 't1' cannot group-commit together. Much of the extra complexity in the proposed MDEV-31949 patches come due to this, being able to group-commit the XA PREPARE and XA COMMIT together in the binlog, but have the XA COMMIT wait for the XA PREPARE before committing in engine. I think the point is that XA PREPAREs do not commit in-order inside the engine, so InnoDB can still have large group-commits (just the groups will be different mix of transactions than in the binlog). Based on all this, I expect my minimal patch to have reduced performance (compared to the more complex patches) in benchmarks where the XA PREPARE and XA COMMIT appear close together in the binlog, and the cost of fsync is significant compared to the cost of running the transaction. - Kristian.