[MariaDB developers] Re: Suggestions for the problems around replication of XA in 10.5

24 Jan 2024

      andrei.elkin@pp.inet.fi kirjoitti 2024-01-23 21:01:
...
Howdy Kristian, Monty!
...
[KN wrote ]>> Here is my idea for a design that solves most of these 
problems.
...
Not to disregard your text Kristian and also loggically split two
subjects, let me process it in another reply tomorrow.
...
At XA PREPARE, we can still write the events to the binlog, but 
without a
GTID, and we do not replicate it to slaves by default. Then at XA 
COMMIT we
binlog a commit transaction that can be replicated normally to the 
slaves
without problems.
This is lesser simplistic version of the mentioned patch that reduces an 
XA to normal
transaction in binlog.
...
...
If necessary, the events for XA COMMIT can be read from
the PREPARE earlier in the binlog, eg. after server crash/restart. We
already have the binlog checkpoint mechanism to ensure that required 
binlog
files are preserved until no longer needed for transaction recovery.
Indeed The XA becomes recoverable on its original host server.
...
...
This way we make external XA preserved across server restart, and all 
normal
replication features continue to work - mysqldump, mixed-mode, etc. 
Nice and
simple.
True, yet this solution apart from losing  failover
  (I am backing up below specifically) slows things down.
Big prepared XA transactions would be idling around - and apparently 
creating hot
spots for overall execution when their commits finally  arrive.
This methods is just an anti-thesis to what I believe we needs to strive 
our development,
that is to replicated everything sooner, up to an individual statement 
of a trx, or
a sub-statement of a big long running one.
I always thought of doing it in connection with the optimistic parallel 
execution :-).
...
...
Then optionally we can support the specific usecase of being able to 
recover
external XA PREPAREd transactions on a slave after failover. When 
enabled,
the slave can receive the XA PREPARE events and binlog them itself, 
without
applying. Then as part of failover, those XA PREPARE in the binlog 
that are
still pending can be applied, leaving them in PREPAREd state on the 
new
master. This way, _only_ the few transactions that need to be 
failed-over
need special handling, the majority can still just replicate normally.
Notice, that an initialization part of  failover 'the few transactions' 
that would have to be  "officially"
prepared now. However MDEV-32020 shows just two is enough for hanging.
Therefore this may not be a solution for the failover case.
...
...
There are different refinements and optimizations that can be added on 
top
of this. But the point is that this is a simple implementation that is
robust, correct, and crash-safe from the start, without needing to add
complexity and fixes on top.
The failover with or without XA is never a specific use case. It's shown 
with XA this
...
XA  PREPARE 'x';  #=> OK to the user
*crash of master*

*failover to slave*
XA RECOVER;      #=> 'x' is the prepared list

does not work in your simpler design.

Cheers,

Andrei

[MariaDB developers] Re: Suggestions for the problems around replication of XA in 10.5

andrei.elkin＠pp.inet.fi