Hi Sergey,
It is better to be able to commit through Spider node. Currently it is impossible, but I think it is possible if xid_cache_delete is skipped when xa commit get an error from a storage engine. Could you please tell me your opinion?
I don't understand how you can rely on in-memory xid_cache_delete. It's not persistent, if the Spider node is restarted, it will be lost anyway.
When the Spider node is restarted, Spider can register xid into xid_cache, because hton->recover is called at starting server, and registered xid can do xa commit. But this xa commit failed case is deleted xid from xid_cache. What kind of problem is there, if xid_cache_delete is skipped when xa commit get an error from a storage engine?
I think Spider can, probably, perform an xa recovery of the data node automatically - when a node is reconnected after a crash, Spider node looks in the mysql.spider_xa table and commits/aborts transactions on the node accordingly. But it's a bit tricky, if you consider that the Spider node itself can crash. One needs to analyze carefully all cases where the data and the Spider node crash at any point during the commit sequence. I have not done that.
If crashed Spider node can recovery, it's no problem for xa recovery. If crashed Spider node can't recovery (gone away for ever), it needs to get used xid from application log or something for recovering. Automatic xa recovery feature is planed in the future. Thank you for suggesting it to me!
With the "error during the commit", I checked what MariaDB does, it's actually better than I thought. After successful prepare it won't rollback the transaction in any engine. And with your node crash the transaction was, from user point of view, committed - it was neither rolled back, nor corrupted or partially applied. It was "virtually committed" and will be fully committed and available after the node recovery. So, it looks like it's ok to return an error in this specific case.
Thank you for reviewing! Thanks, Kentoku 2013/10/5 Sergei Golubchik <serg@mariadb.org>
Hi, kentoku!
Hi Sergei,
Just one question, before I could answer. What does it mean "data node is committed manually after recovery"? What exactly should the user do?
Thank you for caring it! The xa commit sequence with crash recovery is like the followings.(In
On Oct 05, kentoku wrote: this
case. I talk about 1 Spider node and 3 data nodes). Sorry for long explanation, answer for "What does it mean "data node is committed manually after recovery"?" is 3.
1. An application send xa prepare to Spider node. appilication -> xa prepare -> Spider node -|-> xa prepare -> data node1 |-> xa prepare -> data node2 |-> xa prepare -> data node3 return success to an application.
2. An application send xa commit to Spider node after crushing data node2. appilication -> xa commit -> Spider node -|-> xa commit -> data node1 |-> xa commit xx data node2 |-> xa commit -> data node3 return error to an application.
3. Send xa recover and xa commit manually to data node2 after recovering. Status of xa transaction is recorded in mysql.spider_xa table. So you can know about you should commit or rollback the xa transaction from this table. It's human or monitoring tool operation. -> xa commit -> data node2
It is better to be able to commit through Spider node. Currently it is impossible, but I think it is possible if xid_cache_delete is skipped when xa commit get an error from a storage engine. Could you please tell me your opinion?
I don't understand how you can rely on in-memory xid_cache_delete. It's not persistent, if the Spider node is restarted, it will be lost anyway.
I think Spider can, probably, perform an xa recovery of the data node automatically - when a node is reconnected after a crash, Spider node looks in the mysql.spider_xa table and commits/aborts transactions on the node accordingly. But it's a bit tricky, if you consider that the Spider node itself can crash. One needs to analyze carefully all cases where the data and the Spider node crash at any point during the commit sequence. I have not done that.
With the "error during the commit", I checked what MariaDB does, it's actually better than I thought. After successful prepare it won't rollback the transaction in any engine. And with your node crash the transaction was, from user point of view, committed - it was neither rolled back, nor corrupted or partially applied. It was "virtually committed" and will be fully committed and available after the node recovery. So, it looks like it's ok to return an error in this specific case.
Regards, Sergei