Hi Sergey,

>> It is better to be able to commit through Spider node. Currently it is
>> impossible, but I think it is possible if xid_cache_delete is skipped when
>> xa commit get an error from a storage engine.
>> Could you please tell me your opinion?

> I don't understand how you can rely on in-memory xid_cache_delete. It's
> not persistent, if the Spider node is restarted, it will be lost anyway.

When the Spider node is restarted, Spider can register xid into xid_cache, because hton->recover is called at starting server, and registered xid can do xa commit. But this xa commit failed case is deleted xid from xid_cache. What kind of problem is there, if xid_cache_delete is skipped when xa commit get an error from a storage engine?

> I think Spider can, probably, perform an xa recovery of the data node
> automatically - when a node is reconnected after a crash, Spider node
> looks in the mysql.spider_xa table and commits/aborts transactions on
> the node accordingly. But it's a bit tricky, if you consider that the
> Spider node itself can crash. One needs to analyze carefully all cases
> where the data and the Spider node crash at any point during the
> commit sequence. I have not done that.

If crashed Spider node can recovery, it's no problem for xa recovery. If crashed Spider node can't recovery (gone away for ever), it needs to get used xid from application log or something for recovering. Automatic xa recovery feature is planed in the future. Thank you for suggesting it to me!

> With the "error during the commit", I checked what MariaDB does, it's
> actually better than I thought. After successful prepare it won't rollback
> the transaction in any engine. And with your node crash the transaction
> was, from user point of view, committed - it was neither rolled back,
> nor corrupted or partially applied. It was "virtually committed" and
> will be fully committed and available after the node recovery.
> So, it looks like it's ok to return an error in this specific case.

Thank you for reviewing!

Thanks,
Kentoku



2013/10/5 Sergei Golubchik <serg@mariadb.org>
Hi, kentoku!

On Oct 05, kentoku wrote:
> Hi Sergei,
>
> > Just one question, before I could answer.
> > What does it mean "data node is committed manually after recovery"?
> > What exactly should the user do?
>
> Thank you for caring it!
> The xa commit sequence with crash recovery is like the followings.(In this
> case. I talk about 1 Spider node and 3 data nodes). Sorry for long
> explanation, answer for "What does it mean "data node is committed manually
> after recovery"?" is 3.
>
> 1. An application send xa prepare to Spider node.
> appilication -> xa prepare -> Spider node -|-> xa prepare -> data node1
>                                            |-> xa prepare -> data node2
>                                            |-> xa prepare -> data node3
> return success to an application.
>
> 2. An application send xa commit to Spider node after crushing data node2.
> appilication -> xa commit -> Spider node -|-> xa commit -> data node1
>                                           |-> xa commit xx data node2
>                                           |-> xa commit -> data node3
> return error to an application.
>
> 3. Send xa recover and xa commit manually to data node2 after recovering.
>     Status of xa transaction is recorded in mysql.spider_xa table. So you
> can know about you should commit or rollback the xa transaction from this
> table.
>     It's human or monitoring tool operation.
>                                            -> xa commit -> data node2
>
> It is better to be able to commit through Spider node. Currently it is
> impossible, but I think it is possible if xid_cache_delete is skipped when
> xa commit get an error from a storage engine.
> Could you please tell me your opinion?

I don't understand how you can rely on in-memory xid_cache_delete. It's
not persistent, if the Spider node is restarted, it will be lost anyway.

I think Spider can, probably, perform an xa recovery of the data node
automatically - when a node is reconnected after a crash, Spider node
looks in the mysql.spider_xa table and commits/aborts transactions on
the node accordingly. But it's a bit tricky, if you consider that the
Spider node itself can crash. One needs to analyze carefully all cases
where the data and the Spider node crash at any point during the
commit sequence. I have not done that.

With the "error during the commit", I checked what MariaDB does, it's
actually better than I thought. After successful prepare it won't rollback
the transaction in any engine. And with your node crash the transaction
was, from user point of view, committed - it was neither rolled back,
nor corrupted or partially applied. It was "virtually committed" and
will be fully committed and available after the node recovery.
So, it looks like it's ok to return an error in this specific case.

Regards,
Sergei