Hi Sergei, I have tested this, and it all works as you specified, but now I have found a problem with PBXT. The problem stems from code in PBXT (in functions ha_pbxt::write_row, ha_pbxt::update_row, ha_pbxt::delete_row) which I have pasted below. The code is a hack! According to the comment I had a problem with calling: trans_register_ha(pb_mysql_thd, FALSE, pbxt_hton), to register the start of a transaction statement. When I did this unconditionally at the start of a statement, ha_commit_trans() was not always called (and MySQL became confused, thinking a transaction is still running). On reading the MySQL code (long time ago) I noticed that ha_commit_trans() was only always called when an update was performed. So I delayed the trans_register_ha(pb_mysql_thd, FALSE, pbxt_hton) call until write_row(), etc was called. The problem now is that MySQL now fails to recognize PBXT transactions as XA (see ha_write_row below), because mark_trx_read_write() is called before write_row() (i.e. before trans_register_ha() call). I noticed that InnoDB and NDB call trans_register_ha() unconditionally at the start of a statement. Could it be that the problem that lead to my "hack" has been fixed? Best regards, Paul PBXT code in ha_pbxt::write_row, etc. ------------------------------------------------------------ /* GOTCHA: I have a huge problem with the transaction statement. * It is not ALWAYS committed (I mean ha_commit_trans() is * not always called - for example in SELECT). * * If I call trans_register_ha() but ha_commit_trans() is not called * then MySQL thinks a transaction is still running (while * I have committed the auto-transaction in ha_pbxt::external_lock()). * * This causes all kinds of problems, like transactions * are killed when they should not be. * * To prevent this, I only inform MySQL that a transaction * has beens started when an update is performed. I have determined that * ha_commit_trans() is only guarenteed to be called if an update is done. */ if (!pb_open_tab->ot_thread->st_stat_trans) { trans_register_ha(pb_mysql_thd, FALSE, pbxt_hton); XT_PRINT0(pb_open_tab->ot_thread, "ha_pbxt::write_row trans_register_ha all=FALSE\n"); pb_open_tab->ot_thread->st_stat_trans = TRUE; } ------------------------------- int handler::ha_write_row(uchar *buf) { int error; Log_func *log_func= Write_rows_log_event::binlog_row_logging_function; DBUG_ENTER("handler::ha_write_row"); mark_trx_read_write(); if (unlikely(error= write_row(buf))) DBUG_RETURN(error); if (unlikely(error= binlog_log_row(table, 0, buf, log_func))) DBUG_RETURN(error); /* purecov: inspected */ DBUG_RETURN(0); } On Dec 2, 2009, at 12:26 PM, Sergei Golubchik wrote:
Hi, Paul!
On Dec 02, Paul McCullagh wrote:
Now, if a binlog doesn't indicate a crash (last binlog file was closed properly, showing a normal shutdown procedure) or, for example, there're no binlog files at all, the server cannot perform the recovery - it doesn't know what transactions should be committed and what transactions should be rolled back. A transaction may be prepared in one engine and already committed (or rolled back) in another. In this case the server requests the user to make the decision, the user has to request an explicit commit or rollback with --tc-heuristic-recover command-line switch.
So if I understand this correctly, tc-heuristic-recover handles a situation that should actually never occur.
Yes, that's mainly to account for a user mistake - deleting binlog after a crash or copying the live datadir from master to a slave without binlog. Or something similar.
But if there's only one XA-capable storage engine, it's always safe to rollback all prepared transactions, there can be no other engine that has them committed. That's what the ifdef-ed code does - if InnoDB is the only XA-capable storage engine on recovery without binlog it forces a rollback of all not committed transactions, preserving the pre-XA InnoDB behavior.
So, in fact, we could change this code:
to:
if (total_ha_2pc == (ulong) opt_bin_log+1) { tc_heuristic_recover= TC_HEURISTIC_RECOVER_ROLLBACK; // forcing ROLLBACK info.dry_run=FALSE; }
I intentionally did it with ifdef - the idea was that if there could *possibly* be another XA-engine, we need to go the safe way. Otherwise one could, for example, restart the server with disabled pbxt and unknowingly enable auto-rollback mode, causing inconsistent data.
Regards / Mit vielen Grüßen, Sergei
-- __ ___ ___ ____ __ / |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg@sun.com> / /|_/ / // /\ \/ /_/ / /__ Principal Software Engineer/Server Architect /_/ /_/\_, /___/\___\_\___/ Sun Microsystems GmbH, HRB München 161028 <___/ Sonnenallee 1, 85551 Kirchheim- Heimstetten Geschäftsführer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel Vorsitzender des Aufsichtsrates: Martin Häring
-- Paul McCullagh PrimeBase Technologies www.primebase.org www.blobstreaming.org pbxt.blogspot.com