Thanks, Sergei and everyone else, for your feedback/ideas! Well, I finished a working prototype that does everything except the locking and unlocking of databases. Someone mentioned that it's now possible to lock individual tables. Would you--anyone--be willing to point me toward places in the 5.5 code that use that locking/unlocking? I'm interested in sharing the changes with the community if there is any interest. (Any suggestions about whether I should prefer launchpad versus forge.mysql.com or some other place to share? [I once owned a project on SourceForge--I think it's still there, but not very active ...]. :-) A few comments in response to your comments: On Thu, Sep 16, 2010 at 6:34 AM, Sergei Golubchik <serg@askmonty.org> wrote:
Hi, Weldon!
On Sep 15, Weldon Whipple wrote:
See where mysql_bin_log.write is called.
Excellent information! (I was hoping it might be that easy ???). If I succeed in making a DB_BIN_LOG class derive from MYSQL_BIN_LOG (with an instance of DB_BIN_LOG for each database that is currently binlogging), I could--in theory, at least--just call something like db_binlog->write () at those places.
Better make this write method virtual and implement it in your derived class.
I did implement it as a derived class, but it was a bit more work than just reimplementing the ::write() method. (I use separate instances of DB_BINLOG--the derived class--for each database that is binlogging to its separate binlog file [sequence].) I had to have a few additional methods (extremely small) to set up for opening files named <database>-bin.<seq#>, <database>-bin.index, etc. I also maintain a db_binlog.info file that remembers the databases being binlogged across restarts. One major change was in the way the global MYSQL_BIN_LOG class is referenced. In the original code, it comes into existence at the top of binlog.cc. Before I started making my modifications, I changed MYSQL_BIN_LOG to be a pointer to MYSQL_BIN_LOG, declared in mysqld.cc (and extern'ed in mysqld.h). At the point in mysqld.cc where mysql_bin_log is first referenced I "new" the global instance of MYSQL_BIN_LOG*. I went through all the files that reference mysql_bin_log and made the necessary changes so that mysql_bin_log works as a pointer. (Example: Change "mysql_bin_log.is_active()" to "mysql_bin_log->is_active().) Also, before, referencing mysql_bin_log*, I always make sure it is non-NULL ... Then I made my changes for 1 db-per-file binlogging, using #ifdef HAVE_DB_BINLOG/#endif blocks to allow building with or without 1-db-per-file binlogging. As I'm typing this post, I'm running mysql-test-run.pl on a build without HAVE_DB_BINLOG defined and it seems to be passing all the tests. I need to write additional test cases for when HAVE_DB_BINLOG is defined ... Thanks in advance for ideas/suggestions (as well as for the ideas in the past). Weldon Whipple weldon@whipple.org (I'll leave the conversations below to help refresh your memory on this thread's past.)
Questions I Have. ----------------
1. I notice that binlog is in the list of engines and plugins (displayed by SHOW ...). The bottom of log.cc has "mysql_declare_plugin(binlog)". Do I need one for db_plugin? (probably not?)
depends. binlog needs to be declated a plugin, to be able to pretent being a storage engine, to be able to force two phase commit (2PC).
2PC is used to commit or rollback a transaction atomically in more than one engine. And MySQL only uses 2PC when a transaction spans more than one engine. But we want a transaction to be committed in the engine and written to a binlog - and it should be done atomically too. So, binlog pretends to be an engine, pretends to participate in a transaction, which forces MySQL to use 2PC which can guarantee that a transaction is either committed and written to a binlog - or not committed and not written. Reliably, even if MySQL crashes in the middle.
So, if you don't use innodb, or don't use innodb_flush_log_at_trx_commit=1 or don't use sync_binlog=1 you probably don't need to support 2PC as your setup is not crash proof anyway.
If you care about innodb<->binlog consistency in the presence of crashes, you should support 2PC in your binlog, as explained above.
We DO have a small percentage of innodb databases. If 2PC commits (rolls back, etc.) atomically to more than one engine, does that mean TWO (only)? ... or could it happen for three (specifically, innodb, MYSQL_BIN_LOG's binlog, and DB_BIN_LOG's binlog)?
See http://en.wikipedia.org/wiki/Two-phase_commit_protocol
It's two phases, not two engines, any number of engines can participate in a transaction.
But! All of them but one have to support two-phase commit. Binlog does not support it. Apparently, DB_BIN_LOG will not support it either - adding two-phase commit support to binlog will surely be beyond the scope of your work.
It means, you cannot have InnoDB, binlog, and DB_BIN_LOG to participate in a two-phase commit. But you can disable MYSQL_BIN_LOG altogether and only use DB_BIN_LOG - in that case you can have a two-phase commit between DB_BIN_LOG and InnoDB.
The two kinds of binlog files would have the same format (except that one is for only one database). Are they considered two storage engines/plugins?
If you want to write to both - yes. A transaction will need to be written to both independently, so you want them to be in sync, and MySQL uses 2PC to synchronize different data storages.
You can solve that by
1) not using MYSQL_BIN_LOG. In this case there's only InnoDB and DB_BIN_LOG to keep in sync, and InnoDB can do 2PC.
2) not making DB_BIN_LOG independent from MYSQL_BIN_LOG. For example, everything is written to binlog only. Then you have a background thread (or even a separate process) that reads binlog and writes your per-db binlogs. In that case you have only InnoDB and MYSQL_BIN_LOG to keep in sync, and MySQL already does that. Your background thread is only catching up, it is allowed to fall behind.
(I notice that the "mysql_declare_plugin(binlog)" declaration is at the bottom of log.cc. Most of the other storage engines' corresponding declarations are in files in the storage subdirectories. If I were to need a "mysql_declare_plugin(db_binlog)" declaration, could it also go at the bottom of log.cc? ... Or (since a DB_BIN_LOG is AKO [a kind of] MYSQL_BIN_LOG [in the OO sense]), are MYSQL_BIN_LOGs and DB_BIN_LOGs both kinds of binlog plugins ...?
if you want to have two plugins in one file you still need to use one mysql_declare_plugin(), not two. See example at the end of http://www.packtpub.com/article/look-inside-mysql-daemon-plugin
Regards, Sergei