+maria-discuss as they are working in replication too +robert.hodges as tungsten replicator has features that could be used here I am very interested in a subset of this. I want one SQL thread per database on a slave (stream per database on the slave). However I don't need to split the binlog on the master to get that and writing a binlog file per database on the master might complicate the sync_binlog option that keeps the binlog and InnoDB in sync during crash recovery. How will this work in that case? Note that a stream (SQL slave thread) per database on the slave will break rpl_transaction_enabled. Some of us rely on that to keep slaves consistent during crash recovery. I assume you don't use that feature. MySQL has worklogs to use db tables in place of relay-log.info and when that code is production ready, then this is not a problem. On Mon, Sep 13, 2010 at 1:52 PM, Weldon Whipple <weldon@whipple.org> wrote:
Greetings!
My employer has asked me to implement per-DB binlogging for MySQL. I've been working on prototypes for 3-4 weeks now, and decided it's time to post to this list to ask for comments, suggestions, condolences, etc. (I've been a "lurker" for quite awhile. I hope this post isn't too far off the mark) The assignment is *only* for master-side binlogging. (It doesn't require a slave/replication-side implementation.)
Why I Want to Do It. -------------------
I work for an ISP that hosts (last I heard--I HOPE I'm not lying) over 2 million domains. Customers are allowed to have (almost) unlimited MySQL databases. Migrating an account from one physical server to another is a daunting task. (A typical server has easily 2000+ [sometimes far more] databases.) Global mysqld binlogging isn't feasible when we want to migrate (for example) a single account to another server.
Our proposed scenario goes something like this:
1. START USER BINLOG FOR DATABASE <db> 2. FLUSH USER TABLES WITH READ LOCK FOR DATABASE <db> 3. SHOW USER BINLOG STATUS [WHERE DATABASE LIKE <pattern>]
SHOW output might look something like: ================================= mysql> show user binlog status; +----------------------+----------+---------------------------------+------------+ | Database | User | Binlog_file | Binlog_pos | +----------------------+----------+---------------------------------+------------+ | booklibs | wwhipple | booklibs-bin.000003 | 248 | | charmins_gravestones | charmins | charmins_gravestones-bin.000001 | 4 | +----------------------+----------+---------------------------------+------------+ 2 rows in set (0.01 sec)
The "where" clause also works:
mysql> show user binlog status where user like 'charmins'; +----------------------+----------+---------------------------------+------------+ | Database | User | Binlog_file | Binlog_pos | +----------------------+----------+---------------------------------+------------+ | charmins_gravestones | charmins | charmins_gravestones-bin.000001 | 4 | +----------------------+----------+---------------------------------+------------+ 1 row in set (0.01 sec) =================================
At this point the migration tools note the binlog coordinates and start copying the databases to the new server. (For myisam tables, the tools writers want to just copy [scp?] the files. For innodb, they will use something like mysqldump.)
(Note: We might want to "lose" the "User" column in the above table] in our recent designs.)
Then the sequence of MySQL commands resumes:
4. UNLOCK USER BINLOG FOR DATABASE <db>
Finally, when migration is complete:
5. STOP USER BINLOG FOR DATABASE <db>
A high percentage of users (infrequent bloggers, etc.) go several days (sometimes even weeks or months) without updating their databases. For them--after migrating the files to the new server, a final check of the binlog coordinates on the old box can verify that nothing has changed.
If it HAS changed, migration can copy the binlog file(s) to the new box and use the mysqlbinlog command (specifying the starting coordinates) to bring the new database in sync with the old one.
(This scenario is probably oversimplified, ignoring things like DNS caching, TTL, etc., etc. Hopefully it doesn't include too many lies.)
How I Plan to Do It. -------------------
I've noticed that:
1. The class MYSQL_BIN_LOG seems to be involved in most binlogging 2. MYSQL_BIN_LOG inherits from TC_LOG and MYSQL_LOG. It is created in mysqld.cc at startup.
My current partial implementation has:
1. Class DB_BIN_LOG inheriting from MYSQL_BIN_LOG, adding members like db_name db_binlog_name db_index_name and methods to access them, etc.
2. Singleton class USER_BIN_LOG_MGR. (Should probably be DB_BIN_LOG_MGR), which maintains a db_binlog_hash of information about each DB being binlogged. A DB_BIN_LOG_ENTRY (in the hash) contains: dbname locked (bool)--has it been locked by FLUSH ... TABLES WITH READ LOCK? binlog file binlog_pos is_active a pointer to a DB_BIN_LOG instance
USER_BINLOG_MGR has a rw lock to protect access to its hash of currently binlogging db's. (Most accesses will probably be read-only?)
Is it plausible to identify all places where a binlog addition might occur? In those places, if the singleton USER_BIN_LOG_MRG exists, ask it if the current DB is being binlogged. If so, send the request to the DB_BIN_LOG instance.
Questions I Have. ----------------
1. I notice that binlog is in the list of engines and plugins (displayed by SHOW ...). The bottom of log.cc has "mysql_declare_plugin(binlog)". Do I need one for db_plugin? (probably not?) 2. I've diagrammed the binary log event taxonomy--the 24 classes that inherit from Log_event (in log_event.cc/h). That was my main motivation for wanting to inherit from MYSQL_BIN_LOG. 3. On the other hand, I wonder if inheriting from MYSQL_BIN_LOG might be overkill for "master only" binlogging--without the slave/replication part of it. 4. What version should I be working on? (We're currently running 5.1.47.) What should I check out of bzr/launchpad? (Is that still what we're doing with the changes in ownership?) 5. I see that THD is involved in binlogging. (Several THD methods are defined in log.cc.) Are there any caveats here? 6. What files should I use? A new h/cc pair? ... or just add to the hierarchy in log.h/cc?
In an ideal world, it would be nice to inherit polymorphically from MYSQL_BIN_LOG, overriding methods that set the names of the binlog file and index file (etc.) then just call the methods and have it do the "right thing." On the other hand, I get the impression that I might be oversimplifying.
I welcome your advice!!
What I've Implemented So Far (about 13 Prototypes). --------------------------------------------------
(PRELIMINARY NOTE:)
Our initial plan was to call it "Per-User Binlogging". The user here is a cPanel [control panel] user. cPanel's database naming convention is to have all of a user's database names start with the username, followed by an underscore, followed by a distinctive database name. Thus all the commands above start with
<verb> USER ...
Originally most of them could end either "DATABASE <db>" or "USER <user>" (with user meaning cPanel user). Thus, in the original plan the first command above was:
START USER BINLOG FOR (USER <user> | DATABASE <db>)
We have realized that per-database binlogging is all we really need. We imagine that others in our situation might want something similar. Since not everyone uses cPanel, we should forget per[cPanel]-user binlogging, focusing on per-database binlogging.
I have several versions of lex and bison/yacc files (sql_yacc.yy, lex.h) to implement the above (and earlier variants). The recent versions avoid increasing the %expect'ed 169 shift/reduce conflicts specified in sql_yacc.yy.
(END OF PRELIMINARY NOTE)
I currently have 13 different Subversion branches with different (partial) implementations.
1. All have some lex/bison grammer defined. (I'm not completely satisfied with any of them.)
2. Per-db binlogging is enabled by two my.cnf (or command-line) options:
user_binlog_dir=/full/directory/path user_binlog_persist_file=user_binlog.info
Our servers have fast solid-state drives that we will use for per-db binlogging. We want to specify a separate directory to store all the per-db binlogs
The persist_file stores information about the current state of db-binlogging--to survive a restart.
Just about every instance of "user_binlog" in this note should probably be changed to "db_binlog" ... (???)
3. I have added cases in sql_parse.cc for
SQLCOM_START_USER_BINLOG_DB: SQLCOM_FLUSH_AND_LOCK_DB: SQLCOM_SHOW_USER_BINLOG_STATUS: SQLCOM_UNLOCK_DB: SQLCOM_STOP_USER_BINLOG_DB:
(as well as some "USER" variants like SQLCOM_START_USER_BINLOG_USER:, which should probably abandoned.)
They should probably be changed to something like:
SQLCOM_START_DB_BINLOG: SQLCOM_FLUSH_AND_LOCK_DB: SQLCOM_SHOW_DB_BINLOG_STATUS: SQLCOM_UNLOCK_DB: SQLCOM_STOP_DB_BINLOG:
-----------------
Now that you've read it, please feel free to throw darts, etc. At this point I'm willing to discard everytning and start over. (In the worst case--if everyone on this list thinks per-db binlogging is a terrible idea--I guess I'll be forced to create our own patch [sigh].)
On the other hand, I would love to help contribute to this effort. (If per-db binlogging is already underway, let me know, and I'll likely help.) Thanks in advance for feedback/suggestions/help. Feel free to contact me privately if you want to.
Weldon Whipple weldon@whipple.org
-- MySQL Internals Mailing List For list archives: http://lists.mysql.com/internals To unsubscribe: http://lists.mysql.com/internals?unsub=mdcallag@gmail.com
-- Mark Callaghan mdcallag@gmail.com