Re: [Maria-discuss] Per-db binlogging

14 Sep 2010

      +maria-discuss as they are working in replication too
+robert.hodges as tungsten replicator has features that could be used here

I am very interested in a subset of this. I want one SQL thread per
database on a slave (stream per database on the slave). However I
don't need to split the binlog on the master to get that and writing a
binlog file per database on the master might complicate the
sync_binlog option that keeps the binlog and InnoDB in sync during
crash recovery. How will this work in that case?

Note that a stream (SQL slave thread) per database on the slave will
break rpl_transaction_enabled. Some of us rely on that to keep slaves
consistent during crash recovery. I assume you don't use that feature.
MySQL has worklogs to use db tables in place of relay-log.info and
when that code is production ready, then this is not a problem.

On Mon, Sep 13, 2010 at 1:52 PM, Weldon Whipple <weldon@whipple.org> wrote:
...
Greetings!
My employer has asked me to implement per-DB binlogging for
MySQL. I've been working on prototypes for 3-4 weeks now, and decided
it's time to post to this list to ask for comments, suggestions,
condolences, etc. (I've been a "lurker" for quite awhile. I hope this
post isn't too far off the mark) The assignment is *only* for master-side
binlogging. (It doesn't require a slave/replication-side implementation.)
Why I Want to Do It.
-------------------
I work for an ISP that hosts (last I heard--I HOPE I'm not lying) over
2 million domains. Customers are allowed to have (almost) unlimited
MySQL databases. Migrating an account from one physical server to
another is a daunting task. (A typical server has easily 2000+
[sometimes far more] databases.) Global mysqld binlogging isn't
feasible when we want to migrate (for example) a single account to
another server.
Our proposed scenario goes something like this:
1. START USER BINLOG FOR DATABASE <db>
2. FLUSH USER TABLES WITH READ LOCK FOR DATABASE <db>
3. SHOW USER BINLOG STATUS [WHERE DATABASE LIKE <pattern>]
SHOW output might look something like:
=================================
mysql> show user binlog status;
+----------------------+----------+---------------------------------+------------+
| Database             | User     | Binlog_file                     |
Binlog_pos |
+----------------------+----------+---------------------------------+------------+
| booklibs             | wwhipple | booklibs-bin.000003             |
    248 |
| charmins_gravestones | charmins | charmins_gravestones-bin.000001 |
      4 |
+----------------------+----------+---------------------------------+------------+
2 rows in set (0.01 sec)
The "where" clause also works:
mysql> show user binlog status where user like 'charmins';
+----------------------+----------+---------------------------------+------------+
| Database             | User     | Binlog_file                     |
Binlog_pos |
+----------------------+----------+---------------------------------+------------+
| charmins_gravestones | charmins | charmins_gravestones-bin.000001 |
      4 |
+----------------------+----------+---------------------------------+------------+
1 row in set (0.01 sec)
=================================
At this point the migration tools note the binlog coordinates and
start copying the databases to the new server. (For myisam tables, the
tools writers want to just copy [scp?] the files. For innodb, they
will use something like mysqldump.)
(Note: We might want to "lose" the "User" column in the above table]
in our recent designs.)
Then the sequence of MySQL commands resumes:
4. UNLOCK USER BINLOG FOR DATABASE <db>
Finally, when migration is complete:
5. STOP USER BINLOG FOR DATABASE <db>
A high percentage of users (infrequent bloggers, etc.) go several days
(sometimes even weeks or months) without updating their databases. For
them--after migrating the files to the new server, a final check of
the binlog coordinates on the old box can verify that nothing has
changed.
If it HAS changed, migration can copy the binlog file(s) to the new
box and use the mysqlbinlog command (specifying the starting
coordinates) to bring the new database in sync with the old one.
(This scenario is probably oversimplified, ignoring things like DNS
caching, TTL, etc., etc. Hopefully it doesn't include too many lies.)
How I Plan to Do It.
-------------------
I've noticed that:
1. The class MYSQL_BIN_LOG seems to be involved in most binlogging
2. MYSQL_BIN_LOG inherits from TC_LOG and MYSQL_LOG. It is created in
mysqld.cc at startup.
My current partial implementation has:
1. Class DB_BIN_LOG inheriting from MYSQL_BIN_LOG, adding members like
 db_name
 db_binlog_name
 db_index_name
 and methods to access them, etc.
2. Singleton class USER_BIN_LOG_MGR. (Should probably be DB_BIN_LOG_MGR),
 which maintains a db_binlog_hash of information about each DB being
 binlogged. A DB_BIN_LOG_ENTRY (in the hash) contains:
   dbname
   locked (bool)--has it been locked by FLUSH ... TABLES WITH READ LOCK?
   binlog file
   binlog_pos
   is_active
   a pointer to a DB_BIN_LOG instance
 USER_BINLOG_MGR has a rw lock to protect access to its hash of
 currently binlogging db's. (Most accesses will probably be
 read-only?)
Is it plausible to identify all places where a binlog addition might
occur? In those places, if the singleton USER_BIN_LOG_MRG exists, ask
it if the current DB is being binlogged. If so, send the request to
the DB_BIN_LOG instance.
Questions I Have.
----------------
1. I notice that binlog is in the list of engines and plugins
 (displayed by SHOW ...). The bottom of log.cc has
 "mysql_declare_plugin(binlog)". Do I need one for db_plugin?
 (probably not?)
2. I've diagrammed the binary log event taxonomy--the 24 classes that
 inherit from Log_event (in log_event.cc/h). That was my main
 motivation for wanting to inherit from MYSQL_BIN_LOG.
3. On the other hand, I wonder if inheriting from MYSQL_BIN_LOG might
 be overkill for "master only" binlogging--without the
 slave/replication part of it.
4. What version should I be working on? (We're currently running
 5.1.47.) What should I check out of bzr/launchpad? (Is that still
 what we're doing with the changes in ownership?)
5. I see that THD is involved in binlogging. (Several THD methods are
 defined in log.cc.) Are there any caveats here?
6. What files should I use? A new h/cc pair? ... or just add to the
 hierarchy in log.h/cc?
In an ideal world, it would be nice to inherit polymorphically from
MYSQL_BIN_LOG, overriding methods that set the names of the binlog
file and index file (etc.) then just call the methods and have it do
the "right thing." On the other hand, I get the impression that I
might be oversimplifying.
I welcome your advice!!
What I've Implemented So Far (about 13 Prototypes).
--------------------------------------------------
(PRELIMINARY NOTE:)
Our initial plan was to call it "Per-User Binlogging". The user here
is a cPanel [control panel] user. cPanel's database naming convention
is to have all of a user's database names start with the username,
followed by an underscore, followed by a distinctive database
name. Thus all the commands above start with
<verb> USER ...
Originally most of them could end either "DATABASE <db>" or "USER
<user>" (with user meaning cPanel user). Thus, in the original plan
the first command above was:
START USER BINLOG FOR (USER <user> | DATABASE <db>)
We have realized that per-database binlogging is all we really need.
We imagine that others in our situation might want something similar.
Since not everyone uses cPanel, we should forget per[cPanel]-user
binlogging, focusing on per-database binlogging.
I have several versions of lex and bison/yacc files (sql_yacc.yy,
lex.h) to implement the above (and earlier variants). The recent
versions avoid increasing the %expect'ed 169 shift/reduce conflicts
specified in sql_yacc.yy.
(END OF PRELIMINARY NOTE)
I currently have 13 different Subversion branches with different
(partial) implementations.
1. All have some lex/bison grammer defined. (I'm not completely
satisfied with any of them.)
2. Per-db binlogging is enabled by two my.cnf (or command-line)
options:
user_binlog_dir=/full/directory/path
user_binlog_persist_file=user_binlog.info
Our servers have fast solid-state drives that we will use for per-db
binlogging. We want to specify a separate directory to store all the
per-db binlogs
The persist_file stores information about the current state of
db-binlogging--to survive a restart.
Just about every instance of "user_binlog" in this note should
probably be changed to "db_binlog" ... (???)
3. I have added cases in sql_parse.cc for
SQLCOM_START_USER_BINLOG_DB:
SQLCOM_FLUSH_AND_LOCK_DB:
SQLCOM_SHOW_USER_BINLOG_STATUS:
SQLCOM_UNLOCK_DB:
SQLCOM_STOP_USER_BINLOG_DB:
(as well as some "USER" variants like SQLCOM_START_USER_BINLOG_USER:,
which should probably abandoned.)
They should probably be changed to something like:
SQLCOM_START_DB_BINLOG:
SQLCOM_FLUSH_AND_LOCK_DB:
SQLCOM_SHOW_DB_BINLOG_STATUS:
SQLCOM_UNLOCK_DB:
SQLCOM_STOP_DB_BINLOG:
-----------------
Now that you've read it, please feel free to throw darts, etc. At this
point I'm willing to discard everytning and start over. (In the worst
case--if everyone on this list thinks per-db binlogging is a terrible
idea--I guess I'll be forced to create our own patch [sigh].)
On the other hand, I would love to help contribute to this effort. (If
per-db binlogging is already underway, let me know, and I'll likely help.)
Thanks in advance for feedback/suggestions/help. Feel free to contact
me privately if you want to.
Weldon Whipple
weldon@whipple.org
--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe:    http://lists.mysql.com/internals?unsub=mdcallag@gmail.com
-- 
Mark Callaghan
mdcallag@gmail.com

Re: [Maria-discuss] Per-db binlogging

MARK CALLAGHAN