Henrik, Thanks for your response! (See my replies below.) On Wed, Sep 15, 2010 at 2:33 PM, Henrik Ingo <henrik.ingo@avoinelama.fi> wrote:
On Tue, Sep 14, 2010 at 3:02 AM, MARK CALLAGHAN <mdcallag@gmail.com> wrote:
+maria-discuss as they are working in replication too
To be exact, maria-developers@ is the counterpart to internals@mysql, although everyone should be reading maria-discuss@ too. I'm now leaving both in CC.
I'm replying leaving all your addressees in my reply to this note. (I hope there aren't too many duplicates!) <snip/>
Do I understand correctly, that in the normal case you do not run with binlog on at all?
Yes. (We have tried it a few times in the past for load balancing, etc., but aren't using it present.)
You would only turn on this kind of per-database binlog when needed?
Yes--when we want to migrate a user's databases from one server to another.
If that would be the case, then I don't see why --binlog-do-db wouldn't already do exactly what you want, except that you'd require a restart to change it. So all you need to implement would be the ability to set it as a system variable and not just in the conf file. (http://dev.mysql.com/doc/refman/5.1/en/replication-options-binary-log.html#o...)
That is almost (precisely) what we want! I have actually written some local hacks where my.cnf and command-line options are also settable as system variables. I could probably do that easily. The only difference is that we want multiple output binlog files--one per database. It is common for our users (who are serviced on 100s or 1000s of physical servers in our data centers) to want/need to migrate to a different physical server--to any one of the other servers in our data centers. In that case, it seems to make the most sense to have separate binlog files, which we can copy to the destination servers if needed. If--in the period between dumping the databases on the source server and importing it onto the new server (and just before going online on the new server), we see that the binlog coordinates for a given database have changed on the source server, we can copy the binlog file(s) from the source to the destination server and run mysqlbinlog (specifying the same coordinates displayed before the initial dump) to bring the database up-to-date, then bring the customer online on the new server (and offline on the old server. On a given day, different customers on one server will often migrate to many different servers. It would be impractical to have many slave servers listening for changes on that one "source" server. Also, any one destination server might need to listen to several source masters at any given time. (This could lead to a combinatoric explosion--or at least a difficult situation for administrators/migration tools.) I'm jumping to a later line in your reply--the one about the mysqlbinlog command. Because of the large number of combinations of migration sources and destinations, we are looking at using the mysqlbinlog command on the destination servers: If a second check of the source server's single-database-binlog coordinates (filename/offset) shows a change, we will copy what binlog file(s) are necessary from the old to the new server and run mysqlbinlog--just before "flipping the switch" to bring the customer online on the new server/offline at the old server. Perhaps a different approach might be to modify (if enabled by some option) the binlog-do-db behavior (if controlled by a global variable) to open a database-specific binlog file. I've noticed in a later post by Sergei that "FLUSH TABLES WITH READ LOCK FOR DATABASE dbname" would be difficult to implement. If we decide to skip that command (in favor of a global version?) in our first iteration, we would be left with: START, SHOW coordinates, STOP--all for a single database--to a single binlog file (set) and index. I'm guessing that "UNLOCK DATABASE dbname for BINLOG" would be as difficult as the FLUSH/LOCK command ... If that is the case: What is the overhead/time involved for a mysqld with thousands of databases to run the global "FLUSH TABLES WITH READ LOCK" and "UNLOCK TABLES" whenever we want to dump a database on the source server before copying (scp) the data to the new server ...?
Alternatively, if you are also running a "global" binlog, then as Mark suggested, don't write additional per database binlogs, rather get your information from the "global" binlog and filter out what should be replicated instead by using --replicate-do-db. Again, it is not currently possible to set this on/off without restart. (http://dev.mysql.com/doc/refman/5.1/en/replication-options-slave.html#option...)
Finally, for anyone wishing to implement their own binlogging and replication, I should advertise our work in MariaDB on a generic "replication plugin API". Using this you could implement your own replication plugin that would do precisely what you want, but you could still leave the original replication code untouched. (You could of course "fork" it to make your own plugin if you wanted.) Kristian Nielsen in CC is working on that, and you can read about it here: http://askmonty.org/wiki/ReplicationProject http://askmonty.org/worklog/?tid=107 http://askmonty.org/worklog/?tid=120
I will definitely look at the above links.
henrik PS: It is not exactly the same use case, but also see mine and Igor's presentation from last years MySQL user conference. http://assets.en.oreilly.com/1/event/36/Valuable%20MariaDB%20Features%20That... ...from slide 10 onwards. Summary: You can use mysqlbinlog instead of replication and this allows all kinds of scriptable hacks to occur between master and slave.
I will read about mysqlbinlog from your slides!
henrik
If I'm overlooking some brain-dead easy solution to this migration requirement, feel free to enlighten me.!! Thanks again! Weldon