Martin Kaluznik <martin.kaluznik@gmail.com> writes:
different places is not acceptable. I have decided to use regular replication IO thread as Kristian Nielsen suggested before and add special handling where necessary.
Right. Being able to re-use the code for talking on the network with the master will be a huge benefit, I think.
Now when the IO thread requests binlog dump from master, it sends packet with following structure master position | binlog flags | server id | log name
This is the COM_BINLOG_DUMP client packet, right?
Master needs to receive information, that slaves IO thread is expecting provisioning data along with current binlog. I have created new BINLOG_PROVISIONING_MODE flag, but I believe that the name is incorrect and it doesn’t belong to binlog flags field. My question is, what is better for compatibility, change binlog flags to any flags and use it for sending information about provisioning request to master (as in current implementation) or add another byte to packet after log name containing provisioning mode flag and possibly more flags later.
It's not really a flag for the binlog, but for the COM_BINLOG_DUMP request. It seems fine to me to use that flag field. Maybe a better name could be BINLOG_DUMP_PROVISIONING_MODE. This stuff is part of the client-server protocol, which is tricky to change due to the requirement of both forward and backwards compatibility with clients. So a minimal change of adding a flag bit seems safest. You might want to consider adding the flag at the end (0x8000) to reduce risk of clashing with any future flag added by MySQL@Oracle (those people don't care about compatibility with the rest of the world). The alternative is to add a new packet type (COM_SLAVE_PROVISION or whatever). MySQL 5.6 similarly added a new command for their GTID implementation (COM_BINLOG_DUMP_GTID). But I think using the existing command with a flag is fine. The provisioning is replicating events same way as a normal slave, while also interleaving new provision data. So it seems a natural extension of the existing protocol. One thing that occurs to me is how to handle if the user tries to LOAD DATA FROM MASTER from a master that does not support the new feature. This should give an error, but the old master will just ignore the new flag and attempt to process the COM_BINLOG_DUMP in the normal way. But we can detect this error in another way, I think, for example by adding an extra query to get_master_version_and_clock().
My current implementation progress can be seen at https://github.com/f4rnham/server/commits/MDEV-7502 diff of all commits at once https://github.com/MariaDB/server/compare/10.1...f4rnham:MDEV-7502
I'm happy to see that you have already started to write test cases!
There is one more decision I am not sure about. What should result of provisioning on non-empty slave be? For example, slave contains table with same name in same database as master, but with different columns. Should slave detect already existing tables and fail with error, or silently drop everything what could cause conflicts (add optional FORCE part of provisioning start command?). Which one from these solutions should be default? Or maybe there is third one and better.
Good point. It is actually a valid use case to provision into a non-empty slave, eg. for multi-source replication, provisioning from two separate masters and then replicating from both. But for most users, provisioning into a non-empty slave will be a mistake, ending up with a useless mix of old and new data. I think the simplest solution is to just give an error whenever a conflict is found (provisioning an object like a table which already exists on slave). The user will be expected to DROP DATABASE/TABLE/... before running the command. Automatically dropping conflicting tables may not be that useful, as it can still leave stray tables that happened to not exist on the master. And this point can be refined in a later version. For example, it could be nice if the slave would get the list of objects to be created at the start, so that error about conflicts could be given before making any changes to the slave. But that is not a requirement of a first version I think, just giving the error whenever the conflict is detected is fine. The tricky part seems to be what to do about the system tables in the `mysql` schema. Like user accounts. I am not sure about what to do there, but I suggest looking at what happens when a user uses mysqldump to dump a server and provision a new slave from it. Probably doing the same for the new provisioning feature would be reasonable. Or maybe just ignore the `mysql` schema completely - the user will in any case need to have set up accounts on the new slave-to-be, to be able to connect the slave to the master and run the LOAD DATA FROM MASTER command. - Kristian.