[Maria-developers] Replicating same server_id problem
Kristian, Currently MariaDB (as well as MySQL of all previous versions) has a very big problem related to replicating same server_id. There is --replicate-same-server-id flag which as I understand (when set to 0) controls two things: 1) It doesn't allow slave to connect to a master with the same server_id. 2) Slave ignores all binlog events in the replication stream that have the same server_id as slave. And this flag cannot be set to 1 when --log-slave-updates is used. And that is a big problem. Consider the following scenario: let's say we have two servers S1 (master) and S2 (slave). Let's say at some moment in time they are completely in sync and you bring down S2 to take cold backup (you can even include binlogs in it). Then you bring it back up, S1 is still master. Now you execute some transactions, then you do a failover, make S2 master and execute some more transactions. Then you bring down S1, restore it from the backup taken earlier and connect to replicate from S2 again. At this point S1 will have to reply some transaction that have the same server_id as S1 has, but S1 doesn't have those transactions yet, so it shouldn't skip them, but it will. What do you think about how this should be fixed? As I understand you explicitly wanted to support replication cycles, so you still want the skipping of transactions with the same server_id to exist. But the situation above is a valid production use case. Maybe in GTID world it can be solved better? E.g. if transaction has the same server_id, but the GTID wasn't applied yet then it shouldn't be skipped? Thank you, Pavel
Pavel Ivanov <pivanof@google.com> writes:
--replicate-same-server-id flag which as I understand (when set to 0) controls two things: 1) It doesn't allow slave to connect to a master with the same server_id. 2) Slave ignores all binlog events in the replication stream that have the same server_id as slave. And this flag cannot be set to 1 when --log-slave-updates is used. And that is a big problem.
Hm, I was not aware of this. It seems wrong. For (1), I don't think slave should ever be allowed to connect to server with the same server_id. And for the reason you mentioned, it seems wrong that --log-slave-updates and --replicate-same-server-id can not be used together. After all, --replicate-same-server-id is only a problem in ring topologies. It does not really seem related to GTID though, the exact same problems would occur when using old-style replication. Of course an easy work-around is to change the server id on the restored server S1, but the problem is if one is not aware of this ahead of time... On the other hand, in GTID strict mode, the problem of creating a loop does not exist. Any attempt to binlog an event that is already in the binlog will cause an error. So it would make sense to allow --replicate-same-server-id together with --log-slave-updates when GTID strict mode is enabled. On the other hand, I would be tempted to just allow the two to be used together freely - users that want to do ring topologies must in any case be very aware of all the possible pitfalls.
What do you think about how this should be fixed? As I understand you explicitly wanted to support replication cycles, so you still want the skipping of transactions with the same server_id to exist. But the situation above is a valid production use case. Maybe in GTID world it can be solved better? E.g. if transaction has the same server_id, but the GTID wasn't applied yet then it shouldn't be skipped?
The main problem I see is what should be the default? I suppose we cannot safely change the default for --replicate-same-server-id. On the other hand, if users explicitly set --replicate-same-server-id=0, then it really does not seem correct that some events with same server id are nevertheless replicated depending on some complicated GTID semantics. So the curse of backwards compatibility seems to hit here... Maybe in GTID strict mode we could make it an error if we are about to skip an event with our own server_id that has a higher seq_no than what we have in our binlog. Then we at least get safe behaviour in strict mode in non-ring topologies. With respect to ring topologies, I frankly find them quite dangerous to rely on, and for now I am mainly concerned with making sure that anything that worked in 5.5 will continue to work in 10.0. - Kristian.
Kristian Nielsen <knielsen@knielsen-hq.org> writes:
The main problem I see is what should be the default? I suppose we cannot safely change the default for --replicate-same-server-id. On the other hand, if users explicitly set --replicate-same-server-id=0, then it really does not seem correct that some events with same server id are nevertheless replicated depending on some complicated GTID semantics. So the curse of backwards compatibility seems to hit here...
Hm, on second thoughts... CHANGE MASTER TO ... master_use_gtid is a new feature, so backwards compatibility does not really apply. And in GTID mode, we have a better way to prevent loops - the seq_no. Even in non-strict mode this can be relied on for our own server_id, at least. So perhaps it is better to just say that --replicate-same-server-id does not apply to GTID mode at all. Instead, in GTID mode, if we receive an event with our own server_id and smaller seq_no than what we already have, we skip it. Otherwise we apply it. That should be correct in the important use-case that you mentioned (restoring old backup), as well as in ring topologies. I need to think on this a bit more to be sure, but it seems like a possible solution. - Kristian.
Krisitan, Did you figure out what would be the best solution here? Thank you, Pavel On Tue, Sep 3, 2013 at 2:05 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Kristian Nielsen <knielsen@knielsen-hq.org> writes:
The main problem I see is what should be the default? I suppose we cannot safely change the default for --replicate-same-server-id. On the other hand, if users explicitly set --replicate-same-server-id=0, then it really does not seem correct that some events with same server id are nevertheless replicated depending on some complicated GTID semantics. So the curse of backwards compatibility seems to hit here...
Hm, on second thoughts...
CHANGE MASTER TO ... master_use_gtid is a new feature, so backwards compatibility does not really apply. And in GTID mode, we have a better way to prevent loops - the seq_no. Even in non-strict mode this can be relied on for our own server_id, at least.
So perhaps it is better to just say that --replicate-same-server-id does not apply to GTID mode at all. Instead, in GTID mode, if we receive an event with our own server_id and smaller seq_no than what we already have, we skip it. Otherwise we apply it. That should be correct in the important use-case that you mentioned (restoring old backup), as well as in ring topologies.
I need to think on this a bit more to be sure, but it seems like a possible solution.
- Kristian.
See: http://developers.slashdot.org/story/13/09/09/2259206/a-tale-of-two-mysql-bu... Great work! -- Jan Lindström Principal Engineer MariaDB | MaxScale | skype: jan_p_lindstrom www.skysql.com <http://www.skysql.com/> Twitter <http://twitter.com/skysql> Blog <http://www.skysql.com/blog/> Facebook <http://www.facebook.com/skysql> LinkedIn <http://www.linkedin.com/company/1214250> Google+ <https://plus.google.com/117544963211695643458/posts>
Pavel Ivanov <pivanof@google.com> writes:
Kristian Nielsen <knielsen@knielsen-hq.org> writes:
So perhaps it is better to just say that --replicate-same-server-id does not apply to GTID mode at all. Instead, in GTID mode, if we receive an event with our own server_id and smaller seq_no than what we already have, we skip it. Otherwise we apply it. That should be correct in the important use-case that you mentioned (restoring old backup), as well as in ring topologies.
Did you figure out what would be the best solution here?
I'm working on other stuff at the moment, but it still seems the best idea I could think of ... - Kristian.
Hi!
"Pavel" == Pavel Ivanov <pivanof@google.com> writes:
Pavel> Kristian, Pavel> Currently MariaDB (as well as MySQL of all previous versions) has a Pavel> very big problem related to replicating same server_id. There is Pavel> --replicate-same-server-id flag which as I understand (when set to 0) Pavel> controls two things: Pavel> 1) It doesn't allow slave to connect to a master with the same server_id. Pavel> 2) Slave ignores all binlog events in the replication stream that have Pavel> the same server_id as slave. Pavel> And this flag cannot be set to 1 when --log-slave-updates is used. And Pavel> that is a big problem. Pavel> Consider the following scenario: let's say we have two servers S1 Pavel> (master) and S2 (slave). Let's say at some moment in time they are Pavel> completely in sync and you bring down S2 to take cold backup (you can Pavel> even include binlogs in it). Then you bring it back up, S1 is still Pavel> master. Now you execute some transactions, then you do a failover, Pavel> make S2 master and execute some more transactions. The above is all ok. Pavel> Then you bring down Pavel> S1, restore it from the backup taken earlier and connect to replicate Pavel> from S2 again. The above is not ok and has never been supported before in MySQL/MariaDB. What one should do is to use S2 to setup a new S1 or change server id on S1. The reason is that you can't logically get the above to work safe with server id's in all scenario's. An example: Assume you have a ring-replication or setup between S1 and S2. If you now restore S1 to an older state, you can't know which of the events S1 you get from S2 have already been applied. Here is an example: A) S1 sends one event S1.1 to S2 B) backup C) S1 sends one event, S1.2 to S2 D) S2 sends events S2.1, S1.1 and S1.2 to S1 If you restore S1 to state B and start replication, data from D) will be sent to S1, but based on servid it's not possible to know that S1.1 has to be skipped and S1.2 to be executed. With GTID we can do things better. knielsen> Maybe in GTID strict mode we could make it an error if we are about to skip an knielsen> event with our own server_id that has a higher seq_no than what we have in our knielsen> binlog. Then we at least get safe behaviour in strict mode in non-ring knielsen> topologies. Wouldn't it be safe to just give a warning that we have found already applied events and then skip them? Regards, Monty
On Mon, Sep 30, 2013 at 11:47 PM, Michael Widenius <monty@askmonty.org> wrote:
Pavel> Kristian, Pavel> Currently MariaDB (as well as MySQL of all previous versions) has a Pavel> very big problem related to replicating same server_id. There is Pavel> --replicate-same-server-id flag which as I understand (when set to 0) Pavel> controls two things: Pavel> 1) It doesn't allow slave to connect to a master with the same server_id. Pavel> 2) Slave ignores all binlog events in the replication stream that have Pavel> the same server_id as slave. Pavel> And this flag cannot be set to 1 when --log-slave-updates is used. And Pavel> that is a big problem.
Pavel> Consider the following scenario: let's say we have two servers S1 Pavel> (master) and S2 (slave). Let's say at some moment in time they are Pavel> completely in sync and you bring down S2 to take cold backup (you can Pavel> even include binlogs in it). Then you bring it back up, S1 is still Pavel> master. Now you execute some transactions, then you do a failover, Pavel> make S2 master and execute some more transactions.
The above is all ok.
Pavel> Then you bring down Pavel> S1, restore it from the backup taken earlier and connect to replicate Pavel> from S2 again.
The above is not ok and has never been supported before in MySQL/MariaDB.
What one should do is to use S2 to setup a new S1 or change server id on S1.
Unfortunately both advices are unacceptable in highly available production environments. - Using S2 to setup a new S1 means we have to bring down database completely for a prolonged period of time which doesn't line up with high availability at all. - Changing server_id for S1 means we have to remember all server ids that ever were a master for the database. When any master failover and server restart is a manual process this could be feasible, but in automated environments this is virtually impossible.
The reason is that you can't logically get the above to work safe with server id's in all scenario's.
An example:
Assume you have a ring-replication or setup between S1 and S2.
I believe the circular replication is ill-advised and it's impossible to build any sane production system based on it (and I would be glad to hear about any examples to the contrary). So I would love to see some flag that disables any possibility of circular replication along with removing any features that exist only to facilitate such configuration...
If you now restore S1 to an older state, you can't know which of the events S1 you get from S2 have already been applied.
Here is an example:
A) S1 sends one event S1.1 to S2 B) backup C) S1 sends one event, S1.2 to S2 D) S2 sends events S2.1, S1.1 and S1.2 to S1
If you restore S1 to state B and start replication, data from D) will be sent to S1, but based on servid it's not possible to know that S1.1 has to be skipped and S1.2 to be executed.
With GTID we can do things better.
Are you suggesting that currently if slaves always connect to master using GTID something can be implemented that will allow to re-play binlog events with the same server id without turning on --replicate-same-server-id flag?
knielsen> Maybe in GTID strict mode we could make it an error if we are about to skip an knielsen> event with our own server_id that has a higher seq_no than what we have in our knielsen> binlog. Then we at least get safe behaviour in strict mode in non-ring knielsen> topologies.
Wouldn't it be safe to just give a warning that we have found already applied events and then skip them?
Thank you, Pavel
Hi!
"Pavel" == Pavel Ivanov <pivanof@google.com> writes:
<cut>
What one should do is to use S2 to setup a new S1 or change server id on S1.
Pavel> Unfortunately both advices are unacceptable in highly available Pavel> production environments. Pavel> - Using S2 to setup a new S1 means we have to bring down database Pavel> completely for a prolonged period of time which doesn't line up with Pavel> high availability at all. The way people are doing it now: - Taking a snapshot of the file systems of S2 and use that as a base This works of course only for some file systems and setup. - One has a S3 replicate, either after S1 or S2. Taking this down and use this is backup works for most. Pavel> - Changing server_id for S1 means we have to remember all server ids Pavel> that ever were a master for the database. When any master failover and Pavel> server restart is a manual process this could be feasible, but in Pavel> automated environments this is virtually impossible. You only have to avoid those server_id's that are 'active' (ie, in a binary log file that you will read). If you rotate your binary log file once a week, there should be many easy ways to assign and reuse server_id's. But I agree that this is not a long term solution that works for anyone.
The reason is that you can't logically get the above to work safe with server id's in all scenario's.
An example:
Assume you have a ring-replication or setup between S1 and S2.
Pavel> I believe the circular replication is ill-advised and it's impossible Pavel> to build any sane production system based on it (and I would be glad Pavel> to hear about any examples to the contrary). So I would love to see Pavel> some flag that disables any possibility of circular replication along Pavel> with removing any features that exist only to facilitate such Pavel> configuration... There are a LOT of MySQL and MariaDB users that are using circular replication. As far as I know, Yahoo is using this to replicate coast to coast.
If you now restore S1 to an older state, you can't know which of the events S1 you get from S2 have already been applied.
Here is an example:
A) S1 sends one event S1.1 to S2 B) backup C) S1 sends one event, S1.2 to S2 D) S2 sends events S2.1, S1.1 and S1.2 to S1
If you restore S1 to state B and start replication, data from D) will be sent to S1, but based on servid it's not possible to know that S1.1 has to be skipped and S1.2 to be executed.
With GTID we can do things better.
Pavel> Are you suggesting that currently if slaves always connect to master Pavel> using GTID something can be implemented that will allow to re-play Pavel> binlog events with the same server id without turning on Pavel> --replicate-same-server-id flag? What I was saying is that in MariaDB/MySQL 5.5 this was never working and one could never get this to work safely in your setup. With GTID we know better the state of the master and we should be able to add a bit of code to ignore events that we know we have already executed. Regards, Monty
participants (4)
-
Jan Lindström
-
Kristian Nielsen
-
Michael Widenius
-
Pavel Ivanov