[Maria-discuss] Replication and auto-increment
I was reading over the replication discussion on maria-developers I thought I'd share some thoughts that are not so technical as I do not understand the back end that well. Our system is currently a master-master WAN replication setup, each writing to a different DB with an offline process to do copies. Our databases extensively uses auto-increments on tables (more than necessary, but that's another story). Would future developments allow synchronous replication to occur on a single database when the WAN link has a single point of failure(SPoF)? How might an auto-increment system handle such a failure? This SPoF is our network link that can be a transient problem out of our control. Both sides may need to keep inserting data during this transient issue. This is usually not an issue 99% of the time, but it's better to prepare for downtime than trying to fix data errors in that 1%.
Brian, Allow me to share how we are handling this issue. We have a customer that develop and sell a system that control employees in & out. They have many customers with multiple plants & sites. To avoid the related issue, we did a simple workaround, we add a new column at each table. The column name ;-) server_id :-) Now each insert is running like this: INSERT INTO foo SET server_id = @server_id, auto_increment_column = NULL, bla bla bla BinLog format chosen = ROW. The worse and complex scenario is a customer with 17 sites. All of them linked by ADSL Internet ;-) Yes, believe me. The problem of INSERT / DELETE / SELEC was successfuly solved. The only issue to take care is the possibility of race condition on UPDATE ops. Alexandre Em 27/01/2010, às 18:51, Brian Evans escreveu:
I was reading over the replication discussion on maria-developers I thought I'd share some thoughts that are not so technical as I do not understand the back end that well.
Our system is currently a master-master WAN replication setup, each writing to a different DB with an offline process to do copies. Our databases extensively uses auto-increments on tables (more than necessary, but that's another story).
Would future developments allow synchronous replication to occur on a single database when the WAN link has a single point of failure(SPoF)? How might an auto-increment system handle such a failure?
This SPoF is our network link that can be a transient problem out of our control. Both sides may need to keep inserting data during this transient issue. This is usually not an issue 99% of the time, but it's better to prepare for downtime than trying to fix data errors in that 1%.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hi Alexandre, Brian On 28/01/2010, at 7:18 AM, Alexandre Almeida wrote:
Allow me to share how we are handling this issue. We have a customer that develop and sell a system that control employees in & out. They have many customers with multiple plants & sites. To avoid the related issue, we did a simple workaround, we add a new column at each table. The column name ;-) server_id :-) Now each insert is running like this: INSERT INTO foo SET server_id = @server_id, auto_increment_column = NULL, bla bla bla BinLog format chosen = ROW. The worse and complex scenario is a customer with 17 sites. All of them linked by ADSL Internet ;-) Yes, believe me. The problem of INSERT / DELETE / SELEC was successfuly solved. The only issue to take care is the possibility of race condition on UPDATE ops.
If you set the auto-increment-increment and auto-increment-offset options, you achieve the same, without the extra column. The possible race with UPDATEs remains either way, depends on the app logic.
Em 27/01/2010, às 18:51, Brian Evans escreveu: I was reading over the replication discussion on maria-developers
I thought I'd share some thoughts that are not so technical as I do not understand the back end that well.
Our system is currently a master-master WAN replication setup, each writing to a different DB with an offline process to do copies. Our databases extensively uses auto-increments on tables (more than necessary, but that's another story).
Would future developments allow synchronous replication to occur on a single database when the WAN link has a single point of failure(SPoF)? How might an auto-increment system handle such a failure?
Basically I think the problem is that you're asking about auto- increment in this context; I don't see why auto-inc would be relevant to the question? Naturally, having sync replication when a single secondary server fails is an oxymoron, since then a) you wouldn't have a copy and b) no replication active. If you'd wish to continue operations anyway in this scenario, you'd want a setup that allows the master to continue even if sync replication were to fail, relying on monitoring to alert of this situation.
This SPoF is our network link that can be a transient problem out of our control. Both sides may need to keep inserting data during this transient issue. This is usually not an issue 99% of the time, but it's better to prepare for downtime than trying to fix data errors in that 1%.
With this requirement, all you're asking for is asynchronous replication, handling auto-inc appropriately. That already exists today. Under normal circumstances, the extra thing sync repl provides is an immediate secondary copy of the changes. But with your question above you've essentially released that requirement, making it asynchronous. Regards, Arjen. -- Arjen Lentz, Exec.Director @ Open Query (http://openquery.com) Exceptional Services for MySQL at a fixed budget. Follow our blog at http://openquery.com/blog/ OurDelta: packages for MySQL and MariaDB @ http://ourdelta.org
Hi Arjen, The idea behind the server_id column is to avoid inc_inc = 10, for example. Inc_Inc/Offset is excellent for 1, 2, or even 3 MM servers. Where things go wild, I mean, more then 3 servers, connected via ADSL we must be more creative. And if you are using MyIsam you can make server_id + auto_increment column as PK. Unfortunately, innoDB doesn't allow us to do that. The scenario I have described is a very INSERT app with helps a lot. Alexandre Em 27/01/2010, às 23:19, Arjen Lentz escreveu:
Hi Alexandre, Brian
On 28/01/2010, at 7:18 AM, Alexandre Almeida wrote:
Allow me to share how we are handling this issue. We have a customer that develop and sell a system that control employees in & out. They have many customers with multiple plants & sites. To avoid the related issue, we did a simple workaround, we add a new column at each table. The column name ;-) server_id :-) Now each insert is running like this: INSERT INTO foo SET server_id = @server_id, auto_increment_column = NULL, bla bla bla BinLog format chosen = ROW. The worse and complex scenario is a customer with 17 sites. All of them linked by ADSL Internet ;-) Yes, believe me. The problem of INSERT / DELETE / SELEC was successfuly solved. The only issue to take care is the possibility of race condition on UPDATE ops.
If you set the auto-increment-increment and auto-increment-offset options, you achieve the same, without the extra column. The possible race with UPDATEs remains either way, depends on the app logic.
Em 27/01/2010, às 18:51, Brian Evans escreveu: I was reading over the replication discussion on maria-developers
I thought I'd share some thoughts that are not so technical as I do not understand the back end that well.
Our system is currently a master-master WAN replication setup, each writing to a different DB with an offline process to do copies. Our databases extensively uses auto-increments on tables (more than necessary, but that's another story).
Would future developments allow synchronous replication to occur on a single database when the WAN link has a single point of failure(SPoF)? How might an auto-increment system handle such a failure?
Basically I think the problem is that you're asking about auto- increment in this context; I don't see why auto-inc would be relevant to the question?
Naturally, having sync replication when a single secondary server fails is an oxymoron, since then a) you wouldn't have a copy and b) no replication active. If you'd wish to continue operations anyway in this scenario, you'd want a setup that allows the master to continue even if sync replication were to fail, relying on monitoring to alert of this situation.
This SPoF is our network link that can be a transient problem out of our control. Both sides may need to keep inserting data during this transient issue. This is usually not an issue 99% of the time, but it's better to prepare for downtime than trying to fix data errors in that 1%.
With this requirement, all you're asking for is asynchronous replication, handling auto-inc appropriately. That already exists today.
Under normal circumstances, the extra thing sync repl provides is an immediate secondary copy of the changes. But with your question above you've essentially released that requirement, making it asynchronous.
Regards, Arjen. -- Arjen Lentz, Exec.Director @ Open Query (http://openquery.com) Exceptional Services for MySQL at a fixed budget.
Follow our blog at http://openquery.com/blog/ OurDelta: packages for MySQL and MariaDB @ http://ourdelta.org
Hi Alexandre, On 28/01/2010, at 11:39 AM, Alexandre Almeida wrote:
The idea behind the server_id column is to avoid inc_inc = 10, for example. Inc_Inc/Offset is excellent for 1, 2, or even 3 MM servers. Where things go wild, I mean, more then 3 servers, connected via ADSL we must be more creative.
What's the problem with auto-increment-increment=10 ? Even if you use BIGINT UNSIGNED instead of INT UNSIGNED, it's probably more economical than an extra column. Regards, Arjen.
Em 27/01/2010, às 23:19, Arjen Lentz escreveu:
Hi Alexandre, Brian
On 28/01/2010, at 7:18 AM, Alexandre Almeida wrote:
Allow me to share how we are handling this issue. We have a customer that develop and sell a system that control employees in & out. They have many customers with multiple plants & sites. To avoid the related issue, we did a simple workaround, we add a new column at each table. The column name ;-) server_id :-) Now each insert is running like this: INSERT INTO foo SET server_id = @server_id, auto_increment_column = NULL, bla bla bla BinLog format chosen = ROW. The worse and complex scenario is a customer with 17 sites. All of them linked by ADSL Internet ;-) Yes, believe me. The problem of INSERT / DELETE / SELEC was successfuly solved. The only issue to take care is the possibility of race condition on UPDATE ops.
If you set the auto-increment-increment and auto-increment-offset options, you achieve the same, without the extra column. The possible race with UPDATEs remains either way, depends on the app logic.
Em 27/01/2010, às 18:51, Brian Evans escreveu: I was reading over the replication discussion on maria-developers
I thought I'd share some thoughts that are not so technical as I do not understand the back end that well.
Our system is currently a master-master WAN replication setup, each writing to a different DB with an offline process to do copies. Our databases extensively uses auto-increments on tables (more than necessary, but that's another story).
Would future developments allow synchronous replication to occur on a single database when the WAN link has a single point of failure(SPoF)? How might an auto-increment system handle such a failure?
Basically I think the problem is that you're asking about auto- increment in this context; I don't see why auto-inc would be relevant to the question?
Naturally, having sync replication when a single secondary server fails is an oxymoron, since then a) you wouldn't have a copy and b) no replication active. If you'd wish to continue operations anyway in this scenario, you'd want a setup that allows the master to continue even if sync replication were to fail, relying on monitoring to alert of this situation.
This SPoF is our network link that can be a transient problem out of our control. Both sides may need to keep inserting data during this transient issue. This is usually not an issue 99% of the time, but it's better to prepare for downtime than trying to fix data errors in that 1%.
With this requirement, all you're asking for is asynchronous replication, handling auto-inc appropriately. That already exists today.
Under normal circumstances, the extra thing sync repl provides is an immediate secondary copy of the changes. But with your question above you've essentially released that requirement, making it asynchronous.
Regards, Arjen. -- Arjen Lentz, Exec.Director @ Open Query (http://openquery.com) Exceptional Services for MySQL at a fixed budget.
Follow our blog at http://openquery.com/blog/ OurDelta: packages for MySQL and MariaDB @ http://ourdelta.org
-- Arjen Lentz, Exec.Director @ Open Query (http://openquery.com) Exceptional Services for MySQL at a fixed budget. Follow our blog at http://openquery.com/blog/ OurDelta: packages for MySQL and MariaDB @ http://ourdelta.org
Hi Brian, On 1/27/10 12:51 PM PST, "Brian Evans" <grknight@scent-team.com> wrote:
I was reading over the replication discussion on maria-developers I thought I'd share some thoughts that are not so technical as I do not understand the back end that well.
Our system is currently a master-master WAN replication setup, each writing to a different DB with an offline process to do copies. Our databases extensively uses auto-increments on tables (more than necessary, but that's another story).
Would future developments allow synchronous replication to occur on a single database when the WAN link has a single point of failure(SPoF)? How might an auto-increment system handle such a failure?
This SPoF is our network link that can be a transient problem out of our control. Both sides may need to keep inserting data during this transient issue. This is usually not an issue 99% of the time, but it's better to prepare for downtime than trying to fix data errors in that 1%.
Given the requirement you just stated synchronous replication is not an option because your database will become unavailable if the WAN connection goes down. This is not a limitation of current implementations. It is a consequence of the problem of distributed consensus, which is like the halting problem for distributed systems and means that distributed databases cannot be guaranteed to have the same contents if one of them is allowed to fail. Instead there are workarounds. 1.) Have masters in both locations as you seem to describe and replicate to backup copies on the other site. 2.) Use auto-increment-offset as described by Arjen up-thread. However, this does not protect from other types of conflicts such as adding a child row with a particular FK on one site while deleting its parent row and hence PK on the other site. It just handles inserts. 3.) Rewrite your data model to eliminate conflicts. In that case you are better off dispensing with auto-increment entirely. 4.) Route updates to a single master for processing and then replicate them back to local slaves. In this case you must enqueue updates on remote sites (i.e., accept them provisionally) during an outage. They don't commit fully until the link is back up. This usually involves not just a data model rewrite but re-architecting the application. What's wrong with the current approach you are using? Unless I'm mistaken you don't have a problem with outages now because you can continue to update the masters and replicate once the link resumes. If you are looking for full data consistency across sites that problem is not solvable but others may be. Cheers, Robert
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hi!
"Brian" == Brian Evans <grknight@scent-team.com> writes:
Brian> I was reading over the replication discussion on maria-developers Brian> I thought I'd share some thoughts that are not so technical as I do not Brian> understand the back end that well. Brian> Our system is currently a master-master WAN replication setup, each Brian> writing to a different DB with an offline process to do copies. Brian> Our databases extensively uses auto-increments on tables (more than Brian> necessary, but that's another story). Brian> Would future developments allow synchronous replication to occur on a Brian> single database when the WAN link has a single point of failure(SPoF)? Brian> How might an auto-increment system handle such a failure? Brian> This SPoF is our network link that can be a transient problem out of our Brian> control. Brian> Both sides may need to keep inserting data during this transient issue. Brian> This is usually not an issue 99% of the time, but it's better to prepare Brian> for downtime than trying to fix data errors in that 1%. The easyest solution for this is to have the auto-increment as the second part of your key: CREATE TABLE (A int, B int auto_increment, key(A,B)); As long you keep A unique per system, you never have to worry about duplicate keys. Another solution is to use UUID:s instead of auto-increment. Regards, Monty
participants (5)
-
Alexandre Almeida
-
Arjen Lentz
-
Brian Evans
-
Michael Widenius
-
Robert Hodges