Re: [Maria-developers] [Commits] Rev 4484: MDEV-4255 MIXED replication: Unique key updates are not marked unsafe, replication fails in lp:~maria-captains/maria/10.0
Sergei Golubchik <serg@mariadb.org> writes:
Kristian Nielsen, could you please review this patch?
I guess my main question here is, did you think about why you want to) fix this 1) in a post-GA release, 2) at all? One problem with adding new statements as unsafe for statement-based binlogging is that it introduces new warnings in existing applications. This is especially annoying on a slave with binlogging enabled, as every unsafe statement now causes a warning in the slave error log. Thus, it can happen that a security update for a user leaves them with an error log that constantly fills up on a busy slave. It is my understanding that statement-based binlogging (as opposed to mixed or row) is still quite popular, and there have been complaints about these warnings before. Another point is that this patch seems to fix only a corner case of a bigger problem? This particular patch is about non-transactional updates. But updating a unique key can break replication just as well on transactional tables: CREATE TABLE t1 (a INT PRIMARY KEY); INSERT INTO t1 VALUES (2), (3), (4); UPDATE t1 SET a=a-1; Since we do not have deferred constraints, this UPDATE will fail or succeed depending on the order in which rows are updated. So a statement that succeeds andgets binlogged on the master can break on the slave if the rows are updated in a different order. So any reason that you fix one part of UPDATE of unique key but not the other? (I suppose the same problem can also happen with foreign key constraints, though it might take some creativity to come up with an example where an UPDATE will conflict or not with foreign keys depending on row update order). The bug has fix version 5.5, your patch seems to be against 10.0, where do you intend to push it? I would probably have chosen to fix in 10.1, if at all, but I'm not necessarily against 5.5 or 10.0, I'm mostly just asking what the reason is for whichever version was chosen?
I don't particularly like that it explicitly invokes thd->decide_logging_format() . But there's no way around it, first time logging format is decided in lock_tables(), basically when tables are just opened. At that point we don't know what fields will be updated yet.
I am not really familiar with how binlogging is decided on the master, have not worked with that code so far. So unfortunately I am not able to say much about whether this is a problem or not. At least the call is still before the actual update of rows starts, so I suppose it should be ok...
At lp:~maria-captains/maria/10.0
------------------------------------------------------------ revno: 4484 revision-id: sergii@pisem.net-20141116234900-747qdhm9ti9xh070 parent: sergii@pisem.net-20141113161325-mdz5ox7e86iyazss fixes bug: https://mariadb.atlassian.net/browse/MDEV-4255 committer: Sergei Golubchik <sergii@pisem.net> branch nick: 10.0 timestamp: Mon 2014-11-17 00:49:00 +0100 message: MDEV-4255 MIXED replication: Unique key updates are not marked unsafe, replication fails
I think this needs a better description, especially since this can cause new warnings in slave error logs. Here is a suggestion, but feel free to use something else if you prefer: "MDEV-4255 MIXED replication: Unique key updates are not marked unsafe, replication fails Statement-based replication of UPDATE statements that modify (parts of) unique keys is not safe. If the order of row updates differ between master and slave, the slave can end up modifying a different set of rows before a unique key violation error than what was updated on the master. If the table is non-transactional, then the statement will be partially executed on the master and on the slave, but with different result, causing replication to diverge. This patch marks as unsafe for statement-based replication any UPDATE that modifies a unique key on a non-transactinal table. In mixed-mode replication, this will cause the query to be logged in row mode. In statement-mode replication, this will produce a warning about binlogging of unsafe statement."
=== modified file 'sql/share/errmsg-utf8.txt' --- a/sql/share/errmsg-utf8.txt 2014-11-13 16:13:25 +0000 +++ b/sql/share/errmsg-utf8.txt 2014-11-16 23:49:00 +0000 @@ -7111,3 +7111,5 @@ ER_SLAVE_SKIP_NOT_IN_GTID eng "When using GTID, @@sql_slave_skip_counter can not be used. Instead, setting @@gtid_slave_pos explicitly can be used to skip to after a given GTID position." ER_TABLE_DEFINITION_TOO_BIG eng "The definition for table %`s is too big" +ER_BINLOG_UNSAFE_UPDATE_UNIQUE_NONTRANS + eng "The statement is unsafe because it updates a UNIQUE key colums in a non-transactional table. This is unsafe because in the case of a conflict a non-predictable part of the table will stay updated."
(Adding a new error code in a GA release requires some care, especially if fix-version:5.5 is intended, but I'm sure you're better aware of this than most).
=== modified file 'sql/sql_lex.h' --- a/sql/sql_lex.h 2014-08-07 16:06:56 +0000 +++ b/sql/sql_lex.h 2014-11-16 23:49:00 +0000
@@ -1319,6 +1319,12 @@ class Query_tables_list */ BINLOG_STMT_UNSAFE_AUTOINC_NOT_FIRST,
+ /** + INSERT into auto-inc field which is not the first part of composed + primary key. + */ + BINLOG_STMT_UPDATE_UNIQUE_NONTRANS, +
Copy-paste error. Here, you copied the comment of the earlier entry, you surely intended to re-write the comment to describe the newly added entry. Hope this helps, - Kristian.
By the way, it would be a lot better if we could do it so that the statement is only marked unsafe if it modifies more than one row. I would guess that UPDATE of a unique key are by far most common with single-row updates, and such updates should be safe. So it would be great if they did not get affected and cause warnings in the error log. But I am not sure if that can be achieved easily. It is probably not ok to delay the decide_logging_format() call until we actually see a second row being updated - at that point it is too late to switch to row-based binlogging, we would lose the update to the first row, I think. Perhaps the optimiser can in some cases see that a query will touch only one row, eg. ... WHERE unique_key=XXX condition? If this information is available in mysql_update() somehow, it could prevent needlessly marking statements unsafe in some cases, at least. Maybe the optimiser people know if there is such information available. Well, just an idea, may or may not be feasible. - Kristian.
Hi, Kristian! (this is just a reply to this your email, I'm going to reply to the review separately) On Nov 18, Kristian Nielsen wrote:
By the way, it would be a lot better if we could do it so that the statement is only marked unsafe if it modifies more than one row. I would guess that UPDATE of a unique key are by far most common with single-row updates, and such updates should be safe. So it would be great if they did not get affected and cause warnings in the error log.
I had a different idea - only issue this warning when an actual unique conflict occurs. It can be easily combined with yours - only issue a warning if a unique conflict occurs and some rows were already updated.
But I am not sure if that can be achieved easily. It is probably not ok to delay the decide_logging_format() call until we actually see a second row being updated - at that point it is too late to switch to row-based binlogging, we would lose the update to the first row, I think.
Yes, in this case I though it'll need to switch to RBR (from MIXED) at the beginning - like in my current patch - but only issue a warning if there was a conflict. But this behavior is not consistent with existing implementations. For example, INSERT IGNORE and UPDATE IGNORE always issue an unsafe warning, not only when an actual conflict was ignored. For consistency all these cases would need to be fixed to follow the new semantics. Regards, Sergei
Hi, Kristian! On Nov 18, Kristian Nielsen wrote:
Sergei Golubchik <serg@mariadb.org> writes:
Kristian Nielsen, could you please review this patch?
I guess my main question here is, did you think about why you want to) fix this 1) in a post-GA release, 2) at all?
I wasn't sure, actually. There was a bug report. I wanted to see what it takes to fix it. But it happened out that this fix had introduced quite a few warnings in existing tests. So, I don't think this should be pushed in 10.0 in its current form.
Another point is that this patch seems to fix only a corner case of a bigger problem? This particular patch is about non-transactional updates. But updating a unique key can break replication just as well on transactional tables:
CREATE TABLE t1 (a INT PRIMARY KEY); INSERT INTO t1 VALUES (2), (3), (4); UPDATE t1 SET a=a-1;
Since we do not have deferred constraints, this UPDATE will fail or succeed depending on the order in which rows are updated. So a statement that succeeds and gets binlogged on the master can break on the slave if the rows are updated in a different order.
Indeed.
So any reason that you fix one part of UPDATE of unique key but not the other?
Because only the first one was mentioned in the bug report. I didn't think of the second one :( Anyway, it seems that we agree not to push it into 10.0. What do you think - shall it be pushed into 10.1 or not at all? Regards, Sergei
Sergei Golubchik <serg@mariadb.org> writes:
Anyway, it seems that we agree not to push it into 10.0. What do you think - shall it be pushed into 10.1 or not at all?
Well, as I said, I wouldn't have prioritised fixing that bug. However, now that you've made the patch, I think it's ok to push it to 10.1 if you want. This only concerns UPDATE of unique key on non-transactional table. That's already a fairly edge-case scenario. So pushing this to 10.1 should not make much difference one way or the other. And the patch as such looks fine. (Just fix the incorrect comment and possibly elaborate the commit message if you decide to push). - Kristian.
A note from a peanut gallery: I think issuing a warning only when an actual conflict occurs is a bad idea, because due to a specific row order conflict may not occur on the master, but occur on slaves. And that can lead to a very bad replication breakage. On Tue, Nov 18, 2014 at 2:30 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Sergei Golubchik <serg@mariadb.org> writes:
Anyway, it seems that we agree not to push it into 10.0. What do you think - shall it be pushed into 10.1 or not at all?
Well, as I said, I wouldn't have prioritised fixing that bug. However, now that you've made the patch, I think it's ok to push it to 10.1 if you want.
This only concerns UPDATE of unique key on non-transactional table. That's already a fairly edge-case scenario. So pushing this to 10.1 should not make much difference one way or the other. And the patch as such looks fine.
(Just fix the incorrect comment and possibly elaborate the commit message if you decide to push).
- Kristian.
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Hi, Pavel! On Nov 29, Pavel Ivanov wrote:
A note from a peanut gallery: I think issuing a warning only when an actual conflict occurs is a bad idea, because due to a specific row order conflict may not occur on the master, but occur on slaves. And that can lead to a very bad replication breakage.
Is that possible? I mean, a unique key conflict either occurs or it doesn't. I don't see yet how a row order might affect that. But anyway, that was just an idea how to reduce the amount of new warnings that this patch introduces. If this idea does more bad than good, I won't do it, it's not essential for the bug fix. Regards, Sergei
On Sat, Nov 29, 2014 at 10:23 AM, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Pavel!
On Nov 29, Pavel Ivanov wrote:
A note from a peanut gallery: I think issuing a warning only when an actual conflict occurs is a bad idea, because due to a specific row order conflict may not occur on the master, but occur on slaves. And that can lead to a very bad replication breakage.
Is that possible? I mean, a unique key conflict either occurs or it doesn't. I don't see yet how a row order might affect that.
A simple example could be like this: create table t (id int primary key, n int unique); insert into t values (1, 3); insert into t values (2, 2); update t set n = n + 1 where n >= 2; Here if master chooses to use PRIMARY index to scan the table it will update both rows without any unique key conflicts. If after that slave chooses to use the unique index on n to scan the table it will get a conflict and the statement will fail. Pavel
participants (3)
-
Kristian Nielsen
-
Pavel Ivanov
-
Sergei Golubchik