Re: [Maria-developers] MariaDB 10.0.14 got rid of replication safety warning without warning in release notes?

18 Oct 2014


      As I have reports of people not seeing this email on maria-developers@
list (probably because Monty sent it from the wrong address) I'm
forwarding it to the list. I'll respond to it in detail later.

On Fri, Oct 17, 2014 at 1:43 PM, Michael Widenius
<michael.widenius@gmail.com> wrote:
...
Hi!
First sorry for the delayed reply;  I am at a conference in Oxford and
didn't have an reliable internet connection until now.
...
...
...
...
...
"Pavel" == Pavel Ivanov <pivanof@google.com> writes:
Pavel> Hi,
Pavel> I see that in MariaDB 10.0.14 Michael disabled replication safety
Pavel> warning for the case of INSERT ON DUPLICATE KEY UPDATE with several
Pavel> unique keys with a comment "We should assume that the store engine
Pavel> will report the first duplicate key for this case."
Pavel> http://bazaar.launchpad.net/~maria-captains/maria/10.0/revision/4400.1.3
Pavel> What does that mean?
That we don't anymore give a warning for things that can't happen in
the distributed MariaDB server.
Pabel> Which key is "the first duplicate key".
The first key that has a duplicate value in the order of keys in the KEY
structure that MariaDB provides for the storage engine.
This is the same order that keys are presented in SHOW CREATE TABLE
Pavel> and why
Pavel> MariaDB relies on an engine behavior which AFAIK is not guaranteed?
This is guaranteed for all storage engines shipped with MariaDB and
should also be guaranteed for any storage engine that wants to follow
the MariaDB and even the original MySQL storage engine interface.
The change is also in line with how I designed the part of the storage
engine interface that gives the SQL layer the information about which
key that was conflicted.  As we use this for things like INSERT ... ON
DUPLICATE KEY, it's critical that all storage engine provides the same
information in case of conflicts.
To help engine writes we are constantly adding new test in the storage
engine test suite, to make it easier for storage engine vendors to
test that their storage engine follows the interface.
We have now added a test to check for the correct behaviour for the
case of INSERT ... ON DUPLICATE KEY.
What is important when it comes to the storage engine interface is
that in all possible cases, a query should work identically against 2
different storage engines.
The only 'allowed difference's that I can think of for the moment are:
- Different storage engines may retrieve rows in different order.
- In case of errors, changes to non transactional tables will not roll
  back.
- Automatically generated key values may be different between storage engines.
  (but preferably they should not be that). However when things are
  replicated the slave, it should get the same values as the master.
In the case of INSERT ... ON DUPLICATE KEY, it was clear that someone
had broke the storage engine interface, for no good reason.  Instead
of users having to depend on 'unsafe, different and undocumented
behavior' for which row as updated, it's better to fix this issue once
and for all, especially as fixing this is trivial.
Do you have or use a storage engine that doesn't provide the expected value
for the conflicting key ?
If yes, there is a couple of ways to fix this. I am happy to talk with
you to find a solution that will satisfy your needs and ours.
I underatand that in many cases it's not critical which unique key the
storage engine reports as duplicate. However in some cases, it's
critical to get the first key.
I could add a new extra call to inform the engine that we now require
the first duplicate key and not just any conflicting key.  This would
be used for example for INSERT ... ON DUPLICATE KEY ...
I would prefer to have the check for which key was duplicated in the
storage engine.  However, if critically needed, and assuming there is more
than one storage engine that needs this, we could add a check on the
handler level that will do a lookup on all unique keys to find the
first duplicate one.
What solution would you prefer?
Pavel> And why this serious change in behavior wasn't mentioned in the
Pavel> Release Notes?
I didn't see it as a a serious change as it's only about removing a
warning that has no value for any normal MariaDB user (who only use
storage engines supplied with MariadB) and which actually caused
serious problems for some MariaDB users.
That said, I agree it was a mistake from my part to not notice that
this was missing in the release notes.  I will try to do better in the
future and check that the release notes correctly reflects any changes
that I have done.
I appologize for missing this in this case.
Pavel> On the related note Release Notes mention the error log flood
Pavel> protection. Why it's not behind flag? How should I disable the feature
Pavel> and get back to the old behavior?
We already had in MariaDB flood protection for some error
messages. What I did was to make it more general and extend it to also
include other error messages, like the one we are discussing.  As
there wasn't a flag for the old suppression, I didn't see a need for
adding a flag for the new suppression.
I am happy to look into adding a flag that is usable for any kind of
suppressed messages.
A couple of ways to do this:
a) Not suppress anything if you give --log-warnings a value > 2
b) make LIMIT_UNSAFE_WARNING_ACTIVATION_TIMEOUT a configurable
   variable, where 0 would mean no suppression.
Which of the above, or both, would you prefer?
Or do you have any other suggestion of how to do this?
For 10.1 we will probably add a mechanism where you can permanently
suppress any specific error message to be written to the error log.
(They would still be given to the client).
Any suggestion or wishes regarding this?
Regards,
Monty