On Mon, Nov 11, 2013 at 2:29 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
We've noticed recently that semisync_master plugin in MariaDB (which apparently was fully inherited from MySQL) is seriously incompatible with our understanding of the purpose of semi-sync replication. This incompatibility was apparently introduced as a fix for http://bugs.mysql.com/bug.php?id=45672. The "major no-no" that bug
So as I understand it, this bug is about what should happen when semisync is enabled, but no slaves are connected.
Apparently before the fix of Bug#45672, an error was thrown late during COMMIT. So the transaction was committed (locally on the master), but the client still got an error back.
And if I understand correctly, after the fix of Bug#45672, no error is thrown in the case where no slave is connected.
No error is thrown and semi_sync_master is turned off completely.
talks about is in our opinion the whole purpose of semi-sync replication -- if transaction is not replicated to at least one slave client shouldn't get OK even if transaction is committed locally on the master. Also master shouldn't just turn off semi-sync replication whenever it wants.
So with "just turn off semi-sync replication whenever it wants" - what are you refering to here? I seem to remember that semisync has a timeout, and it gets disabled if that timeout triggers? My guess is that this is what you have in mind, but I wanted to ask to make sure ...
Yes, that's what I was referring to.
We will fix this problem for us, but first I wanted to understand what's your view of the purpose of semi-sync replication and how you think it should work? I need to know your opinion to understand how I should fix this issue...
Well, personally, I never was much interested in semi-sync. But it is my understanding that there is some interest, so I will answer with what small opinion I have.
I suppose the general idea is that when client sees its COMMIT complete, it can know that its transaction exists in at least two places (master binlog + at least one slave relay log). So there is no longer any single point of failure that can cause loss of the transaction.
Another point of view I is that semi-sync provides some sort of throttle on how fast the master can generate events compared to how fast the slaves can receive them:
http://www.mysqlperformanceblog.com/2012/01/19/how-does-semisynchronous-mysq...
There was also a suggestion (and a patch is floating around somewhere) for "enhanced semisync replication":
https://mariadb.atlassian.net/browse/MDEV-162
This delays not only client acknowledge but also InnoDB commit until the ack from at least one slave, which means that transactions are not visible to other clients until they exist on at least one slave in addition to on the master.
Since this is _semi_-sync, not real two-phase commit synchronous replication, the main problem is that there is way to ensure consistency in the general error case. The transaction is already fully committed on the master, it cannot be rolled back. So we are left with the choice of one of two evils:
1. Report an error to the client. Most clients would then probably wrongly assume that the transaction was _not_ committed. There also does not seem to be much the client can do about the error except perhaps log an incident to the monitoring system. On the other hand, then at least the problem is not silently ignored.
Well, I'd say "wrongly assume" is not quite good wording here. When client sees error it must assume that transaction is not committed, and if by the time it reconnects a new master is already elected, client indeed will see that transaction is not committed. Of course I understand that this design is somewhat brittle because with a very small semi_sync_master_timeout client will basically see error on each transaction it makes. And he will be able to check with SELECT that transaction is committed, even without re-connecting to server. So the general assumption is that semi_sync_master_timeout is very big and client will see client-side timeout and loss of connection much earlier than that.
2. Report success to the client but complain loudly in the error log (I assume this is what happens in current code). This leaves the client unaware that there is a problem (but presumably the monitoring system will catch the message in the error log).
This not only leaves the client unaware of the problem, but also allows the server to accept transactions from clients at a very high rate when no slaves are present. And if then machine with master fails all those accepted transactions will be permanently lost. So in the situation when master doesn't have slaves we want to slow down clients as much as possible even though their transactions will be committed locally and they will be able to check with SELECTs that transactions are actually committed.
From this summary, I think I can see the logic of the current behaviour:
- It preserves protection against single-point-of-failure. If all slaves are gone, then we already have one failure, and unless we experience a double failure (master also failing before slave recovers), the transaction will eventually be sent to a slave and no overall failure happens.
- If the client can anyway not do anything about the problem except notify the monitoring system, the server may as well do the notification itself.
But the opposite point of view also has merit. The client asked for semi-sync behavior, but did not get it, and it does not even have a way to know about the problem. That is not good.
Does the client currently at least get a warning for the COMMIT? I think it should (eg. the fix for Bug#45672 should at least have been to turn the error into a warning, not remove the error completely).
No, there's no warning. And on the server side there's only one line in the logs showing that semi-sync replication has turned off, and nothing else after that for a long period of time when transactions were accepted, but no slaves replicated it.
What I think could make sense is if the client got an error during the prepare phase if no slaves are connected. In this case we _can_ roll back the transaction and give an error to the client without any issue of consistency. But it still leaves a small window where the last slave can disappear between the prepare and the commit phase and leave us with the original problem.
I hope this helps you ... Maybe you can describe your use-case, and how you need things to work for that case? Personally I have nothing against changing this behaviour to something more logical, I am just not sure what the most logical behaviour is ...
For our use case we want clients to always see error when slaves didn't ack the transaction. This basically allows us to have a general rule: "Clients can rely on durability of only those transactions which they received the "success" result on". I.e. all transactions that were committed locally but didn't receive semi-sync ack are ok to lose later, and that won't be a serious offense on MySQL side. Of course "enhanced semi-sync replication" will help with this a lot and we'll be really happy to have it. But without it we at least don't want semi_sync_master to turn itself off ever. So basically my question is: if I prepare a patch that will restore the original behavior of semi-sync replication (and remove the tests added for Bug#45672) will that be acceptable for MariaDB? Thank you, Pavel