On Mon, 25 Jan 2010 13:55:44 +0100, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
I think it would be useful if you explained what the problems are with
that
interface, in your opinion.
Let me start with that I'm not that much familiar with the current MySQL replication code and may not be qualified to judge how much this new MySQL Replication Interface improves on what we have in 5.1.x. Perhaps it does. But as a replication developer with a goal of creating a generic redundancy service API that will handle a broad range of tasks I can say that it is a step in the dead-end direction. This interface does not seem to improve anything about how redundancy is achieved in MySQL. Moreover, it seems to cement all the bad decisions that were made in years into an explicit interface: - It makes explicit distinction between binlogging and replication. - It does not care to introduce a concept of global transaction ID. - It exposes what should be redundancy service internal implementation details to SQL server. I do understand that there are perfectly good reasons why MySQL replication API ended to be such a mess. But if we want to move further, we must recognize that it is a mess beyond repairs. It cannot be inherited. I think the main problem here is that many people by force of habit regard things that should be internal implementation details of redundancy service as integral parts of an SQL server. Like binlog storage or relay service. We won't get a clean flexible generic API before we clearly sort out what belongs where. And for that we'll need to look at redundancy service unencumbered by existing code. This is not a call to revolution. It is a suggestion to create a completely new parallel redundancy service API and _gradually_ reimplement required functionality under that API. Please understand that I'm not questioning current replication implementation. It may be well reusable. I'm questioning where and how the redundancy API line is drawn. Exposing concrete binlog storage implementation to SQL server is not only pointless, it is harmful. One more reason to design redundancy API from scratch and not start from this one is because whenever you'll want to change anything inside, you'll inevitably have to change this API simply because it exposes so much of internals. To illustrate this somehow, on page 18 of replication slides from UC 2009 (http://forge.mysql.com/wiki/MySQL_Replication:_Walk-through_of_the_new_5.1_a...) we can see unification of logging and replication functionality behind something called "Logging Kernel", but it does not seem to be reflected in any way neither in the aforementioned Replication Interface spec. nor on page 23 of the slides. Apparently, intended plugin points are to be various observer interfaces shown as diamonds below delegate boxes. Well, we can do much better than that and raise redundancy plugin boundary much higher. Specifically, everything but "SQL execution" and "Slave IO thread" on that slide must be moved behind the redundancy service plugin interface and become implementation detail. (This is not to say that there can't be plugins to redundancy service plugin.)
I am thinking that this is mainly a refactoring to expose mostly already present functionality in a clean way to new plugins (semisync.
replication
in particular), but I will have to look deeper to know for sure.
It sure looks so. But notice that more than half of the APIs there are not even used by semi-sync. In fact existence if semi-sync in no way justifies this interface. It can be implemented much easier with wsrep API. Thanks, Alex -- Alexey Yurchenko, Codership Oy, www.codership.com Skype: alexey.yurchenko, Phone: +358-400-516-011