Alex Yurchenko <alexey.yurchenko@codership.com> writes:
On Mon, 25 Jan 2010 13:55:44 +0100, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
I think it would be useful if you explained what the problems are with
that
interface, in your opinion.
This interface does not seem to improve anything about how redundancy is achieved in MySQL. Moreover, it seems to cement all the bad decisions that were made in years into an explicit interface:
- It exposes what should be redundancy service internal implementation details to SQL server.
We won't get a clean flexible generic API before we clearly sort out what belongs where. And for that we'll need to look at redundancy service unencumbered by existing code. This is not a call to revolution. It is a suggestion to create a completely new parallel redundancy service API and _gradually_ reimplement required functionality under that API.
Right. So if I understand you correctly, with "internal implementation details" we do not mean just that the APIs expose internals of the SQL server which we want to shield plugins from. Rather, the way the interface is designed it makes assumptions about how the plugin that will use the iterface will be implemented, thus making it unsuitable for other plugins that have other ideas about what to do. So more concretely, what we want is an API that does not make assumptions about the format of the binlog file, or even that there is a binlog stored in a file. And an API that does not assume that events to be applied will be read from a specific mysql connection to a master server, returning data in a particular binlog format. Like you wrote:
I think the main problem here is that many people by force of habit regard things that should be internal implementation details of redundancy service as integral parts of an SQL server. Like binlog storage or relay service.
I had a brief but inspiring discussion with Serg about this at our meeting two weeks ago. So basically, what we could aim for is to make the entire current MySQL replication into a set of plugins. These plugins would be made against a new plugin interface that would support not only the existing MySQL replication for backwards compatibility, but also things like Galera and Tungsten, and other ideas. So while the compatibility *plugins* would contain the legacy MySQL binlog storage and relay service, the plugin *interface* would not. I think this is what you had in mind? So the basic for such an interface would be the ability to install hooks to be called with row data for every handler::write_row(), handler::update_row(), and handler::delete_row() invocation, just like the current row-based binlogging does. And similar for SQL statement execution like statement-based logging does now. That should be clear enough. Then comes the need to hook into transaction start and commit and opening tables. At this point, more of the internals of the MySQL server start to appear, and some careful thought will be needed to get an interface that exposes enough that plugins can do what they need, without exposing too much internal details of how MySQL query execution is implemented. (But note that this is two different issues regarding "internal implementations". One is how the *query execution* is implemented. The other is how the *plugins* are implemented. If I understood you correctly, the interface used for semisync in MySQL fails on the latter point). One example of how a lot of details from query execution pop up is with regard to the mixed-mode binlogging. This is where queries are logged as statements when this is safe, and as row events when this is not safe (nondeterministic queries). The concept of "mixed mode binlogging" certainly seems like something that should be an implementation detail of the plugin, not part of the interface. On the other hand, determining whether a query is safe for statement-based logging is highly complex, and exposing enough of the server for the plugin to be able to determine this by itself may be too much. (Maybe just expose an is_safe_for_statement() function to plugins could be enough). Another example of hairy details is all the extra information that can go with an SQL statement into the binary log. Things like current timestamp, random seed, user-set @variables, etc. To support a statement-based replication plugin, we probably have to expose all of this on the interface in a clean fashion.
- It does not care to introduce a concept of global transaction ID.
Right. As I wrote earlier, this seems to be central to many of the ideas involved in this project. What I am wondering at the moment is if the concept of global transaction ID should be a part of the new API, or if it is really an implemtation detail of the reduncancy service. On the one hand, if we make it part of the API, can we make it general enough to support everything we want? For example, some plugin will need the ID to be generated at the start of a transaction. Some will need it to be generated at the end of the transaction. On the other hand, if we make it _not_ part of the API, we run the risk of making the API overly general and just pushing the problem down for each plugin to try to solve individually. I'll start working more deeply into these issues of new API and global transaction ID. - Kristian.