Hi Mark,

TokuMX is a quite different beast than TokuDB. First of all, we already had the experience of integrating our engine into one database product before we started. So many kinks in the TokuKV layer had already been worked out.

But more importantly, TokuMX/MongoDB doesn't have a storage engine API. I think some people thought we were going to add a storage engine API to MongoDB and then plug ourselves into it. That wasn't the goal of TokuMX, the goal was simply to get our engine inside MongoDB as fast as possible, and the way to do that was to avoid thinking about what would be a good interface and instead to just do it. As everyone here I'm sure knows, making a good storage engine API is /really/ hard.

Probably the hardest things in the TokuMX integration were learning how to deal with DDL (everything in MongoDB seems to use "lazy initialization"---for DDL operations at least), finding the right model within the MongoDB code to represent transactions, and reorganizing the locking. All these things were tightly coupled with the way the MongoDB storage system works (except transactions, well, because they didn't exist), but now in TokuMX they're pretty tightly coupled with the way TokuKV does things.

In a way, we've created a storage API, but the API is defined by our version of db.h and nothing else implements that with the same assumptions we have, so it's probably not useful to compare the "TokuMX storage engine API" with the one in MySQL.

In short, I'd say yes it was easier, but not because MongoDB has a better API (it doesn't have one), but because we had a bit of experience and because we didn't try to create or conform to a generic API.

On Mon, Aug 19, 2013 at 11:36 AM, MARK CALLAGHAN <mdcallag@gmail.com> wrote:

Thanks for your response.

On Fri, Aug 16, 2013 at 11:23 AM, Zardosht Kasheff <zardosht@gmail.com> wrote:

I've worked on the TokuDB storage engine for quite a while now. I have
had many experiences over the years, so I guess it's hard to know
where to begin. I guess I will start small, and if the conversation
evolves, I can contribute more thoughts. I think the current API is
really good, as evidenced by the fact that many storage engines have
used it to plug into MySQL. The two areas that I see we can really
benefit from are the following:

Many were written in the long-ago past. Besides TokuDB how many new storage engines have reached GA in the past decade? I worked on a custom storage engine and I am sure others have done the same, but there hasn't been much innovation in the public. Aria is also GA, but that was written by people who know and wrote parts of the API, so it isn't a sign that the API is something people want to use.

Was TokuMX easier to implement than TokuDB?

--
Mark Callaghan
mdcallag@gmail.com

_______________________________________________
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help : https://help.launchpad.net/ListHelp

--
Cheers,
Leif