
Rong, On Tue, Jun 3, 2025 at 9:35 PM Rong Kang via discuss < discuss@lists.mariadb.org> wrote:
Dear Sergey Vojtovich:
Thank you for your detailed and actionable guidance. I agree with the steps you've outlined and would like to confirm a few points:
1. Regarding roles and collaboration: Yes, we (ByteDance) can drive this effort. I'll be working with a student selected for OSPP to contribute to VIDEX for MariaDB. The final student selection will be confirmed on June 25, and he/she will begin participating then. (Of course, we're also happy to have you drive the process if preferred. We're flexible and eager to collaborate either way)
It was my expectation, just wanted to confirm. Let's stick to your plan. 2. Regarding milestones: Should we maintain one large PR until all
requirements are met and then merge everything at once? Or would you prefer splitting it into multiple milestones, completing functionality incrementally?
It would be good to have some functional and useful minimal implementation in the beginning. Then we can switch to an incremental approach. Otherwise it is up to you to set milestones.
3. Regarding API adaptation: For steps 6-7, which we consider the most challenging parts (adapting the API, compiling and running VIDEX in MariaDB), would MariaDB Foundation collaborate on PRs and contribute code, or should ByteDance handle all code contributions while MariaDB Foundation only provide guidance? (I noticed Petrunia has already reviewed the VIDEX code in depth)
If ByteDance drives this effort, we take a secondary role. Which is mostly about providing guidance. But I don't see anything preventing us from contributing code for specific issues. This can be negotiated on a case by case basis.
4. Regarding code quality: We'd appreciate your advices on whether the current VIDEX implementation needs refactoring to meet MariaDB standards. Additionally, VIDEX currently lack C tests for the storage engine itself and would appreciate your guidance - such as test case frameworks to be filled.
MariaDB server and plugins can have different maturity levels. E.g. server is stable, while ha_example storage engine is marked experimental. First versions of VIDEX can be either experimental or alpha. grep for MariaDB_PLUGIN_MATURITY_EXPERIMENTAL. I can't foresee any specific refactorings at the moment. But the code will have to evolve for sure. What kind of C tests are you referring to? I believe our mysql-test framework should be capable of testing VIDEX. E.g. see tests for storage engines like storage/oqgraph, storage/sphinx or storage/test_sql_discovery. It should be possible to install videx plugin, run external indexer and run/query VIDEX daemon. We can help you to develop a test suite, for detailed information see https://mariadb.com/kb/en/mariadb-test-overview/
5. Regarding communication: This mailing list has grown quite long, making it difficult to track new issues or mention multiple people. GitHub issues may be better, or shall we join a Slack channel for real-time discussions?
Agree. Github issues are definitely fine. Slack can be an option, I will need to discuss it with my colleagues though. We also have MariaDB zulip publicly available: https://mariadb.zulipchat.com .
6. Feature detail #1: If the VIDEX-stats-server operates as a submodule, it requires a Python environment and launches an HTTP service. This design is intended to be AI-model integration friendly, but is this suitable for MariaDB?
Yes, it should be alright. We'll definitely have problems with packaging rpms/debs, but it must be manageable. VIDEX should be in a separate package.
7. Feature detail #2: VIDEX currently has a limitation regarding index_read. As Petrunia mentioned, MySQL/MariaDB query optimization phase may call index_init/index_read, but VIDEX doesn't currently implement them and throws a 1031 error. Can we leave this as a future implementation issue?
Yes, it is up to you to decide which functionality is available and which is not. Thank you, and I look forward to collaborating on this interesting project! Likewise! I had a discussion regarding git layout with my colleagues and it appears to be slightly more complex. In a nutshell storage/videx should go under MariaDB git history. storage/videx/videx should be submodule. In this case if we change API/build process/etc we can fix VIDEX right away on our side. If storage/videx were a submodule we have no control over it, and the only option we have in such a case is disabling it. So storage/videx contents should look like: CMakeLists.txt ha_videx.cc videx/ -- submodule, common stuff mysql-test/ -- some tests can come from videx/ submodule if needed (so that they work both in MariaDB and MySQL) Regards, Sergey