Inquiry Regarding Parameters of Modified HNSW Algorithm in MariaDB Vector
![](https://secure.gravatar.com/avatar/177fcd78280c1c9e53c67389f0bac79d.jpg?s=120&d=mm&r=g)
Dear Sir, I hope this message finds you well. I am currently working with the MariaDB vector functionality and am particularly interested in the modified HNSW (MHNSW) algorithm. Specifically, I would like to inquire about the meaning and usage of the parameter "M" in MHNSW, as well as how it compares to other HNSW implementations, such as pgvector. ( https://mariadb.com/kb/en/vector-system-variables/) Typically, HNSW has two key construction parameters: "M" and "efconstruction." However, I noticed that in MHNSW, there doesn't seem to be an "efc" parameter. This has led me to wonder if the "M" parameter in MHNSW has a different meaning or purpose compared to the traditional HNSW "M." Could you kindly clarify the role of the "M" parameter in the modified HNSW algorithm, and how it differs from or is similar to other implementations like pgvector? Thank you for your time and assistance. I look forward to your response. Best regards!
![](https://secure.gravatar.com/avatar/39b623a1559cf9c69ac3d9d4fb44e7fe.jpg?s=120&d=mm&r=g)
Hi, kase, On Jan 13, kase jojo via discuss wrote:
Dear Sir,
I hope this message finds you well.
I am currently working with the MariaDB vector functionality and am particularly interested in the modified HNSW (MHNSW) algorithm. Specifically, I would like to inquire about the meaning and usage of the parameter "M" in MHNSW, as well as how it compares to other HNSW implementations, such as pgvector. https://mariadb.com/kb/en/vector-system-variables/
Typically, HNSW has two key construction parameters: "M" and "efconstruction." However, I noticed that in MHNSW, there doesn't seem to be an "efc" parameter. This has led me to wonder if the "M" parameter in MHNSW has a different meaning or purpose compared to the traditional HNSW "M."
Could you kindly clarify the role of the "M" parameter in the modified HNSW algorithm, and how it differs from or is similar to other implementations like pgvector?
MariaDB has efConstruction, but it's hard-coded and cannot be tuned. There are about 10 hidden parameters that cannot be tuned. Conceptually, there are two degrees of freedom here — one allows to choose between "high qps low recall" and "low qps high recall", the other allows to choose between "fast insert slow search" and "slow insert fast search". It means there must be at least two tunable parameters to be able to tune these two degrees of freedom independently. But more than two parameters would be redundant. Thus, MariaDB does not have a configurable efConstruction. Indeed, M in MariaDB tunes more than "max number of graph edges on a non-zero level". Most importantly it is also used to tune the greediness of the search. That's why the recall in MariaDB will be higher for the same M than in other implementations of HNSW, and the difference will grow as M grows. Naturally, it comes at the cost of the speed, so with M growing qps will drop faster in MariaDB than in other databases. Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
participants (2)
-
kase jojo
-
Sergei Golubchik