Thanks for the replies and the clarification!
May I suggest to enhance https://mariadb.com/kb/en/information-schema-innodb_ft_index_table-table/ slightly: it currently explains for “INNODB_FT_INDEX_TABLE.doc_id”:
* Document ID of the newly added row, either an appropriate ID column or an internal InnoDB value.
I now learned that the rules for “appropriate id” are more complex than I thought 😉. But maybe there exists already a documentation that I did not find.
Also, one question remains: if an invisible FTS_DOC_ID column is automatically added to my table, how can I view the values? I ask this, because in the process of analyzing the word split results, I wanted to match entries of “INNODB_FT_INDEX_TABLE “ to my real table or vice versa.
This does not work:
select FTS_DOC_ID from t1;
I already found that FTS_DOC_ID is an invisible column, but the FTS_DOC_ID cannot be selected, so it must be something special. I found this JIRA issue about adding several types of invisibility: https://jira.mariadb.org/browse/MDEV-10177 - but I found no further information about this.
My MariaDB version is slightly old – 10.3.39. There might have been changes to the behavior since then.
Best regards
Wolfgang
Von: Thirunarayanan Balathandayuthapani
Gesendet: Samstag, 14. September 2024 05:11
An: Wolfgang Knauf
Cc: discuss@lists.mariadb.org
Betreff: Re: [MariaDB discuss] Re: Fulltext index tokenization rules / DOC_ID
ACHTUNG: Diese E-Mail stammt von einem externen Absender. Bitte achten Sie auf Anhänge oder externe Links.
On Thu, Sep 12, 2024 at 12:47 PM wknauf--- via discuss mailto:discuss@lists.mariadb.org> wrote:
Question 1: this was changed by https://jira.mariadb.org/browse/MDEV-20797
Question 2: still open. I asked the same here: https://stackoverflow.com/questions/78969640/mariadb-fulltext-index-tokeniza...
To answer your question 2:
If an FTS_DOC_ID column is not defined, InnoDB automatically adds a hidden FTS_DOC_ID column when the full-text index is created.
If you create a full-text index at CREATE TABLE time and do not specify an FTS_DOC_ID column, InnoDB adds a hidden FTS_DOC_ID column.
Example:
CREATE TABLE t1 (
ID INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title VARCHAR(200),
FULLTEXT(title)
) ENGINE = InnoDB;
INSERT INTO t1(title) VALUES('mysql');
INSERT INTO t1(title) VALUES('database');
....
SELECT * FROM t1;
ID title
1 mysql
2 database
SET GLOBAL innodb_ft_aux_table="test/t1";
SELECT WORD, DOC_ID FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE;
WORD DOC_ID
database 3
mysql 2
Here the hidden FTS_DOC_ID value is 2 and 3 for mysql and database. It doesn't have to match with ID column in the table.
Another way to create your own FTS_DOC_ID column, the column must be defined as BIGINT UNSIGNED NOT NULL and named FTS_DOC_ID (all uppercase)
Example:
CREATE TABLE t1 (
FTS_DOC_ID BIGINT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title VARCHAR(200),
FULLTEXT(title)
) ENGINE = InnoDB;
INSERT INTO t1(title) VALUES('mysql');
INSERT INTO t1(title) VALUES('database');
.....
SELECT * FROM t1;
FTS_DOC_ID title
1 mysql
2 database
SET GLOBAL innodb_ft_aux_table="test/t1";
SELECT WORD, DOC_ID FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE;
WORD DOC_ID
database 2
mysql 1
SET GLOBAL innodb_ft_aux_table=default;
Here FTS_DOC_ID for mysql and database is "1" and "2"
FTS_DOC_ID doesn't have to be PRIMARY KEY. It can be part of UNIQUE INDEX as well
Example:
CREATE TABLE t1 (
ID INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
FTS_DOC_ID BIGINT UNSIGNED NOT NULL,
UNIQUE INDEX FTS_DOC_ID_INDEX (FTS_DOC_ID),
title VARCHAR(200),
FULLTEXT(title)
) ENGINE = InnoDB;
INSERT INTO t1(FTS_DOC_ID, title) VALUES(1, 'mysql');
INSERT INTO t1(FTS_DOC_ID, title) VALUES(2, 'database');
SELECT * FROM t1;
ID FTS_DOC_ID title
1 1 mysql
2 2 database
SET GLOBAL innodb_ft_aux_table="test/t1";
SELECT WORD, DOC_ID FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE;
WORD DOC_ID
database 2
mysql 1
--
Thirunarayanan B
Software Engineer
MariaDB