Thanks for the replies and the clarification!

 

May I suggest to enhance https://mariadb.com/kb/en/information-schema-innodb_ft_index_table-table/ slightly: it currently explains for “INNODB_FT_INDEX_TABLE.doc_id”:

I now learned that the rules for “appropriate id” are more complex than I thought 😉. But maybe there exists already a documentation that I did not find.

 

Also, one question remains: if an invisible FTS_DOC_ID column is automatically added to my table, how can I view the values? I ask this, because in the process of analyzing the word split results, I wanted to match entries of “INNODB_FT_INDEX_TABLE “ to my real table or vice versa.

 

This does not work:

select FTS_DOC_ID from t1;

 

I already found that FTS_DOC_ID is an invisible column, but the FTS_DOC_ID cannot be selected, so it must be something special. I found this JIRA issue about adding several types of invisibility: https://jira.mariadb.org/browse/MDEV-10177 - but I found no further information about this.

 

My MariaDB version is slightly old – 10.3.39. There might have been changes to the behavior since then.

 

Best regards

 

Wolfgang

 

 

 

Von: Thirunarayanan Balathandayuthapani <thiru@mariadb.com>
Gesendet: Samstag, 14. September 2024 05:11
An: Wolfgang Knauf <WKnauf@hg-online.de>
Cc: discuss@lists.mariadb.org
Betreff: Re: [MariaDB discuss] Re: Fulltext index tokenization rules / DOC_ID

 

ACHTUNG: Diese E-Mail stammt von einem externen Absender. Bitte achten Sie auf Anhänge oder externe Links.

 

 

On Thu, Sep 12, 2024 at 12:47PM wknauf--- via discuss <discuss@lists.mariadb.org> wrote:

Question 1: this was changed by https://jira.mariadb.org/browse/MDEV-20797
Question 2: still open. I asked the same here: https://stackoverflow.com/questions/78969640/mariadb-fulltext-index-tokenization-rules-doc-id-mapping/78976703

To answer your question 2: 

 

If an FTS_DOC_ID column is not defined, InnoDB automatically adds a hidden FTS_DOC_ID column when the full-text index is created.
If you create a full-text index at CREATE TABLE time and do not specify an FTS_DOC_ID column, InnoDB adds a hidden FTS_DOC_ID column.

 

Example:

CREATE TABLE t1 (
ID INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title VARCHAR(200),
FULLTEXT(title)
) ENGINE = InnoDB;

INSERT INTO t1(title) VALUES('mysql');
INSERT INTO t1(title) VALUES('database');
....
SELECT * FROM t1;
ID title
1 mysql
2 database
SET GLOBAL innodb_ft_aux_table="test/t1";

SELECT WORD, DOC_ID FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE;
WORD   DOC_ID

database  3
mysql       2

Here the hidden FTS_DOC_ID value is 2 and 3 for mysql and database. It doesn't have to match with ID column in the table.

 

Another way to create your own FTS_DOC_ID column, the column must be defined as BIGINT UNSIGNED NOT NULL and named FTS_DOC_ID (all uppercase)

 

Example:

 

CREATE TABLE t1 (

        FTS_DOC_ID BIGINT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
        title VARCHAR(200),
        FULLTEXT(title)
) ENGINE = InnoDB;
INSERT INTO t1(title) VALUES('mysql');
INSERT INTO t1(title) VALUES('database');
.....
SELECT * FROM t1;
FTS_DOC_ID title
1 mysql
2 database

SET GLOBAL innodb_ft_aux_table="test/t1";
SELECT WORD, DOC_ID FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE;
WORD DOC_ID
database 2 
mysql      1
SET GLOBAL innodb_ft_aux_table=default;

Here FTS_DOC_ID for mysql and database is "1" and "2"

 

FTS_DOC_ID doesn't have to be PRIMARY KEY. It can be part of UNIQUE INDEX as well

 

Example:


CREATE TABLE t1 (
ID INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
FTS_DOC_ID BIGINT UNSIGNED NOT NULL,
UNIQUE INDEX FTS_DOC_ID_INDEX (FTS_DOC_ID),
title VARCHAR(200),
FULLTEXT(title)
) ENGINE = InnoDB;

INSERT INTO t1(FTS_DOC_ID, title) VALUES(1, 'mysql');
INSERT INTO t1(FTS_DOC_ID, title) VALUES(2, 'database');

SELECT * FROM t1;
ID FTS_DOC_ID title
1 1 mysql
2 2 database

SET GLOBAL innodb_ft_aux_table="test/t1";
SELECT WORD, DOC_ID FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE;
WORD DOC_ID

database 2
mysql      1

 

 

--

Thirunarayanan B

Software Engineer

MariaDB