[Maria-developers] better execution plan possible with INNER JOIN of mroonga (or whatever storage engine) ?
Hello ! I am investigating a slow query and have a question about execution plan with mroonga (possibly any storage engine?). (with MariaDB 10.0.15.) Here are tables : CREATE TABLE catalog ( uid bigint(20) unsigned NOT NULL, path varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '', PRIMARY KEY (uid) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; CREATE TABLE catalog_full_text ( uid bigint(20) unsigned NOT NULL, title varchar(255) COLLATE utf8_unicode_ci DEFAULT '', PRIMARY KEY (uid), FULLTEXT KEY title (title) COMMENT 'parser "TokenBigramSplitSymbolAlphaDigit"' ) ENGINE=Mroonga DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; CREATE TABLE category ( uid bigint(20) unsigned NOT NULL, category_uid bigint(20) unsigned NOT NULL DEFAULT '0', base_category_uid bigint(20) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (uid,category_uid,base_category_uid) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; Here is the query : SELECT DISTINCT catalog.path, catalog.uid FROM ( ( catalog AS catalog INNER JOIN ( category AS category INNER JOIN catalog_full_text AS source_title_full_text ON category.base_category_uid = 18482 AND source_title_full_text.uid = category.category_uid ) ON category.uid = catalog.uid ) INNER JOIN catalog_full_text AS title_full_text ON title_full_text.uid = catalog.uid ) WHERE MATCH (source_title_full_text.title) AGAINST ('bar' IN BOOLEAN MODE) AND MATCH (title_full_text.title) AGAINST ('foo' IN BOOLEAN MODE) Here is the output of EXPLAIN : *************************** 1. row *************************** id: 1 select_type: SIMPLE table: source_title_full_text type: fulltext possible_keys: PRIMARY,title key: title key_len: 0 ref: rows: 1 Extra: Using where with pushed condition; Using temporary *************************** 2. row *************************** id: 1 select_type: SIMPLE table: title_full_text type: fulltext possible_keys: PRIMARY,title key: title key_len: 0 ref: rows: 1 Extra: Using where with pushed condition *************************** 3. row *************************** id: 1 select_type: SIMPLE table: catalog type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 8 ref: test.title_full_text.uid rows: 1 Extra: *************************** 4. row *************************** id: 1 select_type: SIMPLE table: category type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 24 ref: test.title_full_text.uid,test.source_title_full_text.uid,const rows: 1 Extra: Using index; Distinct As you see, we have two full-text queries at the beginning and both have 'rows = 1'. In a slow query case, 25k records mathces with the first full-text query and 1.6k records matches with the second full-text query. When I manually modify the order of INNER JOIN so that the 'lighter' full-text query appears first, EXPLAIN output is different as well and the performance is much improved. Accoding to Kohei, a Mroonga developer, we always have 'rows = 1' for mroonga full-text query because MariaDB does not request such value to mroonga (JOIN::get_examined_rows() in sql/sql_select.cc), and it is possible to modify mroonga to return an estimation of rows if MariaDB has such an API. So here are my questions : * can MariaDB request an 'estimation of rows' to mroonga (or whatever storage engine) ? * can such information help to decide a better execution plan with INNER JOIN like above ? Thanks in advance ! -- Kazuhiko Shiozaki, Nexedi SA Senior Consultant Nexedi: Consulting and Development of Free / Open Source Software http://www.nexedi.com/ ERP5: Full Featured High End Open Source ERP http://www.erp5.com/
Hi, Kazuhiko! On Dec 02, Kazuhiko Shiozaki wrote:
Hello !
I am investigating a slow query and have a question about execution plan with mroonga (possibly any storage engine?). (with MariaDB 10.0.15.)
CREATE TABLE catalog_full_text ( uid bigint(20) unsigned NOT NULL, title varchar(255) COLLATE utf8_unicode_ci DEFAULT '', PRIMARY KEY (uid), FULLTEXT KEY title (title) COMMENT 'parser "TokenBigramSplitSymbolAlphaDigit"'
unrelated comment: in MariaDB it's *much* better to use engine-defined table attributes for this. https://mariadb.com/kb/en/engine-defined-new-tablefieldindex-attributes/
) ENGINE=Mroonga DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; ... Accoding to Kohei, a Mroonga developer, we always have 'rows = 1' for mroonga full-text query because MariaDB does not request such value to mroonga (JOIN::get_examined_rows() in sql/sql_select.cc)
Correct.
So here are my questions :
* can MariaDB request an 'estimation of rows' to mroonga (or whatever storage engine) ? * can such information help to decide a better execution plan with INNER JOIN like above ?
Yes, I believe it is possible. I've reported this issue as a bug now: https://mariadb.atlassian.net/browse/MDEV-7250 Regards, Sergei
Hi Sergei ! On 02/12/2014 16:41, Sergei Golubchik wrote:
So here are my questions :
* can MariaDB request an 'estimation of rows' to mroonga (or whatever storage engine) ? * can such information help to decide a better execution plan with INNER JOIN like above ?
Yes, I believe it is possible.
I've reported this issue as a bug now: https://mariadb.atlassian.net/browse/MDEV-7250
Thanks for summarising the topic into this issue. It is quite clear and I could understand the background of the current implementation. -- Kazuhiko Shiozaki, Nexedi SA Senior Consultant Nexedi: Consulting and Development of Free / Open Source Software http://www.nexedi.com/ ERP5: Full Featured High End Open Source ERP http://www.erp5.com/
Hi Kazuhiko Shiozaki-san. My permute-indexes tool is handy when doingperformance tuning/optimizer analysis: https://github.com/jamesbriggs/mysql-permute-indexes Thanks, James Briggs. -- Cassandra/MySQL DBA. Available in San Jose area or remote. cass_top: https://github.com/jamesbriggs/cassandra-top From: Kazuhiko Shiozaki <kazuhiko@nexedi.com> To: maria-developers@lists.launchpad.net Sent: Tuesday, December 2, 2014 6:22 AM Subject: [Maria-developers] better execution plan possible with INNER JOIN of mroonga (or whatever storage engine) ? Hello ! I am investigating a slow query and have a question about execution plan with mroonga (possibly any storage engine?). (with MariaDB 10.0.15.) Here are tables : CREATE TABLE catalog ( uid bigint(20) unsigned NOT NULL, path varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '', PRIMARY KEY (uid) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; CREATE TABLE catalog_full_text ( uid bigint(20) unsigned NOT NULL, title varchar(255) COLLATE utf8_unicode_ci DEFAULT '', PRIMARY KEY (uid), FULLTEXT KEY title (title) COMMENT 'parser "TokenBigramSplitSymbolAlphaDigit"' ) ENGINE=Mroonga DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; CREATE TABLE category ( uid bigint(20) unsigned NOT NULL, category_uid bigint(20) unsigned NOT NULL DEFAULT '0', base_category_uid bigint(20) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (uid,category_uid,base_category_uid) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; Here is the query : SELECT DISTINCT catalog.path, catalog.uid FROM ( ( catalog AS catalog INNER JOIN ( category AS category INNER JOIN catalog_full_text AS source_title_full_text ON category.base_category_uid = 18482 AND source_title_full_text.uid = category.category_uid ) ON category.uid = catalog.uid ) INNER JOIN catalog_full_text AS title_full_text ON title_full_text.uid = catalog.uid ) WHERE MATCH (source_title_full_text.title) AGAINST ('bar' IN BOOLEAN MODE) AND MATCH (title_full_text.title) AGAINST ('foo' IN BOOLEAN MODE) Here is the output of EXPLAIN : *************************** 1. row *************************** id: 1 select_type: SIMPLE table: source_title_full_text type: fulltext possible_keys: PRIMARY,title key: title key_len: 0 ref: rows: 1 Extra: Using where with pushed condition; Using temporary *************************** 2. row *************************** id: 1 select_type: SIMPLE table: title_full_text type: fulltext possible_keys: PRIMARY,title key: title key_len: 0 ref: rows: 1 Extra: Using where with pushed condition *************************** 3. row *************************** id: 1 select_type: SIMPLE table: catalog type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 8 ref: test.title_full_text.uid rows: 1 Extra: *************************** 4. row *************************** id: 1 select_type: SIMPLE table: category type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 24 ref: test.title_full_text.uid,test.source_title_full_text.uid,const rows: 1 Extra: Using index; Distinct As you see, we have two full-text queries at the beginning and both have 'rows = 1'. In a slow query case, 25k records mathces with the first full-text query and 1.6k records matches with the second full-text query. When I manually modify the order of INNER JOIN so that the 'lighter' full-text query appears first, EXPLAIN output is different as well and the performance is much improved. Accoding to Kohei, a Mroonga developer, we always have 'rows = 1' for mroonga full-text query because MariaDB does not request such value to mroonga (JOIN::get_examined_rows() in sql/sql_select.cc), and it is possible to modify mroonga to return an estimation of rows if MariaDB has such an API. So here are my questions : * can MariaDB request an 'estimation of rows' to mroonga (or whatever storage engine) ? * can such information help to decide a better execution plan with INNER JOIN like above ? Thanks in advance ! -- Kazuhiko Shiozaki, Nexedi SA Senior Consultant Nexedi: Consulting and Development of Free / Open Source Software http://www.nexedi.com/ ERP5: Full Featured High End Open Source ERP http://www.erp5.com/ _______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
participants (3)
-
James Briggs
-
Kazuhiko Shiozaki
-
Sergei Golubchik