Hi Kazuhiko Shiozaki-san.

My permute-indexes tool is handy when doing
performance tuning/optimizer analysis:

https://github.com/jamesbriggs/mysql-permute-indexes
Thanks, James Briggs.
--
Cassandra/MySQL DBA. Available in San Jose area or remote.
cass_top: https://github.com/jamesbriggs/cassandra-top


From: Kazuhiko Shiozaki <kazuhiko@nexedi.com>
To: maria-developers@lists.launchpad.net
Sent: Tuesday, December 2, 2014 6:22 AM
Subject: [Maria-developers] better execution plan possible with INNER JOIN of mroonga (or whatever storage engine) ?

Hello !

I am investigating a slow query and have a question about execution plan
with mroonga (possibly any storage engine?).

(with MariaDB 10.0.15.)

Here are tables :

CREATE TABLE catalog (
  uid bigint(20) unsigned NOT NULL,
  path varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  PRIMARY KEY (uid)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE catalog_full_text (
  uid bigint(20) unsigned NOT NULL,
  title varchar(255) COLLATE utf8_unicode_ci DEFAULT '',
  PRIMARY KEY (uid),
  FULLTEXT KEY title (title) COMMENT 'parser
"TokenBigramSplitSymbolAlphaDigit"'
) ENGINE=Mroonga DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE category (
  uid bigint(20) unsigned NOT NULL,
  category_uid bigint(20) unsigned NOT NULL DEFAULT '0',
  base_category_uid bigint(20) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (uid,category_uid,base_category_uid)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

Here is the query :

SELECT DISTINCT
  catalog.path,
  catalog.uid
FROM
  (
    (
      catalog AS catalog
      INNER JOIN
      (
        category AS category
        INNER JOIN
        catalog_full_text AS source_title_full_text
        ON
        category.base_category_uid = 18482
        AND source_title_full_text.uid = category.category_uid
      )
      ON
      category.uid = catalog.uid
    )
    INNER JOIN
    catalog_full_text AS title_full_text
    ON
    title_full_text.uid = catalog.uid
  )
WHERE
  MATCH (source_title_full_text.title) AGAINST ('bar' IN BOOLEAN MODE)
  AND MATCH (title_full_text.title) AGAINST ('foo' IN BOOLEAN MODE)

Here is the output of EXPLAIN :

*************************** 1. row ***************************
          id: 1
  select_type: SIMPLE
        table: source_title_full_text
        type: fulltext
possible_keys: PRIMARY,title
          key: title
      key_len: 0
          ref:
        rows: 1
        Extra: Using where with pushed condition; Using temporary
*************************** 2. row ***************************
          id: 1
  select_type: SIMPLE
        table: title_full_text
        type: fulltext
possible_keys: PRIMARY,title
          key: title
      key_len: 0
          ref:
        rows: 1
        Extra: Using where with pushed condition
*************************** 3. row ***************************
          id: 1
  select_type: SIMPLE
        table: catalog
        type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: test.title_full_text.uid
        rows: 1
        Extra:
*************************** 4. row ***************************
          id: 1
  select_type: SIMPLE
        table: category
        type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 24
          ref:
test.title_full_text.uid,test.source_title_full_text.uid,const
        rows: 1
        Extra: Using index; Distinct

As you see, we have two full-text queries at the beginning and both have
'rows = 1'.

In a slow query case, 25k records mathces with the first full-text query
and 1.6k records matches with the second full-text query.

When I manually modify the order of INNER JOIN so that the 'lighter'
full-text query appears first, EXPLAIN output is different as well and
the performance is much improved.

Accoding to Kohei, a Mroonga developer, we always have 'rows = 1' for
mroonga full-text query because MariaDB does not request such value to
mroonga (JOIN::get_examined_rows() in sql/sql_select.cc), and it is
possible to modify mroonga to return an estimation of rows if MariaDB
has such an API.

So here are my questions :

* can MariaDB request an 'estimation of rows' to mroonga (or whatever
storage engine) ?
* can such information help to decide a better execution plan with INNER
JOIN like above ?

Thanks in advance !
--
Kazuhiko Shiozaki, Nexedi SA Senior Consultant
Nexedi: Consulting and Development of Free / Open Source Software
http://www.nexedi.com/
ERP5: Full Featured High End Open Source ERP
http://www.erp5.com/

_______________________________________________
Mailing list: https://launchpad.net/~maria-developers
Post to    : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help  : https://help.launchpad.net/ListHelp