developers
Threads by month
- ----- 2025 -----
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
January 2010
- 22 participants
- 177 discussions

[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 20 Jan '10
by worklog-noreply@askmonty.org 20 Jan '10
20 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Wed, 20 Jan 2010, 14:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.26873 2010-01-20 14:50:41.000000000 +0200
+++ /tmp/wklog.66.new.26873 2010-01-20 14:50:41.000000000 +0200
@@ -4,7 +4,6 @@
To check/discuss:
-----------------
-* Do we put subquery cache on all levels of subqueries or on highest level only
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
@@ -41,7 +40,12 @@
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
-We dont support cases when outer_expr or correlation_references are blobs.
+We don't support cases when outer_expr or correlation_references are blobs.
+
+All subquery predicates are cached. That is, if one subquery predicate is
+located within another, both of them will have caches (one option to reduce
+cache memory usage was to use cache only for the upper-most select. we decided
+against it).
2. Data structure used for the cache
------------------------------------
-=-=(Psergey - Wed, 20 Jan 2010, 13:07)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.17649 2010-01-20 13:07:07.000000000 +0200
+++ /tmp/wklog.66.new.17649 2010-01-20 13:07:07.000000000 +0200
@@ -3,7 +3,13 @@
To check/discuss:
- To put subquery cache on all levels of subqueries or on highest level only.
+-----------------
+* Do we put subquery cache on all levels of subqueries or on highest level only
+* Will there be any means to measure subquery cache hit rate?
+* MySQL-6.0 has a one-element predicate result cache. It is called "left
+ expression cache", grep for left_expr_cache in sql/item_subselect.*
+ When this WL is merged with 6.0's optimizations, these two caches will
+ need to be unified somehow.
<contents>
-=-=(Psergey - Mon, 18 Jan 2010, 16:40)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24899 2010-01-18 16:40:16.000000000 +0200
+++ /tmp/wklog.66.new.24899 2010-01-18 16:40:16.000000000 +0200
@@ -1,3 +1,5 @@
+* Target version: base on mysql-5.2 code
+
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
-=-=(Psergey - Mon, 18 Jan 2010, 16:37)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24586 2010-01-18 16:37:07.000000000 +0200
+++ /tmp/wklog.66.new.24586 2010-01-18 16:37:07.000000000 +0200
@@ -4,6 +4,11 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to fill the temptable
+-------------------------
+Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
+and is supposed to be quite trivial.
+
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
-=-=(Psergey - Mon, 18 Jan 2010, 16:34)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24328 2010-01-18 16:34:19.000000000 +0200
+++ /tmp/wklog.66.new.24328 2010-01-18 16:34:19.000000000 +0200
@@ -32,8 +32,8 @@
Question: or perhaps that is not necessarry?
</questionable>
-Execution process
-~~~~~~~~~~~~~~~~~
+Doing the lookup
+~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
@@ -42,4 +42,12 @@
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+The code in this WL will use the same approach
+Extracting the value of the subquery predicate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The goal of making the lookup is to get the value of subquery predicate.
+This is done by creating an Item_field $I which refers to appropriate
+temporary table's field and then subquery_predicate->val_int() will invoke
+$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
+forth.
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
------------------------------------------------------------
-=-=(View All Progress Notes, 12 total)=-=-
http://askmonty.org/worklog/index.pl?tid=66&nolimit=1
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
-----------------
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
When this WL is merged with 6.0's optimizations, these two caches will
need to be unified somehow.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We don't support cases when outer_expr or correlation_references are blobs.
All subquery predicates are cached. That is, if one subquery predicate is
located within another, both of them will have caches (one option to reduce
cache memory usage was to use cache only for the upper-most select. we decided
against it).
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
* Target version: base on mysql-5.2 code
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to fill the temptable
-------------------------
Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
and is supposed to be quite trivial.
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Doing the lookup
~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
The code in this WL will use the same approach
Extracting the value of the subquery predicate
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of making the lookup is to get the value of subquery predicate.
This is done by creating an Item_field $I which refers to appropriate
temporary table's field and then subquery_predicate->val_int() will invoke
$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
forth.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 20 Jan '10
by worklog-noreply@askmonty.org 20 Jan '10
20 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Wed, 20 Jan 2010, 14:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.26873 2010-01-20 14:50:41.000000000 +0200
+++ /tmp/wklog.66.new.26873 2010-01-20 14:50:41.000000000 +0200
@@ -4,7 +4,6 @@
To check/discuss:
-----------------
-* Do we put subquery cache on all levels of subqueries or on highest level only
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
@@ -41,7 +40,12 @@
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
-We dont support cases when outer_expr or correlation_references are blobs.
+We don't support cases when outer_expr or correlation_references are blobs.
+
+All subquery predicates are cached. That is, if one subquery predicate is
+located within another, both of them will have caches (one option to reduce
+cache memory usage was to use cache only for the upper-most select. we decided
+against it).
2. Data structure used for the cache
------------------------------------
-=-=(Psergey - Wed, 20 Jan 2010, 13:07)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.17649 2010-01-20 13:07:07.000000000 +0200
+++ /tmp/wklog.66.new.17649 2010-01-20 13:07:07.000000000 +0200
@@ -3,7 +3,13 @@
To check/discuss:
- To put subquery cache on all levels of subqueries or on highest level only.
+-----------------
+* Do we put subquery cache on all levels of subqueries or on highest level only
+* Will there be any means to measure subquery cache hit rate?
+* MySQL-6.0 has a one-element predicate result cache. It is called "left
+ expression cache", grep for left_expr_cache in sql/item_subselect.*
+ When this WL is merged with 6.0's optimizations, these two caches will
+ need to be unified somehow.
<contents>
-=-=(Psergey - Mon, 18 Jan 2010, 16:40)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24899 2010-01-18 16:40:16.000000000 +0200
+++ /tmp/wklog.66.new.24899 2010-01-18 16:40:16.000000000 +0200
@@ -1,3 +1,5 @@
+* Target version: base on mysql-5.2 code
+
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
-=-=(Psergey - Mon, 18 Jan 2010, 16:37)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24586 2010-01-18 16:37:07.000000000 +0200
+++ /tmp/wklog.66.new.24586 2010-01-18 16:37:07.000000000 +0200
@@ -4,6 +4,11 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to fill the temptable
+-------------------------
+Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
+and is supposed to be quite trivial.
+
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
-=-=(Psergey - Mon, 18 Jan 2010, 16:34)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24328 2010-01-18 16:34:19.000000000 +0200
+++ /tmp/wklog.66.new.24328 2010-01-18 16:34:19.000000000 +0200
@@ -32,8 +32,8 @@
Question: or perhaps that is not necessarry?
</questionable>
-Execution process
-~~~~~~~~~~~~~~~~~
+Doing the lookup
+~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
@@ -42,4 +42,12 @@
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+The code in this WL will use the same approach
+Extracting the value of the subquery predicate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The goal of making the lookup is to get the value of subquery predicate.
+This is done by creating an Item_field $I which refers to appropriate
+temporary table's field and then subquery_predicate->val_int() will invoke
+$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
+forth.
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
------------------------------------------------------------
-=-=(View All Progress Notes, 12 total)=-=-
http://askmonty.org/worklog/index.pl?tid=66&nolimit=1
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
-----------------
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
When this WL is merged with 6.0's optimizations, these two caches will
need to be unified somehow.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We don't support cases when outer_expr or correlation_references are blobs.
All subquery predicates are cached. That is, if one subquery predicate is
located within another, both of them will have caches (one option to reduce
cache memory usage was to use cache only for the upper-most select. we decided
against it).
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
* Target version: base on mysql-5.2 code
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to fill the temptable
-------------------------
Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
and is supposed to be quite trivial.
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Doing the lookup
~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
The code in this WL will use the same approach
Extracting the value of the subquery predicate
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of making the lookup is to get the value of subquery predicate.
This is done by creating an Item_field $I which refers to appropriate
temporary table's field and then subquery_predicate->val_int() will invoke
$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
forth.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 20 Jan '10
by worklog-noreply@askmonty.org 20 Jan '10
20 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Wed, 20 Jan 2010, 13:07)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.17649 2010-01-20 13:07:07.000000000 +0200
+++ /tmp/wklog.66.new.17649 2010-01-20 13:07:07.000000000 +0200
@@ -3,7 +3,13 @@
To check/discuss:
- To put subquery cache on all levels of subqueries or on highest level only.
+-----------------
+* Do we put subquery cache on all levels of subqueries or on highest level only
+* Will there be any means to measure subquery cache hit rate?
+* MySQL-6.0 has a one-element predicate result cache. It is called "left
+ expression cache", grep for left_expr_cache in sql/item_subselect.*
+ When this WL is merged with 6.0's optimizations, these two caches will
+ need to be unified somehow.
<contents>
-=-=(Psergey - Mon, 18 Jan 2010, 16:40)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24899 2010-01-18 16:40:16.000000000 +0200
+++ /tmp/wklog.66.new.24899 2010-01-18 16:40:16.000000000 +0200
@@ -1,3 +1,5 @@
+* Target version: base on mysql-5.2 code
+
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
-=-=(Psergey - Mon, 18 Jan 2010, 16:37)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24586 2010-01-18 16:37:07.000000000 +0200
+++ /tmp/wklog.66.new.24586 2010-01-18 16:37:07.000000000 +0200
@@ -4,6 +4,11 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to fill the temptable
+-------------------------
+Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
+and is supposed to be quite trivial.
+
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
-=-=(Psergey - Mon, 18 Jan 2010, 16:34)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24328 2010-01-18 16:34:19.000000000 +0200
+++ /tmp/wklog.66.new.24328 2010-01-18 16:34:19.000000000 +0200
@@ -32,8 +32,8 @@
Question: or perhaps that is not necessarry?
</questionable>
-Execution process
-~~~~~~~~~~~~~~~~~
+Doing the lookup
+~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
@@ -42,4 +42,12 @@
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+The code in this WL will use the same approach
+Extracting the value of the subquery predicate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The goal of making the lookup is to get the value of subquery predicate.
+This is done by creating an Item_field $I which refers to appropriate
+temporary table's field and then subquery_predicate->val_int() will invoke
+$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
+forth.
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
-=-=(Sanja - Fri, 11 Dec 2009, 15:08)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.28072 2009-12-11 15:08:16.000000000 +0200
+++ /tmp/wklog.66.new.28072 2009-12-11 15:08:16.000000000 +0200
@@ -1 +1,6 @@
+All items on which subquery depend could be collected in
+st_select_lex::mark_as_dependent (direct of indirect reference?)
+
+Temporary table index should be created by all fields except result field
+(TMP_TABLE_PARAM::keyinfo).
------------------------------------------------------------
-=-=(View All Progress Notes, 11 total)=-=-
http://askmonty.org/worklog/index.pl?tid=66&nolimit=1
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
-----------------
* Do we put subquery cache on all levels of subqueries or on highest level only
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
When this WL is merged with 6.0's optimizations, these two caches will
need to be unified somehow.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We dont support cases when outer_expr or correlation_references are blobs.
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
* Target version: base on mysql-5.2 code
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to fill the temptable
-------------------------
Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
and is supposed to be quite trivial.
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Doing the lookup
~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
The code in this WL will use the same approach
Extracting the value of the subquery predicate
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of making the lookup is to get the value of subquery predicate.
This is done by creating an Item_field $I which refers to appropriate
temporary table's field and then subquery_predicate->val_int() will invoke
$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
forth.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 20 Jan '10
by worklog-noreply@askmonty.org 20 Jan '10
20 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Wed, 20 Jan 2010, 13:07)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.17649 2010-01-20 13:07:07.000000000 +0200
+++ /tmp/wklog.66.new.17649 2010-01-20 13:07:07.000000000 +0200
@@ -3,7 +3,13 @@
To check/discuss:
- To put subquery cache on all levels of subqueries or on highest level only.
+-----------------
+* Do we put subquery cache on all levels of subqueries or on highest level only
+* Will there be any means to measure subquery cache hit rate?
+* MySQL-6.0 has a one-element predicate result cache. It is called "left
+ expression cache", grep for left_expr_cache in sql/item_subselect.*
+ When this WL is merged with 6.0's optimizations, these two caches will
+ need to be unified somehow.
<contents>
-=-=(Psergey - Mon, 18 Jan 2010, 16:40)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24899 2010-01-18 16:40:16.000000000 +0200
+++ /tmp/wklog.66.new.24899 2010-01-18 16:40:16.000000000 +0200
@@ -1,3 +1,5 @@
+* Target version: base on mysql-5.2 code
+
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
-=-=(Psergey - Mon, 18 Jan 2010, 16:37)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24586 2010-01-18 16:37:07.000000000 +0200
+++ /tmp/wklog.66.new.24586 2010-01-18 16:37:07.000000000 +0200
@@ -4,6 +4,11 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to fill the temptable
+-------------------------
+Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
+and is supposed to be quite trivial.
+
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
-=-=(Psergey - Mon, 18 Jan 2010, 16:34)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24328 2010-01-18 16:34:19.000000000 +0200
+++ /tmp/wklog.66.new.24328 2010-01-18 16:34:19.000000000 +0200
@@ -32,8 +32,8 @@
Question: or perhaps that is not necessarry?
</questionable>
-Execution process
-~~~~~~~~~~~~~~~~~
+Doing the lookup
+~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
@@ -42,4 +42,12 @@
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+The code in this WL will use the same approach
+Extracting the value of the subquery predicate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The goal of making the lookup is to get the value of subquery predicate.
+This is done by creating an Item_field $I which refers to appropriate
+temporary table's field and then subquery_predicate->val_int() will invoke
+$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
+forth.
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
-=-=(Sanja - Fri, 11 Dec 2009, 15:08)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.28072 2009-12-11 15:08:16.000000000 +0200
+++ /tmp/wklog.66.new.28072 2009-12-11 15:08:16.000000000 +0200
@@ -1 +1,6 @@
+All items on which subquery depend could be collected in
+st_select_lex::mark_as_dependent (direct of indirect reference?)
+
+Temporary table index should be created by all fields except result field
+(TMP_TABLE_PARAM::keyinfo).
------------------------------------------------------------
-=-=(View All Progress Notes, 11 total)=-=-
http://askmonty.org/worklog/index.pl?tid=66&nolimit=1
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
-----------------
* Do we put subquery cache on all levels of subqueries or on highest level only
* Will there be any means to measure subquery cache hit rate?
* MySQL-6.0 has a one-element predicate result cache. It is called "left
expression cache", grep for left_expr_cache in sql/item_subselect.*
When this WL is merged with 6.0's optimizations, these two caches will
need to be unified somehow.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We dont support cases when outer_expr or correlation_references are blobs.
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
* Target version: base on mysql-5.2 code
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to fill the temptable
-------------------------
Can reuse approach from SJ-Materialization. Its code is in end_sj_materialize()
and is supposed to be quite trivial.
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Doing the lookup
~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
The code in this WL will use the same approach
Extracting the value of the subquery predicate
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of making the lookup is to get the value of subquery predicate.
This is done by creating an Item_field $I which refers to appropriate
temporary table's field and then subquery_predicate->val_int() will invoke
$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
forth.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Rev 2769: Added create options for table^ fields and keys (MWL#43). in file:///Users/bell/maria/bzr/work-maria-5.1-create-options/
by sanja@askmonty.org 19 Jan '10
by sanja@askmonty.org 19 Jan '10
19 Jan '10
At file:///Users/bell/maria/bzr/work-maria-5.1-create-options/
------------------------------------------------------------
revno: 2769
revision-id: sanja(a)askmonty.org-20091201120747-cvp0uni2rd4iy5xi
parent: bo.thorsen(a)canonical.com-20091126153249-0xfszhcynbym2lvr
committer: sanja(a)askmonty.org
branch nick: work-maria-5.1-create-options
timestamp: Tue 2009-12-01 14:07:47 +0200
message:
Added create options for table^ fields and keys (MWL#43).
Diff too large for email (1985 lines, the limit is 1000).
3
6

[Maria-developers] Updated (by Timour): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply@askmonty.org 19 Jan '10
by worklog-noreply@askmonty.org 19 Jan '10
19 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs
CREATION DATE..: Fri, 27 Nov 2009, 13:22
SUPERVISOR.....: Monty
IMPLEMENTOR....: Timour
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 68 (http://askmonty.org/worklog/?tid=68)
VERSION........: Server-9.x
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Timour - Tue, 19 Jan 2010, 18:44)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.22569 2010-01-19 18:44:01.000000000 +0200
+++ /tmp/wklog.68.new.22569 2010-01-19 18:44:01.000000000 +0200
@@ -132,11 +132,10 @@
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
- if (nonull_key)
- pq.insert(nonull_key)
for (i = 1; i <= n; i++)
{
+ if (vkey[i] != nonull_key)
vkey[i].lookup(outer_ref)
if (! vkey[i].is_eof())
pq.insert(i)
@@ -167,7 +166,7 @@
/* There cannot be a complete match, as we already checked for one. */
assert(matching_keys.elements < n)
}
- else if (cur_min_key == nonull_key)
+ else if (vkey[cur_min_key] == nonull_key)
{
/*
The non-NULL key has no corresponding NULL index, so we know for
@@ -183,8 +182,10 @@
/*
Check if all null_keys contain a NULL at row 'min_row'. The procedure
internally checks all keys in a special precomputed order. A prior
- procedure determines an optimal order and a mapping
- idx_no -> idx_order (encoded as an array).
+ procedure determines an optimal order and a mapping idx_no -> idx_order
+ (encoded as an array).
+
+ This procedure makes sure not to match the non-NULL column.
*/
if (test_null_row(null_keys, min_row))
return TRUE
@@ -198,6 +199,14 @@
vkey[cur_min_key].next()
if (! vkey[cur_min_key].is_eof())
pq.insert(cur_min_key)
+ else if (vkey[cur_min_key] == nonull_key)
+ {
+ /*
+ If there can't be more matches for the nonull_key, we know for sure
+ there is no match, since there is no possible NULL match.
+ */
+ return FALSE
+ }
if (pq.is_empty())
{
@@ -216,7 +225,6 @@
}
-
3. Directions for improvement
========================================================================
-=-=(Timour - Tue, 19 Jan 2010, 18:29)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.21045 2010-01-19 18:29:12.000000000 +0200
+++ /tmp/wklog.68.new.21045 2010-01-19 18:29:12.000000000 +0200
@@ -132,6 +132,8 @@
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
+ if (nonull_key)
+ pq.insert(nonull_key)
for (i = 1; i <= n; i++)
{
-=-=(Guest - Tue, 19 Jan 2010, 18:15)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.19825 2010-01-19 18:15:30.000000000 +0200
+++ /tmp/wklog.68.new.19825 2010-01-19 18:15:30.000000000 +0200
@@ -1,8 +1,16 @@
-This a copy of the initial algorithm proposed by Igor:
-======================================================
+Contents
+========================================================================
-For each left side tuple (v_1,...,v_n) we have to find the following set
-of rowids for the temp table containing N rows as the result of
+1. Initial idea as proposed by Igor
+2. Algorithm for IN execution with partial matching
+3. Directions for improvement
+
+
+1. Initial idea as proposed by Igor
+========================================================================
+
+For each left side tuple (v_1,...,v_n) we have to find the following
+set of rowids for the temp table containing N rows as the result of
materialization of the subquery:
R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
@@ -18,38 +26,198 @@
- it requires minimum memory: not more than N*n bits in total
- search of an element in a set is extremely cheap
-Taken all above into account I could suggest the following algorithm to
-build R:
+Taken all above into account I could suggest the following algorithm
+to build R:
- Using indexes (read about them below) for each column participating in the
- intersection,
- merge ordered sets rowid{a_i=v_i} in the following manner.
+ Using indexes (read about them below) for each column participating
+ in the intersection, merge ordered sets rowid{a_i=v_i} in the
+ following manner.
If a rowid r has been encountered maximum in k sets
-rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
+ rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
then it has to be checked against all rowid{a_i=v_i} such that i is
-not in {i1,...,ik}.
+ not in {i1,...,ik}.
As soon as we fail to find r in one of these sets we discard it.
If r has been found in all of them then r belongs to the set R.
-Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i
-is null} is either
+Here we use the property (1):
+any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either
belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
-infer that for any r from R
-indexes a_i can be uniquely divided into two groups: one contains
-indexes a_i where r belongs to
-the sets rowid{a_i=v_i}, the other contains indexes a_j such that r
-belongs to rowid{a_j is null}.
-
-Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order
-needed for the merge procedure. We could use BTREE indexes for temp
-table. But they are rather expensive and
-take a lot of memory as the are implemented with RB trees.
+infer that for any r from R indexes a_i can be uniquely divided into
+two groups:
+- one contains indexes a_i where r belongs to the sets rowid{a_i=v_i},
+- the other contains indexes a_j such that r belongs to
+ rowid{a_j is null}.
+
+Now let's talk how to get elements from rowid{a_i=v_i} in a sorted
+order needed for the merge procedure. We could use BTREE indexes for
+temp table. But they are rather expensive and take a lot of memory as
+the are implemented with RB trees.
I would suggest creating for each column from the temporary table just
an array of rowids sorted by the value from column a.
Index lookup in such an array is cheap. It's also rather cheap to check
that the next rowid refers to a row with a different value in column a.
The array can be created on demand.
+2. Algorithm for IN execution with partial matching
+========================================================================
+
+2.1 Below is shown the top-level algorithm to execute an IN predicate
+with partial matching. This algorithm is essentially the implementation
+of Item_subselect:exec().
+
+int lookup_with_null_semantics(outer_ref[], mat_subquery)
+{
+ if (index_lookup(outer_ref, mat_subquery)
+ return TRUE
+ else
+ {
+ /*
+ Check if there is a partial match (UNKNOWN) or no match (NULL).
+ */
+ if (this is the first partial match)
+ {
+ vkey[] = build array of value keys for each NULL-able column
+ of mat_subquery.
+ nkey[] = build a bitmap NULL index for each column of mat_subquery
+ that contains NULLs
+ nonull_key = build a key over all non-NULL columns of mat_subquery
+ }
+ if (partial_match(outer_ref, vkey[], nkey[], nonull_key)
+ return UNKNOWN
+ else
+ return FALSE
+ }
+}
+
+2.2 The implementation of partial matching is as follows
+
+/*
+ Assumptions:
+ - It has already been checked if there is a complete match by a
+ regular index lookup, and the test failed.
+ - It has already been checked if there is a complete NULL row,
+ and if there was we wouldn't call this function. Thus we assume
+ that there is no complete NULL row.
+ - Not all vidx_i are empty, but some can be empty. If all were empty,
+ then the only possibility for a match is a complete NULL row, which
+ we already checked.
+
+ @param outer_ref - the uter (left) IN argument.
+ @param vidx[] - array of value keys
+ Ordered sequences of rowids of the corresponding columns a_i, such
+ that all rowids in idx_i are the ones where column a_i contains some
+ value or NULL. Each idx_i is derived dynamically, for each different
+ left argument of an IN predicate.
+ @param nidx[] - array of NULL keys
+ Bitmpas, one per each column, where a bit is set if the corresponding
+ row has a NULL value for the corresponding column.
+ @nonull_key - the only key over all columns of the materialized subquery
+ that do not contain NULLs
+
+ @returns
+ @retval FALSE if there is no match
+ @retval TRUE if there is a partial match
+*/
+
+Boolean partial_match(outer_ref, vkey[], nkey[], nonull_key)
+{
+ /* Set of the keys (columns) that form a partial match. */
+ Set matching_keys = {}
+ /* A subset of all keys that need to be checked for NULL matches. */
+ Set null_keys = {}
+ Int min_key /* Key that contains the current minimum position. */
+ Int min_row /* Current row number of min_key. */
+ Int cur_min_key, cur_min_row
+ PriorityQueue pq
+
+ if (nonull_key && ! nonull_key->lookup(outer_ref))
+ return FALSE
+
+ for (i = 1; i <= n; i++)
+ {
+ vkey[i].lookup(outer_ref)
+ if (! vkey[i].is_eof())
+ pq.insert(i)
+ }
+ /*
+ Not all value keys are empty, thus we don't have only NULL
+ keys. If we had, the only possible match is a NULL row, and
+ we cheked there is no such row, therefore the result is known
+ to be FALSE.
+ In fact this algorithm makes sense for at least two non-NULL
+ columns.
+ */
+ assert(pq.elements > 1)
+
+ (min_key, min_row) = pq.pop()
+ matching_keys.add(min_key)
+ vkey[min_key].next()
+ if (! vkey[min_key].is_eof())
+ pq.insert(min_key)
+
+ while (TRUE)
+ {
+ (cur_min_key, cur_min_row) = pq.pop()
+
+ if (cur_min_row == min_row)
+ {
+ matching_keys.add(cur_min_key)
+ /* There cannot be a complete match, as we already checked for one. */
+ assert(matching_keys.elements < n)
+ }
+ else if (cur_min_key == nonull_key)
+ {
+ /*
+ The non-NULL key has no corresponding NULL index, so we know for
+ sure that the row 'min_row' is not a match.
+ */
+ (min_key, min_row) = (cur_min_key, cur_min_row)
+ matching_keys = {min_key}
+ }
+ else
+ {
+ assert(cur_min_row > min_row) /* Follows from the use of PQ. */
+ null_keys = set_difference(all keys vkey[], matching_keys)
+ /*
+ Check if all null_keys contain a NULL at row 'min_row'. The procedure
+ internally checks all keys in a special precomputed order. A prior
+ procedure determines an optimal order and a mapping
+ idx_no -> idx_order (encoded as an array).
+ */
+ if (test_null_row(null_keys, min_row))
+ return TRUE
+ else
+ {
+ (min_key, min_row) = (cur_min_key, cur_min_row)
+ matching_keys = {min_key}
+ }
+ }
+
+ vkey[cur_min_key].next()
+ if (! vkey[cur_min_key].is_eof())
+ pq.insert(cur_min_key)
+
+ if (pq.is_empty())
+ {
+ /* Check the last row of the last column in PQ for NULL matches. */
+ null_keys = set_difference(all keys vkey[], matching_keys)
+ if (test_null_row(null_keys, min_row))
+ return TRUE
+ else
+ return FALSE
+ }
+ }
+
+ /* We should never get here. */
+ assert(FALSE)
+ return FALSE
+}
+
+
+
+3. Directions for improvement
+========================================================================
+
Other consideration that may be taken into account:
1. If columns a_j1,...,a_jm do not contain null values in the temporary
-=-=(Timour - Sun, 06 Dec 2009, 14:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.12919 2009-12-06 14:36:18.000000000 +0200
+++ /tmp/wklog.68.new.12919 2009-12-06 14:36:18.000000000 +0200
@@ -87,3 +87,8 @@
7. If you get a row with nulls in all columns stop filling the temporary
table and return UNKNOWN for any tuple <v1,...,vn>.
+8. [timour]
+ Consider that due to materialization, we already have a unique index
+on all columns <a_1,..., a_n>. We can use the first key part of this index
+over column a_1, instead of the index rowid{a_i=v_i}. Thus we can avoid
+creating the index rowid{a_i=v_i}.
-=-=(Timour - Fri, 04 Dec 2009, 14:04)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.16724 2009-12-04 14:04:28.000000000 +0200
+++ /tmp/wklog.68.new.16724 2009-12-04 14:04:28.000000000 +0200
@@ -10,7 +10,8 @@
Bear in mind the following specifics of this intersection:
(1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
- (2) For each i: rowid{a_i is null} is the same for each tuple
+ (2) For each i: rowid{a_i is null} is the same for each tuple,
+ that is, this set is independent of the left-side tuples.
Due to (2) it makes sense to build rowid{a_i is null} only once.
A good representation for such sets would be bitmaps:
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Version updated.
--- /tmp/wklog.68.old.5257 2009-12-04 11:27:11.000000000 +0200
+++ /tmp/wklog.68.new.5257 2009-12-04 11:27:11.000000000 +0200
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Category updated.
--- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200
+++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Status updated.
--- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200
+++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+Assigned
-=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200
+++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200
@@ -50,23 +50,39 @@
The array can be created on demand.
Other consideration that may be taken into account:
+
1. If columns a_j1,...,a_jm do not contain null values in the temporary
-table at all, create for them only one index array (and of course do not
-create any bitmaps for them).
-2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows
-where a_i is not null and V(a_i) is the number of distinct values for
-a_i excluding nulls.
- If d(a_i) is close to 1 then do not create any index array: check
+table at all and v_j1,...,v_jm cannot be null, create for these columns
+only one index array (and of course do not create any bitmaps for them).
+
+2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number
+of rows, where a_i is not null and V(a_i) is the number of distinct
+values for a_i excluding nulls.
+If d(a_i) is close to N'(a_i) then do not create any index array: check
whether there is a match running through the records that have been
-filtered in. Anyway if d(a_i) is close to 1 then a intersection with
-rowid{a_i=v_i} would not reduce the number of remaining rowids
+filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection
+ with rowid{a_i=v_i} will not reduce the number of remaining rowids
significantly.
- If additionally N-N' is small do not create a bitmap for this column
-either.
-3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a
-sorted array of rowids from the set rowid{a_i is null} can be used
-instead of a bitmap.
+In other words is V(a_i) exceeds some threshold there is no sense to
+create an index for a_i.
+If additionally N-N'(a_i) is small do not create a bitmap for this
+column either.
+
+3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is
+small a sorted array of rowids from the set rowid{a_i is null} can be
+used instead of a bitmap.
+
4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
empty. Here i runs through all indexes from [1..n] such that v_i is not
null. For a given subset of columns this fact has to be checked only
once. It can be easily done with bitmap intersection.
+
+5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be
+created only for rows with nulls.
+
+6. If v1,...,vn never can be a null and number of rows with nulls is
+small do not create indexes and do not create bitmaps.
+
+7. If you get a row with nulls in all columns stop filling the temporary
+table and return UNKNOWN for any tuple <v1,...,vn>.
+
-=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000
+++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000
@@ -1 +1,72 @@
+This a copy of the initial algorithm proposed by Igor:
+======================================================
+For each left side tuple (v_1,...,v_n) we have to find the following set
+of rowids for the temp table containing N rows as the result of
+materialization of the subquery:
+
+ R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
+trough all indexes from [1..n] such that v_i is not null.
+
+Bear in mind the following specifics of this intersection:
+ (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
+ (2) For each i: rowid{a_i is null} is the same for each tuple
+
+Due to (2) it makes sense to build rowid{a_i is null} only once.
+A good representation for such sets would be bitmaps:
+- it requires minimum memory: not more than N*n bits in total
+- search of an element in a set is extremely cheap
+
+Taken all above into account I could suggest the following algorithm to
+build R:
+
+ Using indexes (read about them below) for each column participating in the
+ intersection,
+ merge ordered sets rowid{a_i=v_i} in the following manner.
+ If a rowid r has been encountered maximum in k sets
+rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
+ then it has to be checked against all rowid{a_i=v_i} such that i is
+not in {i1,...,ik}.
+ As soon as we fail to find r in one of these sets we discard it.
+ If r has been found in all of them then r belongs to the set R.
+
+Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i
+is null} is either
+belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
+infer that for any r from R
+indexes a_i can be uniquely divided into two groups: one contains
+indexes a_i where r belongs to
+the sets rowid{a_i=v_i}, the other contains indexes a_j such that r
+belongs to rowid{a_j is null}.
+
+Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order
+needed for the merge procedure. We could use BTREE indexes for temp
+table. But they are rather expensive and
+take a lot of memory as the are implemented with RB trees.
+I would suggest creating for each column from the temporary table just
+an array of rowids sorted by the value from column a.
+Index lookup in such an array is cheap. It's also rather cheap to check
+that the next rowid refers to a row with a different value in column a.
+The array can be created on demand.
+
+Other consideration that may be taken into account:
+1. If columns a_j1,...,a_jm do not contain null values in the temporary
+table at all, create for them only one index array (and of course do not
+create any bitmaps for them).
+2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows
+where a_i is not null and V(a_i) is the number of distinct values for
+a_i excluding nulls.
+ If d(a_i) is close to 1 then do not create any index array: check
+whether there is a match running through the records that have been
+filtered in. Anyway if d(a_i) is close to 1 then a intersection with
+rowid{a_i=v_i} would not reduce the number of remaining rowids
+significantly.
+ If additionally N-N' is small do not create a bitmap for this column
+either.
+3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a
+sorted array of rowids from the set rowid{a_i is null} can be used
+instead of a bitmap.
+4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
+empty. Here i runs through all indexes from [1..n] such that v_i is not
+null. For a given subset of columns this fact has to be checked only
+once. It can be easily done with bitmap intersection.
DESCRIPTION:
The goal of this task is to implement efficient execution of NOT IN
subquery predicates of the form:
<oe_1,...,oe_n> NOT IN <non_correlated subquery>
when either some oe_i, or some subqury result column contains NULLs.
The problem with such predicates is that it is possible to use index
lookups only when neither argument of the predicate contains NULLs.
If some argument contains a NULL, then due to NULL semantics, it
plays the role of a wildcard. If we were to use regular index lookups,
then we would get 'no match' for some outer tuple (thus the predicate
evaluates to FALSE), while the SQL semantics means 'partial match', and
the predicate should evaluate to NULL.
This task implements an efficient algorithm to compute such 'parial
matches', where a NULL matches any value.
HIGH-LEVEL SPECIFICATION:
Contents
========================================================================
1. Initial idea as proposed by Igor
2. Algorithm for IN execution with partial matching
3. Directions for improvement
1. Initial idea as proposed by Igor
========================================================================
For each left side tuple (v_1,...,v_n) we have to find the following
set of rowids for the temp table containing N rows as the result of
materialization of the subquery:
R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
trough all indexes from [1..n] such that v_i is not null.
Bear in mind the following specifics of this intersection:
(1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
(2) For each i: rowid{a_i is null} is the same for each tuple,
that is, this set is independent of the left-side tuples.
Due to (2) it makes sense to build rowid{a_i is null} only once.
A good representation for such sets would be bitmaps:
- it requires minimum memory: not more than N*n bits in total
- search of an element in a set is extremely cheap
Taken all above into account I could suggest the following algorithm
to build R:
Using indexes (read about them below) for each column participating
in the intersection, merge ordered sets rowid{a_i=v_i} in the
following manner.
If a rowid r has been encountered maximum in k sets
rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
then it has to be checked against all rowid{a_i=v_i} such that i is
not in {i1,...,ik}.
As soon as we fail to find r in one of these sets we discard it.
If r has been found in all of them then r belongs to the set R.
Here we use the property (1):
any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either
belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
infer that for any r from R indexes a_i can be uniquely divided into
two groups:
- one contains indexes a_i where r belongs to the sets rowid{a_i=v_i},
- the other contains indexes a_j such that r belongs to
rowid{a_j is null}.
Now let's talk how to get elements from rowid{a_i=v_i} in a sorted
order needed for the merge procedure. We could use BTREE indexes for
temp table. But they are rather expensive and take a lot of memory as
the are implemented with RB trees.
I would suggest creating for each column from the temporary table just
an array of rowids sorted by the value from column a.
Index lookup in such an array is cheap. It's also rather cheap to check
that the next rowid refers to a row with a different value in column a.
The array can be created on demand.
2. Algorithm for IN execution with partial matching
========================================================================
2.1 Below is shown the top-level algorithm to execute an IN predicate
with partial matching. This algorithm is essentially the implementation
of Item_subselect:exec().
int lookup_with_null_semantics(outer_ref[], mat_subquery)
{
if (index_lookup(outer_ref, mat_subquery)
return TRUE
else
{
/*
Check if there is a partial match (UNKNOWN) or no match (NULL).
*/
if (this is the first partial match)
{
vkey[] = build array of value keys for each NULL-able column
of mat_subquery.
nkey[] = build a bitmap NULL index for each column of mat_subquery
that contains NULLs
nonull_key = build a key over all non-NULL columns of mat_subquery
}
if (partial_match(outer_ref, vkey[], nkey[], nonull_key)
return UNKNOWN
else
return FALSE
}
}
2.2 The implementation of partial matching is as follows
/*
Assumptions:
- It has already been checked if there is a complete match by a
regular index lookup, and the test failed.
- It has already been checked if there is a complete NULL row,
and if there was we wouldn't call this function. Thus we assume
that there is no complete NULL row.
- Not all vidx_i are empty, but some can be empty. If all were empty,
then the only possibility for a match is a complete NULL row, which
we already checked.
@param outer_ref - the uter (left) IN argument.
@param vidx[] - array of value keys
Ordered sequences of rowids of the corresponding columns a_i, such
that all rowids in idx_i are the ones where column a_i contains some
value or NULL. Each idx_i is derived dynamically, for each different
left argument of an IN predicate.
@param nidx[] - array of NULL keys
Bitmpas, one per each column, where a bit is set if the corresponding
row has a NULL value for the corresponding column.
@nonull_key - the only key over all columns of the materialized subquery
that do not contain NULLs
@returns
@retval FALSE if there is no match
@retval TRUE if there is a partial match
*/
Boolean partial_match(outer_ref, vkey[], nkey[], nonull_key)
{
/* Set of the keys (columns) that form a partial match. */
Set matching_keys = {}
/* A subset of all keys that need to be checked for NULL matches. */
Set null_keys = {}
Int min_key /* Key that contains the current minimum position. */
Int min_row /* Current row number of min_key. */
Int cur_min_key, cur_min_row
PriorityQueue pq
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
for (i = 1; i <= n; i++)
{
if (vkey[i] != nonull_key)
vkey[i].lookup(outer_ref)
if (! vkey[i].is_eof())
pq.insert(i)
}
/*
Not all value keys are empty, thus we don't have only NULL
keys. If we had, the only possible match is a NULL row, and
we cheked there is no such row, therefore the result is known
to be FALSE.
In fact this algorithm makes sense for at least two non-NULL
columns.
*/
assert(pq.elements > 1)
(min_key, min_row) = pq.pop()
matching_keys.add(min_key)
vkey[min_key].next()
if (! vkey[min_key].is_eof())
pq.insert(min_key)
while (TRUE)
{
(cur_min_key, cur_min_row) = pq.pop()
if (cur_min_row == min_row)
{
matching_keys.add(cur_min_key)
/* There cannot be a complete match, as we already checked for one. */
assert(matching_keys.elements < n)
}
else if (vkey[cur_min_key] == nonull_key)
{
/*
The non-NULL key has no corresponding NULL index, so we know for
sure that the row 'min_row' is not a match.
*/
(min_key, min_row) = (cur_min_key, cur_min_row)
matching_keys = {min_key}
}
else
{
assert(cur_min_row > min_row) /* Follows from the use of PQ. */
null_keys = set_difference(all keys vkey[], matching_keys)
/*
Check if all null_keys contain a NULL at row 'min_row'. The procedure
internally checks all keys in a special precomputed order. A prior
procedure determines an optimal order and a mapping idx_no -> idx_order
(encoded as an array).
This procedure makes sure not to match the non-NULL column.
*/
if (test_null_row(null_keys, min_row))
return TRUE
else
{
(min_key, min_row) = (cur_min_key, cur_min_row)
matching_keys = {min_key}
}
}
vkey[cur_min_key].next()
if (! vkey[cur_min_key].is_eof())
pq.insert(cur_min_key)
else if (vkey[cur_min_key] == nonull_key)
{
/*
If there can't be more matches for the nonull_key, we know for sure
there is no match, since there is no possible NULL match.
*/
return FALSE
}
if (pq.is_empty())
{
/* Check the last row of the last column in PQ for NULL matches. */
null_keys = set_difference(all keys vkey[], matching_keys)
if (test_null_row(null_keys, min_row))
return TRUE
else
return FALSE
}
}
/* We should never get here. */
assert(FALSE)
return FALSE
}
3. Directions for improvement
========================================================================
Other consideration that may be taken into account:
1. If columns a_j1,...,a_jm do not contain null values in the temporary
table at all and v_j1,...,v_jm cannot be null, create for these columns
only one index array (and of course do not create any bitmaps for them).
2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number
of rows, where a_i is not null and V(a_i) is the number of distinct
values for a_i excluding nulls.
If d(a_i) is close to N'(a_i) then do not create any index array: check
whether there is a match running through the records that have been
filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection
with rowid{a_i=v_i} will not reduce the number of remaining rowids
significantly.
In other words is V(a_i) exceeds some threshold there is no sense to
create an index for a_i.
If additionally N-N'(a_i) is small do not create a bitmap for this
column either.
3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is
small a sorted array of rowids from the set rowid{a_i is null} can be
used instead of a bitmap.
4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
empty. Here i runs through all indexes from [1..n] such that v_i is not
null. For a given subset of columns this fact has to be checked only
once. It can be easily done with bitmap intersection.
5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be
created only for rows with nulls.
6. If v1,...,vn never can be a null and number of rows with nulls is
small do not create indexes and do not create bitmaps.
7. If you get a row with nulls in all columns stop filling the temporary
table and return UNKNOWN for any tuple <v1,...,vn>.
8. [timour]
Consider that due to materialization, we already have a unique index
on all columns <a_1,..., a_n>. We can use the first key part of this index
over column a_1, instead of the index rowid{a_i=v_i}. Thus we can avoid
creating the index rowid{a_i=v_i}.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Timour): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply@askmonty.org 19 Jan '10
by worklog-noreply@askmonty.org 19 Jan '10
19 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs
CREATION DATE..: Fri, 27 Nov 2009, 13:22
SUPERVISOR.....: Monty
IMPLEMENTOR....: Timour
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 68 (http://askmonty.org/worklog/?tid=68)
VERSION........: Server-9.x
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Timour - Tue, 19 Jan 2010, 18:44)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.22569 2010-01-19 18:44:01.000000000 +0200
+++ /tmp/wklog.68.new.22569 2010-01-19 18:44:01.000000000 +0200
@@ -132,11 +132,10 @@
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
- if (nonull_key)
- pq.insert(nonull_key)
for (i = 1; i <= n; i++)
{
+ if (vkey[i] != nonull_key)
vkey[i].lookup(outer_ref)
if (! vkey[i].is_eof())
pq.insert(i)
@@ -167,7 +166,7 @@
/* There cannot be a complete match, as we already checked for one. */
assert(matching_keys.elements < n)
}
- else if (cur_min_key == nonull_key)
+ else if (vkey[cur_min_key] == nonull_key)
{
/*
The non-NULL key has no corresponding NULL index, so we know for
@@ -183,8 +182,10 @@
/*
Check if all null_keys contain a NULL at row 'min_row'. The procedure
internally checks all keys in a special precomputed order. A prior
- procedure determines an optimal order and a mapping
- idx_no -> idx_order (encoded as an array).
+ procedure determines an optimal order and a mapping idx_no -> idx_order
+ (encoded as an array).
+
+ This procedure makes sure not to match the non-NULL column.
*/
if (test_null_row(null_keys, min_row))
return TRUE
@@ -198,6 +199,14 @@
vkey[cur_min_key].next()
if (! vkey[cur_min_key].is_eof())
pq.insert(cur_min_key)
+ else if (vkey[cur_min_key] == nonull_key)
+ {
+ /*
+ If there can't be more matches for the nonull_key, we know for sure
+ there is no match, since there is no possible NULL match.
+ */
+ return FALSE
+ }
if (pq.is_empty())
{
@@ -216,7 +225,6 @@
}
-
3. Directions for improvement
========================================================================
-=-=(Timour - Tue, 19 Jan 2010, 18:29)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.21045 2010-01-19 18:29:12.000000000 +0200
+++ /tmp/wklog.68.new.21045 2010-01-19 18:29:12.000000000 +0200
@@ -132,6 +132,8 @@
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
+ if (nonull_key)
+ pq.insert(nonull_key)
for (i = 1; i <= n; i++)
{
-=-=(Guest - Tue, 19 Jan 2010, 18:15)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.19825 2010-01-19 18:15:30.000000000 +0200
+++ /tmp/wklog.68.new.19825 2010-01-19 18:15:30.000000000 +0200
@@ -1,8 +1,16 @@
-This a copy of the initial algorithm proposed by Igor:
-======================================================
+Contents
+========================================================================
-For each left side tuple (v_1,...,v_n) we have to find the following set
-of rowids for the temp table containing N rows as the result of
+1. Initial idea as proposed by Igor
+2. Algorithm for IN execution with partial matching
+3. Directions for improvement
+
+
+1. Initial idea as proposed by Igor
+========================================================================
+
+For each left side tuple (v_1,...,v_n) we have to find the following
+set of rowids for the temp table containing N rows as the result of
materialization of the subquery:
R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
@@ -18,38 +26,198 @@
- it requires minimum memory: not more than N*n bits in total
- search of an element in a set is extremely cheap
-Taken all above into account I could suggest the following algorithm to
-build R:
+Taken all above into account I could suggest the following algorithm
+to build R:
- Using indexes (read about them below) for each column participating in the
- intersection,
- merge ordered sets rowid{a_i=v_i} in the following manner.
+ Using indexes (read about them below) for each column participating
+ in the intersection, merge ordered sets rowid{a_i=v_i} in the
+ following manner.
If a rowid r has been encountered maximum in k sets
-rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
+ rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
then it has to be checked against all rowid{a_i=v_i} such that i is
-not in {i1,...,ik}.
+ not in {i1,...,ik}.
As soon as we fail to find r in one of these sets we discard it.
If r has been found in all of them then r belongs to the set R.
-Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i
-is null} is either
+Here we use the property (1):
+any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either
belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
-infer that for any r from R
-indexes a_i can be uniquely divided into two groups: one contains
-indexes a_i where r belongs to
-the sets rowid{a_i=v_i}, the other contains indexes a_j such that r
-belongs to rowid{a_j is null}.
-
-Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order
-needed for the merge procedure. We could use BTREE indexes for temp
-table. But they are rather expensive and
-take a lot of memory as the are implemented with RB trees.
+infer that for any r from R indexes a_i can be uniquely divided into
+two groups:
+- one contains indexes a_i where r belongs to the sets rowid{a_i=v_i},
+- the other contains indexes a_j such that r belongs to
+ rowid{a_j is null}.
+
+Now let's talk how to get elements from rowid{a_i=v_i} in a sorted
+order needed for the merge procedure. We could use BTREE indexes for
+temp table. But they are rather expensive and take a lot of memory as
+the are implemented with RB trees.
I would suggest creating for each column from the temporary table just
an array of rowids sorted by the value from column a.
Index lookup in such an array is cheap. It's also rather cheap to check
that the next rowid refers to a row with a different value in column a.
The array can be created on demand.
+2. Algorithm for IN execution with partial matching
+========================================================================
+
+2.1 Below is shown the top-level algorithm to execute an IN predicate
+with partial matching. This algorithm is essentially the implementation
+of Item_subselect:exec().
+
+int lookup_with_null_semantics(outer_ref[], mat_subquery)
+{
+ if (index_lookup(outer_ref, mat_subquery)
+ return TRUE
+ else
+ {
+ /*
+ Check if there is a partial match (UNKNOWN) or no match (NULL).
+ */
+ if (this is the first partial match)
+ {
+ vkey[] = build array of value keys for each NULL-able column
+ of mat_subquery.
+ nkey[] = build a bitmap NULL index for each column of mat_subquery
+ that contains NULLs
+ nonull_key = build a key over all non-NULL columns of mat_subquery
+ }
+ if (partial_match(outer_ref, vkey[], nkey[], nonull_key)
+ return UNKNOWN
+ else
+ return FALSE
+ }
+}
+
+2.2 The implementation of partial matching is as follows
+
+/*
+ Assumptions:
+ - It has already been checked if there is a complete match by a
+ regular index lookup, and the test failed.
+ - It has already been checked if there is a complete NULL row,
+ and if there was we wouldn't call this function. Thus we assume
+ that there is no complete NULL row.
+ - Not all vidx_i are empty, but some can be empty. If all were empty,
+ then the only possibility for a match is a complete NULL row, which
+ we already checked.
+
+ @param outer_ref - the uter (left) IN argument.
+ @param vidx[] - array of value keys
+ Ordered sequences of rowids of the corresponding columns a_i, such
+ that all rowids in idx_i are the ones where column a_i contains some
+ value or NULL. Each idx_i is derived dynamically, for each different
+ left argument of an IN predicate.
+ @param nidx[] - array of NULL keys
+ Bitmpas, one per each column, where a bit is set if the corresponding
+ row has a NULL value for the corresponding column.
+ @nonull_key - the only key over all columns of the materialized subquery
+ that do not contain NULLs
+
+ @returns
+ @retval FALSE if there is no match
+ @retval TRUE if there is a partial match
+*/
+
+Boolean partial_match(outer_ref, vkey[], nkey[], nonull_key)
+{
+ /* Set of the keys (columns) that form a partial match. */
+ Set matching_keys = {}
+ /* A subset of all keys that need to be checked for NULL matches. */
+ Set null_keys = {}
+ Int min_key /* Key that contains the current minimum position. */
+ Int min_row /* Current row number of min_key. */
+ Int cur_min_key, cur_min_row
+ PriorityQueue pq
+
+ if (nonull_key && ! nonull_key->lookup(outer_ref))
+ return FALSE
+
+ for (i = 1; i <= n; i++)
+ {
+ vkey[i].lookup(outer_ref)
+ if (! vkey[i].is_eof())
+ pq.insert(i)
+ }
+ /*
+ Not all value keys are empty, thus we don't have only NULL
+ keys. If we had, the only possible match is a NULL row, and
+ we cheked there is no such row, therefore the result is known
+ to be FALSE.
+ In fact this algorithm makes sense for at least two non-NULL
+ columns.
+ */
+ assert(pq.elements > 1)
+
+ (min_key, min_row) = pq.pop()
+ matching_keys.add(min_key)
+ vkey[min_key].next()
+ if (! vkey[min_key].is_eof())
+ pq.insert(min_key)
+
+ while (TRUE)
+ {
+ (cur_min_key, cur_min_row) = pq.pop()
+
+ if (cur_min_row == min_row)
+ {
+ matching_keys.add(cur_min_key)
+ /* There cannot be a complete match, as we already checked for one. */
+ assert(matching_keys.elements < n)
+ }
+ else if (cur_min_key == nonull_key)
+ {
+ /*
+ The non-NULL key has no corresponding NULL index, so we know for
+ sure that the row 'min_row' is not a match.
+ */
+ (min_key, min_row) = (cur_min_key, cur_min_row)
+ matching_keys = {min_key}
+ }
+ else
+ {
+ assert(cur_min_row > min_row) /* Follows from the use of PQ. */
+ null_keys = set_difference(all keys vkey[], matching_keys)
+ /*
+ Check if all null_keys contain a NULL at row 'min_row'. The procedure
+ internally checks all keys in a special precomputed order. A prior
+ procedure determines an optimal order and a mapping
+ idx_no -> idx_order (encoded as an array).
+ */
+ if (test_null_row(null_keys, min_row))
+ return TRUE
+ else
+ {
+ (min_key, min_row) = (cur_min_key, cur_min_row)
+ matching_keys = {min_key}
+ }
+ }
+
+ vkey[cur_min_key].next()
+ if (! vkey[cur_min_key].is_eof())
+ pq.insert(cur_min_key)
+
+ if (pq.is_empty())
+ {
+ /* Check the last row of the last column in PQ for NULL matches. */
+ null_keys = set_difference(all keys vkey[], matching_keys)
+ if (test_null_row(null_keys, min_row))
+ return TRUE
+ else
+ return FALSE
+ }
+ }
+
+ /* We should never get here. */
+ assert(FALSE)
+ return FALSE
+}
+
+
+
+3. Directions for improvement
+========================================================================
+
Other consideration that may be taken into account:
1. If columns a_j1,...,a_jm do not contain null values in the temporary
-=-=(Timour - Sun, 06 Dec 2009, 14:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.12919 2009-12-06 14:36:18.000000000 +0200
+++ /tmp/wklog.68.new.12919 2009-12-06 14:36:18.000000000 +0200
@@ -87,3 +87,8 @@
7. If you get a row with nulls in all columns stop filling the temporary
table and return UNKNOWN for any tuple <v1,...,vn>.
+8. [timour]
+ Consider that due to materialization, we already have a unique index
+on all columns <a_1,..., a_n>. We can use the first key part of this index
+over column a_1, instead of the index rowid{a_i=v_i}. Thus we can avoid
+creating the index rowid{a_i=v_i}.
-=-=(Timour - Fri, 04 Dec 2009, 14:04)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.16724 2009-12-04 14:04:28.000000000 +0200
+++ /tmp/wklog.68.new.16724 2009-12-04 14:04:28.000000000 +0200
@@ -10,7 +10,8 @@
Bear in mind the following specifics of this intersection:
(1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
- (2) For each i: rowid{a_i is null} is the same for each tuple
+ (2) For each i: rowid{a_i is null} is the same for each tuple,
+ that is, this set is independent of the left-side tuples.
Due to (2) it makes sense to build rowid{a_i is null} only once.
A good representation for such sets would be bitmaps:
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Version updated.
--- /tmp/wklog.68.old.5257 2009-12-04 11:27:11.000000000 +0200
+++ /tmp/wklog.68.new.5257 2009-12-04 11:27:11.000000000 +0200
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Category updated.
--- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200
+++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Status updated.
--- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200
+++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+Assigned
-=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200
+++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200
@@ -50,23 +50,39 @@
The array can be created on demand.
Other consideration that may be taken into account:
+
1. If columns a_j1,...,a_jm do not contain null values in the temporary
-table at all, create for them only one index array (and of course do not
-create any bitmaps for them).
-2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows
-where a_i is not null and V(a_i) is the number of distinct values for
-a_i excluding nulls.
- If d(a_i) is close to 1 then do not create any index array: check
+table at all and v_j1,...,v_jm cannot be null, create for these columns
+only one index array (and of course do not create any bitmaps for them).
+
+2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number
+of rows, where a_i is not null and V(a_i) is the number of distinct
+values for a_i excluding nulls.
+If d(a_i) is close to N'(a_i) then do not create any index array: check
whether there is a match running through the records that have been
-filtered in. Anyway if d(a_i) is close to 1 then a intersection with
-rowid{a_i=v_i} would not reduce the number of remaining rowids
+filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection
+ with rowid{a_i=v_i} will not reduce the number of remaining rowids
significantly.
- If additionally N-N' is small do not create a bitmap for this column
-either.
-3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a
-sorted array of rowids from the set rowid{a_i is null} can be used
-instead of a bitmap.
+In other words is V(a_i) exceeds some threshold there is no sense to
+create an index for a_i.
+If additionally N-N'(a_i) is small do not create a bitmap for this
+column either.
+
+3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is
+small a sorted array of rowids from the set rowid{a_i is null} can be
+used instead of a bitmap.
+
4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
empty. Here i runs through all indexes from [1..n] such that v_i is not
null. For a given subset of columns this fact has to be checked only
once. It can be easily done with bitmap intersection.
+
+5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be
+created only for rows with nulls.
+
+6. If v1,...,vn never can be a null and number of rows with nulls is
+small do not create indexes and do not create bitmaps.
+
+7. If you get a row with nulls in all columns stop filling the temporary
+table and return UNKNOWN for any tuple <v1,...,vn>.
+
-=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000
+++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000
@@ -1 +1,72 @@
+This a copy of the initial algorithm proposed by Igor:
+======================================================
+For each left side tuple (v_1,...,v_n) we have to find the following set
+of rowids for the temp table containing N rows as the result of
+materialization of the subquery:
+
+ R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
+trough all indexes from [1..n] such that v_i is not null.
+
+Bear in mind the following specifics of this intersection:
+ (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
+ (2) For each i: rowid{a_i is null} is the same for each tuple
+
+Due to (2) it makes sense to build rowid{a_i is null} only once.
+A good representation for such sets would be bitmaps:
+- it requires minimum memory: not more than N*n bits in total
+- search of an element in a set is extremely cheap
+
+Taken all above into account I could suggest the following algorithm to
+build R:
+
+ Using indexes (read about them below) for each column participating in the
+ intersection,
+ merge ordered sets rowid{a_i=v_i} in the following manner.
+ If a rowid r has been encountered maximum in k sets
+rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
+ then it has to be checked against all rowid{a_i=v_i} such that i is
+not in {i1,...,ik}.
+ As soon as we fail to find r in one of these sets we discard it.
+ If r has been found in all of them then r belongs to the set R.
+
+Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i
+is null} is either
+belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
+infer that for any r from R
+indexes a_i can be uniquely divided into two groups: one contains
+indexes a_i where r belongs to
+the sets rowid{a_i=v_i}, the other contains indexes a_j such that r
+belongs to rowid{a_j is null}.
+
+Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order
+needed for the merge procedure. We could use BTREE indexes for temp
+table. But they are rather expensive and
+take a lot of memory as the are implemented with RB trees.
+I would suggest creating for each column from the temporary table just
+an array of rowids sorted by the value from column a.
+Index lookup in such an array is cheap. It's also rather cheap to check
+that the next rowid refers to a row with a different value in column a.
+The array can be created on demand.
+
+Other consideration that may be taken into account:
+1. If columns a_j1,...,a_jm do not contain null values in the temporary
+table at all, create for them only one index array (and of course do not
+create any bitmaps for them).
+2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows
+where a_i is not null and V(a_i) is the number of distinct values for
+a_i excluding nulls.
+ If d(a_i) is close to 1 then do not create any index array: check
+whether there is a match running through the records that have been
+filtered in. Anyway if d(a_i) is close to 1 then a intersection with
+rowid{a_i=v_i} would not reduce the number of remaining rowids
+significantly.
+ If additionally N-N' is small do not create a bitmap for this column
+either.
+3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a
+sorted array of rowids from the set rowid{a_i is null} can be used
+instead of a bitmap.
+4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
+empty. Here i runs through all indexes from [1..n] such that v_i is not
+null. For a given subset of columns this fact has to be checked only
+once. It can be easily done with bitmap intersection.
DESCRIPTION:
The goal of this task is to implement efficient execution of NOT IN
subquery predicates of the form:
<oe_1,...,oe_n> NOT IN <non_correlated subquery>
when either some oe_i, or some subqury result column contains NULLs.
The problem with such predicates is that it is possible to use index
lookups only when neither argument of the predicate contains NULLs.
If some argument contains a NULL, then due to NULL semantics, it
plays the role of a wildcard. If we were to use regular index lookups,
then we would get 'no match' for some outer tuple (thus the predicate
evaluates to FALSE), while the SQL semantics means 'partial match', and
the predicate should evaluate to NULL.
This task implements an efficient algorithm to compute such 'parial
matches', where a NULL matches any value.
HIGH-LEVEL SPECIFICATION:
Contents
========================================================================
1. Initial idea as proposed by Igor
2. Algorithm for IN execution with partial matching
3. Directions for improvement
1. Initial idea as proposed by Igor
========================================================================
For each left side tuple (v_1,...,v_n) we have to find the following
set of rowids for the temp table containing N rows as the result of
materialization of the subquery:
R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
trough all indexes from [1..n] such that v_i is not null.
Bear in mind the following specifics of this intersection:
(1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
(2) For each i: rowid{a_i is null} is the same for each tuple,
that is, this set is independent of the left-side tuples.
Due to (2) it makes sense to build rowid{a_i is null} only once.
A good representation for such sets would be bitmaps:
- it requires minimum memory: not more than N*n bits in total
- search of an element in a set is extremely cheap
Taken all above into account I could suggest the following algorithm
to build R:
Using indexes (read about them below) for each column participating
in the intersection, merge ordered sets rowid{a_i=v_i} in the
following manner.
If a rowid r has been encountered maximum in k sets
rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
then it has to be checked against all rowid{a_i=v_i} such that i is
not in {i1,...,ik}.
As soon as we fail to find r in one of these sets we discard it.
If r has been found in all of them then r belongs to the set R.
Here we use the property (1):
any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either
belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
infer that for any r from R indexes a_i can be uniquely divided into
two groups:
- one contains indexes a_i where r belongs to the sets rowid{a_i=v_i},
- the other contains indexes a_j such that r belongs to
rowid{a_j is null}.
Now let's talk how to get elements from rowid{a_i=v_i} in a sorted
order needed for the merge procedure. We could use BTREE indexes for
temp table. But they are rather expensive and take a lot of memory as
the are implemented with RB trees.
I would suggest creating for each column from the temporary table just
an array of rowids sorted by the value from column a.
Index lookup in such an array is cheap. It's also rather cheap to check
that the next rowid refers to a row with a different value in column a.
The array can be created on demand.
2. Algorithm for IN execution with partial matching
========================================================================
2.1 Below is shown the top-level algorithm to execute an IN predicate
with partial matching. This algorithm is essentially the implementation
of Item_subselect:exec().
int lookup_with_null_semantics(outer_ref[], mat_subquery)
{
if (index_lookup(outer_ref, mat_subquery)
return TRUE
else
{
/*
Check if there is a partial match (UNKNOWN) or no match (NULL).
*/
if (this is the first partial match)
{
vkey[] = build array of value keys for each NULL-able column
of mat_subquery.
nkey[] = build a bitmap NULL index for each column of mat_subquery
that contains NULLs
nonull_key = build a key over all non-NULL columns of mat_subquery
}
if (partial_match(outer_ref, vkey[], nkey[], nonull_key)
return UNKNOWN
else
return FALSE
}
}
2.2 The implementation of partial matching is as follows
/*
Assumptions:
- It has already been checked if there is a complete match by a
regular index lookup, and the test failed.
- It has already been checked if there is a complete NULL row,
and if there was we wouldn't call this function. Thus we assume
that there is no complete NULL row.
- Not all vidx_i are empty, but some can be empty. If all were empty,
then the only possibility for a match is a complete NULL row, which
we already checked.
@param outer_ref - the uter (left) IN argument.
@param vidx[] - array of value keys
Ordered sequences of rowids of the corresponding columns a_i, such
that all rowids in idx_i are the ones where column a_i contains some
value or NULL. Each idx_i is derived dynamically, for each different
left argument of an IN predicate.
@param nidx[] - array of NULL keys
Bitmpas, one per each column, where a bit is set if the corresponding
row has a NULL value for the corresponding column.
@nonull_key - the only key over all columns of the materialized subquery
that do not contain NULLs
@returns
@retval FALSE if there is no match
@retval TRUE if there is a partial match
*/
Boolean partial_match(outer_ref, vkey[], nkey[], nonull_key)
{
/* Set of the keys (columns) that form a partial match. */
Set matching_keys = {}
/* A subset of all keys that need to be checked for NULL matches. */
Set null_keys = {}
Int min_key /* Key that contains the current minimum position. */
Int min_row /* Current row number of min_key. */
Int cur_min_key, cur_min_row
PriorityQueue pq
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
for (i = 1; i <= n; i++)
{
if (vkey[i] != nonull_key)
vkey[i].lookup(outer_ref)
if (! vkey[i].is_eof())
pq.insert(i)
}
/*
Not all value keys are empty, thus we don't have only NULL
keys. If we had, the only possible match is a NULL row, and
we cheked there is no such row, therefore the result is known
to be FALSE.
In fact this algorithm makes sense for at least two non-NULL
columns.
*/
assert(pq.elements > 1)
(min_key, min_row) = pq.pop()
matching_keys.add(min_key)
vkey[min_key].next()
if (! vkey[min_key].is_eof())
pq.insert(min_key)
while (TRUE)
{
(cur_min_key, cur_min_row) = pq.pop()
if (cur_min_row == min_row)
{
matching_keys.add(cur_min_key)
/* There cannot be a complete match, as we already checked for one. */
assert(matching_keys.elements < n)
}
else if (vkey[cur_min_key] == nonull_key)
{
/*
The non-NULL key has no corresponding NULL index, so we know for
sure that the row 'min_row' is not a match.
*/
(min_key, min_row) = (cur_min_key, cur_min_row)
matching_keys = {min_key}
}
else
{
assert(cur_min_row > min_row) /* Follows from the use of PQ. */
null_keys = set_difference(all keys vkey[], matching_keys)
/*
Check if all null_keys contain a NULL at row 'min_row'. The procedure
internally checks all keys in a special precomputed order. A prior
procedure determines an optimal order and a mapping idx_no -> idx_order
(encoded as an array).
This procedure makes sure not to match the non-NULL column.
*/
if (test_null_row(null_keys, min_row))
return TRUE
else
{
(min_key, min_row) = (cur_min_key, cur_min_row)
matching_keys = {min_key}
}
}
vkey[cur_min_key].next()
if (! vkey[cur_min_key].is_eof())
pq.insert(cur_min_key)
else if (vkey[cur_min_key] == nonull_key)
{
/*
If there can't be more matches for the nonull_key, we know for sure
there is no match, since there is no possible NULL match.
*/
return FALSE
}
if (pq.is_empty())
{
/* Check the last row of the last column in PQ for NULL matches. */
null_keys = set_difference(all keys vkey[], matching_keys)
if (test_null_row(null_keys, min_row))
return TRUE
else
return FALSE
}
}
/* We should never get here. */
assert(FALSE)
return FALSE
}
3. Directions for improvement
========================================================================
Other consideration that may be taken into account:
1. If columns a_j1,...,a_jm do not contain null values in the temporary
table at all and v_j1,...,v_jm cannot be null, create for these columns
only one index array (and of course do not create any bitmaps for them).
2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number
of rows, where a_i is not null and V(a_i) is the number of distinct
values for a_i excluding nulls.
If d(a_i) is close to N'(a_i) then do not create any index array: check
whether there is a match running through the records that have been
filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection
with rowid{a_i=v_i} will not reduce the number of remaining rowids
significantly.
In other words is V(a_i) exceeds some threshold there is no sense to
create an index for a_i.
If additionally N-N'(a_i) is small do not create a bitmap for this
column either.
3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is
small a sorted array of rowids from the set rowid{a_i is null} can be
used instead of a bitmap.
4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
empty. Here i runs through all indexes from [1..n] such that v_i is not
null. For a given subset of columns this fact has to be checked only
once. It can be easily done with bitmap intersection.
5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be
created only for rows with nulls.
6. If v1,...,vn never can be a null and number of rows with nulls is
small do not create indexes and do not create bitmaps.
7. If you get a row with nulls in all columns stop filling the temporary
table and return UNKNOWN for any tuple <v1,...,vn>.
8. [timour]
Consider that due to materialization, we already have a unique index
on all columns <a_1,..., a_n>. We can use the first key part of this index
over column a_1, instead of the index rowid{a_i=v_i}. Thus we can avoid
creating the index rowid{a_i=v_i}.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Timour): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply@askmonty.org 19 Jan '10
by worklog-noreply@askmonty.org 19 Jan '10
19 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs
CREATION DATE..: Fri, 27 Nov 2009, 13:22
SUPERVISOR.....: Monty
IMPLEMENTOR....: Timour
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 68 (http://askmonty.org/worklog/?tid=68)
VERSION........: Server-9.x
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Timour - Tue, 19 Jan 2010, 18:29)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.21045 2010-01-19 18:29:12.000000000 +0200
+++ /tmp/wklog.68.new.21045 2010-01-19 18:29:12.000000000 +0200
@@ -132,6 +132,8 @@
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
+ if (nonull_key)
+ pq.insert(nonull_key)
for (i = 1; i <= n; i++)
{
-=-=(Guest - Tue, 19 Jan 2010, 18:15)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.19825 2010-01-19 18:15:30.000000000 +0200
+++ /tmp/wklog.68.new.19825 2010-01-19 18:15:30.000000000 +0200
@@ -1,8 +1,16 @@
-This a copy of the initial algorithm proposed by Igor:
-======================================================
+Contents
+========================================================================
-For each left side tuple (v_1,...,v_n) we have to find the following set
-of rowids for the temp table containing N rows as the result of
+1. Initial idea as proposed by Igor
+2. Algorithm for IN execution with partial matching
+3. Directions for improvement
+
+
+1. Initial idea as proposed by Igor
+========================================================================
+
+For each left side tuple (v_1,...,v_n) we have to find the following
+set of rowids for the temp table containing N rows as the result of
materialization of the subquery:
R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
@@ -18,38 +26,198 @@
- it requires minimum memory: not more than N*n bits in total
- search of an element in a set is extremely cheap
-Taken all above into account I could suggest the following algorithm to
-build R:
+Taken all above into account I could suggest the following algorithm
+to build R:
- Using indexes (read about them below) for each column participating in the
- intersection,
- merge ordered sets rowid{a_i=v_i} in the following manner.
+ Using indexes (read about them below) for each column participating
+ in the intersection, merge ordered sets rowid{a_i=v_i} in the
+ following manner.
If a rowid r has been encountered maximum in k sets
-rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
+ rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
then it has to be checked against all rowid{a_i=v_i} such that i is
-not in {i1,...,ik}.
+ not in {i1,...,ik}.
As soon as we fail to find r in one of these sets we discard it.
If r has been found in all of them then r belongs to the set R.
-Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i
-is null} is either
+Here we use the property (1):
+any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either
belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
-infer that for any r from R
-indexes a_i can be uniquely divided into two groups: one contains
-indexes a_i where r belongs to
-the sets rowid{a_i=v_i}, the other contains indexes a_j such that r
-belongs to rowid{a_j is null}.
-
-Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order
-needed for the merge procedure. We could use BTREE indexes for temp
-table. But they are rather expensive and
-take a lot of memory as the are implemented with RB trees.
+infer that for any r from R indexes a_i can be uniquely divided into
+two groups:
+- one contains indexes a_i where r belongs to the sets rowid{a_i=v_i},
+- the other contains indexes a_j such that r belongs to
+ rowid{a_j is null}.
+
+Now let's talk how to get elements from rowid{a_i=v_i} in a sorted
+order needed for the merge procedure. We could use BTREE indexes for
+temp table. But they are rather expensive and take a lot of memory as
+the are implemented with RB trees.
I would suggest creating for each column from the temporary table just
an array of rowids sorted by the value from column a.
Index lookup in such an array is cheap. It's also rather cheap to check
that the next rowid refers to a row with a different value in column a.
The array can be created on demand.
+2. Algorithm for IN execution with partial matching
+========================================================================
+
+2.1 Below is shown the top-level algorithm to execute an IN predicate
+with partial matching. This algorithm is essentially the implementation
+of Item_subselect:exec().
+
+int lookup_with_null_semantics(outer_ref[], mat_subquery)
+{
+ if (index_lookup(outer_ref, mat_subquery)
+ return TRUE
+ else
+ {
+ /*
+ Check if there is a partial match (UNKNOWN) or no match (NULL).
+ */
+ if (this is the first partial match)
+ {
+ vkey[] = build array of value keys for each NULL-able column
+ of mat_subquery.
+ nkey[] = build a bitmap NULL index for each column of mat_subquery
+ that contains NULLs
+ nonull_key = build a key over all non-NULL columns of mat_subquery
+ }
+ if (partial_match(outer_ref, vkey[], nkey[], nonull_key)
+ return UNKNOWN
+ else
+ return FALSE
+ }
+}
+
+2.2 The implementation of partial matching is as follows
+
+/*
+ Assumptions:
+ - It has already been checked if there is a complete match by a
+ regular index lookup, and the test failed.
+ - It has already been checked if there is a complete NULL row,
+ and if there was we wouldn't call this function. Thus we assume
+ that there is no complete NULL row.
+ - Not all vidx_i are empty, but some can be empty. If all were empty,
+ then the only possibility for a match is a complete NULL row, which
+ we already checked.
+
+ @param outer_ref - the uter (left) IN argument.
+ @param vidx[] - array of value keys
+ Ordered sequences of rowids of the corresponding columns a_i, such
+ that all rowids in idx_i are the ones where column a_i contains some
+ value or NULL. Each idx_i is derived dynamically, for each different
+ left argument of an IN predicate.
+ @param nidx[] - array of NULL keys
+ Bitmpas, one per each column, where a bit is set if the corresponding
+ row has a NULL value for the corresponding column.
+ @nonull_key - the only key over all columns of the materialized subquery
+ that do not contain NULLs
+
+ @returns
+ @retval FALSE if there is no match
+ @retval TRUE if there is a partial match
+*/
+
+Boolean partial_match(outer_ref, vkey[], nkey[], nonull_key)
+{
+ /* Set of the keys (columns) that form a partial match. */
+ Set matching_keys = {}
+ /* A subset of all keys that need to be checked for NULL matches. */
+ Set null_keys = {}
+ Int min_key /* Key that contains the current minimum position. */
+ Int min_row /* Current row number of min_key. */
+ Int cur_min_key, cur_min_row
+ PriorityQueue pq
+
+ if (nonull_key && ! nonull_key->lookup(outer_ref))
+ return FALSE
+
+ for (i = 1; i <= n; i++)
+ {
+ vkey[i].lookup(outer_ref)
+ if (! vkey[i].is_eof())
+ pq.insert(i)
+ }
+ /*
+ Not all value keys are empty, thus we don't have only NULL
+ keys. If we had, the only possible match is a NULL row, and
+ we cheked there is no such row, therefore the result is known
+ to be FALSE.
+ In fact this algorithm makes sense for at least two non-NULL
+ columns.
+ */
+ assert(pq.elements > 1)
+
+ (min_key, min_row) = pq.pop()
+ matching_keys.add(min_key)
+ vkey[min_key].next()
+ if (! vkey[min_key].is_eof())
+ pq.insert(min_key)
+
+ while (TRUE)
+ {
+ (cur_min_key, cur_min_row) = pq.pop()
+
+ if (cur_min_row == min_row)
+ {
+ matching_keys.add(cur_min_key)
+ /* There cannot be a complete match, as we already checked for one. */
+ assert(matching_keys.elements < n)
+ }
+ else if (cur_min_key == nonull_key)
+ {
+ /*
+ The non-NULL key has no corresponding NULL index, so we know for
+ sure that the row 'min_row' is not a match.
+ */
+ (min_key, min_row) = (cur_min_key, cur_min_row)
+ matching_keys = {min_key}
+ }
+ else
+ {
+ assert(cur_min_row > min_row) /* Follows from the use of PQ. */
+ null_keys = set_difference(all keys vkey[], matching_keys)
+ /*
+ Check if all null_keys contain a NULL at row 'min_row'. The procedure
+ internally checks all keys in a special precomputed order. A prior
+ procedure determines an optimal order and a mapping
+ idx_no -> idx_order (encoded as an array).
+ */
+ if (test_null_row(null_keys, min_row))
+ return TRUE
+ else
+ {
+ (min_key, min_row) = (cur_min_key, cur_min_row)
+ matching_keys = {min_key}
+ }
+ }
+
+ vkey[cur_min_key].next()
+ if (! vkey[cur_min_key].is_eof())
+ pq.insert(cur_min_key)
+
+ if (pq.is_empty())
+ {
+ /* Check the last row of the last column in PQ for NULL matches. */
+ null_keys = set_difference(all keys vkey[], matching_keys)
+ if (test_null_row(null_keys, min_row))
+ return TRUE
+ else
+ return FALSE
+ }
+ }
+
+ /* We should never get here. */
+ assert(FALSE)
+ return FALSE
+}
+
+
+
+3. Directions for improvement
+========================================================================
+
Other consideration that may be taken into account:
1. If columns a_j1,...,a_jm do not contain null values in the temporary
-=-=(Timour - Sun, 06 Dec 2009, 14:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.12919 2009-12-06 14:36:18.000000000 +0200
+++ /tmp/wklog.68.new.12919 2009-12-06 14:36:18.000000000 +0200
@@ -87,3 +87,8 @@
7. If you get a row with nulls in all columns stop filling the temporary
table and return UNKNOWN for any tuple <v1,...,vn>.
+8. [timour]
+ Consider that due to materialization, we already have a unique index
+on all columns <a_1,..., a_n>. We can use the first key part of this index
+over column a_1, instead of the index rowid{a_i=v_i}. Thus we can avoid
+creating the index rowid{a_i=v_i}.
-=-=(Timour - Fri, 04 Dec 2009, 14:04)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.16724 2009-12-04 14:04:28.000000000 +0200
+++ /tmp/wklog.68.new.16724 2009-12-04 14:04:28.000000000 +0200
@@ -10,7 +10,8 @@
Bear in mind the following specifics of this intersection:
(1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
- (2) For each i: rowid{a_i is null} is the same for each tuple
+ (2) For each i: rowid{a_i is null} is the same for each tuple,
+ that is, this set is independent of the left-side tuples.
Due to (2) it makes sense to build rowid{a_i is null} only once.
A good representation for such sets would be bitmaps:
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Version updated.
--- /tmp/wklog.68.old.5257 2009-12-04 11:27:11.000000000 +0200
+++ /tmp/wklog.68.new.5257 2009-12-04 11:27:11.000000000 +0200
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Category updated.
--- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200
+++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Status updated.
--- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200
+++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+Assigned
-=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200
+++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200
@@ -50,23 +50,39 @@
The array can be created on demand.
Other consideration that may be taken into account:
+
1. If columns a_j1,...,a_jm do not contain null values in the temporary
-table at all, create for them only one index array (and of course do not
-create any bitmaps for them).
-2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows
-where a_i is not null and V(a_i) is the number of distinct values for
-a_i excluding nulls.
- If d(a_i) is close to 1 then do not create any index array: check
+table at all and v_j1,...,v_jm cannot be null, create for these columns
+only one index array (and of course do not create any bitmaps for them).
+
+2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number
+of rows, where a_i is not null and V(a_i) is the number of distinct
+values for a_i excluding nulls.
+If d(a_i) is close to N'(a_i) then do not create any index array: check
whether there is a match running through the records that have been
-filtered in. Anyway if d(a_i) is close to 1 then a intersection with
-rowid{a_i=v_i} would not reduce the number of remaining rowids
+filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection
+ with rowid{a_i=v_i} will not reduce the number of remaining rowids
significantly.
- If additionally N-N' is small do not create a bitmap for this column
-either.
-3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a
-sorted array of rowids from the set rowid{a_i is null} can be used
-instead of a bitmap.
+In other words is V(a_i) exceeds some threshold there is no sense to
+create an index for a_i.
+If additionally N-N'(a_i) is small do not create a bitmap for this
+column either.
+
+3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is
+small a sorted array of rowids from the set rowid{a_i is null} can be
+used instead of a bitmap.
+
4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
empty. Here i runs through all indexes from [1..n] such that v_i is not
null. For a given subset of columns this fact has to be checked only
once. It can be easily done with bitmap intersection.
+
+5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be
+created only for rows with nulls.
+
+6. If v1,...,vn never can be a null and number of rows with nulls is
+small do not create indexes and do not create bitmaps.
+
+7. If you get a row with nulls in all columns stop filling the temporary
+table and return UNKNOWN for any tuple <v1,...,vn>.
+
-=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000
+++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000
@@ -1 +1,72 @@
+This a copy of the initial algorithm proposed by Igor:
+======================================================
+For each left side tuple (v_1,...,v_n) we have to find the following set
+of rowids for the temp table containing N rows as the result of
+materialization of the subquery:
+
+ R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
+trough all indexes from [1..n] such that v_i is not null.
+
+Bear in mind the following specifics of this intersection:
+ (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
+ (2) For each i: rowid{a_i is null} is the same for each tuple
+
+Due to (2) it makes sense to build rowid{a_i is null} only once.
+A good representation for such sets would be bitmaps:
+- it requires minimum memory: not more than N*n bits in total
+- search of an element in a set is extremely cheap
+
+Taken all above into account I could suggest the following algorithm to
+build R:
+
+ Using indexes (read about them below) for each column participating in the
+ intersection,
+ merge ordered sets rowid{a_i=v_i} in the following manner.
+ If a rowid r has been encountered maximum in k sets
+rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
+ then it has to be checked against all rowid{a_i=v_i} such that i is
+not in {i1,...,ik}.
+ As soon as we fail to find r in one of these sets we discard it.
+ If r has been found in all of them then r belongs to the set R.
+
+Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i
+is null} is either
+belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
+infer that for any r from R
+indexes a_i can be uniquely divided into two groups: one contains
+indexes a_i where r belongs to
+the sets rowid{a_i=v_i}, the other contains indexes a_j such that r
+belongs to rowid{a_j is null}.
+
+Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order
+needed for the merge procedure. We could use BTREE indexes for temp
+table. But they are rather expensive and
+take a lot of memory as the are implemented with RB trees.
+I would suggest creating for each column from the temporary table just
+an array of rowids sorted by the value from column a.
+Index lookup in such an array is cheap. It's also rather cheap to check
+that the next rowid refers to a row with a different value in column a.
+The array can be created on demand.
+
+Other consideration that may be taken into account:
+1. If columns a_j1,...,a_jm do not contain null values in the temporary
+table at all, create for them only one index array (and of course do not
+create any bitmaps for them).
+2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows
+where a_i is not null and V(a_i) is the number of distinct values for
+a_i excluding nulls.
+ If d(a_i) is close to 1 then do not create any index array: check
+whether there is a match running through the records that have been
+filtered in. Anyway if d(a_i) is close to 1 then a intersection with
+rowid{a_i=v_i} would not reduce the number of remaining rowids
+significantly.
+ If additionally N-N' is small do not create a bitmap for this column
+either.
+3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a
+sorted array of rowids from the set rowid{a_i is null} can be used
+instead of a bitmap.
+4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
+empty. Here i runs through all indexes from [1..n] such that v_i is not
+null. For a given subset of columns this fact has to be checked only
+once. It can be easily done with bitmap intersection.
DESCRIPTION:
The goal of this task is to implement efficient execution of NOT IN
subquery predicates of the form:
<oe_1,...,oe_n> NOT IN <non_correlated subquery>
when either some oe_i, or some subqury result column contains NULLs.
The problem with such predicates is that it is possible to use index
lookups only when neither argument of the predicate contains NULLs.
If some argument contains a NULL, then due to NULL semantics, it
plays the role of a wildcard. If we were to use regular index lookups,
then we would get 'no match' for some outer tuple (thus the predicate
evaluates to FALSE), while the SQL semantics means 'partial match', and
the predicate should evaluate to NULL.
This task implements an efficient algorithm to compute such 'parial
matches', where a NULL matches any value.
HIGH-LEVEL SPECIFICATION:
Contents
========================================================================
1. Initial idea as proposed by Igor
2. Algorithm for IN execution with partial matching
3. Directions for improvement
1. Initial idea as proposed by Igor
========================================================================
For each left side tuple (v_1,...,v_n) we have to find the following
set of rowids for the temp table containing N rows as the result of
materialization of the subquery:
R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
trough all indexes from [1..n] such that v_i is not null.
Bear in mind the following specifics of this intersection:
(1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
(2) For each i: rowid{a_i is null} is the same for each tuple,
that is, this set is independent of the left-side tuples.
Due to (2) it makes sense to build rowid{a_i is null} only once.
A good representation for such sets would be bitmaps:
- it requires minimum memory: not more than N*n bits in total
- search of an element in a set is extremely cheap
Taken all above into account I could suggest the following algorithm
to build R:
Using indexes (read about them below) for each column participating
in the intersection, merge ordered sets rowid{a_i=v_i} in the
following manner.
If a rowid r has been encountered maximum in k sets
rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
then it has to be checked against all rowid{a_i=v_i} such that i is
not in {i1,...,ik}.
As soon as we fail to find r in one of these sets we discard it.
If r has been found in all of them then r belongs to the set R.
Here we use the property (1):
any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either
belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
infer that for any r from R indexes a_i can be uniquely divided into
two groups:
- one contains indexes a_i where r belongs to the sets rowid{a_i=v_i},
- the other contains indexes a_j such that r belongs to
rowid{a_j is null}.
Now let's talk how to get elements from rowid{a_i=v_i} in a sorted
order needed for the merge procedure. We could use BTREE indexes for
temp table. But they are rather expensive and take a lot of memory as
the are implemented with RB trees.
I would suggest creating for each column from the temporary table just
an array of rowids sorted by the value from column a.
Index lookup in such an array is cheap. It's also rather cheap to check
that the next rowid refers to a row with a different value in column a.
The array can be created on demand.
2. Algorithm for IN execution with partial matching
========================================================================
2.1 Below is shown the top-level algorithm to execute an IN predicate
with partial matching. This algorithm is essentially the implementation
of Item_subselect:exec().
int lookup_with_null_semantics(outer_ref[], mat_subquery)
{
if (index_lookup(outer_ref, mat_subquery)
return TRUE
else
{
/*
Check if there is a partial match (UNKNOWN) or no match (NULL).
*/
if (this is the first partial match)
{
vkey[] = build array of value keys for each NULL-able column
of mat_subquery.
nkey[] = build a bitmap NULL index for each column of mat_subquery
that contains NULLs
nonull_key = build a key over all non-NULL columns of mat_subquery
}
if (partial_match(outer_ref, vkey[], nkey[], nonull_key)
return UNKNOWN
else
return FALSE
}
}
2.2 The implementation of partial matching is as follows
/*
Assumptions:
- It has already been checked if there is a complete match by a
regular index lookup, and the test failed.
- It has already been checked if there is a complete NULL row,
and if there was we wouldn't call this function. Thus we assume
that there is no complete NULL row.
- Not all vidx_i are empty, but some can be empty. If all were empty,
then the only possibility for a match is a complete NULL row, which
we already checked.
@param outer_ref - the uter (left) IN argument.
@param vidx[] - array of value keys
Ordered sequences of rowids of the corresponding columns a_i, such
that all rowids in idx_i are the ones where column a_i contains some
value or NULL. Each idx_i is derived dynamically, for each different
left argument of an IN predicate.
@param nidx[] - array of NULL keys
Bitmpas, one per each column, where a bit is set if the corresponding
row has a NULL value for the corresponding column.
@nonull_key - the only key over all columns of the materialized subquery
that do not contain NULLs
@returns
@retval FALSE if there is no match
@retval TRUE if there is a partial match
*/
Boolean partial_match(outer_ref, vkey[], nkey[], nonull_key)
{
/* Set of the keys (columns) that form a partial match. */
Set matching_keys = {}
/* A subset of all keys that need to be checked for NULL matches. */
Set null_keys = {}
Int min_key /* Key that contains the current minimum position. */
Int min_row /* Current row number of min_key. */
Int cur_min_key, cur_min_row
PriorityQueue pq
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
if (nonull_key)
pq.insert(nonull_key)
for (i = 1; i <= n; i++)
{
vkey[i].lookup(outer_ref)
if (! vkey[i].is_eof())
pq.insert(i)
}
/*
Not all value keys are empty, thus we don't have only NULL
keys. If we had, the only possible match is a NULL row, and
we cheked there is no such row, therefore the result is known
to be FALSE.
In fact this algorithm makes sense for at least two non-NULL
columns.
*/
assert(pq.elements > 1)
(min_key, min_row) = pq.pop()
matching_keys.add(min_key)
vkey[min_key].next()
if (! vkey[min_key].is_eof())
pq.insert(min_key)
while (TRUE)
{
(cur_min_key, cur_min_row) = pq.pop()
if (cur_min_row == min_row)
{
matching_keys.add(cur_min_key)
/* There cannot be a complete match, as we already checked for one. */
assert(matching_keys.elements < n)
}
else if (cur_min_key == nonull_key)
{
/*
The non-NULL key has no corresponding NULL index, so we know for
sure that the row 'min_row' is not a match.
*/
(min_key, min_row) = (cur_min_key, cur_min_row)
matching_keys = {min_key}
}
else
{
assert(cur_min_row > min_row) /* Follows from the use of PQ. */
null_keys = set_difference(all keys vkey[], matching_keys)
/*
Check if all null_keys contain a NULL at row 'min_row'. The procedure
internally checks all keys in a special precomputed order. A prior
procedure determines an optimal order and a mapping
idx_no -> idx_order (encoded as an array).
*/
if (test_null_row(null_keys, min_row))
return TRUE
else
{
(min_key, min_row) = (cur_min_key, cur_min_row)
matching_keys = {min_key}
}
}
vkey[cur_min_key].next()
if (! vkey[cur_min_key].is_eof())
pq.insert(cur_min_key)
if (pq.is_empty())
{
/* Check the last row of the last column in PQ for NULL matches. */
null_keys = set_difference(all keys vkey[], matching_keys)
if (test_null_row(null_keys, min_row))
return TRUE
else
return FALSE
}
}
/* We should never get here. */
assert(FALSE)
return FALSE
}
3. Directions for improvement
========================================================================
Other consideration that may be taken into account:
1. If columns a_j1,...,a_jm do not contain null values in the temporary
table at all and v_j1,...,v_jm cannot be null, create for these columns
only one index array (and of course do not create any bitmaps for them).
2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number
of rows, where a_i is not null and V(a_i) is the number of distinct
values for a_i excluding nulls.
If d(a_i) is close to N'(a_i) then do not create any index array: check
whether there is a match running through the records that have been
filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection
with rowid{a_i=v_i} will not reduce the number of remaining rowids
significantly.
In other words is V(a_i) exceeds some threshold there is no sense to
create an index for a_i.
If additionally N-N'(a_i) is small do not create a bitmap for this
column either.
3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is
small a sorted array of rowids from the set rowid{a_i is null} can be
used instead of a bitmap.
4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
empty. Here i runs through all indexes from [1..n] such that v_i is not
null. For a given subset of columns this fact has to be checked only
once. It can be easily done with bitmap intersection.
5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be
created only for rows with nulls.
6. If v1,...,vn never can be a null and number of rows with nulls is
small do not create indexes and do not create bitmaps.
7. If you get a row with nulls in all columns stop filling the temporary
table and return UNKNOWN for any tuple <v1,...,vn>.
8. [timour]
Consider that due to materialization, we already have a unique index
on all columns <a_1,..., a_n>. We can use the first key part of this index
over column a_1, instead of the index rowid{a_i=v_i}. Thus we can avoid
creating the index rowid{a_i=v_i}.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Timour): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply@askmonty.org 19 Jan '10
by worklog-noreply@askmonty.org 19 Jan '10
19 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs
CREATION DATE..: Fri, 27 Nov 2009, 13:22
SUPERVISOR.....: Monty
IMPLEMENTOR....: Timour
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 68 (http://askmonty.org/worklog/?tid=68)
VERSION........: Server-9.x
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Timour - Tue, 19 Jan 2010, 18:29)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.21045 2010-01-19 18:29:12.000000000 +0200
+++ /tmp/wklog.68.new.21045 2010-01-19 18:29:12.000000000 +0200
@@ -132,6 +132,8 @@
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
+ if (nonull_key)
+ pq.insert(nonull_key)
for (i = 1; i <= n; i++)
{
-=-=(Guest - Tue, 19 Jan 2010, 18:15)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.19825 2010-01-19 18:15:30.000000000 +0200
+++ /tmp/wklog.68.new.19825 2010-01-19 18:15:30.000000000 +0200
@@ -1,8 +1,16 @@
-This a copy of the initial algorithm proposed by Igor:
-======================================================
+Contents
+========================================================================
-For each left side tuple (v_1,...,v_n) we have to find the following set
-of rowids for the temp table containing N rows as the result of
+1. Initial idea as proposed by Igor
+2. Algorithm for IN execution with partial matching
+3. Directions for improvement
+
+
+1. Initial idea as proposed by Igor
+========================================================================
+
+For each left side tuple (v_1,...,v_n) we have to find the following
+set of rowids for the temp table containing N rows as the result of
materialization of the subquery:
R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
@@ -18,38 +26,198 @@
- it requires minimum memory: not more than N*n bits in total
- search of an element in a set is extremely cheap
-Taken all above into account I could suggest the following algorithm to
-build R:
+Taken all above into account I could suggest the following algorithm
+to build R:
- Using indexes (read about them below) for each column participating in the
- intersection,
- merge ordered sets rowid{a_i=v_i} in the following manner.
+ Using indexes (read about them below) for each column participating
+ in the intersection, merge ordered sets rowid{a_i=v_i} in the
+ following manner.
If a rowid r has been encountered maximum in k sets
-rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
+ rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
then it has to be checked against all rowid{a_i=v_i} such that i is
-not in {i1,...,ik}.
+ not in {i1,...,ik}.
As soon as we fail to find r in one of these sets we discard it.
If r has been found in all of them then r belongs to the set R.
-Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i
-is null} is either
+Here we use the property (1):
+any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either
belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
-infer that for any r from R
-indexes a_i can be uniquely divided into two groups: one contains
-indexes a_i where r belongs to
-the sets rowid{a_i=v_i}, the other contains indexes a_j such that r
-belongs to rowid{a_j is null}.
-
-Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order
-needed for the merge procedure. We could use BTREE indexes for temp
-table. But they are rather expensive and
-take a lot of memory as the are implemented with RB trees.
+infer that for any r from R indexes a_i can be uniquely divided into
+two groups:
+- one contains indexes a_i where r belongs to the sets rowid{a_i=v_i},
+- the other contains indexes a_j such that r belongs to
+ rowid{a_j is null}.
+
+Now let's talk how to get elements from rowid{a_i=v_i} in a sorted
+order needed for the merge procedure. We could use BTREE indexes for
+temp table. But they are rather expensive and take a lot of memory as
+the are implemented with RB trees.
I would suggest creating for each column from the temporary table just
an array of rowids sorted by the value from column a.
Index lookup in such an array is cheap. It's also rather cheap to check
that the next rowid refers to a row with a different value in column a.
The array can be created on demand.
+2. Algorithm for IN execution with partial matching
+========================================================================
+
+2.1 Below is shown the top-level algorithm to execute an IN predicate
+with partial matching. This algorithm is essentially the implementation
+of Item_subselect:exec().
+
+int lookup_with_null_semantics(outer_ref[], mat_subquery)
+{
+ if (index_lookup(outer_ref, mat_subquery)
+ return TRUE
+ else
+ {
+ /*
+ Check if there is a partial match (UNKNOWN) or no match (NULL).
+ */
+ if (this is the first partial match)
+ {
+ vkey[] = build array of value keys for each NULL-able column
+ of mat_subquery.
+ nkey[] = build a bitmap NULL index for each column of mat_subquery
+ that contains NULLs
+ nonull_key = build a key over all non-NULL columns of mat_subquery
+ }
+ if (partial_match(outer_ref, vkey[], nkey[], nonull_key)
+ return UNKNOWN
+ else
+ return FALSE
+ }
+}
+
+2.2 The implementation of partial matching is as follows
+
+/*
+ Assumptions:
+ - It has already been checked if there is a complete match by a
+ regular index lookup, and the test failed.
+ - It has already been checked if there is a complete NULL row,
+ and if there was we wouldn't call this function. Thus we assume
+ that there is no complete NULL row.
+ - Not all vidx_i are empty, but some can be empty. If all were empty,
+ then the only possibility for a match is a complete NULL row, which
+ we already checked.
+
+ @param outer_ref - the uter (left) IN argument.
+ @param vidx[] - array of value keys
+ Ordered sequences of rowids of the corresponding columns a_i, such
+ that all rowids in idx_i are the ones where column a_i contains some
+ value or NULL. Each idx_i is derived dynamically, for each different
+ left argument of an IN predicate.
+ @param nidx[] - array of NULL keys
+ Bitmpas, one per each column, where a bit is set if the corresponding
+ row has a NULL value for the corresponding column.
+ @nonull_key - the only key over all columns of the materialized subquery
+ that do not contain NULLs
+
+ @returns
+ @retval FALSE if there is no match
+ @retval TRUE if there is a partial match
+*/
+
+Boolean partial_match(outer_ref, vkey[], nkey[], nonull_key)
+{
+ /* Set of the keys (columns) that form a partial match. */
+ Set matching_keys = {}
+ /* A subset of all keys that need to be checked for NULL matches. */
+ Set null_keys = {}
+ Int min_key /* Key that contains the current minimum position. */
+ Int min_row /* Current row number of min_key. */
+ Int cur_min_key, cur_min_row
+ PriorityQueue pq
+
+ if (nonull_key && ! nonull_key->lookup(outer_ref))
+ return FALSE
+
+ for (i = 1; i <= n; i++)
+ {
+ vkey[i].lookup(outer_ref)
+ if (! vkey[i].is_eof())
+ pq.insert(i)
+ }
+ /*
+ Not all value keys are empty, thus we don't have only NULL
+ keys. If we had, the only possible match is a NULL row, and
+ we cheked there is no such row, therefore the result is known
+ to be FALSE.
+ In fact this algorithm makes sense for at least two non-NULL
+ columns.
+ */
+ assert(pq.elements > 1)
+
+ (min_key, min_row) = pq.pop()
+ matching_keys.add(min_key)
+ vkey[min_key].next()
+ if (! vkey[min_key].is_eof())
+ pq.insert(min_key)
+
+ while (TRUE)
+ {
+ (cur_min_key, cur_min_row) = pq.pop()
+
+ if (cur_min_row == min_row)
+ {
+ matching_keys.add(cur_min_key)
+ /* There cannot be a complete match, as we already checked for one. */
+ assert(matching_keys.elements < n)
+ }
+ else if (cur_min_key == nonull_key)
+ {
+ /*
+ The non-NULL key has no corresponding NULL index, so we know for
+ sure that the row 'min_row' is not a match.
+ */
+ (min_key, min_row) = (cur_min_key, cur_min_row)
+ matching_keys = {min_key}
+ }
+ else
+ {
+ assert(cur_min_row > min_row) /* Follows from the use of PQ. */
+ null_keys = set_difference(all keys vkey[], matching_keys)
+ /*
+ Check if all null_keys contain a NULL at row 'min_row'. The procedure
+ internally checks all keys in a special precomputed order. A prior
+ procedure determines an optimal order and a mapping
+ idx_no -> idx_order (encoded as an array).
+ */
+ if (test_null_row(null_keys, min_row))
+ return TRUE
+ else
+ {
+ (min_key, min_row) = (cur_min_key, cur_min_row)
+ matching_keys = {min_key}
+ }
+ }
+
+ vkey[cur_min_key].next()
+ if (! vkey[cur_min_key].is_eof())
+ pq.insert(cur_min_key)
+
+ if (pq.is_empty())
+ {
+ /* Check the last row of the last column in PQ for NULL matches. */
+ null_keys = set_difference(all keys vkey[], matching_keys)
+ if (test_null_row(null_keys, min_row))
+ return TRUE
+ else
+ return FALSE
+ }
+ }
+
+ /* We should never get here. */
+ assert(FALSE)
+ return FALSE
+}
+
+
+
+3. Directions for improvement
+========================================================================
+
Other consideration that may be taken into account:
1. If columns a_j1,...,a_jm do not contain null values in the temporary
-=-=(Timour - Sun, 06 Dec 2009, 14:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.12919 2009-12-06 14:36:18.000000000 +0200
+++ /tmp/wklog.68.new.12919 2009-12-06 14:36:18.000000000 +0200
@@ -87,3 +87,8 @@
7. If you get a row with nulls in all columns stop filling the temporary
table and return UNKNOWN for any tuple <v1,...,vn>.
+8. [timour]
+ Consider that due to materialization, we already have a unique index
+on all columns <a_1,..., a_n>. We can use the first key part of this index
+over column a_1, instead of the index rowid{a_i=v_i}. Thus we can avoid
+creating the index rowid{a_i=v_i}.
-=-=(Timour - Fri, 04 Dec 2009, 14:04)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.16724 2009-12-04 14:04:28.000000000 +0200
+++ /tmp/wklog.68.new.16724 2009-12-04 14:04:28.000000000 +0200
@@ -10,7 +10,8 @@
Bear in mind the following specifics of this intersection:
(1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
- (2) For each i: rowid{a_i is null} is the same for each tuple
+ (2) For each i: rowid{a_i is null} is the same for each tuple,
+ that is, this set is independent of the left-side tuples.
Due to (2) it makes sense to build rowid{a_i is null} only once.
A good representation for such sets would be bitmaps:
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Version updated.
--- /tmp/wklog.68.old.5257 2009-12-04 11:27:11.000000000 +0200
+++ /tmp/wklog.68.new.5257 2009-12-04 11:27:11.000000000 +0200
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Category updated.
--- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200
+++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Status updated.
--- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200
+++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+Assigned
-=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200
+++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200
@@ -50,23 +50,39 @@
The array can be created on demand.
Other consideration that may be taken into account:
+
1. If columns a_j1,...,a_jm do not contain null values in the temporary
-table at all, create for them only one index array (and of course do not
-create any bitmaps for them).
-2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows
-where a_i is not null and V(a_i) is the number of distinct values for
-a_i excluding nulls.
- If d(a_i) is close to 1 then do not create any index array: check
+table at all and v_j1,...,v_jm cannot be null, create for these columns
+only one index array (and of course do not create any bitmaps for them).
+
+2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number
+of rows, where a_i is not null and V(a_i) is the number of distinct
+values for a_i excluding nulls.
+If d(a_i) is close to N'(a_i) then do not create any index array: check
whether there is a match running through the records that have been
-filtered in. Anyway if d(a_i) is close to 1 then a intersection with
-rowid{a_i=v_i} would not reduce the number of remaining rowids
+filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection
+ with rowid{a_i=v_i} will not reduce the number of remaining rowids
significantly.
- If additionally N-N' is small do not create a bitmap for this column
-either.
-3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a
-sorted array of rowids from the set rowid{a_i is null} can be used
-instead of a bitmap.
+In other words is V(a_i) exceeds some threshold there is no sense to
+create an index for a_i.
+If additionally N-N'(a_i) is small do not create a bitmap for this
+column either.
+
+3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is
+small a sorted array of rowids from the set rowid{a_i is null} can be
+used instead of a bitmap.
+
4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
empty. Here i runs through all indexes from [1..n] such that v_i is not
null. For a given subset of columns this fact has to be checked only
once. It can be easily done with bitmap intersection.
+
+5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be
+created only for rows with nulls.
+
+6. If v1,...,vn never can be a null and number of rows with nulls is
+small do not create indexes and do not create bitmaps.
+
+7. If you get a row with nulls in all columns stop filling the temporary
+table and return UNKNOWN for any tuple <v1,...,vn>.
+
-=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000
+++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000
@@ -1 +1,72 @@
+This a copy of the initial algorithm proposed by Igor:
+======================================================
+For each left side tuple (v_1,...,v_n) we have to find the following set
+of rowids for the temp table containing N rows as the result of
+materialization of the subquery:
+
+ R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
+trough all indexes from [1..n] such that v_i is not null.
+
+Bear in mind the following specifics of this intersection:
+ (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
+ (2) For each i: rowid{a_i is null} is the same for each tuple
+
+Due to (2) it makes sense to build rowid{a_i is null} only once.
+A good representation for such sets would be bitmaps:
+- it requires minimum memory: not more than N*n bits in total
+- search of an element in a set is extremely cheap
+
+Taken all above into account I could suggest the following algorithm to
+build R:
+
+ Using indexes (read about them below) for each column participating in the
+ intersection,
+ merge ordered sets rowid{a_i=v_i} in the following manner.
+ If a rowid r has been encountered maximum in k sets
+rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
+ then it has to be checked against all rowid{a_i=v_i} such that i is
+not in {i1,...,ik}.
+ As soon as we fail to find r in one of these sets we discard it.
+ If r has been found in all of them then r belongs to the set R.
+
+Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i
+is null} is either
+belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
+infer that for any r from R
+indexes a_i can be uniquely divided into two groups: one contains
+indexes a_i where r belongs to
+the sets rowid{a_i=v_i}, the other contains indexes a_j such that r
+belongs to rowid{a_j is null}.
+
+Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order
+needed for the merge procedure. We could use BTREE indexes for temp
+table. But they are rather expensive and
+take a lot of memory as the are implemented with RB trees.
+I would suggest creating for each column from the temporary table just
+an array of rowids sorted by the value from column a.
+Index lookup in such an array is cheap. It's also rather cheap to check
+that the next rowid refers to a row with a different value in column a.
+The array can be created on demand.
+
+Other consideration that may be taken into account:
+1. If columns a_j1,...,a_jm do not contain null values in the temporary
+table at all, create for them only one index array (and of course do not
+create any bitmaps for them).
+2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows
+where a_i is not null and V(a_i) is the number of distinct values for
+a_i excluding nulls.
+ If d(a_i) is close to 1 then do not create any index array: check
+whether there is a match running through the records that have been
+filtered in. Anyway if d(a_i) is close to 1 then a intersection with
+rowid{a_i=v_i} would not reduce the number of remaining rowids
+significantly.
+ If additionally N-N' is small do not create a bitmap for this column
+either.
+3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a
+sorted array of rowids from the set rowid{a_i is null} can be used
+instead of a bitmap.
+4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
+empty. Here i runs through all indexes from [1..n] such that v_i is not
+null. For a given subset of columns this fact has to be checked only
+once. It can be easily done with bitmap intersection.
DESCRIPTION:
The goal of this task is to implement efficient execution of NOT IN
subquery predicates of the form:
<oe_1,...,oe_n> NOT IN <non_correlated subquery>
when either some oe_i, or some subqury result column contains NULLs.
The problem with such predicates is that it is possible to use index
lookups only when neither argument of the predicate contains NULLs.
If some argument contains a NULL, then due to NULL semantics, it
plays the role of a wildcard. If we were to use regular index lookups,
then we would get 'no match' for some outer tuple (thus the predicate
evaluates to FALSE), while the SQL semantics means 'partial match', and
the predicate should evaluate to NULL.
This task implements an efficient algorithm to compute such 'parial
matches', where a NULL matches any value.
HIGH-LEVEL SPECIFICATION:
Contents
========================================================================
1. Initial idea as proposed by Igor
2. Algorithm for IN execution with partial matching
3. Directions for improvement
1. Initial idea as proposed by Igor
========================================================================
For each left side tuple (v_1,...,v_n) we have to find the following
set of rowids for the temp table containing N rows as the result of
materialization of the subquery:
R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
trough all indexes from [1..n] such that v_i is not null.
Bear in mind the following specifics of this intersection:
(1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
(2) For each i: rowid{a_i is null} is the same for each tuple,
that is, this set is independent of the left-side tuples.
Due to (2) it makes sense to build rowid{a_i is null} only once.
A good representation for such sets would be bitmaps:
- it requires minimum memory: not more than N*n bits in total
- search of an element in a set is extremely cheap
Taken all above into account I could suggest the following algorithm
to build R:
Using indexes (read about them below) for each column participating
in the intersection, merge ordered sets rowid{a_i=v_i} in the
following manner.
If a rowid r has been encountered maximum in k sets
rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
then it has to be checked against all rowid{a_i=v_i} such that i is
not in {i1,...,ik}.
As soon as we fail to find r in one of these sets we discard it.
If r has been found in all of them then r belongs to the set R.
Here we use the property (1):
any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either
belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
infer that for any r from R indexes a_i can be uniquely divided into
two groups:
- one contains indexes a_i where r belongs to the sets rowid{a_i=v_i},
- the other contains indexes a_j such that r belongs to
rowid{a_j is null}.
Now let's talk how to get elements from rowid{a_i=v_i} in a sorted
order needed for the merge procedure. We could use BTREE indexes for
temp table. But they are rather expensive and take a lot of memory as
the are implemented with RB trees.
I would suggest creating for each column from the temporary table just
an array of rowids sorted by the value from column a.
Index lookup in such an array is cheap. It's also rather cheap to check
that the next rowid refers to a row with a different value in column a.
The array can be created on demand.
2. Algorithm for IN execution with partial matching
========================================================================
2.1 Below is shown the top-level algorithm to execute an IN predicate
with partial matching. This algorithm is essentially the implementation
of Item_subselect:exec().
int lookup_with_null_semantics(outer_ref[], mat_subquery)
{
if (index_lookup(outer_ref, mat_subquery)
return TRUE
else
{
/*
Check if there is a partial match (UNKNOWN) or no match (NULL).
*/
if (this is the first partial match)
{
vkey[] = build array of value keys for each NULL-able column
of mat_subquery.
nkey[] = build a bitmap NULL index for each column of mat_subquery
that contains NULLs
nonull_key = build a key over all non-NULL columns of mat_subquery
}
if (partial_match(outer_ref, vkey[], nkey[], nonull_key)
return UNKNOWN
else
return FALSE
}
}
2.2 The implementation of partial matching is as follows
/*
Assumptions:
- It has already been checked if there is a complete match by a
regular index lookup, and the test failed.
- It has already been checked if there is a complete NULL row,
and if there was we wouldn't call this function. Thus we assume
that there is no complete NULL row.
- Not all vidx_i are empty, but some can be empty. If all were empty,
then the only possibility for a match is a complete NULL row, which
we already checked.
@param outer_ref - the uter (left) IN argument.
@param vidx[] - array of value keys
Ordered sequences of rowids of the corresponding columns a_i, such
that all rowids in idx_i are the ones where column a_i contains some
value or NULL. Each idx_i is derived dynamically, for each different
left argument of an IN predicate.
@param nidx[] - array of NULL keys
Bitmpas, one per each column, where a bit is set if the corresponding
row has a NULL value for the corresponding column.
@nonull_key - the only key over all columns of the materialized subquery
that do not contain NULLs
@returns
@retval FALSE if there is no match
@retval TRUE if there is a partial match
*/
Boolean partial_match(outer_ref, vkey[], nkey[], nonull_key)
{
/* Set of the keys (columns) that form a partial match. */
Set matching_keys = {}
/* A subset of all keys that need to be checked for NULL matches. */
Set null_keys = {}
Int min_key /* Key that contains the current minimum position. */
Int min_row /* Current row number of min_key. */
Int cur_min_key, cur_min_row
PriorityQueue pq
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
if (nonull_key)
pq.insert(nonull_key)
for (i = 1; i <= n; i++)
{
vkey[i].lookup(outer_ref)
if (! vkey[i].is_eof())
pq.insert(i)
}
/*
Not all value keys are empty, thus we don't have only NULL
keys. If we had, the only possible match is a NULL row, and
we cheked there is no such row, therefore the result is known
to be FALSE.
In fact this algorithm makes sense for at least two non-NULL
columns.
*/
assert(pq.elements > 1)
(min_key, min_row) = pq.pop()
matching_keys.add(min_key)
vkey[min_key].next()
if (! vkey[min_key].is_eof())
pq.insert(min_key)
while (TRUE)
{
(cur_min_key, cur_min_row) = pq.pop()
if (cur_min_row == min_row)
{
matching_keys.add(cur_min_key)
/* There cannot be a complete match, as we already checked for one. */
assert(matching_keys.elements < n)
}
else if (cur_min_key == nonull_key)
{
/*
The non-NULL key has no corresponding NULL index, so we know for
sure that the row 'min_row' is not a match.
*/
(min_key, min_row) = (cur_min_key, cur_min_row)
matching_keys = {min_key}
}
else
{
assert(cur_min_row > min_row) /* Follows from the use of PQ. */
null_keys = set_difference(all keys vkey[], matching_keys)
/*
Check if all null_keys contain a NULL at row 'min_row'. The procedure
internally checks all keys in a special precomputed order. A prior
procedure determines an optimal order and a mapping
idx_no -> idx_order (encoded as an array).
*/
if (test_null_row(null_keys, min_row))
return TRUE
else
{
(min_key, min_row) = (cur_min_key, cur_min_row)
matching_keys = {min_key}
}
}
vkey[cur_min_key].next()
if (! vkey[cur_min_key].is_eof())
pq.insert(cur_min_key)
if (pq.is_empty())
{
/* Check the last row of the last column in PQ for NULL matches. */
null_keys = set_difference(all keys vkey[], matching_keys)
if (test_null_row(null_keys, min_row))
return TRUE
else
return FALSE
}
}
/* We should never get here. */
assert(FALSE)
return FALSE
}
3. Directions for improvement
========================================================================
Other consideration that may be taken into account:
1. If columns a_j1,...,a_jm do not contain null values in the temporary
table at all and v_j1,...,v_jm cannot be null, create for these columns
only one index array (and of course do not create any bitmaps for them).
2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number
of rows, where a_i is not null and V(a_i) is the number of distinct
values for a_i excluding nulls.
If d(a_i) is close to N'(a_i) then do not create any index array: check
whether there is a match running through the records that have been
filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection
with rowid{a_i=v_i} will not reduce the number of remaining rowids
significantly.
In other words is V(a_i) exceeds some threshold there is no sense to
create an index for a_i.
If additionally N-N'(a_i) is small do not create a bitmap for this
column either.
3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is
small a sorted array of rowids from the set rowid{a_i is null} can be
used instead of a bitmap.
4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
empty. Here i runs through all indexes from [1..n] such that v_i is not
null. For a given subset of columns this fact has to be checked only
once. It can be easily done with bitmap intersection.
5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be
created only for rows with nulls.
6. If v1,...,vn never can be a null and number of rows with nulls is
small do not create indexes and do not create bitmaps.
7. If you get a row with nulls in all columns stop filling the temporary
table and return UNKNOWN for any tuple <v1,...,vn>.
8. [timour]
Consider that due to materialization, we already have a unique index
on all columns <a_1,..., a_n>. We can use the first key part of this index
over column a_1, instead of the index rowid{a_i=v_i}. Thus we can avoid
creating the index rowid{a_i=v_i}.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Subquery optimization: Efficient NOT IN execution with NULLs (68)
by worklog-noreply@askmonty.org 19 Jan '10
by worklog-noreply@askmonty.org 19 Jan '10
19 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Efficient NOT IN execution with NULLs
CREATION DATE..: Fri, 27 Nov 2009, 13:22
SUPERVISOR.....: Monty
IMPLEMENTOR....: Timour
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 68 (http://askmonty.org/worklog/?tid=68)
VERSION........: Server-9.x
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Tue, 19 Jan 2010, 18:15)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.19825 2010-01-19 18:15:30.000000000 +0200
+++ /tmp/wklog.68.new.19825 2010-01-19 18:15:30.000000000 +0200
@@ -1,8 +1,16 @@
-This a copy of the initial algorithm proposed by Igor:
-======================================================
+Contents
+========================================================================
-For each left side tuple (v_1,...,v_n) we have to find the following set
-of rowids for the temp table containing N rows as the result of
+1. Initial idea as proposed by Igor
+2. Algorithm for IN execution with partial matching
+3. Directions for improvement
+
+
+1. Initial idea as proposed by Igor
+========================================================================
+
+For each left side tuple (v_1,...,v_n) we have to find the following
+set of rowids for the temp table containing N rows as the result of
materialization of the subquery:
R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
@@ -18,38 +26,198 @@
- it requires minimum memory: not more than N*n bits in total
- search of an element in a set is extremely cheap
-Taken all above into account I could suggest the following algorithm to
-build R:
+Taken all above into account I could suggest the following algorithm
+to build R:
- Using indexes (read about them below) for each column participating in the
- intersection,
- merge ordered sets rowid{a_i=v_i} in the following manner.
+ Using indexes (read about them below) for each column participating
+ in the intersection, merge ordered sets rowid{a_i=v_i} in the
+ following manner.
If a rowid r has been encountered maximum in k sets
-rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
+ rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
then it has to be checked against all rowid{a_i=v_i} such that i is
-not in {i1,...,ik}.
+ not in {i1,...,ik}.
As soon as we fail to find r in one of these sets we discard it.
If r has been found in all of them then r belongs to the set R.
-Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i
-is null} is either
+Here we use the property (1):
+any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either
belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
-infer that for any r from R
-indexes a_i can be uniquely divided into two groups: one contains
-indexes a_i where r belongs to
-the sets rowid{a_i=v_i}, the other contains indexes a_j such that r
-belongs to rowid{a_j is null}.
-
-Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order
-needed for the merge procedure. We could use BTREE indexes for temp
-table. But they are rather expensive and
-take a lot of memory as the are implemented with RB trees.
+infer that for any r from R indexes a_i can be uniquely divided into
+two groups:
+- one contains indexes a_i where r belongs to the sets rowid{a_i=v_i},
+- the other contains indexes a_j such that r belongs to
+ rowid{a_j is null}.
+
+Now let's talk how to get elements from rowid{a_i=v_i} in a sorted
+order needed for the merge procedure. We could use BTREE indexes for
+temp table. But they are rather expensive and take a lot of memory as
+the are implemented with RB trees.
I would suggest creating for each column from the temporary table just
an array of rowids sorted by the value from column a.
Index lookup in such an array is cheap. It's also rather cheap to check
that the next rowid refers to a row with a different value in column a.
The array can be created on demand.
+2. Algorithm for IN execution with partial matching
+========================================================================
+
+2.1 Below is shown the top-level algorithm to execute an IN predicate
+with partial matching. This algorithm is essentially the implementation
+of Item_subselect:exec().
+
+int lookup_with_null_semantics(outer_ref[], mat_subquery)
+{
+ if (index_lookup(outer_ref, mat_subquery)
+ return TRUE
+ else
+ {
+ /*
+ Check if there is a partial match (UNKNOWN) or no match (NULL).
+ */
+ if (this is the first partial match)
+ {
+ vkey[] = build array of value keys for each NULL-able column
+ of mat_subquery.
+ nkey[] = build a bitmap NULL index for each column of mat_subquery
+ that contains NULLs
+ nonull_key = build a key over all non-NULL columns of mat_subquery
+ }
+ if (partial_match(outer_ref, vkey[], nkey[], nonull_key)
+ return UNKNOWN
+ else
+ return FALSE
+ }
+}
+
+2.2 The implementation of partial matching is as follows
+
+/*
+ Assumptions:
+ - It has already been checked if there is a complete match by a
+ regular index lookup, and the test failed.
+ - It has already been checked if there is a complete NULL row,
+ and if there was we wouldn't call this function. Thus we assume
+ that there is no complete NULL row.
+ - Not all vidx_i are empty, but some can be empty. If all were empty,
+ then the only possibility for a match is a complete NULL row, which
+ we already checked.
+
+ @param outer_ref - the uter (left) IN argument.
+ @param vidx[] - array of value keys
+ Ordered sequences of rowids of the corresponding columns a_i, such
+ that all rowids in idx_i are the ones where column a_i contains some
+ value or NULL. Each idx_i is derived dynamically, for each different
+ left argument of an IN predicate.
+ @param nidx[] - array of NULL keys
+ Bitmpas, one per each column, where a bit is set if the corresponding
+ row has a NULL value for the corresponding column.
+ @nonull_key - the only key over all columns of the materialized subquery
+ that do not contain NULLs
+
+ @returns
+ @retval FALSE if there is no match
+ @retval TRUE if there is a partial match
+*/
+
+Boolean partial_match(outer_ref, vkey[], nkey[], nonull_key)
+{
+ /* Set of the keys (columns) that form a partial match. */
+ Set matching_keys = {}
+ /* A subset of all keys that need to be checked for NULL matches. */
+ Set null_keys = {}
+ Int min_key /* Key that contains the current minimum position. */
+ Int min_row /* Current row number of min_key. */
+ Int cur_min_key, cur_min_row
+ PriorityQueue pq
+
+ if (nonull_key && ! nonull_key->lookup(outer_ref))
+ return FALSE
+
+ for (i = 1; i <= n; i++)
+ {
+ vkey[i].lookup(outer_ref)
+ if (! vkey[i].is_eof())
+ pq.insert(i)
+ }
+ /*
+ Not all value keys are empty, thus we don't have only NULL
+ keys. If we had, the only possible match is a NULL row, and
+ we cheked there is no such row, therefore the result is known
+ to be FALSE.
+ In fact this algorithm makes sense for at least two non-NULL
+ columns.
+ */
+ assert(pq.elements > 1)
+
+ (min_key, min_row) = pq.pop()
+ matching_keys.add(min_key)
+ vkey[min_key].next()
+ if (! vkey[min_key].is_eof())
+ pq.insert(min_key)
+
+ while (TRUE)
+ {
+ (cur_min_key, cur_min_row) = pq.pop()
+
+ if (cur_min_row == min_row)
+ {
+ matching_keys.add(cur_min_key)
+ /* There cannot be a complete match, as we already checked for one. */
+ assert(matching_keys.elements < n)
+ }
+ else if (cur_min_key == nonull_key)
+ {
+ /*
+ The non-NULL key has no corresponding NULL index, so we know for
+ sure that the row 'min_row' is not a match.
+ */
+ (min_key, min_row) = (cur_min_key, cur_min_row)
+ matching_keys = {min_key}
+ }
+ else
+ {
+ assert(cur_min_row > min_row) /* Follows from the use of PQ. */
+ null_keys = set_difference(all keys vkey[], matching_keys)
+ /*
+ Check if all null_keys contain a NULL at row 'min_row'. The procedure
+ internally checks all keys in a special precomputed order. A prior
+ procedure determines an optimal order and a mapping
+ idx_no -> idx_order (encoded as an array).
+ */
+ if (test_null_row(null_keys, min_row))
+ return TRUE
+ else
+ {
+ (min_key, min_row) = (cur_min_key, cur_min_row)
+ matching_keys = {min_key}
+ }
+ }
+
+ vkey[cur_min_key].next()
+ if (! vkey[cur_min_key].is_eof())
+ pq.insert(cur_min_key)
+
+ if (pq.is_empty())
+ {
+ /* Check the last row of the last column in PQ for NULL matches. */
+ null_keys = set_difference(all keys vkey[], matching_keys)
+ if (test_null_row(null_keys, min_row))
+ return TRUE
+ else
+ return FALSE
+ }
+ }
+
+ /* We should never get here. */
+ assert(FALSE)
+ return FALSE
+}
+
+
+
+3. Directions for improvement
+========================================================================
+
Other consideration that may be taken into account:
1. If columns a_j1,...,a_jm do not contain null values in the temporary
-=-=(Timour - Sun, 06 Dec 2009, 14:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.12919 2009-12-06 14:36:18.000000000 +0200
+++ /tmp/wklog.68.new.12919 2009-12-06 14:36:18.000000000 +0200
@@ -87,3 +87,8 @@
7. If you get a row with nulls in all columns stop filling the temporary
table and return UNKNOWN for any tuple <v1,...,vn>.
+8. [timour]
+ Consider that due to materialization, we already have a unique index
+on all columns <a_1,..., a_n>. We can use the first key part of this index
+over column a_1, instead of the index rowid{a_i=v_i}. Thus we can avoid
+creating the index rowid{a_i=v_i}.
-=-=(Timour - Fri, 04 Dec 2009, 14:04)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.16724 2009-12-04 14:04:28.000000000 +0200
+++ /tmp/wklog.68.new.16724 2009-12-04 14:04:28.000000000 +0200
@@ -10,7 +10,8 @@
Bear in mind the following specifics of this intersection:
(1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
- (2) For each i: rowid{a_i is null} is the same for each tuple
+ (2) For each i: rowid{a_i is null} is the same for each tuple,
+ that is, this set is independent of the left-side tuples.
Due to (2) it makes sense to build rowid{a_i is null} only once.
A good representation for such sets would be bitmaps:
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Version updated.
--- /tmp/wklog.68.old.5257 2009-12-04 11:27:11.000000000 +0200
+++ /tmp/wklog.68.new.5257 2009-12-04 11:27:11.000000000 +0200
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Category updated.
--- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200
+++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-Sprint
-=-=(Timour - Fri, 04 Dec 2009, 11:27)=-=-
Status updated.
--- /tmp/wklog.68.old.5242 2009-12-04 11:27:02.000000000 +0200
+++ /tmp/wklog.68.new.5242 2009-12-04 11:27:02.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+Assigned
-=-=(Timour - Fri, 04 Dec 2009, 11:26)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.5182 2009-12-04 11:26:25.000000000 +0200
+++ /tmp/wklog.68.new.5182 2009-12-04 11:26:25.000000000 +0200
@@ -50,23 +50,39 @@
The array can be created on demand.
Other consideration that may be taken into account:
+
1. If columns a_j1,...,a_jm do not contain null values in the temporary
-table at all, create for them only one index array (and of course do not
-create any bitmaps for them).
-2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows
-where a_i is not null and V(a_i) is the number of distinct values for
-a_i excluding nulls.
- If d(a_i) is close to 1 then do not create any index array: check
+table at all and v_j1,...,v_jm cannot be null, create for these columns
+only one index array (and of course do not create any bitmaps for them).
+
+2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number
+of rows, where a_i is not null and V(a_i) is the number of distinct
+values for a_i excluding nulls.
+If d(a_i) is close to N'(a_i) then do not create any index array: check
whether there is a match running through the records that have been
-filtered in. Anyway if d(a_i) is close to 1 then a intersection with
-rowid{a_i=v_i} would not reduce the number of remaining rowids
+filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection
+ with rowid{a_i=v_i} will not reduce the number of remaining rowids
significantly.
- If additionally N-N' is small do not create a bitmap for this column
-either.
-3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a
-sorted array of rowids from the set rowid{a_i is null} can be used
-instead of a bitmap.
+In other words is V(a_i) exceeds some threshold there is no sense to
+create an index for a_i.
+If additionally N-N'(a_i) is small do not create a bitmap for this
+column either.
+
+3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is
+small a sorted array of rowids from the set rowid{a_i is null} can be
+used instead of a bitmap.
+
4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
empty. Here i runs through all indexes from [1..n] such that v_i is not
null. For a given subset of columns this fact has to be checked only
once. It can be easily done with bitmap intersection.
+
+5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be
+created only for rows with nulls.
+
+6. If v1,...,vn never can be a null and number of rows with nulls is
+small do not create indexes and do not create bitmaps.
+
+7. If you get a row with nulls in all columns stop filling the temporary
+table and return UNKNOWN for any tuple <v1,...,vn>.
+
-=-=(Timour - Fri, 27 Nov 2009, 13:23)=-=-
High-Level Specification modified.
--- /tmp/wklog.68.old.17140 2009-11-27 11:23:17.000000000 +0000
+++ /tmp/wklog.68.new.17140 2009-11-27 11:23:17.000000000 +0000
@@ -1 +1,72 @@
+This a copy of the initial algorithm proposed by Igor:
+======================================================
+For each left side tuple (v_1,...,v_n) we have to find the following set
+of rowids for the temp table containing N rows as the result of
+materialization of the subquery:
+
+ R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
+trough all indexes from [1..n] such that v_i is not null.
+
+Bear in mind the following specifics of this intersection:
+ (1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
+ (2) For each i: rowid{a_i is null} is the same for each tuple
+
+Due to (2) it makes sense to build rowid{a_i is null} only once.
+A good representation for such sets would be bitmaps:
+- it requires minimum memory: not more than N*n bits in total
+- search of an element in a set is extremely cheap
+
+Taken all above into account I could suggest the following algorithm to
+build R:
+
+ Using indexes (read about them below) for each column participating in the
+ intersection,
+ merge ordered sets rowid{a_i=v_i} in the following manner.
+ If a rowid r has been encountered maximum in k sets
+rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
+ then it has to be checked against all rowid{a_i=v_i} such that i is
+not in {i1,...,ik}.
+ As soon as we fail to find r in one of these sets we discard it.
+ If r has been found in all of them then r belongs to the set R.
+
+Here we use the property (1): any r from rowid{a_i=v_i} UNION rowid{a_i
+is null} is either
+belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
+infer that for any r from R
+indexes a_i can be uniquely divided into two groups: one contains
+indexes a_i where r belongs to
+the sets rowid{a_i=v_i}, the other contains indexes a_j such that r
+belongs to rowid{a_j is null}.
+
+Now let's talk how to get elements from rowid{a_i=v_i} in a sorted order
+needed for the merge procedure. We could use BTREE indexes for temp
+table. But they are rather expensive and
+take a lot of memory as the are implemented with RB trees.
+I would suggest creating for each column from the temporary table just
+an array of rowids sorted by the value from column a.
+Index lookup in such an array is cheap. It's also rather cheap to check
+that the next rowid refers to a row with a different value in column a.
+The array can be created on demand.
+
+Other consideration that may be taken into account:
+1. If columns a_j1,...,a_jm do not contain null values in the temporary
+table at all, create for them only one index array (and of course do not
+create any bitmaps for them).
+2. Consider the ratio d(a_i)=N'/ V(a_i), where N' is the number of rows
+where a_i is not null and V(a_i) is the number of distinct values for
+a_i excluding nulls.
+ If d(a_i) is close to 1 then do not create any index array: check
+whether there is a match running through the records that have been
+filtered in. Anyway if d(a_i) is close to 1 then a intersection with
+rowid{a_i=v_i} would not reduce the number of remaining rowids
+significantly.
+ If additionally N-N' is small do not create a bitmap for this column
+either.
+3. If for a column a_i d(a_i) is not close to 1, but N-N' is small a
+sorted array of rowids from the set rowid{a_i is null} can be used
+instead of a bitmap.
+4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
+empty. Here i runs through all indexes from [1..n] such that v_i is not
+null. For a given subset of columns this fact has to be checked only
+once. It can be easily done with bitmap intersection.
DESCRIPTION:
The goal of this task is to implement efficient execution of NOT IN
subquery predicates of the form:
<oe_1,...,oe_n> NOT IN <non_correlated subquery>
when either some oe_i, or some subqury result column contains NULLs.
The problem with such predicates is that it is possible to use index
lookups only when neither argument of the predicate contains NULLs.
If some argument contains a NULL, then due to NULL semantics, it
plays the role of a wildcard. If we were to use regular index lookups,
then we would get 'no match' for some outer tuple (thus the predicate
evaluates to FALSE), while the SQL semantics means 'partial match', and
the predicate should evaluate to NULL.
This task implements an efficient algorithm to compute such 'parial
matches', where a NULL matches any value.
HIGH-LEVEL SPECIFICATION:
Contents
========================================================================
1. Initial idea as proposed by Igor
2. Algorithm for IN execution with partial matching
3. Directions for improvement
1. Initial idea as proposed by Igor
========================================================================
For each left side tuple (v_1,...,v_n) we have to find the following
set of rowids for the temp table containing N rows as the result of
materialization of the subquery:
R= INTERSECT (rowid{a_i=v_i} UNION rowid{a_i is null} where i runs
trough all indexes from [1..n] such that v_i is not null.
Bear in mind the following specifics of this intersection:
(1) For each i: rowid{a_i=v_i} and rowid{a_i is null} are disjoint
(2) For each i: rowid{a_i is null} is the same for each tuple,
that is, this set is independent of the left-side tuples.
Due to (2) it makes sense to build rowid{a_i is null} only once.
A good representation for such sets would be bitmaps:
- it requires minimum memory: not more than N*n bits in total
- search of an element in a set is extremely cheap
Taken all above into account I could suggest the following algorithm
to build R:
Using indexes (read about them below) for each column participating
in the intersection, merge ordered sets rowid{a_i=v_i} in the
following manner.
If a rowid r has been encountered maximum in k sets
rowid{a_i1=v_i1},...,rowid(a_ik=v_ik),
then it has to be checked against all rowid{a_i=v_i} such that i is
not in {i1,...,ik}.
As soon as we fail to find r in one of these sets we discard it.
If r has been found in all of them then r belongs to the set R.
Here we use the property (1):
any r from rowid{a_i=v_i} UNION rowid{a_i is null} is either
belongs to rowid{a_i=v_i} or to rowid{a_i is null}. From this we can
infer that for any r from R indexes a_i can be uniquely divided into
two groups:
- one contains indexes a_i where r belongs to the sets rowid{a_i=v_i},
- the other contains indexes a_j such that r belongs to
rowid{a_j is null}.
Now let's talk how to get elements from rowid{a_i=v_i} in a sorted
order needed for the merge procedure. We could use BTREE indexes for
temp table. But they are rather expensive and take a lot of memory as
the are implemented with RB trees.
I would suggest creating for each column from the temporary table just
an array of rowids sorted by the value from column a.
Index lookup in such an array is cheap. It's also rather cheap to check
that the next rowid refers to a row with a different value in column a.
The array can be created on demand.
2. Algorithm for IN execution with partial matching
========================================================================
2.1 Below is shown the top-level algorithm to execute an IN predicate
with partial matching. This algorithm is essentially the implementation
of Item_subselect:exec().
int lookup_with_null_semantics(outer_ref[], mat_subquery)
{
if (index_lookup(outer_ref, mat_subquery)
return TRUE
else
{
/*
Check if there is a partial match (UNKNOWN) or no match (NULL).
*/
if (this is the first partial match)
{
vkey[] = build array of value keys for each NULL-able column
of mat_subquery.
nkey[] = build a bitmap NULL index for each column of mat_subquery
that contains NULLs
nonull_key = build a key over all non-NULL columns of mat_subquery
}
if (partial_match(outer_ref, vkey[], nkey[], nonull_key)
return UNKNOWN
else
return FALSE
}
}
2.2 The implementation of partial matching is as follows
/*
Assumptions:
- It has already been checked if there is a complete match by a
regular index lookup, and the test failed.
- It has already been checked if there is a complete NULL row,
and if there was we wouldn't call this function. Thus we assume
that there is no complete NULL row.
- Not all vidx_i are empty, but some can be empty. If all were empty,
then the only possibility for a match is a complete NULL row, which
we already checked.
@param outer_ref - the uter (left) IN argument.
@param vidx[] - array of value keys
Ordered sequences of rowids of the corresponding columns a_i, such
that all rowids in idx_i are the ones where column a_i contains some
value or NULL. Each idx_i is derived dynamically, for each different
left argument of an IN predicate.
@param nidx[] - array of NULL keys
Bitmpas, one per each column, where a bit is set if the corresponding
row has a NULL value for the corresponding column.
@nonull_key - the only key over all columns of the materialized subquery
that do not contain NULLs
@returns
@retval FALSE if there is no match
@retval TRUE if there is a partial match
*/
Boolean partial_match(outer_ref, vkey[], nkey[], nonull_key)
{
/* Set of the keys (columns) that form a partial match. */
Set matching_keys = {}
/* A subset of all keys that need to be checked for NULL matches. */
Set null_keys = {}
Int min_key /* Key that contains the current minimum position. */
Int min_row /* Current row number of min_key. */
Int cur_min_key, cur_min_row
PriorityQueue pq
if (nonull_key && ! nonull_key->lookup(outer_ref))
return FALSE
for (i = 1; i <= n; i++)
{
vkey[i].lookup(outer_ref)
if (! vkey[i].is_eof())
pq.insert(i)
}
/*
Not all value keys are empty, thus we don't have only NULL
keys. If we had, the only possible match is a NULL row, and
we cheked there is no such row, therefore the result is known
to be FALSE.
In fact this algorithm makes sense for at least two non-NULL
columns.
*/
assert(pq.elements > 1)
(min_key, min_row) = pq.pop()
matching_keys.add(min_key)
vkey[min_key].next()
if (! vkey[min_key].is_eof())
pq.insert(min_key)
while (TRUE)
{
(cur_min_key, cur_min_row) = pq.pop()
if (cur_min_row == min_row)
{
matching_keys.add(cur_min_key)
/* There cannot be a complete match, as we already checked for one. */
assert(matching_keys.elements < n)
}
else if (cur_min_key == nonull_key)
{
/*
The non-NULL key has no corresponding NULL index, so we know for
sure that the row 'min_row' is not a match.
*/
(min_key, min_row) = (cur_min_key, cur_min_row)
matching_keys = {min_key}
}
else
{
assert(cur_min_row > min_row) /* Follows from the use of PQ. */
null_keys = set_difference(all keys vkey[], matching_keys)
/*
Check if all null_keys contain a NULL at row 'min_row'. The procedure
internally checks all keys in a special precomputed order. A prior
procedure determines an optimal order and a mapping
idx_no -> idx_order (encoded as an array).
*/
if (test_null_row(null_keys, min_row))
return TRUE
else
{
(min_key, min_row) = (cur_min_key, cur_min_row)
matching_keys = {min_key}
}
}
vkey[cur_min_key].next()
if (! vkey[cur_min_key].is_eof())
pq.insert(cur_min_key)
if (pq.is_empty())
{
/* Check the last row of the last column in PQ for NULL matches. */
null_keys = set_difference(all keys vkey[], matching_keys)
if (test_null_row(null_keys, min_row))
return TRUE
else
return FALSE
}
}
/* We should never get here. */
assert(FALSE)
return FALSE
}
3. Directions for improvement
========================================================================
Other consideration that may be taken into account:
1. If columns a_j1,...,a_jm do not contain null values in the temporary
table at all and v_j1,...,v_jm cannot be null, create for these columns
only one index array (and of course do not create any bitmaps for them).
2. Consider the ratio d(a_i)=N'(a_i)/V(a_i), where N'(a_i) is the number
of rows, where a_i is not null and V(a_i) is the number of distinct
values for a_i excluding nulls.
If d(a_i) is close to N'(a_i) then do not create any index array: check
whether there is a match running through the records that have been
filtered in. Anyway if d(a_i) is close to N'(a_i) then the intersection
with rowid{a_i=v_i} will not reduce the number of remaining rowids
significantly.
In other words is V(a_i) exceeds some threshold there is no sense to
create an index for a_i.
If additionally N-N'(a_i) is small do not create a bitmap for this
column either.
3. If for a column a_i d(a_i) is not close to N'(a_i), but N-N'(a_i) is
small a sorted array of rowids from the set rowid{a_i is null} can be
used instead of a bitmap.
4. We always have a match if R0= INTERSECT rowid{a_i is null} is not
empty. Here i runs through all indexes from [1..n] such that v_i is not
null. For a given subset of columns this fact has to be checked only
once. It can be easily done with bitmap intersection.
5. If v1,...,vn never can be a null, then indexes (sorted arrays) can be
created only for rows with nulls.
6. If v1,...,vn never can be a null and number of rows with nulls is
small do not create indexes and do not create bitmaps.
7. If you get a row with nulls in all columns stop filling the temporary
table and return UNKNOWN for any tuple <v1,...,vn>.
8. [timour]
Consider that due to materialization, we already have a unique index
on all columns <a_1,..., a_n>. We can use the first key part of this index
over column a_1, instead of the index rowid{a_i=v_i}. Thus we can avoid
creating the index rowid{a_i=v_i}.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0