developers
Threads by month
- ----- 2025 -----
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- 9 participants
- 6826 discussions
[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 18 Jan '10
by worklog-noreply@askmonty.org 18 Jan '10
18 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 18 Jan 2010, 16:34)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24328 2010-01-18 16:34:19.000000000 +0200
+++ /tmp/wklog.66.new.24328 2010-01-18 16:34:19.000000000 +0200
@@ -32,8 +32,8 @@
Question: or perhaps that is not necessarry?
</questionable>
-Execution process
-~~~~~~~~~~~~~~~~~
+Doing the lookup
+~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
@@ -42,4 +42,12 @@
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+The code in this WL will use the same approach
+Extracting the value of the subquery predicate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The goal of making the lookup is to get the value of subquery predicate.
+This is done by creating an Item_field $I which refers to appropriate
+temporary table's field and then subquery_predicate->val_int() will invoke
+$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
+forth.
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
-=-=(Sanja - Fri, 11 Dec 2009, 15:08)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.28072 2009-12-11 15:08:16.000000000 +0200
+++ /tmp/wklog.66.new.28072 2009-12-11 15:08:16.000000000 +0200
@@ -1 +1,6 @@
+All items on which subquery depend could be collected in
+st_select_lex::mark_as_dependent (direct of indirect reference?)
+
+Temporary table index should be created by all fields except result field
+(TMP_TABLE_PARAM::keyinfo).
-=-=(Sanja - Fri, 11 Dec 2009, 15:05)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.27795 2009-12-11 15:05:04.000000000 +0200
+++ /tmp/wklog.66.new.27795 2009-12-11 15:05:04.000000000 +0200
@@ -1 +1,6 @@
+Attach subquery cache to each Item_subquery. Interface should allow to use hash
+or temporary table inside.
+
+To check/discuss:
+Are there sens to put subquery cache on all levels of subqueries of on highest.
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We dont support cases when outer_expr or correlation_references are blobs.
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Doing the lookup
~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
The code in this WL will use the same approach
Extracting the value of the subquery predicate
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of making the lookup is to get the value of subquery predicate.
This is done by creating an Item_field $I which refers to appropriate
temporary table's field and then subquery_predicate->val_int() will invoke
$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
forth.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 18 Jan '10
by worklog-noreply@askmonty.org 18 Jan '10
18 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 18 Jan 2010, 16:34)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.24328 2010-01-18 16:34:19.000000000 +0200
+++ /tmp/wklog.66.new.24328 2010-01-18 16:34:19.000000000 +0200
@@ -32,8 +32,8 @@
Question: or perhaps that is not necessarry?
</questionable>
-Execution process
-~~~~~~~~~~~~~~~~~
+Doing the lookup
+~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
@@ -42,4 +42,12 @@
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+The code in this WL will use the same approach
+Extracting the value of the subquery predicate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The goal of making the lookup is to get the value of subquery predicate.
+This is done by creating an Item_field $I which refers to appropriate
+temporary table's field and then subquery_predicate->val_int() will invoke
+$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
+forth.
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
-=-=(Sanja - Fri, 11 Dec 2009, 15:08)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.28072 2009-12-11 15:08:16.000000000 +0200
+++ /tmp/wklog.66.new.28072 2009-12-11 15:08:16.000000000 +0200
@@ -1 +1,6 @@
+All items on which subquery depend could be collected in
+st_select_lex::mark_as_dependent (direct of indirect reference?)
+
+Temporary table index should be created by all fields except result field
+(TMP_TABLE_PARAM::keyinfo).
-=-=(Sanja - Fri, 11 Dec 2009, 15:05)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.27795 2009-12-11 15:05:04.000000000 +0200
+++ /tmp/wklog.66.new.27795 2009-12-11 15:05:04.000000000 +0200
@@ -1 +1,6 @@
+Attach subquery cache to each Item_subquery. Interface should allow to use hash
+or temporary table inside.
+
+To check/discuss:
+Are there sens to put subquery cache on all levels of subqueries of on highest.
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We dont support cases when outer_expr or correlation_references are blobs.
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Doing the lookup
~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
The code in this WL will use the same approach
Extracting the value of the subquery predicate
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of making the lookup is to get the value of subquery predicate.
This is done by creating an Item_field $I which refers to appropriate
temporary table's field and then subquery_predicate->val_int() will invoke
$I->val_int(), subquery_predicate->val_str() will invoke $I->val_str() and so
forth.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 18 Jan '10
by worklog-noreply@askmonty.org 18 Jan '10
18 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
-=-=(Sanja - Fri, 11 Dec 2009, 15:08)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.28072 2009-12-11 15:08:16.000000000 +0200
+++ /tmp/wklog.66.new.28072 2009-12-11 15:08:16.000000000 +0200
@@ -1 +1,6 @@
+All items on which subquery depend could be collected in
+st_select_lex::mark_as_dependent (direct of indirect reference?)
+
+Temporary table index should be created by all fields except result field
+(TMP_TABLE_PARAM::keyinfo).
-=-=(Sanja - Fri, 11 Dec 2009, 15:05)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.27795 2009-12-11 15:05:04.000000000 +0200
+++ /tmp/wklog.66.new.27795 2009-12-11 15:05:04.000000000 +0200
@@ -1 +1,6 @@
+Attach subquery cache to each Item_subquery. Interface should allow to use hash
+or temporary table inside.
+
+To check/discuss:
+Are there sens to put subquery cache on all levels of subqueries of on highest.
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We dont support cases when outer_expr or correlation_references are blobs.
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Execution process
~~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 18 Jan '10
by worklog-noreply@askmonty.org 18 Jan '10
18 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 18 Jan 2010, 16:23)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23203 2010-01-18 16:23:18.000000000 +0200
+++ /tmp/wklog.66.new.23203 2010-01-18 16:23:18.000000000 +0200
@@ -31,3 +31,15 @@
Question: or perhaps that is not necessarry?
</questionable>
+
+Execution process
+~~~~~~~~~~~~~~~~~
+SJ-Materialization does lookup in sub_select_sjm(), with this code:
+
+ /* Do index lookup in the materialized table */
+ if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
+ DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
+ if (res || !sjm->in_equality->val_int())
+ DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
+
+
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
-=-=(Sanja - Fri, 11 Dec 2009, 15:08)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.28072 2009-12-11 15:08:16.000000000 +0200
+++ /tmp/wklog.66.new.28072 2009-12-11 15:08:16.000000000 +0200
@@ -1 +1,6 @@
+All items on which subquery depend could be collected in
+st_select_lex::mark_as_dependent (direct of indirect reference?)
+
+Temporary table index should be created by all fields except result field
+(TMP_TABLE_PARAM::keyinfo).
-=-=(Sanja - Fri, 11 Dec 2009, 15:05)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.27795 2009-12-11 15:05:04.000000000 +0200
+++ /tmp/wklog.66.new.27795 2009-12-11 15:05:04.000000000 +0200
@@ -1 +1,6 @@
+Attach subquery cache to each Item_subquery. Interface should allow to use hash
+or temporary table inside.
+
+To check/discuss:
+Are there sens to put subquery cache on all levels of subqueries of on highest.
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We dont support cases when outer_expr or correlation_references are blobs.
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
Execution process
~~~~~~~~~~~~~~~~~
SJ-Materialization does lookup in sub_select_sjm(), with this code:
/* Do index lookup in the materialized table */
if ((res= join_read_key2(join_tab, sjm->table, sjm->tab_ref)) == 1)
DBUG_RETURN(NESTED_LOOP_ERROR); /* purecov: inspected */
if (res || !sjm->in_equality->val_int())
DBUG_RETURN(NESTED_LOOP_NO_MORE_ROWS);
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 18 Jan '10
by worklog-noreply@askmonty.org 18 Jan '10
18 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
-=-=(Sanja - Fri, 11 Dec 2009, 15:08)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.28072 2009-12-11 15:08:16.000000000 +0200
+++ /tmp/wklog.66.new.28072 2009-12-11 15:08:16.000000000 +0200
@@ -1 +1,6 @@
+All items on which subquery depend could be collected in
+st_select_lex::mark_as_dependent (direct of indirect reference?)
+
+Temporary table index should be created by all fields except result field
+(TMP_TABLE_PARAM::keyinfo).
-=-=(Sanja - Fri, 11 Dec 2009, 15:05)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.27795 2009-12-11 15:05:04.000000000 +0200
+++ /tmp/wklog.66.new.27795 2009-12-11 15:05:04.000000000 +0200
@@ -1 +1,6 @@
+Attach subquery cache to each Item_subquery. Interface should allow to use hash
+or temporary table inside.
+
+To check/discuss:
+Are there sens to put subquery cache on all levels of subqueries of on highest.
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We dont support cases when outer_expr or correlation_references are blobs.
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Psergey): Subquery optimization: Avoid recalculating subquery if external fields values found in subquery cache (66)
by worklog-noreply@askmonty.org 18 Jan '10
by worklog-noreply@askmonty.org 18 Jan '10
18 Jan '10
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Subquery optimization: Avoid recalculating subquery if external fields
values found in subquery cache
CREATION DATE..: Wed, 25 Nov 2009, 22:25
SUPERVISOR.....: Monty
IMPLEMENTOR....: Sanja
COPIES TO......:
CATEGORY.......: Server-BackLog
TASK ID........: 66 (http://askmonty.org/worklog/?tid=66)
VERSION........: Server-5.2
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 18 Jan 2010, 16:22)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.23076 2010-01-18 16:22:07.000000000 +0200
+++ /tmp/wklog.66.new.23076 2010-01-18 16:22:07.000000000 +0200
@@ -4,3 +4,30 @@
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
+How to make lookups into temptable
+----------------------------------
+We'll reuse approach used by SJ-Materialization in 6.0.
+
+Setup process
+~~~~~~~~~~~~~
+Setup is performed in the same way as in setup_sj_materialization(),
+see the code that starts these lines:
+
+ /*
+ Create/initialize everything we will need to index lookups into the
+ temptable.
+ */
+
+and ends at this line:
+
+ Remove the injected semi-join IN-equalities from join_tab conds. This
+
+<questionable>
+We'll also need to check equalities, i.e. do an equivalent of this:
+
+ if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
+ emb_sj_nest->sj_subq_pred)))
+ DBUG_RETURN(TRUE); /* purecov: inspected */
+
+Question: or perhaps that is not necessarry?
+</questionable>
-=-=(Psergey - Tue, 12 Jan 2010, 18:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.31666 2010-01-12 18:39:43.000000000 +0200
+++ /tmp/wklog.66.new.31666 2010-01-12 18:39:43.000000000 +0200
@@ -4,3 +4,99 @@
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
+
+
+<contents>
+1. Scope of the task
+2. Data structure used for the cache
+3. Cache size
+4. Interplay with other subquery optimizations
+5. User interface
+</contents>
+
+1. Scope of the task
+--------------------
+This WL should handle all subquery predicates, i.e. it should handle these
+cases:
+
+ outer_expr IN (SELECT correlated_select)
+ outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
+ EXISTS (SELECT correlated_select)
+ scalar-context subquery: (SELECT correlated_select)
+
+The cache will maintain
+
+ (outer_expr, correlation_references)-> subquery_item_result
+
+mapping, where
+- correlation_references is a list of tablename.column_name that are referred
+ from the correlated_select but tablename is a table that is ouside the
+ subquery.
+- subquery_item_result is 'bool' for subquery predicates, and is of
+some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
+
+We dont support cases when outer_expr or correlation_references are blobs.
+
+2. Data structure used for the cache
+------------------------------------
+There are two data structures available in the codebase that will allow fast
+equality lookups:
+
+1. HASH (mysys/hash.c) tables
+2. Temporary tables (the ones that are used for e.g. GROUP BY)
+
+None of them has any support for element eviction on overflow (using LRU or
+some other policy).
+
+Query cache and MyISAM/Maria's key/page cache ought to support some eviction
+mechanism, but code-wise it is not readily reusable, one will need to factor
+it out (or copy it).
+
+We choose to use #2, and not to have any eviction policy. See subsequent
+sections for details and reasoning behind the decision.
+
+3. Cache size
+-------------
+Typically, a cache has some maximum size and a policy which is used to
+select a cache entry for removal when the cache becomes full (e.g. find
+and remove the least [recently] used entry)
+
+For this WL entry we will use a cache of infinite size. The reasoning behind
+this is that:
+- is is easy to do: we have temporary tables that can grow to arbitrarily
+ large size while still providing the same insert/lookup interface.
+- it suits us: unless the subquery is resolved with one index lookup,
+ hitting the cache would be many times cheaper than re-running the
+ subquery, so cache is worth having.
+
+4. Interplay with other subquery optimizations
+----------------------------------------------
+* This WL entry should not care about IN->EXISTS transformation: caching for
+ IN subquery and result of its conversion to EXISTS would work in the same
+ way.
+
+* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
+ work/be useful irrespectively of whether the rewrite has been performed or
+ not)
+
+* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
+ this basically the same?
+ A: no, it is not:
+ - IN-Materialization has to perform full materialization before it can
+ do the first subquery evaluation. This WL's code has almost no startup
+ costs.
+ - This optimization has temp.table of (corr_reference, predicate_value),
+ while IN-materialization will have (corr_reference) only.
+
+5. User interface
+-----------------
+* There will be an @@optimizer_switch flag to turn this optimization on and
+ off (TODO: name of the flag?)
+
+* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
+ print something in the warning text of EXPLAIN EXTEDED that would indicate
+ use of cache.
+
+* temporary table sizing (max size for heap table, whether to use MyISAM or
+ Maria) will be controlled with common temp.table control variables.
+
-=-=(Psergey - Mon, 11 Jan 2010, 13:25)=-=-
As of today, there is code that
- collects outside references
- creates a temporary table with index that would allow for fast lookups.
there is no code to
- fill the temporary table
- make lookups into it
Reported zero hours worked. Estimate unchanged.
-=-=(Sanja - Fri, 11 Dec 2009, 15:09)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.28164 2009-12-11 15:09:15.000000000 +0200
+++ /tmp/wklog.66.new.28164 2009-12-11 15:09:15.000000000 +0200
@@ -3,4 +3,4 @@
To check/discuss:
-Are there sens to put subquery cache on all levels of subqueries of on highest.
+ To put subquery cache on all levels of subqueries or on highest level only.
-=-=(Sanja - Fri, 11 Dec 2009, 15:08)=-=-
Low Level Design modified.
--- /tmp/wklog.66.old.28072 2009-12-11 15:08:16.000000000 +0200
+++ /tmp/wklog.66.new.28072 2009-12-11 15:08:16.000000000 +0200
@@ -1 +1,6 @@
+All items on which subquery depend could be collected in
+st_select_lex::mark_as_dependent (direct of indirect reference?)
+
+Temporary table index should be created by all fields except result field
+(TMP_TABLE_PARAM::keyinfo).
-=-=(Sanja - Fri, 11 Dec 2009, 15:05)=-=-
High-Level Specification modified.
--- /tmp/wklog.66.old.27795 2009-12-11 15:05:04.000000000 +0200
+++ /tmp/wklog.66.new.27795 2009-12-11 15:05:04.000000000 +0200
@@ -1 +1,6 @@
+Attach subquery cache to each Item_subquery. Interface should allow to use hash
+or temporary table inside.
+
+To check/discuss:
+Are there sens to put subquery cache on all levels of subqueries of on highest.
DESCRIPTION:
Collect all outer items/references (left part of the subquiery and outer
references inside the subquery) in key string. Compare the string (which
represents certain value set of the references) against values in hash table and
return cached result of subquery if the reference values combination has already
been used.
For example in the following subquery:
(L1, L2) IN (SELECT A, B FROM T WHERE T.F1>OTER_FIELD)
set of references to look into the subquery cache is (L1, L2, OTER_FIELD).
The subquery cache should be implemented as simple LRU connected to the subquery.
Size of the subquery cache (in number of results (but maybe in used memory
amount)) is limited by session variable (query parameter?).
HIGH-LEVEL SPECIFICATION:
Attach subquery cache to each Item_subquery. Interface should allow to use hash
or temporary table inside.
To check/discuss:
To put subquery cache on all levels of subqueries or on highest level only.
<contents>
1. Scope of the task
2. Data structure used for the cache
3. Cache size
4. Interplay with other subquery optimizations
5. User interface
</contents>
1. Scope of the task
--------------------
This WL should handle all subquery predicates, i.e. it should handle these
cases:
outer_expr IN (SELECT correlated_select)
outer_expr $CMP$ ALL/ANY (SELECT correlated_select)
EXISTS (SELECT correlated_select)
scalar-context subquery: (SELECT correlated_select)
The cache will maintain
(outer_expr, correlation_references)-> subquery_item_result
mapping, where
- correlation_references is a list of tablename.column_name that are referred
from the correlated_select but tablename is a table that is ouside the
subquery.
- subquery_item_result is 'bool' for subquery predicates, and is of
some scalar or ROW(scalar1,...scalarN) type for scalar-context subquery.
We dont support cases when outer_expr or correlation_references are blobs.
2. Data structure used for the cache
------------------------------------
There are two data structures available in the codebase that will allow fast
equality lookups:
1. HASH (mysys/hash.c) tables
2. Temporary tables (the ones that are used for e.g. GROUP BY)
None of them has any support for element eviction on overflow (using LRU or
some other policy).
Query cache and MyISAM/Maria's key/page cache ought to support some eviction
mechanism, but code-wise it is not readily reusable, one will need to factor
it out (or copy it).
We choose to use #2, and not to have any eviction policy. See subsequent
sections for details and reasoning behind the decision.
3. Cache size
-------------
Typically, a cache has some maximum size and a policy which is used to
select a cache entry for removal when the cache becomes full (e.g. find
and remove the least [recently] used entry)
For this WL entry we will use a cache of infinite size. The reasoning behind
this is that:
- is is easy to do: we have temporary tables that can grow to arbitrarily
large size while still providing the same insert/lookup interface.
- it suits us: unless the subquery is resolved with one index lookup,
hitting the cache would be many times cheaper than re-running the
subquery, so cache is worth having.
4. Interplay with other subquery optimizations
----------------------------------------------
* This WL entry should not care about IN->EXISTS transformation: caching for
IN subquery and result of its conversion to EXISTS would work in the same
way.
* This optimization is orthogonal to <=>ANY -> MIN/MAX rewrite (it will
work/be useful irrespectively of whether the rewrite has been performed or
not)
* TODO: compare this with materialization for uncorrelated IN-subqueries. Is
this basically the same?
A: no, it is not:
- IN-Materialization has to perform full materialization before it can
do the first subquery evaluation. This WL's code has almost no startup
costs.
- This optimization has temp.table of (corr_reference, predicate_value),
while IN-materialization will have (corr_reference) only.
5. User interface
-----------------
* There will be an @@optimizer_switch flag to turn this optimization on and
off (TODO: name of the flag?)
* TODO: how do we show this in EXPLAIN [EXTENDED]? The most easiest is to
print something in the warning text of EXPLAIN EXTEDED that would indicate
use of cache.
* temporary table sizing (max size for heap table, whether to use MyISAM or
Maria) will be controlled with common temp.table control variables.
LOW-LEVEL DESIGN:
All items on which subquery depend could be collected in
st_select_lex::mark_as_dependent (direct of indirect reference?)
Temporary table index should be created by all fields except result field
(TMP_TABLE_PARAM::keyinfo).
How to make lookups into temptable
----------------------------------
We'll reuse approach used by SJ-Materialization in 6.0.
Setup process
~~~~~~~~~~~~~
Setup is performed in the same way as in setup_sj_materialization(),
see the code that starts these lines:
/*
Create/initialize everything we will need to index lookups into the
temptable.
*/
and ends at this line:
Remove the injected semi-join IN-equalities from join_tab conds. This
<questionable>
We'll also need to check equalities, i.e. do an equivalent of this:
if (!(sjm->in_equality= create_subq_in_equalities(thd, sjm,
emb_sj_nest->sj_subq_pred)))
DBUG_RETURN(TRUE); /* purecov: inspected */
Question: or perhaps that is not necessarry?
</questionable>
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] [Branch ~maria-captains/maria/5.1] Rev 2801: Fixes for some randomly occuring test failures in Buildbot.
by noreply@launchpad.net 18 Jan '10
by noreply@launchpad.net 18 Jan '10
18 Jan '10
------------------------------------------------------------
revno: 2801
committer: knielsen(a)knielsen-hq.org
branch nick: work-5.1
timestamp: Mon 2010-01-18 13:56:10 +0100
message:
Fixes for some randomly occuring test failures in Buildbot.
modified:
mysql-test/r/innodb-timeout.result
mysql-test/suite/rpl/r/rpl_relayspace.result
mysql-test/suite/rpl/t/rpl_relayspace.test
mysql-test/t/innodb-timeout.test
mysql-test/valgrind.supp
--
lp:maria
https://code.launchpad.net/~maria-captains/maria/5.1
Your team Maria developers is subscribed to branch lp:maria.
To unsubscribe from this branch go to https://code.launchpad.net/~maria-captains/maria/5.1/+edit-subscription.
1
0
[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (knielsen:2801)
by knielsen@knielsen-hq.org 18 Jan '10
by knielsen@knielsen-hq.org 18 Jan '10
18 Jan '10
#At lp:maria
2801 knielsen(a)knielsen-hq.org 2010-01-18
Fixes for some randomly occuring test failures in Buildbot.
modified:
mysql-test/r/innodb-timeout.result
mysql-test/suite/rpl/r/rpl_relayspace.result
mysql-test/suite/rpl/t/rpl_relayspace.test
mysql-test/t/innodb-timeout.test
mysql-test/valgrind.supp
per-file messages:
mysql-test/r/innodb-timeout.result
Make test more robust to scheduling delays on the host running the test suite.
mysql-test/suite/rpl/r/rpl_relayspace.result
Apply patch from BUG#25228 and tweak timeout value in an attempt to fix random
failure of this test in Buildbot (could not repeat locally).
mysql-test/suite/rpl/t/rpl_relayspace.test
Apply patch from BUG#25228 and tweak timeout value in an attempt to fix random
failure of this test in Buildbot (could not repeat locally).
mysql-test/t/innodb-timeout.test
Make test more robust to scheduling delays on the host running the test suite.
mysql-test/valgrind.supp
Add suppression for Glibc bug.
=== modified file 'mysql-test/r/innodb-timeout.result'
--- a/mysql-test/r/innodb-timeout.result 2009-06-09 13:19:13 +0000
+++ b/mysql-test/r/innodb-timeout.result 2010-01-18 12:56:10 +0000
@@ -23,6 +23,7 @@ select @@innodb_lock_wait_timeout;
create table t1(a int primary key)engine=innodb;
begin;
insert into t1 values(1),(2),(3);
+set innodb_lock_wait_timeout=5;
select * from t1 for update;
commit;
a
@@ -31,8 +32,15 @@ a
3
begin;
insert into t1 values(4);
+set innodb_lock_wait_timeout=2;
+set @a= current_timestamp();
select * from t1 for update;
-commit;
ERROR HY000: Lock wait timeout exceeded; try restarting transaction
+set @b= current_timestamp();
+set @c= timestampdiff(SECOND, @a, @b);
+select if(@c >= 1 and @c <= 10, 'OK', concat("NOT OK, time passed=", @c));
+if(@c >= 1 and @c <= 10, 'OK', concat("NOT OK, time passed=", @c))
+OK
+commit;
drop table t1;
set global innodb_lock_wait_timeout=50;
=== modified file 'mysql-test/suite/rpl/r/rpl_relayspace.result'
--- a/mysql-test/suite/rpl/r/rpl_relayspace.result 2007-06-27 12:28:02 +0000
+++ b/mysql-test/suite/rpl/r/rpl_relayspace.result 2010-01-18 12:56:10 +0000
@@ -14,6 +14,6 @@ start slave io_thread;
stop slave io_thread;
reset slave;
start slave;
-select master_pos_wait('master-bin.001',200,6)=-1;
-master_pos_wait('master-bin.001',200,6)=-1
+select master_pos_wait('MASTER_LOG_FILE',200,30)=-1;
+master_pos_wait('MASTER_LOG_FILE',200,30)=-1
0
=== modified file 'mysql-test/suite/rpl/t/rpl_relayspace.test'
--- a/mysql-test/suite/rpl/t/rpl_relayspace.test 2008-02-28 11:36:14 +0000
+++ b/mysql-test/suite/rpl/t/rpl_relayspace.test 2010-01-18 12:56:10 +0000
@@ -2,8 +2,10 @@
# to force the deadlock after one event.
source include/master-slave.inc;
+let $master_log_file= query_get_value(SHOW MASTER STATUS, File, 1);
connection slave;
stop slave;
+source include/wait_for_slave_to_stop.inc;
connection master;
# This will generate a master's binlog > 10 bytes
create table t1 (a int);
@@ -20,6 +22,7 @@ source include/wait_for_slave_param.inc;
# A bug caused the I/O thread to refuse stopping.
stop slave io_thread;
+source include/wait_for_slave_io_to_stop.inc;
reset slave;
start slave;
# The I/O thread stops filling the relay log when
@@ -29,9 +32,11 @@ start slave;
# So we should have a deadlock.
# if it is not resolved automatically we'll detect
# it with master_pos_wait that waits for farther than 1Ob;
-# it will timeout after 10 seconds;
+# it will timeout;
# also the slave will probably not cooperate to shutdown
# (as 2 threads are locked)
-select master_pos_wait('master-bin.001',200,6)=-1;
+--replace_result $master_log_file MASTER_LOG_FILE
+eval select master_pos_wait('$master_log_file',200,30)=-1;
+
# End of 4.1 tests
=== modified file 'mysql-test/t/innodb-timeout.test'
--- a/mysql-test/t/innodb-timeout.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb-timeout.test 2010-01-18 12:56:10 +0000
@@ -30,6 +30,7 @@ begin;
insert into t1 values(1),(2),(3);
connection b;
+set innodb_lock_wait_timeout=5;
--send
select * from t1 for update;
@@ -44,16 +45,20 @@ begin;
insert into t1 values(4);
connection b;
---send
+# Test that we get a lock timeout.
+# We cannot reliably test that the timeout is exactly 1 seconds due to
+# process scheduling differences on the host running the test suite. But we
+# can test that we are within reasonable range.
+set innodb_lock_wait_timeout=2;
+set @a= current_timestamp();
+--error ER_LOCK_WAIT_TIMEOUT
select * from t1 for update;
+set @b= current_timestamp();
+set @c= timestampdiff(SECOND, @a, @b);
+select if(@c >= 1 and @c <= 10, 'OK', concat("NOT OK, time passed=", @c));
connection a;
-sleep 2;
commit;
-
-connection b;
---error ER_LOCK_WAIT_TIMEOUT
-reap;
drop table t1;
connection default;
=== modified file 'mysql-test/valgrind.supp'
--- a/mysql-test/valgrind.supp 2009-12-03 11:19:05 +0000
+++ b/mysql-test/valgrind.supp 2010-01-18 12:56:10 +0000
@@ -991,3 +991,15 @@
fun:_dl_allocate_tls
fun:pthread_create*
}
+
+#
+# Bug in Glibc 2.9: http://sourceware.org/bugzilla/show_bug.cgi?id=10391
+# Fixed in latest Glibc, but suppressed here for running tests on hosts
+# with older Glibc version.
+#
+{
+ Glibc bug in __libc_res_nsend
+ Memcheck:Cond
+ fun:__libc_res_nsend
+ fun:__libc_res_nquery
+}
1
0
[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (knielsen:2801)
by knielsen@knielsen-hq.org 18 Jan '10
by knielsen@knielsen-hq.org 18 Jan '10
18 Jan '10
#At lp:maria
2801 knielsen(a)knielsen-hq.org 2010-01-18
Fixes for some randomly occuring test failures in Buildbot.
modified:
mysql-test/r/innodb-timeout.result
mysql-test/suite/rpl/r/rpl_relayspace.result
mysql-test/suite/rpl/t/rpl_relayspace.test
mysql-test/t/innodb-timeout.test
mysql-test/valgrind.supp
per-file messages:
mysql-test/r/innodb-timeout.result
Make test more robust to scheduling delays on the host running the test suite.
mysql-test/suite/rpl/r/rpl_relayspace.result
Apply patch from BUG#25228 and tweak timeout value in an attempt to fix random
failure of this test in Buildbot (could not repeat locally).
mysql-test/suite/rpl/t/rpl_relayspace.test
Apply patch from BUG#25228 and tweak timeout value in an attempt to fix random
failure of this test in Buildbot (could not repeat locally).
mysql-test/t/innodb-timeout.test
Make test more robust to scheduling delays on the host running the test suite.
mysql-test/valgrind.supp
Add suppression for Glibc bug.
=== modified file 'mysql-test/r/innodb-timeout.result'
--- a/mysql-test/r/innodb-timeout.result 2009-06-09 13:19:13 +0000
+++ b/mysql-test/r/innodb-timeout.result 2010-01-18 11:29:53 +0000
@@ -23,6 +23,7 @@ select @@innodb_lock_wait_timeout;
create table t1(a int primary key)engine=innodb;
begin;
insert into t1 values(1),(2),(3);
+set innodb_lock_wait_timeout=5;
select * from t1 for update;
commit;
a
@@ -31,8 +32,15 @@ a
3
begin;
insert into t1 values(4);
+set innodb_lock_wait_timeout=1;
+set @a= current_timestamp();
select * from t1 for update;
-commit;
ERROR HY000: Lock wait timeout exceeded; try restarting transaction
+set @b= current_timestamp();
+set @c= timestampdiff(SECOND, @a, @b);
+select if(@c >= 1 and @c <= 10, 'OK', concat("NOT OK, time passed=", @c));
+if(@c >= 1 and @c <= 10, 'OK', concat("NOT OK, time passed=", @c))
+OK
+commit;
drop table t1;
set global innodb_lock_wait_timeout=50;
=== modified file 'mysql-test/suite/rpl/r/rpl_relayspace.result'
--- a/mysql-test/suite/rpl/r/rpl_relayspace.result 2007-06-27 12:28:02 +0000
+++ b/mysql-test/suite/rpl/r/rpl_relayspace.result 2010-01-18 11:29:53 +0000
@@ -14,6 +14,6 @@ start slave io_thread;
stop slave io_thread;
reset slave;
start slave;
-select master_pos_wait('master-bin.001',200,6)=-1;
-master_pos_wait('master-bin.001',200,6)=-1
+select master_pos_wait('MASTER_LOG_FILE',200,30)=-1;
+master_pos_wait('MASTER_LOG_FILE',200,30)=-1
0
=== modified file 'mysql-test/suite/rpl/t/rpl_relayspace.test'
--- a/mysql-test/suite/rpl/t/rpl_relayspace.test 2008-02-28 11:36:14 +0000
+++ b/mysql-test/suite/rpl/t/rpl_relayspace.test 2010-01-18 11:29:53 +0000
@@ -2,8 +2,10 @@
# to force the deadlock after one event.
source include/master-slave.inc;
+let $master_log_file= query_get_value(SHOW MASTER STATUS, File, 1);
connection slave;
stop slave;
+source include/wait_for_slave_to_stop.inc;
connection master;
# This will generate a master's binlog > 10 bytes
create table t1 (a int);
@@ -20,6 +22,7 @@ source include/wait_for_slave_param.inc;
# A bug caused the I/O thread to refuse stopping.
stop slave io_thread;
+source include/wait_for_slave_io_to_stop.inc;
reset slave;
start slave;
# The I/O thread stops filling the relay log when
@@ -29,9 +32,11 @@ start slave;
# So we should have a deadlock.
# if it is not resolved automatically we'll detect
# it with master_pos_wait that waits for farther than 1Ob;
-# it will timeout after 10 seconds;
+# it will timeout;
# also the slave will probably not cooperate to shutdown
# (as 2 threads are locked)
-select master_pos_wait('master-bin.001',200,6)=-1;
+--replace_result $master_log_file MASTER_LOG_FILE
+eval select master_pos_wait('$master_log_file',200,30)=-1;
+
# End of 4.1 tests
=== modified file 'mysql-test/t/innodb-timeout.test'
--- a/mysql-test/t/innodb-timeout.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb-timeout.test 2010-01-18 11:29:53 +0000
@@ -30,6 +30,7 @@ begin;
insert into t1 values(1),(2),(3);
connection b;
+set innodb_lock_wait_timeout=5;
--send
select * from t1 for update;
@@ -44,16 +45,20 @@ begin;
insert into t1 values(4);
connection b;
---send
+# Test that we get a lock timeout.
+# We cannot reliably test that the timeout is exactly 1 seconds due to
+# process scheduling differences on the host running the test suite. But we
+# can test that we are within reasonable range.
+set innodb_lock_wait_timeout=1;
+set @a= current_timestamp();
+--error ER_LOCK_WAIT_TIMEOUT
select * from t1 for update;
+set @b= current_timestamp();
+set @c= timestampdiff(SECOND, @a, @b);
+select if(@c >= 1 and @c <= 10, 'OK', concat("NOT OK, time passed=", @c));
connection a;
-sleep 2;
commit;
-
-connection b;
---error ER_LOCK_WAIT_TIMEOUT
-reap;
drop table t1;
connection default;
=== modified file 'mysql-test/valgrind.supp'
--- a/mysql-test/valgrind.supp 2009-12-03 11:19:05 +0000
+++ b/mysql-test/valgrind.supp 2010-01-18 11:29:53 +0000
@@ -991,3 +991,15 @@
fun:_dl_allocate_tls
fun:pthread_create*
}
+
+#
+# Bug in Glibc 2.9: http://sourceware.org/bugzilla/show_bug.cgi?id=10391
+# Fixed in latest Glibc, but suppressed here for running tests on hosts
+# with older Glibc version.
+#
+{
+ Glibc bug in __libc_res_nsend
+ Memcheck:Cond
+ fun:__libc_res_nsend
+ fun:__libc_res_nquery
+}
1
0
Hi Daniel,
I just pushed the merge of MySQL and XtraDB into main MariaDB.
This means that the next MariaDB release, MariaDB-5.1.42, will contain MySQL
5.1.42 and XtraDB 9.
I hope you can use this information to obtain appropriate information from the
respective projects to prepare release notes.
- Kristian.
2
1