developers
Threads by month
- ----- 2025 -----
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- 5 participants
- 6818 discussions
[Maria-developers] Updated (by Guest): Table elimination (17)
by worklog-noreply@askmonty.org 03 Jun '09
by worklog-noreply@askmonty.org 03 Jun '09
03 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination
CREATION DATE..: Sun, 10 May 2009, 19:57
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 17 (http://askmonty.org/worklog/?tid=17)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Wed, 03 Jun 2009, 15:04)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.20378 2009-06-03 15:04:54.000000000 +0300
+++ /tmp/wklog.17.new.20378 2009-06-03 15:04:54.000000000 +0300
@@ -135,3 +135,8 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-=-=(Psergey - Wed, 03 Jun 2009, 12:07)=-=-
Dependency created: 29 now depends on 17
-=-=(Guest - Tue, 02 Jun 2009, 00:54)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.23548 2009-06-02 00:54:13.000000000 +0300
+++ /tmp/wklog.17.new.23548 2009-06-02 00:54:13.000000000 +0300
@@ -128,3 +128,10 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+
+* Table elimination is performed after constant table detection (but before
+ the range analysis). Constant tables are technically different from
+ eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
+ Considering we've already done the join_read_const_table() call, is there any
+ real difference between constant table and eliminated one? If there is, should
+ we mark const tables also as eliminated?
-=-=(Psergey - Mon, 01 Jun 2009, 20:46)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.17448 2009-06-01 20:46:40.000000000 +0300
+++ /tmp/wklog.17.new.17448 2009-06-01 20:46:40.000000000 +0300
@@ -122,3 +122,9 @@
always. If we want table elimination to work in presence of grouping, need
to devise some other way of analyzing aggregate functions.
+
+* Should eliminated tables be shown in EXPLAIN EXTENDED?
+ - If we just ignore the question, they will be shown
+ - this is what happens for constant tables, too.
+ - I don't see how showing them could be of any use. They only make it
+ harder to read the rewritten query.
-=-=(Guest - Mon, 01 Jun 2009, 12:49)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.32202 2009-06-01 12:49:15.000000000 +0300
+++ /tmp/wklog.17.new.32202 2009-06-01 12:49:15.000000000 +0300
@@ -8,7 +8,7 @@
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
-
+7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
@@ -116,3 +116,9 @@
* We remove ON clauses within semi-join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
+* Aggregate functions report they depend on all tables, that is,
+
+ item_agg_func->used_tables() == (1ULL << join->tables) - 1
+
+ always. If we want table elimination to work in presence of grouping, need
+ to devise some other way of analyzing aggregate functions.
-=-=(Guest - Fri, 29 May 2009, 00:45)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.1348 2009-05-29 00:45:21.000000000 +0300
+++ /tmp/wklog.17.new.1348 2009-05-29 00:45:21.000000000 +0300
@@ -111,3 +111,8 @@
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
+7. Additional issues
+--------------------
+* We remove ON clauses within semi-join nests. If these clauses contain
+ subqueries, they probably should be gone from EXPLAIN output also?
+
-=-=(Guest - Tue, 26 May 2009, 21:52)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.14120 2009-05-26 21:52:06.000000000 +0300
+++ /tmp/wklog.17.new.14120 2009-05-26 21:52:06.000000000 +0300
@@ -1,11 +1,14 @@
<contents>
1. Conditions for removal
+1.1 Quick check if there are candidates
2. Removal operation properties
3. Removal operation
4. User interface
-5. Todo, issues to resolve
-5.1 To resolve
-5.2 Resolved
+5. Tests and benchmarks
+6. Todo, issues to resolve
+6.1 To resolve
+6.2 Resolved
+
</contents>
It's not really about elimination of tables, it's about elimination of inner
@@ -29,6 +32,18 @@
GROUP BY and HAVING do not refer to the inner tables of the outer join
nest.
+1.1 Quick check if there are candidates
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Before we start to enumerate join nests, here is a quick way to check if
+there *can be* something to be removed:
+
+ if ((tables used in select_list |
+ tables used in group/order by UNION |
+ tables used in where) != bitmap_of_all_tables)
+ {
+ attempt table elimination;
+ }
+
2. Removal operation properties
-------------------------------
* There is always one way to remove (no choice to remove either this or that)
@@ -56,22 +71,24 @@
-----------------
* We'll add an @@optimizer switch flag for table elimination. Tentative
name: 'table_elimination'.
+ (Note ^^ utility of the above questioned ^, as table elimination can never
+ be worse than no elimination. We're leaning towards not adding the flag)
-* With EXPLAIN, there are two options:
- - Show removed tables in a way similar to const tables, with some
- indication that they are removed.
- - Do not show them altogether.
-(the second one seems to be better? We're targeting a situation with VIEWs,
-where the user would not care about what tables were added into his query
-and then discarded from it?)
+* EXPLAIN will not show the removed tables at all. This will allow to check
+ if tables were removed, and also will behave nicely with anchor model and
+ VIEWs: stuff that user doesn't care about just won't be there.
+
+5. Tests and benchmarks
+-----------------------
+Should create a benchmark in sql-bench which checks if the dbms has table
+elimination.
+TODO elaborate
-5. Todo, issues to resolve
+6. Todo, issues to resolve
--------------------------
-5.1 To resolve
+6.1 To resolve
~~~~~~~~~~~~~~
-- See EXPLAIN question in section #4.
-
- Re-check how this works with equality propagation.
- Relationship with prepared statements.
@@ -87,7 +104,7 @@
that we'll meet outer joins which have N inner tables of which some are 1-row
MyISAM tables that do not have primary key.
-5.2 Resolved
+6.2 Resolved
~~~~~~~~~~~~
- outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
-=-=(Guest - Fri, 22 May 2009, 17:23)=-=-
High-Level Specification modified.
--- /tmp/wklog.17.old.30851 2009-05-22 17:23:38.000000000 +0300
+++ /tmp/wklog.17.new.30851 2009-05-22 17:23:38.000000000 +0300
@@ -6,7 +6,7 @@
elimination but not to the same extent.
Basically, what table elimination does, is to remove tables from the
-execution plan when it is unneccessary to include them. This can, of
+execution plan when it is unnecessary to include them. This can, of
course, only happen if the right circumstances arise. Let us for example
look at the following query:
@@ -22,30 +22,26 @@
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
-contain a NULL value.
+still contain it's original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB. If
-A
-contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
+A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
then two rows will match in the join condition. The only way to know
-what
-the result will look like is to actually touch both tables during
+what the result will look like is to actually touch both tables during
execution.
Instead, let's say that tableB contains rows that make it possible to
place a unique constraint on the column B.id, for example and often the
case a primary key. In this situation we know that we will get exactly
-as
-many rows as there are in tableA, since joining with tableB cannot
+as many rows as there are in tableA, since joining with tableB cannot
introduce any duplicates. If further, as in the example query, we do not
select any columns from tableB, touching that table during execution is
-unneccessary. We can remove the whole join operation from the execution
+unnecessary. We can remove the whole join operation from the execution
plan.
Both SQL Server 2005/2008 and Oracle 11g will deploy table elimination
-in
-the case described above. Let us look at a more advanced query, where
+in the case described above. Let us look at a more advanced query, where
Oracle fails.
select
-=-=(Guest - Fri, 22 May 2009, 17:00)=-=-
Version updated.
--- /tmp/wklog.17.old.30176 2009-05-22 17:00:35.000000000 +0300
+++ /tmp/wklog.17.new.30176 2009-05-22 17:00:35.000000000 +0300
@@ -1 +1 @@
-Maria-2.0
+Server-5.1
-=-=(Guest - Fri, 22 May 2009, 17:00)=-=-
Category updated.
--- /tmp/wklog.17.old.30162 2009-05-22 17:00:28.000000000 +0300
+++ /tmp/wklog.17.new.30162 2009-05-22 17:00:28.000000000 +0300
@@ -1 +1 @@
-Maria-Sprint
+Server-Sprint
------------------------------------------------------------
-=-=(View All Progress Notes, 20 total)=-=-
http://askmonty.org/worklog/index.pl?tid=17&nolimit=1
DESCRIPTION:
Eliminate not needed tables from SELECT queries..
This will speed up some views and automatically generated queries.
Example:
CREATE TABLE B (id int primary key);
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
In this case we can remove table B and the join from the query.
HIGH-LEVEL SPECIFICATION:
Here is an extended explanation of table elimination.
Table elimination is a feature found in some modern query optimizers, of
which Microsoft SQL Server 2005/2008 seems to have the most advanced
implementation. Oracle 11g has also been confirmed to use table
elimination but not to the same extent.
Basically, what table elimination does, is to remove tables from the
execution plan when it is unnecessary to include them. This can, of
course, only happen if the right circumstances arise. Let us for example
look at the following query:
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
still contain it's original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB. If
A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
then two rows will match in the join condition. The only way to know
what the result will look like is to actually touch both tables during
execution.
Instead, let's say that tableB contains rows that make it possible to
place a unique constraint on the column B.id, for example and often the
case a primary key. In this situation we know that we will get exactly
as many rows as there are in tableA, since joining with tableB cannot
introduce any duplicates. If further, as in the example query, we do not
select any columns from tableB, touching that table during execution is
unnecessary. We can remove the whole join operation from the execution
plan.
Both SQL Server 2005/2008 and Oracle 11g will deploy table elimination
in the case described above. Let us look at a more advanced query, where
Oracle fails.
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (
select
max(sub.fromDate)
from
tableB sub
where
sub.id = A.id
);
In this example we have added another join condition, which ensures
that we only pick the matching row from tableB having the latest
fromDate. In this case tableB will contain duplicates of the column
B.id, so in order to ensure uniqueness the primary key has to contain
the fromDate column as well. In other words the primary key of tableB
is (B.id, B.fromDate).
Furthermore, since the subselect ensures that we only pick the latest
B.fromDate for a given B.id we know that at most one row will match
the join condition. We will again have the situation where joining
with tableB cannot affect the number of rows in the result set. Since
we do not select any columns from tableB, the whole join operation can
be eliminated from the execution plan.
SQL Server 2005/2008 will deploy table elimination in this situation as
well. We have not found a way to make Oracle 11g use it for this type of
query. Queries like these arise in two situations. Either when you have
denormalized model consisting of a fact table with several related
dimension tables, or when you have a highly normalized model where each
attribute is stored in its own table. The example with the subselect is
common whenever you store historized/versioned data.
LOW-LEVEL DESIGN:
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
2. Removal operation properties
3. Removal operation
4. User interface
5. Tests and benchmarks
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
sides of outer joins.
1. Conditions for removal
-------------------------
We can eliminate an inner side of outer join if:
1. For each record combination of outer tables, it will always produce
exactly one record.
2. There are no references to columns of the inner tables anywhere else in
the query.
#1 means that every table inside the outer join nest is:
- is a constant table:
= because it can be accessed via eq_ref(const) access, or
= it is a zero-rows or one-row MyISAM-like table [MARK1]
- has an eq_ref access method candidate.
#2 means that WHERE clause, ON clauses of embedding outer joins, ORDER BY,
GROUP BY and HAVING do not refer to the inner tables of the outer join
nest.
1.1 Quick check if there are candidates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before we start to enumerate join nests, here is a quick way to check if
there *can be* something to be removed:
if ((tables used in select_list |
tables used in group/order by UNION |
tables used in where) != bitmap_of_all_tables)
{
attempt table elimination;
}
2. Removal operation properties
-------------------------------
* There is always one way to remove (no choice to remove either this or that)
* It is always better to remove as much tables as possible (at least within
our cost model).
Thus, no need for any cost calculations/etc. It's an unconditional rewrite.
3. Removal operation
--------------------
* Remove the outer join nest's nested join structure (i.e. get the
outer join's TABLE_LIST object $OJ and remove it from $OJ->embedding,
$OJ->embedding->nested_join. Update table_map's of all ancestor nested
joins). [MARK2]
* Move the tables and their JOIN_TABs to front like it is done with const
tables, with exception that if eliminated outer join nest was within
another outer join nest, that shouldn't prevent us from moving away the
eliminated tables.
* Update join->table_count and all-join-tables bitmap.
* That's it. Nothing else?
4. User interface
-----------------
* We'll add an @@optimizer switch flag for table elimination. Tentative
name: 'table_elimination'.
(Note ^^ utility of the above questioned ^, as table elimination can never
be worse than no elimination. We're leaning towards not adding the flag)
* EXPLAIN will not show the removed tables at all. This will allow to check
if tables were removed, and also will behave nicely with anchor model and
VIEWs: stuff that user doesn't care about just won't be there.
5. Tests and benchmarks
-----------------------
Should create a benchmark in sql-bench which checks if the dbms has table
elimination.
TODO elaborate
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
- Re-check how this works with equality propagation.
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
to limit the applicability by removing [MARK1] as that can change during
lifetime of the statement.
The other option is to do table elimination every time. This will require to
rework operation [MARK2] to be undoable.
I'm leaning towards doing the former. With anchor modeling, it is unlikely
that we'll meet outer joins which have N inner tables of which some are 1-row
MyISAM tables that do not have primary key.
6.2 Resolved
~~~~~~~~~~~~
- outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
7. Additional issues
--------------------
* We remove ON clauses within semi-join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
* Aggregate functions report they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
always. If we want table elimination to work in presence of grouping, need
to devise some other way of analyzing aggregate functions.
* Should eliminated tables be shown in EXPLAIN EXTENDED?
- If we just ignore the question, they will be shown
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- affected tables must not be eliminated
- tables that are used on the right side of the SET x=y assignments must
not be eliminated either.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Guest): Table elimination (17)
by worklog-noreply@askmonty.org 03 Jun '09
by worklog-noreply@askmonty.org 03 Jun '09
03 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination
CREATION DATE..: Sun, 10 May 2009, 19:57
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 17 (http://askmonty.org/worklog/?tid=17)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Wed, 03 Jun 2009, 15:04)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.20378 2009-06-03 15:04:54.000000000 +0300
+++ /tmp/wklog.17.new.20378 2009-06-03 15:04:54.000000000 +0300
@@ -135,3 +135,8 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-=-=(Psergey - Wed, 03 Jun 2009, 12:07)=-=-
Dependency created: 29 now depends on 17
-=-=(Guest - Tue, 02 Jun 2009, 00:54)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.23548 2009-06-02 00:54:13.000000000 +0300
+++ /tmp/wklog.17.new.23548 2009-06-02 00:54:13.000000000 +0300
@@ -128,3 +128,10 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+
+* Table elimination is performed after constant table detection (but before
+ the range analysis). Constant tables are technically different from
+ eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
+ Considering we've already done the join_read_const_table() call, is there any
+ real difference between constant table and eliminated one? If there is, should
+ we mark const tables also as eliminated?
-=-=(Psergey - Mon, 01 Jun 2009, 20:46)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.17448 2009-06-01 20:46:40.000000000 +0300
+++ /tmp/wklog.17.new.17448 2009-06-01 20:46:40.000000000 +0300
@@ -122,3 +122,9 @@
always. If we want table elimination to work in presence of grouping, need
to devise some other way of analyzing aggregate functions.
+
+* Should eliminated tables be shown in EXPLAIN EXTENDED?
+ - If we just ignore the question, they will be shown
+ - this is what happens for constant tables, too.
+ - I don't see how showing them could be of any use. They only make it
+ harder to read the rewritten query.
-=-=(Guest - Mon, 01 Jun 2009, 12:49)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.32202 2009-06-01 12:49:15.000000000 +0300
+++ /tmp/wklog.17.new.32202 2009-06-01 12:49:15.000000000 +0300
@@ -8,7 +8,7 @@
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
-
+7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
@@ -116,3 +116,9 @@
* We remove ON clauses within semi-join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
+* Aggregate functions report they depend on all tables, that is,
+
+ item_agg_func->used_tables() == (1ULL << join->tables) - 1
+
+ always. If we want table elimination to work in presence of grouping, need
+ to devise some other way of analyzing aggregate functions.
-=-=(Guest - Fri, 29 May 2009, 00:45)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.1348 2009-05-29 00:45:21.000000000 +0300
+++ /tmp/wklog.17.new.1348 2009-05-29 00:45:21.000000000 +0300
@@ -111,3 +111,8 @@
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
+7. Additional issues
+--------------------
+* We remove ON clauses within semi-join nests. If these clauses contain
+ subqueries, they probably should be gone from EXPLAIN output also?
+
-=-=(Guest - Tue, 26 May 2009, 21:52)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.14120 2009-05-26 21:52:06.000000000 +0300
+++ /tmp/wklog.17.new.14120 2009-05-26 21:52:06.000000000 +0300
@@ -1,11 +1,14 @@
<contents>
1. Conditions for removal
+1.1 Quick check if there are candidates
2. Removal operation properties
3. Removal operation
4. User interface
-5. Todo, issues to resolve
-5.1 To resolve
-5.2 Resolved
+5. Tests and benchmarks
+6. Todo, issues to resolve
+6.1 To resolve
+6.2 Resolved
+
</contents>
It's not really about elimination of tables, it's about elimination of inner
@@ -29,6 +32,18 @@
GROUP BY and HAVING do not refer to the inner tables of the outer join
nest.
+1.1 Quick check if there are candidates
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Before we start to enumerate join nests, here is a quick way to check if
+there *can be* something to be removed:
+
+ if ((tables used in select_list |
+ tables used in group/order by UNION |
+ tables used in where) != bitmap_of_all_tables)
+ {
+ attempt table elimination;
+ }
+
2. Removal operation properties
-------------------------------
* There is always one way to remove (no choice to remove either this or that)
@@ -56,22 +71,24 @@
-----------------
* We'll add an @@optimizer switch flag for table elimination. Tentative
name: 'table_elimination'.
+ (Note ^^ utility of the above questioned ^, as table elimination can never
+ be worse than no elimination. We're leaning towards not adding the flag)
-* With EXPLAIN, there are two options:
- - Show removed tables in a way similar to const tables, with some
- indication that they are removed.
- - Do not show them altogether.
-(the second one seems to be better? We're targeting a situation with VIEWs,
-where the user would not care about what tables were added into his query
-and then discarded from it?)
+* EXPLAIN will not show the removed tables at all. This will allow to check
+ if tables were removed, and also will behave nicely with anchor model and
+ VIEWs: stuff that user doesn't care about just won't be there.
+
+5. Tests and benchmarks
+-----------------------
+Should create a benchmark in sql-bench which checks if the dbms has table
+elimination.
+TODO elaborate
-5. Todo, issues to resolve
+6. Todo, issues to resolve
--------------------------
-5.1 To resolve
+6.1 To resolve
~~~~~~~~~~~~~~
-- See EXPLAIN question in section #4.
-
- Re-check how this works with equality propagation.
- Relationship with prepared statements.
@@ -87,7 +104,7 @@
that we'll meet outer joins which have N inner tables of which some are 1-row
MyISAM tables that do not have primary key.
-5.2 Resolved
+6.2 Resolved
~~~~~~~~~~~~
- outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
-=-=(Guest - Fri, 22 May 2009, 17:23)=-=-
High-Level Specification modified.
--- /tmp/wklog.17.old.30851 2009-05-22 17:23:38.000000000 +0300
+++ /tmp/wklog.17.new.30851 2009-05-22 17:23:38.000000000 +0300
@@ -6,7 +6,7 @@
elimination but not to the same extent.
Basically, what table elimination does, is to remove tables from the
-execution plan when it is unneccessary to include them. This can, of
+execution plan when it is unnecessary to include them. This can, of
course, only happen if the right circumstances arise. Let us for example
look at the following query:
@@ -22,30 +22,26 @@
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
-contain a NULL value.
+still contain it's original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB. If
-A
-contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
+A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
then two rows will match in the join condition. The only way to know
-what
-the result will look like is to actually touch both tables during
+what the result will look like is to actually touch both tables during
execution.
Instead, let's say that tableB contains rows that make it possible to
place a unique constraint on the column B.id, for example and often the
case a primary key. In this situation we know that we will get exactly
-as
-many rows as there are in tableA, since joining with tableB cannot
+as many rows as there are in tableA, since joining with tableB cannot
introduce any duplicates. If further, as in the example query, we do not
select any columns from tableB, touching that table during execution is
-unneccessary. We can remove the whole join operation from the execution
+unnecessary. We can remove the whole join operation from the execution
plan.
Both SQL Server 2005/2008 and Oracle 11g will deploy table elimination
-in
-the case described above. Let us look at a more advanced query, where
+in the case described above. Let us look at a more advanced query, where
Oracle fails.
select
-=-=(Guest - Fri, 22 May 2009, 17:00)=-=-
Version updated.
--- /tmp/wklog.17.old.30176 2009-05-22 17:00:35.000000000 +0300
+++ /tmp/wklog.17.new.30176 2009-05-22 17:00:35.000000000 +0300
@@ -1 +1 @@
-Maria-2.0
+Server-5.1
-=-=(Guest - Fri, 22 May 2009, 17:00)=-=-
Category updated.
--- /tmp/wklog.17.old.30162 2009-05-22 17:00:28.000000000 +0300
+++ /tmp/wklog.17.new.30162 2009-05-22 17:00:28.000000000 +0300
@@ -1 +1 @@
-Maria-Sprint
+Server-Sprint
------------------------------------------------------------
-=-=(View All Progress Notes, 20 total)=-=-
http://askmonty.org/worklog/index.pl?tid=17&nolimit=1
DESCRIPTION:
Eliminate not needed tables from SELECT queries..
This will speed up some views and automatically generated queries.
Example:
CREATE TABLE B (id int primary key);
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
In this case we can remove table B and the join from the query.
HIGH-LEVEL SPECIFICATION:
Here is an extended explanation of table elimination.
Table elimination is a feature found in some modern query optimizers, of
which Microsoft SQL Server 2005/2008 seems to have the most advanced
implementation. Oracle 11g has also been confirmed to use table
elimination but not to the same extent.
Basically, what table elimination does, is to remove tables from the
execution plan when it is unnecessary to include them. This can, of
course, only happen if the right circumstances arise. Let us for example
look at the following query:
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
still contain it's original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB. If
A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
then two rows will match in the join condition. The only way to know
what the result will look like is to actually touch both tables during
execution.
Instead, let's say that tableB contains rows that make it possible to
place a unique constraint on the column B.id, for example and often the
case a primary key. In this situation we know that we will get exactly
as many rows as there are in tableA, since joining with tableB cannot
introduce any duplicates. If further, as in the example query, we do not
select any columns from tableB, touching that table during execution is
unnecessary. We can remove the whole join operation from the execution
plan.
Both SQL Server 2005/2008 and Oracle 11g will deploy table elimination
in the case described above. Let us look at a more advanced query, where
Oracle fails.
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (
select
max(sub.fromDate)
from
tableB sub
where
sub.id = A.id
);
In this example we have added another join condition, which ensures
that we only pick the matching row from tableB having the latest
fromDate. In this case tableB will contain duplicates of the column
B.id, so in order to ensure uniqueness the primary key has to contain
the fromDate column as well. In other words the primary key of tableB
is (B.id, B.fromDate).
Furthermore, since the subselect ensures that we only pick the latest
B.fromDate for a given B.id we know that at most one row will match
the join condition. We will again have the situation where joining
with tableB cannot affect the number of rows in the result set. Since
we do not select any columns from tableB, the whole join operation can
be eliminated from the execution plan.
SQL Server 2005/2008 will deploy table elimination in this situation as
well. We have not found a way to make Oracle 11g use it for this type of
query. Queries like these arise in two situations. Either when you have
denormalized model consisting of a fact table with several related
dimension tables, or when you have a highly normalized model where each
attribute is stored in its own table. The example with the subselect is
common whenever you store historized/versioned data.
LOW-LEVEL DESIGN:
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
2. Removal operation properties
3. Removal operation
4. User interface
5. Tests and benchmarks
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
sides of outer joins.
1. Conditions for removal
-------------------------
We can eliminate an inner side of outer join if:
1. For each record combination of outer tables, it will always produce
exactly one record.
2. There are no references to columns of the inner tables anywhere else in
the query.
#1 means that every table inside the outer join nest is:
- is a constant table:
= because it can be accessed via eq_ref(const) access, or
= it is a zero-rows or one-row MyISAM-like table [MARK1]
- has an eq_ref access method candidate.
#2 means that WHERE clause, ON clauses of embedding outer joins, ORDER BY,
GROUP BY and HAVING do not refer to the inner tables of the outer join
nest.
1.1 Quick check if there are candidates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before we start to enumerate join nests, here is a quick way to check if
there *can be* something to be removed:
if ((tables used in select_list |
tables used in group/order by UNION |
tables used in where) != bitmap_of_all_tables)
{
attempt table elimination;
}
2. Removal operation properties
-------------------------------
* There is always one way to remove (no choice to remove either this or that)
* It is always better to remove as much tables as possible (at least within
our cost model).
Thus, no need for any cost calculations/etc. It's an unconditional rewrite.
3. Removal operation
--------------------
* Remove the outer join nest's nested join structure (i.e. get the
outer join's TABLE_LIST object $OJ and remove it from $OJ->embedding,
$OJ->embedding->nested_join. Update table_map's of all ancestor nested
joins). [MARK2]
* Move the tables and their JOIN_TABs to front like it is done with const
tables, with exception that if eliminated outer join nest was within
another outer join nest, that shouldn't prevent us from moving away the
eliminated tables.
* Update join->table_count and all-join-tables bitmap.
* That's it. Nothing else?
4. User interface
-----------------
* We'll add an @@optimizer switch flag for table elimination. Tentative
name: 'table_elimination'.
(Note ^^ utility of the above questioned ^, as table elimination can never
be worse than no elimination. We're leaning towards not adding the flag)
* EXPLAIN will not show the removed tables at all. This will allow to check
if tables were removed, and also will behave nicely with anchor model and
VIEWs: stuff that user doesn't care about just won't be there.
5. Tests and benchmarks
-----------------------
Should create a benchmark in sql-bench which checks if the dbms has table
elimination.
TODO elaborate
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
- Re-check how this works with equality propagation.
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
to limit the applicability by removing [MARK1] as that can change during
lifetime of the statement.
The other option is to do table elimination every time. This will require to
rework operation [MARK2] to be undoable.
I'm leaning towards doing the former. With anchor modeling, it is unlikely
that we'll meet outer joins which have N inner tables of which some are 1-row
MyISAM tables that do not have primary key.
6.2 Resolved
~~~~~~~~~~~~
- outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
7. Additional issues
--------------------
* We remove ON clauses within semi-join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
* Aggregate functions report they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
always. If we want table elimination to work in presence of grouping, need
to devise some other way of analyzing aggregate functions.
* Should eliminated tables be shown in EXPLAIN EXTENDED?
- If we just ignore the question, they will be shown
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- affected tables must not be eliminated
- tables that are used on the right side of the SET x=y assignments must
not be eliminated either.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] New (by Psergey): index_merge optimization tasks (30)
by worklog-noreply@askmonty.org 03 Jun '09
by worklog-noreply@askmonty.org 03 Jun '09
03 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge optimization tasks
CREATION DATE..: Wed, 03 Jun 2009, 12:08
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Monty
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 30 (http://askmonty.org/worklog/?tid=30)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
This WL entry groups all index_merge optimization improvement tasks
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] New (by Psergey): index_merge optimization tasks (30)
by worklog-noreply@askmonty.org 03 Jun '09
by worklog-noreply@askmonty.org 03 Jun '09
03 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge optimization tasks
CREATION DATE..: Wed, 03 Jun 2009, 12:08
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Monty
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 30 (http://askmonty.org/worklog/?tid=30)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
This WL entry groups all index_merge optimization improvement tasks
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] New (by Psergey): Table elimination: all tasks (29)
by worklog-noreply@askmonty.org 03 Jun '09
by worklog-noreply@askmonty.org 03 Jun '09
03 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination: all tasks
CREATION DATE..: Wed, 03 Jun 2009, 12:07
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 29 (http://askmonty.org/worklog/?tid=29)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
This WL entry groups all table elimination tasks.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] New (by Psergey): Table elimination: all tasks (29)
by worklog-noreply@askmonty.org 03 Jun '09
by worklog-noreply@askmonty.org 03 Jun '09
03 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination: all tasks
CREATION DATE..: Wed, 03 Jun 2009, 12:07
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 29 (http://askmonty.org/worklog/?tid=29)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
This WL entry groups all table elimination tasks.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
Hi!
I cc:d this to the maria-developers list as there are probably others
that are interested in this discussion.
(I removed all references to the customer who is interested in getting
this solved)
>>>>> "Sergey" == Sergey Petrunya <psergey(a)askmonty.org> writes:
Sergey> Hi Monty,
Sergey> Please find below a summary of index_merge problems and suggestions on how
Sergey> we could address them, together with concerns:
Sergey> 1. index_merge/intersect is not used for range access scans.
Sergey> Described as WL#21. Seems to be a big task.
I extended the high level definition a bit to make the text more clear.
Any preliminary estimate you could do (in weeks)?
Sergey> One big issue is that wether we'll be able to make the optimizer make
Sergey> good choices. There are two concerns
Sergey> - Bad estimates/poor cost model
Sergey> - Correlation(s). When we consider index_merge/intersect for
Sergey> t.key1<c1 AND t.key2<c2
Sergey> we have no way to know whether the conditions are
Sergey> = always satisfied together (and thus there is no point to use index_merge)
Sergey> = never satisified together
Sergey> = have no correlation
Sergey> this will cause our estimates be inherently poor. Will they be
Sergey> satisfactory? In case they won't, should we provision for adding hints
Sergey> or something else?
I think that we should assume that there is always notable less rows to
retrieve when you can do an intersection than if we can't, if the
sizes of the sets are of same magnitude.
We should always do index merge (except if disabled by a hint) if the
number or rows matching the conditions are relatively small for all
involved keys.
After all, for normal size keys, we will get +100 keys for each key
block we read. If we can eliminate a couple of rows for each block we
read we will probably gain speed.
I talked with Igor about having live statistics in memory for the
result of index merge. If we could keep for each index combination
the number of rows found for each index and the number of rows
eliminated, we could quite soon know which index merge makes sense.
(This is an idea for the future).
I had some ideas to improve the temporary table solution; we can
discuss these separately.
Sergey> 2. Possible range access disables index_merge/[sort]union scans
Sergey> Described as WL#24. I've posted a fix suggestion there. I'm not sure if
Sergey> it will work to customers complete satisfaction:
Sergey> - whether the fix will handle all kinds of WHERE clauses he needs
Sergey> - what will happen when we enable cost-based choice between range access
Sergey> and index_merge/intersect (will there be poor plan choices due to
Sergey> wrong cost calculations?)
If you can add to the worklog what kind of WHERE we will be able to
optimize with your sugestions, we can then ask the customer to verify
if that is enough for him as a first step.
Sergey> 3. index_merge/intersect optimization is poor.
Sergey> Problems described in WL#26. At the moment I have only vague idea which
Sergey> direction we need to move to improve it.
Sergey> We could try grabbing low-hanging fruits, like
Sergey> = make sure index_merge/intersect is not picked when the range is
Sergey> available.
Wouldn't this cause more problems like described in #2 ?
Please add some example WHERE clauses to the worklog to make it clear
exactly what you mean.
Sergey> = change the the process of choosing best index_merge/intersect plan so
Sergey> that it doesn't construct apparently useless (e.g. redundant) plans.
Do you mean disregarding some index from the index_merge plan that
are covered by other index?
I think this is an obvious fix to make early.
Sergey> Or we could try re-working cost calculations so that the above is
Sergey> automatically taken care of and doesnt happen. The problem with this is
Sergey> that it's hard to estimate when we'll get at acceptable result then.
Lets start with the obvious and only when needed start thinking about
recalculating costs as these can easily lead to the some old working
queries are suddenly slower than before...
Sergey> It seems the first logical thing to do is #2.
Agree.
Sergey> Then we could pick between #1 and #3. #1 and #3 are related in some way as
Sergey> index_merge/intersect can make more assumptions when it optimizes for ROR
Sergey> scans only. On the other hand, Customer mentioned that if we fix #1, #3 will
Sergey> have easier job as he'll be able to remove all the multi-column indexes he
Sergey> had to create to be able to get ROR scans for every WHERE clause he has.
Agree that fixing #1 is the next logical step.
Lets discuss all these worklogs on IRC on Thursday and then decide in
which order and how we should do things.
Regards,
Monty
1
0
[Maria-developers] Updated (by Guest): index_merge: non-ROR intersection (21)
by worklog-noreply@askmonty.org 03 Jun '09
by worklog-noreply@askmonty.org 03 Jun '09
03 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: non-ROR intersection
CREATION DATE..: Thu, 21 May 2009, 21:32
SUPERVISOR.....: Knielsen
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 21 (http://askmonty.org/worklog/?tid=21)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Wed, 03 Jun 2009, 01:17)=-=-
High Level Description modified.
--- /tmp/wklog.21.old.30002 2009-06-03 01:17:32.000000000 +0300
+++ /tmp/wklog.21.new.30002 2009-06-03 01:17:32.000000000 +0300
@@ -7,13 +7,13 @@
The current optimization works with:
-WHERE key1_part1=1 AND key1_part2=2 OR key2_part1=3
+WHERE key1_part1=1 AND key1_part2=2 AND key2_part1=3
but not with:
-WHERE key1_part1=1 OR key2_part1=3
+WHERE key1_part1=1 AND key2_part1=3
or
-WHERE key_part1<10 or key2_part1<100
+WHERE key_part1<10 AND key2_part1<100
This WL entry is to lift this limitation by developing algorithms that do
intersection on non-ROR (rowid ordered retrieval) scans.
-=-=(Monty - Wed, 03 Jun 2009, 01:06)=-=-
High Level Description modified.
--- /tmp/wklog.21.old.29694 2009-06-03 01:06:50.000000000 +0300
+++ /tmp/wklog.21.new.29694 2009-06-03 01:06:50.000000000 +0300
@@ -12,6 +12,8 @@
but not with:
WHERE key1_part1=1 OR key2_part1=3
+or
+WHERE key_part1<10 or key2_part1<100
This WL entry is to lift this limitation by developing algorithms that do
intersection on non-ROR (rowid ordered retrieval) scans.
-=-=(Monty - Wed, 03 Jun 2009, 01:05)=-=-
High Level Description modified.
--- /tmp/wklog.21.old.29638 2009-06-03 01:05:01.000000000 +0300
+++ /tmp/wklog.21.new.29638 2009-06-03 01:05:01.000000000 +0300
@@ -3,5 +3,15 @@
constructed for equality conditions (t.keypart1=const1 AND t.keypart2=const2
AND ... ) and the equalities should cover all index components.
+For example, assuming that key1 has 2 parts and key2 has 1 part.
+
+The current optimization works with:
+
+WHERE key1_part1=1 AND key1_part2=2 OR key2_part1=3
+
+but not with:
+
+WHERE key1_part1=1 OR key2_part1=3
+
This WL entry is to lift this limitation by developing algorithms that do
-intersection on non-ROR scans.
+intersection on non-ROR (rowid ordered retrieval) scans.
-=-=(Guest - Tue, 26 May 2009, 14:04)=-=-
High-Level Specification modified.
--- /tmp/wklog.21.old.1802 2009-05-26 14:04:57.000000000 +0300
+++ /tmp/wklog.21.new.1802 2009-05-26 14:04:57.000000000 +0300
@@ -1,4 +1,3 @@
-
<contents>
1. Execution
1.1 Temptable
@@ -30,6 +29,8 @@
1.1 Temptable
-------------
+[ This is our strategy of choice at the moment]
+
Use a temporary heap-grow-out-to-myisam table with a primary key:
create table temp_table (
@@ -168,3 +169,8 @@
a subset of columns covered by all other indexes.
= (TODO any other rules?)
+- Correlation across selectivities. If there is a condition
+
+ "cond(key1) AND cond(key2) AND ... AND cond(keyN)",
+
+ can we consider satisfaction of AND-parts to be independent?
-=-=(Psergey - Thu, 21 May 2009, 21:33)=-=-
High-Level Specification modified.
--- /tmp/wklog.21.old.25705 2009-05-21 21:33:02.000000000 +0300
+++ /tmp/wklog.21.new.25705 2009-05-21 21:33:02.000000000 +0300
@@ -1 +1,170 @@
+<contents>
+1. Execution
+1.1 Temptable
+1.1.1 Improvement
+1.2 Produce/merge sorted streams
+1.3 Extend Unique class to handle intersection
+1.4 Strategies that do not seem to be useful
+1.4.1 Remove matches after having produced an ordered stream
+1.4.2 Sparse rowid bitmaps
+2. Optimization
+
+</contents>
+
+1. Execution
+============
+
+The primary task is to find means to compute an intersection of N unordered
+streams. Besides general memory/cpu cost of computation, we consider:
+
+- whether the produced rowid stream is ordered. If it is, it can be piped
+ into index_merge/intersect (as opposed to sort-intersect)
+
+- whether the strategy can take advantage of the fact that some input streams
+ are already rowid-ordered
+
+- startup cost (cost of producing the first output record)
+
+We see the following possible strategies:
+
+1.1 Temptable
+-------------
+Use a temporary heap-grow-out-to-myisam table with a primary key:
+
+create table temp_table (
+ rowid binary($rowid_size),
+ count n,
+ primary key(rowid);
+);
+
+Then use this algorithm:
+
+ i1= {index with the least E(#records)};
+
+ for each record R in range_scan(i1)
+ temp_table.insert(R.rowid, count=1);
+
+ for each index idx except i1
+ {
+ for each R record in scan(idx) // (INNER-LOOP)
+ {
+ if (temp_table has R)
+ temptable[R].count++;
+ }
+ }
+
+ // The following loop can do ordered or unordered scan
+ // if we want it to be ordered scan, we probably better arrange so that
+ // 'count' column is part of the index.
+ for each record R in temp_table
+ {
+ if (R.count == number_of_streams)
+ emit(R.rowid);
+ }
+
+The algorithm has an option to emit an ordered rowid stream.
+
+In the above form, the cost to produce the first record is high. It's easy to
+adjust the algorithm to make it low - we'll need to just start scanning all
+indexes at once, and finish as soon as we got a full match, i.e. the
+
+ temptable[R].count++
+
+operation resulted in the counter being equal to the number of merged scans.
+
+1.1.1 Improvement
+~~~~~~~~~~~~~~~~~
+When running INNER-LOOP, we could count how many times we've done the
+"count++" operation. If it has been done #records-in-temptable times, that
+means that all further records will not have matches and we can finish the
+scan, i.e. break out of the INNER-LOOP.
+
+1.2 Produce/merge sorted streams
+--------------------------------
+For each of the merged scan, use filesort-like action to end up with an
+ordered stream of rowids. Then merge the ordered streams.
+
+By filesort-like action we mean
+ - Run over index, collect rowids in a buffer.
+ - When the buffer is full, sort it and dump into a temporary file.
+After the above we'll end up with a number of sorted buffers on disk. We can
+use mergebuff() function (it is part of filesort's functions) to produce one
+ordered sequence (i.e. array, which may be partially on disk) of rowids.
+
+Merging of ordered streams with help of priority queue is already implemented
+in QUICK_ROR_INTERSECT_SELECT. We'll need to substitute the
+
+ child_quick->get_next()
+
+call with a call to read rowid from an ordered sequence.
+
+1.3 Extend Unique class to handle intersection
+----------------------------------------------
+There is no point to use Unique object as a device that accumulates rowids of
+a single scan then produces them in sorted order. One could do the same faster
+with accumulating an array of rowids and then sorting it.
+
+It's possible to use Unique object to collect/merge data from all scans though.
+The idea is as follows:
+
+- Unique should store <rowid, n_scans> pairs
+- Duplicates are pairs with the same rowid
+- Unique should try to avoid creating duplicates:
+ - don't add a duplicate into the in-memory part, instead combine two elements
+ together by adding their n_scans elements.
+ - combine duplicates when it sees them in Unique.get() call
+- The data we get from Unique.get() should be filtered, all records that have
+ n_scans != number_of_scans_being_merged should be discarded.
+
+If we're lucky to have started and finished a scan on some index (denote it
+as S) without flushing the Unique in the process, then:
+- there is no point in adding any new records into the Unique because their
+ absence in the Unique means that they don't have match in S and hence will
+ not get into the result of intersection.
+- we need to only update the counters to be able to tell if the elements that
+ are already in the Unique will have matches in all scans.
+
+1.4 Strategies that do not seem to be useful
+--------------------------------------------
+
+keeping them here so we don't consider them over and over
+
+1.4.1 Remove matches after having produced an ordered stream
+------------------------------------------------------------
+We can dump everything into a rowid stream and get it sorted. Then we read it,
+and if we see a rowid repeated $n_merged_scans times, it belongs to the
+intersection (pass to output), otherwise it doesn't (skip).
+This doesn't have any advantages over the produce/merge sorted streams
+approach.
+
+1.4.2 Sparse rowid bitmaps
+--------------------------
+Use Falcon-style rowid bitmaps. The problem with that is that Falcon's
+bitmaps assume there will always be enough memory to accommodate them.
+
+PostgreSQL makes bitmaps "loose" when they exceed certain size by remembering
+disk pages, not ids of individual records. It's hard for us to do something
+similar because our rowids are opaque entities whose meaning depends on the
+storage engines.
+
+This seems to require too much change to be worth it.
+
+2. Optimization
+===============
+
+SEL_TREE objects already represent intersections. The problems with
+optimizations are:
+
+- Cost formula(s)
+- When N keys/conditions are present:
+
+ "cond(key1) AND cond(key2) AND ... AND cond(keyN)",
+
+ somehow avoid considering (2^n - n) possible options.
+
+- Avoid producing (or even considering) apparently suboptimal plans:
+ = Don't generate a merge of indexes (I_1, ... I_n) where columns of I_n are
+ a subset of columns covered by all other indexes.
+ = (TODO any other rules?)
+
DESCRIPTION:
At the moment index_merge supports intersection only for rowid-ordered streams.
This translates into a limitation that index_merge/intersect can only be
constructed for equality conditions (t.keypart1=const1 AND t.keypart2=const2
AND ... ) and the equalities should cover all index components.
For example, assuming that key1 has 2 parts and key2 has 1 part.
The current optimization works with:
WHERE key1_part1=1 AND key1_part2=2 AND key2_part1=3
but not with:
WHERE key1_part1=1 AND key2_part1=3
or
WHERE key_part1<10 AND key2_part1<100
This WL entry is to lift this limitation by developing algorithms that do
intersection on non-ROR (rowid ordered retrieval) scans.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Execution
1.1 Temptable
1.1.1 Improvement
1.2 Produce/merge sorted streams
1.3 Extend Unique class to handle intersection
1.4 Strategies that do not seem to be useful
1.4.1 Remove matches after having produced an ordered stream
1.4.2 Sparse rowid bitmaps
2. Optimization
</contents>
1. Execution
============
The primary task is to find means to compute an intersection of N unordered
streams. Besides general memory/cpu cost of computation, we consider:
- whether the produced rowid stream is ordered. If it is, it can be piped
into index_merge/intersect (as opposed to sort-intersect)
- whether the strategy can take advantage of the fact that some input streams
are already rowid-ordered
- startup cost (cost of producing the first output record)
We see the following possible strategies:
1.1 Temptable
-------------
[ This is our strategy of choice at the moment]
Use a temporary heap-grow-out-to-myisam table with a primary key:
create table temp_table (
rowid binary($rowid_size),
count n,
primary key(rowid);
);
Then use this algorithm:
i1= {index with the least E(#records)};
for each record R in range_scan(i1)
temp_table.insert(R.rowid, count=1);
for each index idx except i1
{
for each R record in scan(idx) // (INNER-LOOP)
{
if (temp_table has R)
temptable[R].count++;
}
}
// The following loop can do ordered or unordered scan
// if we want it to be ordered scan, we probably better arrange so that
// 'count' column is part of the index.
for each record R in temp_table
{
if (R.count == number_of_streams)
emit(R.rowid);
}
The algorithm has an option to emit an ordered rowid stream.
In the above form, the cost to produce the first record is high. It's easy to
adjust the algorithm to make it low - we'll need to just start scanning all
indexes at once, and finish as soon as we got a full match, i.e. the
temptable[R].count++
operation resulted in the counter being equal to the number of merged scans.
1.1.1 Improvement
~~~~~~~~~~~~~~~~~
When running INNER-LOOP, we could count how many times we've done the
"count++" operation. If it has been done #records-in-temptable times, that
means that all further records will not have matches and we can finish the
scan, i.e. break out of the INNER-LOOP.
1.2 Produce/merge sorted streams
--------------------------------
For each of the merged scan, use filesort-like action to end up with an
ordered stream of rowids. Then merge the ordered streams.
By filesort-like action we mean
- Run over index, collect rowids in a buffer.
- When the buffer is full, sort it and dump into a temporary file.
After the above we'll end up with a number of sorted buffers on disk. We can
use mergebuff() function (it is part of filesort's functions) to produce one
ordered sequence (i.e. array, which may be partially on disk) of rowids.
Merging of ordered streams with help of priority queue is already implemented
in QUICK_ROR_INTERSECT_SELECT. We'll need to substitute the
child_quick->get_next()
call with a call to read rowid from an ordered sequence.
1.3 Extend Unique class to handle intersection
----------------------------------------------
There is no point to use Unique object as a device that accumulates rowids of
a single scan then produces them in sorted order. One could do the same faster
with accumulating an array of rowids and then sorting it.
It's possible to use Unique object to collect/merge data from all scans though.
The idea is as follows:
- Unique should store <rowid, n_scans> pairs
- Duplicates are pairs with the same rowid
- Unique should try to avoid creating duplicates:
- don't add a duplicate into the in-memory part, instead combine two elements
together by adding their n_scans elements.
- combine duplicates when it sees them in Unique.get() call
- The data we get from Unique.get() should be filtered, all records that have
n_scans != number_of_scans_being_merged should be discarded.
If we're lucky to have started and finished a scan on some index (denote it
as S) without flushing the Unique in the process, then:
- there is no point in adding any new records into the Unique because their
absence in the Unique means that they don't have match in S and hence will
not get into the result of intersection.
- we need to only update the counters to be able to tell if the elements that
are already in the Unique will have matches in all scans.
1.4 Strategies that do not seem to be useful
--------------------------------------------
keeping them here so we don't consider them over and over
1.4.1 Remove matches after having produced an ordered stream
------------------------------------------------------------
We can dump everything into a rowid stream and get it sorted. Then we read it,
and if we see a rowid repeated $n_merged_scans times, it belongs to the
intersection (pass to output), otherwise it doesn't (skip).
This doesn't have any advantages over the produce/merge sorted streams
approach.
1.4.2 Sparse rowid bitmaps
--------------------------
Use Falcon-style rowid bitmaps. The problem with that is that Falcon's
bitmaps assume there will always be enough memory to accommodate them.
PostgreSQL makes bitmaps "loose" when they exceed certain size by remembering
disk pages, not ids of individual records. It's hard for us to do something
similar because our rowids are opaque entities whose meaning depends on the
storage engines.
This seems to require too much change to be worth it.
2. Optimization
===============
SEL_TREE objects already represent intersections. The problems with
optimizations are:
- Cost formula(s)
- When N keys/conditions are present:
"cond(key1) AND cond(key2) AND ... AND cond(keyN)",
somehow avoid considering (2^n - n) possible options.
- Avoid producing (or even considering) apparently suboptimal plans:
= Don't generate a merge of indexes (I_1, ... I_n) where columns of I_n are
a subset of columns covered by all other indexes.
= (TODO any other rules?)
- Correlation across selectivities. If there is a condition
"cond(key1) AND cond(key2) AND ... AND cond(keyN)",
can we consider satisfaction of AND-parts to be independent?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Guest): index_merge: non-ROR intersection (21)
by worklog-noreply@askmonty.org 03 Jun '09
by worklog-noreply@askmonty.org 03 Jun '09
03 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: non-ROR intersection
CREATION DATE..: Thu, 21 May 2009, 21:32
SUPERVISOR.....: Knielsen
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 21 (http://askmonty.org/worklog/?tid=21)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Wed, 03 Jun 2009, 01:17)=-=-
High Level Description modified.
--- /tmp/wklog.21.old.30002 2009-06-03 01:17:32.000000000 +0300
+++ /tmp/wklog.21.new.30002 2009-06-03 01:17:32.000000000 +0300
@@ -7,13 +7,13 @@
The current optimization works with:
-WHERE key1_part1=1 AND key1_part2=2 OR key2_part1=3
+WHERE key1_part1=1 AND key1_part2=2 AND key2_part1=3
but not with:
-WHERE key1_part1=1 OR key2_part1=3
+WHERE key1_part1=1 AND key2_part1=3
or
-WHERE key_part1<10 or key2_part1<100
+WHERE key_part1<10 AND key2_part1<100
This WL entry is to lift this limitation by developing algorithms that do
intersection on non-ROR (rowid ordered retrieval) scans.
-=-=(Monty - Wed, 03 Jun 2009, 01:06)=-=-
High Level Description modified.
--- /tmp/wklog.21.old.29694 2009-06-03 01:06:50.000000000 +0300
+++ /tmp/wklog.21.new.29694 2009-06-03 01:06:50.000000000 +0300
@@ -12,6 +12,8 @@
but not with:
WHERE key1_part1=1 OR key2_part1=3
+or
+WHERE key_part1<10 or key2_part1<100
This WL entry is to lift this limitation by developing algorithms that do
intersection on non-ROR (rowid ordered retrieval) scans.
-=-=(Monty - Wed, 03 Jun 2009, 01:05)=-=-
High Level Description modified.
--- /tmp/wklog.21.old.29638 2009-06-03 01:05:01.000000000 +0300
+++ /tmp/wklog.21.new.29638 2009-06-03 01:05:01.000000000 +0300
@@ -3,5 +3,15 @@
constructed for equality conditions (t.keypart1=const1 AND t.keypart2=const2
AND ... ) and the equalities should cover all index components.
+For example, assuming that key1 has 2 parts and key2 has 1 part.
+
+The current optimization works with:
+
+WHERE key1_part1=1 AND key1_part2=2 OR key2_part1=3
+
+but not with:
+
+WHERE key1_part1=1 OR key2_part1=3
+
This WL entry is to lift this limitation by developing algorithms that do
-intersection on non-ROR scans.
+intersection on non-ROR (rowid ordered retrieval) scans.
-=-=(Guest - Tue, 26 May 2009, 14:04)=-=-
High-Level Specification modified.
--- /tmp/wklog.21.old.1802 2009-05-26 14:04:57.000000000 +0300
+++ /tmp/wklog.21.new.1802 2009-05-26 14:04:57.000000000 +0300
@@ -1,4 +1,3 @@
-
<contents>
1. Execution
1.1 Temptable
@@ -30,6 +29,8 @@
1.1 Temptable
-------------
+[ This is our strategy of choice at the moment]
+
Use a temporary heap-grow-out-to-myisam table with a primary key:
create table temp_table (
@@ -168,3 +169,8 @@
a subset of columns covered by all other indexes.
= (TODO any other rules?)
+- Correlation across selectivities. If there is a condition
+
+ "cond(key1) AND cond(key2) AND ... AND cond(keyN)",
+
+ can we consider satisfaction of AND-parts to be independent?
-=-=(Psergey - Thu, 21 May 2009, 21:33)=-=-
High-Level Specification modified.
--- /tmp/wklog.21.old.25705 2009-05-21 21:33:02.000000000 +0300
+++ /tmp/wklog.21.new.25705 2009-05-21 21:33:02.000000000 +0300
@@ -1 +1,170 @@
+<contents>
+1. Execution
+1.1 Temptable
+1.1.1 Improvement
+1.2 Produce/merge sorted streams
+1.3 Extend Unique class to handle intersection
+1.4 Strategies that do not seem to be useful
+1.4.1 Remove matches after having produced an ordered stream
+1.4.2 Sparse rowid bitmaps
+2. Optimization
+
+</contents>
+
+1. Execution
+============
+
+The primary task is to find means to compute an intersection of N unordered
+streams. Besides general memory/cpu cost of computation, we consider:
+
+- whether the produced rowid stream is ordered. If it is, it can be piped
+ into index_merge/intersect (as opposed to sort-intersect)
+
+- whether the strategy can take advantage of the fact that some input streams
+ are already rowid-ordered
+
+- startup cost (cost of producing the first output record)
+
+We see the following possible strategies:
+
+1.1 Temptable
+-------------
+Use a temporary heap-grow-out-to-myisam table with a primary key:
+
+create table temp_table (
+ rowid binary($rowid_size),
+ count n,
+ primary key(rowid);
+);
+
+Then use this algorithm:
+
+ i1= {index with the least E(#records)};
+
+ for each record R in range_scan(i1)
+ temp_table.insert(R.rowid, count=1);
+
+ for each index idx except i1
+ {
+ for each R record in scan(idx) // (INNER-LOOP)
+ {
+ if (temp_table has R)
+ temptable[R].count++;
+ }
+ }
+
+ // The following loop can do ordered or unordered scan
+ // if we want it to be ordered scan, we probably better arrange so that
+ // 'count' column is part of the index.
+ for each record R in temp_table
+ {
+ if (R.count == number_of_streams)
+ emit(R.rowid);
+ }
+
+The algorithm has an option to emit an ordered rowid stream.
+
+In the above form, the cost to produce the first record is high. It's easy to
+adjust the algorithm to make it low - we'll need to just start scanning all
+indexes at once, and finish as soon as we got a full match, i.e. the
+
+ temptable[R].count++
+
+operation resulted in the counter being equal to the number of merged scans.
+
+1.1.1 Improvement
+~~~~~~~~~~~~~~~~~
+When running INNER-LOOP, we could count how many times we've done the
+"count++" operation. If it has been done #records-in-temptable times, that
+means that all further records will not have matches and we can finish the
+scan, i.e. break out of the INNER-LOOP.
+
+1.2 Produce/merge sorted streams
+--------------------------------
+For each of the merged scan, use filesort-like action to end up with an
+ordered stream of rowids. Then merge the ordered streams.
+
+By filesort-like action we mean
+ - Run over index, collect rowids in a buffer.
+ - When the buffer is full, sort it and dump into a temporary file.
+After the above we'll end up with a number of sorted buffers on disk. We can
+use mergebuff() function (it is part of filesort's functions) to produce one
+ordered sequence (i.e. array, which may be partially on disk) of rowids.
+
+Merging of ordered streams with help of priority queue is already implemented
+in QUICK_ROR_INTERSECT_SELECT. We'll need to substitute the
+
+ child_quick->get_next()
+
+call with a call to read rowid from an ordered sequence.
+
+1.3 Extend Unique class to handle intersection
+----------------------------------------------
+There is no point to use Unique object as a device that accumulates rowids of
+a single scan then produces them in sorted order. One could do the same faster
+with accumulating an array of rowids and then sorting it.
+
+It's possible to use Unique object to collect/merge data from all scans though.
+The idea is as follows:
+
+- Unique should store <rowid, n_scans> pairs
+- Duplicates are pairs with the same rowid
+- Unique should try to avoid creating duplicates:
+ - don't add a duplicate into the in-memory part, instead combine two elements
+ together by adding their n_scans elements.
+ - combine duplicates when it sees them in Unique.get() call
+- The data we get from Unique.get() should be filtered, all records that have
+ n_scans != number_of_scans_being_merged should be discarded.
+
+If we're lucky to have started and finished a scan on some index (denote it
+as S) without flushing the Unique in the process, then:
+- there is no point in adding any new records into the Unique because their
+ absence in the Unique means that they don't have match in S and hence will
+ not get into the result of intersection.
+- we need to only update the counters to be able to tell if the elements that
+ are already in the Unique will have matches in all scans.
+
+1.4 Strategies that do not seem to be useful
+--------------------------------------------
+
+keeping them here so we don't consider them over and over
+
+1.4.1 Remove matches after having produced an ordered stream
+------------------------------------------------------------
+We can dump everything into a rowid stream and get it sorted. Then we read it,
+and if we see a rowid repeated $n_merged_scans times, it belongs to the
+intersection (pass to output), otherwise it doesn't (skip).
+This doesn't have any advantages over the produce/merge sorted streams
+approach.
+
+1.4.2 Sparse rowid bitmaps
+--------------------------
+Use Falcon-style rowid bitmaps. The problem with that is that Falcon's
+bitmaps assume there will always be enough memory to accommodate them.
+
+PostgreSQL makes bitmaps "loose" when they exceed certain size by remembering
+disk pages, not ids of individual records. It's hard for us to do something
+similar because our rowids are opaque entities whose meaning depends on the
+storage engines.
+
+This seems to require too much change to be worth it.
+
+2. Optimization
+===============
+
+SEL_TREE objects already represent intersections. The problems with
+optimizations are:
+
+- Cost formula(s)
+- When N keys/conditions are present:
+
+ "cond(key1) AND cond(key2) AND ... AND cond(keyN)",
+
+ somehow avoid considering (2^n - n) possible options.
+
+- Avoid producing (or even considering) apparently suboptimal plans:
+ = Don't generate a merge of indexes (I_1, ... I_n) where columns of I_n are
+ a subset of columns covered by all other indexes.
+ = (TODO any other rules?)
+
DESCRIPTION:
At the moment index_merge supports intersection only for rowid-ordered streams.
This translates into a limitation that index_merge/intersect can only be
constructed for equality conditions (t.keypart1=const1 AND t.keypart2=const2
AND ... ) and the equalities should cover all index components.
For example, assuming that key1 has 2 parts and key2 has 1 part.
The current optimization works with:
WHERE key1_part1=1 AND key1_part2=2 AND key2_part1=3
but not with:
WHERE key1_part1=1 AND key2_part1=3
or
WHERE key_part1<10 AND key2_part1<100
This WL entry is to lift this limitation by developing algorithms that do
intersection on non-ROR (rowid ordered retrieval) scans.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Execution
1.1 Temptable
1.1.1 Improvement
1.2 Produce/merge sorted streams
1.3 Extend Unique class to handle intersection
1.4 Strategies that do not seem to be useful
1.4.1 Remove matches after having produced an ordered stream
1.4.2 Sparse rowid bitmaps
2. Optimization
</contents>
1. Execution
============
The primary task is to find means to compute an intersection of N unordered
streams. Besides general memory/cpu cost of computation, we consider:
- whether the produced rowid stream is ordered. If it is, it can be piped
into index_merge/intersect (as opposed to sort-intersect)
- whether the strategy can take advantage of the fact that some input streams
are already rowid-ordered
- startup cost (cost of producing the first output record)
We see the following possible strategies:
1.1 Temptable
-------------
[ This is our strategy of choice at the moment]
Use a temporary heap-grow-out-to-myisam table with a primary key:
create table temp_table (
rowid binary($rowid_size),
count n,
primary key(rowid);
);
Then use this algorithm:
i1= {index with the least E(#records)};
for each record R in range_scan(i1)
temp_table.insert(R.rowid, count=1);
for each index idx except i1
{
for each R record in scan(idx) // (INNER-LOOP)
{
if (temp_table has R)
temptable[R].count++;
}
}
// The following loop can do ordered or unordered scan
// if we want it to be ordered scan, we probably better arrange so that
// 'count' column is part of the index.
for each record R in temp_table
{
if (R.count == number_of_streams)
emit(R.rowid);
}
The algorithm has an option to emit an ordered rowid stream.
In the above form, the cost to produce the first record is high. It's easy to
adjust the algorithm to make it low - we'll need to just start scanning all
indexes at once, and finish as soon as we got a full match, i.e. the
temptable[R].count++
operation resulted in the counter being equal to the number of merged scans.
1.1.1 Improvement
~~~~~~~~~~~~~~~~~
When running INNER-LOOP, we could count how many times we've done the
"count++" operation. If it has been done #records-in-temptable times, that
means that all further records will not have matches and we can finish the
scan, i.e. break out of the INNER-LOOP.
1.2 Produce/merge sorted streams
--------------------------------
For each of the merged scan, use filesort-like action to end up with an
ordered stream of rowids. Then merge the ordered streams.
By filesort-like action we mean
- Run over index, collect rowids in a buffer.
- When the buffer is full, sort it and dump into a temporary file.
After the above we'll end up with a number of sorted buffers on disk. We can
use mergebuff() function (it is part of filesort's functions) to produce one
ordered sequence (i.e. array, which may be partially on disk) of rowids.
Merging of ordered streams with help of priority queue is already implemented
in QUICK_ROR_INTERSECT_SELECT. We'll need to substitute the
child_quick->get_next()
call with a call to read rowid from an ordered sequence.
1.3 Extend Unique class to handle intersection
----------------------------------------------
There is no point to use Unique object as a device that accumulates rowids of
a single scan then produces them in sorted order. One could do the same faster
with accumulating an array of rowids and then sorting it.
It's possible to use Unique object to collect/merge data from all scans though.
The idea is as follows:
- Unique should store <rowid, n_scans> pairs
- Duplicates are pairs with the same rowid
- Unique should try to avoid creating duplicates:
- don't add a duplicate into the in-memory part, instead combine two elements
together by adding their n_scans elements.
- combine duplicates when it sees them in Unique.get() call
- The data we get from Unique.get() should be filtered, all records that have
n_scans != number_of_scans_being_merged should be discarded.
If we're lucky to have started and finished a scan on some index (denote it
as S) without flushing the Unique in the process, then:
- there is no point in adding any new records into the Unique because their
absence in the Unique means that they don't have match in S and hence will
not get into the result of intersection.
- we need to only update the counters to be able to tell if the elements that
are already in the Unique will have matches in all scans.
1.4 Strategies that do not seem to be useful
--------------------------------------------
keeping them here so we don't consider them over and over
1.4.1 Remove matches after having produced an ordered stream
------------------------------------------------------------
We can dump everything into a rowid stream and get it sorted. Then we read it,
and if we see a rowid repeated $n_merged_scans times, it belongs to the
intersection (pass to output), otherwise it doesn't (skip).
This doesn't have any advantages over the produce/merge sorted streams
approach.
1.4.2 Sparse rowid bitmaps
--------------------------
Use Falcon-style rowid bitmaps. The problem with that is that Falcon's
bitmaps assume there will always be enough memory to accommodate them.
PostgreSQL makes bitmaps "loose" when they exceed certain size by remembering
disk pages, not ids of individual records. It's hard for us to do something
similar because our rowids are opaque entities whose meaning depends on the
storage engines.
This seems to require too much change to be worth it.
2. Optimization
===============
SEL_TREE objects already represent intersections. The problems with
optimizations are:
- Cost formula(s)
- When N keys/conditions are present:
"cond(key1) AND cond(key2) AND ... AND cond(keyN)",
somehow avoid considering (2^n - n) possible options.
- Avoid producing (or even considering) apparently suboptimal plans:
= Don't generate a merge of indexes (I_1, ... I_n) where columns of I_n are
a subset of columns covered by all other indexes.
= (TODO any other rules?)
- Correlation across selectivities. If there is a condition
"cond(key1) AND cond(key2) AND ... AND cond(keyN)",
can we consider satisfaction of AND-parts to be independent?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
[Maria-developers] Updated (by Monty): index_merge: non-ROR intersection (21)
by worklog-noreply@askmonty.org 03 Jun '09
by worklog-noreply@askmonty.org 03 Jun '09
03 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: non-ROR intersection
CREATION DATE..: Thu, 21 May 2009, 21:32
SUPERVISOR.....: Knielsen
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 21 (http://askmonty.org/worklog/?tid=21)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Monty - Wed, 03 Jun 2009, 01:06)=-=-
High Level Description modified.
--- /tmp/wklog.21.old.29694 2009-06-03 01:06:50.000000000 +0300
+++ /tmp/wklog.21.new.29694 2009-06-03 01:06:50.000000000 +0300
@@ -12,6 +12,8 @@
but not with:
WHERE key1_part1=1 OR key2_part1=3
+or
+WHERE key_part1<10 or key2_part1<100
This WL entry is to lift this limitation by developing algorithms that do
intersection on non-ROR (rowid ordered retrieval) scans.
-=-=(Monty - Wed, 03 Jun 2009, 01:05)=-=-
High Level Description modified.
--- /tmp/wklog.21.old.29638 2009-06-03 01:05:01.000000000 +0300
+++ /tmp/wklog.21.new.29638 2009-06-03 01:05:01.000000000 +0300
@@ -3,5 +3,15 @@
constructed for equality conditions (t.keypart1=const1 AND t.keypart2=const2
AND ... ) and the equalities should cover all index components.
+For example, assuming that key1 has 2 parts and key2 has 1 part.
+
+The current optimization works with:
+
+WHERE key1_part1=1 AND key1_part2=2 OR key2_part1=3
+
+but not with:
+
+WHERE key1_part1=1 OR key2_part1=3
+
This WL entry is to lift this limitation by developing algorithms that do
-intersection on non-ROR scans.
+intersection on non-ROR (rowid ordered retrieval) scans.
-=-=(Guest - Tue, 26 May 2009, 14:04)=-=-
High-Level Specification modified.
--- /tmp/wklog.21.old.1802 2009-05-26 14:04:57.000000000 +0300
+++ /tmp/wklog.21.new.1802 2009-05-26 14:04:57.000000000 +0300
@@ -1,4 +1,3 @@
-
<contents>
1. Execution
1.1 Temptable
@@ -30,6 +29,8 @@
1.1 Temptable
-------------
+[ This is our strategy of choice at the moment]
+
Use a temporary heap-grow-out-to-myisam table with a primary key:
create table temp_table (
@@ -168,3 +169,8 @@
a subset of columns covered by all other indexes.
= (TODO any other rules?)
+- Correlation across selectivities. If there is a condition
+
+ "cond(key1) AND cond(key2) AND ... AND cond(keyN)",
+
+ can we consider satisfaction of AND-parts to be independent?
-=-=(Psergey - Thu, 21 May 2009, 21:33)=-=-
High-Level Specification modified.
--- /tmp/wklog.21.old.25705 2009-05-21 21:33:02.000000000 +0300
+++ /tmp/wklog.21.new.25705 2009-05-21 21:33:02.000000000 +0300
@@ -1 +1,170 @@
+<contents>
+1. Execution
+1.1 Temptable
+1.1.1 Improvement
+1.2 Produce/merge sorted streams
+1.3 Extend Unique class to handle intersection
+1.4 Strategies that do not seem to be useful
+1.4.1 Remove matches after having produced an ordered stream
+1.4.2 Sparse rowid bitmaps
+2. Optimization
+
+</contents>
+
+1. Execution
+============
+
+The primary task is to find means to compute an intersection of N unordered
+streams. Besides general memory/cpu cost of computation, we consider:
+
+- whether the produced rowid stream is ordered. If it is, it can be piped
+ into index_merge/intersect (as opposed to sort-intersect)
+
+- whether the strategy can take advantage of the fact that some input streams
+ are already rowid-ordered
+
+- startup cost (cost of producing the first output record)
+
+We see the following possible strategies:
+
+1.1 Temptable
+-------------
+Use a temporary heap-grow-out-to-myisam table with a primary key:
+
+create table temp_table (
+ rowid binary($rowid_size),
+ count n,
+ primary key(rowid);
+);
+
+Then use this algorithm:
+
+ i1= {index with the least E(#records)};
+
+ for each record R in range_scan(i1)
+ temp_table.insert(R.rowid, count=1);
+
+ for each index idx except i1
+ {
+ for each R record in scan(idx) // (INNER-LOOP)
+ {
+ if (temp_table has R)
+ temptable[R].count++;
+ }
+ }
+
+ // The following loop can do ordered or unordered scan
+ // if we want it to be ordered scan, we probably better arrange so that
+ // 'count' column is part of the index.
+ for each record R in temp_table
+ {
+ if (R.count == number_of_streams)
+ emit(R.rowid);
+ }
+
+The algorithm has an option to emit an ordered rowid stream.
+
+In the above form, the cost to produce the first record is high. It's easy to
+adjust the algorithm to make it low - we'll need to just start scanning all
+indexes at once, and finish as soon as we got a full match, i.e. the
+
+ temptable[R].count++
+
+operation resulted in the counter being equal to the number of merged scans.
+
+1.1.1 Improvement
+~~~~~~~~~~~~~~~~~
+When running INNER-LOOP, we could count how many times we've done the
+"count++" operation. If it has been done #records-in-temptable times, that
+means that all further records will not have matches and we can finish the
+scan, i.e. break out of the INNER-LOOP.
+
+1.2 Produce/merge sorted streams
+--------------------------------
+For each of the merged scan, use filesort-like action to end up with an
+ordered stream of rowids. Then merge the ordered streams.
+
+By filesort-like action we mean
+ - Run over index, collect rowids in a buffer.
+ - When the buffer is full, sort it and dump into a temporary file.
+After the above we'll end up with a number of sorted buffers on disk. We can
+use mergebuff() function (it is part of filesort's functions) to produce one
+ordered sequence (i.e. array, which may be partially on disk) of rowids.
+
+Merging of ordered streams with help of priority queue is already implemented
+in QUICK_ROR_INTERSECT_SELECT. We'll need to substitute the
+
+ child_quick->get_next()
+
+call with a call to read rowid from an ordered sequence.
+
+1.3 Extend Unique class to handle intersection
+----------------------------------------------
+There is no point to use Unique object as a device that accumulates rowids of
+a single scan then produces them in sorted order. One could do the same faster
+with accumulating an array of rowids and then sorting it.
+
+It's possible to use Unique object to collect/merge data from all scans though.
+The idea is as follows:
+
+- Unique should store <rowid, n_scans> pairs
+- Duplicates are pairs with the same rowid
+- Unique should try to avoid creating duplicates:
+ - don't add a duplicate into the in-memory part, instead combine two elements
+ together by adding their n_scans elements.
+ - combine duplicates when it sees them in Unique.get() call
+- The data we get from Unique.get() should be filtered, all records that have
+ n_scans != number_of_scans_being_merged should be discarded.
+
+If we're lucky to have started and finished a scan on some index (denote it
+as S) without flushing the Unique in the process, then:
+- there is no point in adding any new records into the Unique because their
+ absence in the Unique means that they don't have match in S and hence will
+ not get into the result of intersection.
+- we need to only update the counters to be able to tell if the elements that
+ are already in the Unique will have matches in all scans.
+
+1.4 Strategies that do not seem to be useful
+--------------------------------------------
+
+keeping them here so we don't consider them over and over
+
+1.4.1 Remove matches after having produced an ordered stream
+------------------------------------------------------------
+We can dump everything into a rowid stream and get it sorted. Then we read it,
+and if we see a rowid repeated $n_merged_scans times, it belongs to the
+intersection (pass to output), otherwise it doesn't (skip).
+This doesn't have any advantages over the produce/merge sorted streams
+approach.
+
+1.4.2 Sparse rowid bitmaps
+--------------------------
+Use Falcon-style rowid bitmaps. The problem with that is that Falcon's
+bitmaps assume there will always be enough memory to accommodate them.
+
+PostgreSQL makes bitmaps "loose" when they exceed certain size by remembering
+disk pages, not ids of individual records. It's hard for us to do something
+similar because our rowids are opaque entities whose meaning depends on the
+storage engines.
+
+This seems to require too much change to be worth it.
+
+2. Optimization
+===============
+
+SEL_TREE objects already represent intersections. The problems with
+optimizations are:
+
+- Cost formula(s)
+- When N keys/conditions are present:
+
+ "cond(key1) AND cond(key2) AND ... AND cond(keyN)",
+
+ somehow avoid considering (2^n - n) possible options.
+
+- Avoid producing (or even considering) apparently suboptimal plans:
+ = Don't generate a merge of indexes (I_1, ... I_n) where columns of I_n are
+ a subset of columns covered by all other indexes.
+ = (TODO any other rules?)
+
DESCRIPTION:
At the moment index_merge supports intersection only for rowid-ordered streams.
This translates into a limitation that index_merge/intersect can only be
constructed for equality conditions (t.keypart1=const1 AND t.keypart2=const2
AND ... ) and the equalities should cover all index components.
For example, assuming that key1 has 2 parts and key2 has 1 part.
The current optimization works with:
WHERE key1_part1=1 AND key1_part2=2 OR key2_part1=3
but not with:
WHERE key1_part1=1 OR key2_part1=3
or
WHERE key_part1<10 or key2_part1<100
This WL entry is to lift this limitation by developing algorithms that do
intersection on non-ROR (rowid ordered retrieval) scans.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Execution
1.1 Temptable
1.1.1 Improvement
1.2 Produce/merge sorted streams
1.3 Extend Unique class to handle intersection
1.4 Strategies that do not seem to be useful
1.4.1 Remove matches after having produced an ordered stream
1.4.2 Sparse rowid bitmaps
2. Optimization
</contents>
1. Execution
============
The primary task is to find means to compute an intersection of N unordered
streams. Besides general memory/cpu cost of computation, we consider:
- whether the produced rowid stream is ordered. If it is, it can be piped
into index_merge/intersect (as opposed to sort-intersect)
- whether the strategy can take advantage of the fact that some input streams
are already rowid-ordered
- startup cost (cost of producing the first output record)
We see the following possible strategies:
1.1 Temptable
-------------
[ This is our strategy of choice at the moment]
Use a temporary heap-grow-out-to-myisam table with a primary key:
create table temp_table (
rowid binary($rowid_size),
count n,
primary key(rowid);
);
Then use this algorithm:
i1= {index with the least E(#records)};
for each record R in range_scan(i1)
temp_table.insert(R.rowid, count=1);
for each index idx except i1
{
for each R record in scan(idx) // (INNER-LOOP)
{
if (temp_table has R)
temptable[R].count++;
}
}
// The following loop can do ordered or unordered scan
// if we want it to be ordered scan, we probably better arrange so that
// 'count' column is part of the index.
for each record R in temp_table
{
if (R.count == number_of_streams)
emit(R.rowid);
}
The algorithm has an option to emit an ordered rowid stream.
In the above form, the cost to produce the first record is high. It's easy to
adjust the algorithm to make it low - we'll need to just start scanning all
indexes at once, and finish as soon as we got a full match, i.e. the
temptable[R].count++
operation resulted in the counter being equal to the number of merged scans.
1.1.1 Improvement
~~~~~~~~~~~~~~~~~
When running INNER-LOOP, we could count how many times we've done the
"count++" operation. If it has been done #records-in-temptable times, that
means that all further records will not have matches and we can finish the
scan, i.e. break out of the INNER-LOOP.
1.2 Produce/merge sorted streams
--------------------------------
For each of the merged scan, use filesort-like action to end up with an
ordered stream of rowids. Then merge the ordered streams.
By filesort-like action we mean
- Run over index, collect rowids in a buffer.
- When the buffer is full, sort it and dump into a temporary file.
After the above we'll end up with a number of sorted buffers on disk. We can
use mergebuff() function (it is part of filesort's functions) to produce one
ordered sequence (i.e. array, which may be partially on disk) of rowids.
Merging of ordered streams with help of priority queue is already implemented
in QUICK_ROR_INTERSECT_SELECT. We'll need to substitute the
child_quick->get_next()
call with a call to read rowid from an ordered sequence.
1.3 Extend Unique class to handle intersection
----------------------------------------------
There is no point to use Unique object as a device that accumulates rowids of
a single scan then produces them in sorted order. One could do the same faster
with accumulating an array of rowids and then sorting it.
It's possible to use Unique object to collect/merge data from all scans though.
The idea is as follows:
- Unique should store <rowid, n_scans> pairs
- Duplicates are pairs with the same rowid
- Unique should try to avoid creating duplicates:
- don't add a duplicate into the in-memory part, instead combine two elements
together by adding their n_scans elements.
- combine duplicates when it sees them in Unique.get() call
- The data we get from Unique.get() should be filtered, all records that have
n_scans != number_of_scans_being_merged should be discarded.
If we're lucky to have started and finished a scan on some index (denote it
as S) without flushing the Unique in the process, then:
- there is no point in adding any new records into the Unique because their
absence in the Unique means that they don't have match in S and hence will
not get into the result of intersection.
- we need to only update the counters to be able to tell if the elements that
are already in the Unique will have matches in all scans.
1.4 Strategies that do not seem to be useful
--------------------------------------------
keeping them here so we don't consider them over and over
1.4.1 Remove matches after having produced an ordered stream
------------------------------------------------------------
We can dump everything into a rowid stream and get it sorted. Then we read it,
and if we see a rowid repeated $n_merged_scans times, it belongs to the
intersection (pass to output), otherwise it doesn't (skip).
This doesn't have any advantages over the produce/merge sorted streams
approach.
1.4.2 Sparse rowid bitmaps
--------------------------
Use Falcon-style rowid bitmaps. The problem with that is that Falcon's
bitmaps assume there will always be enough memory to accommodate them.
PostgreSQL makes bitmaps "loose" when they exceed certain size by remembering
disk pages, not ids of individual records. It's hard for us to do something
similar because our rowids are opaque entities whose meaning depends on the
storage engines.
This seems to require too much change to be worth it.
2. Optimization
===============
SEL_TREE objects already represent intersections. The problems with
optimizations are:
- Cost formula(s)
- When N keys/conditions are present:
"cond(key1) AND cond(key2) AND ... AND cond(keyN)",
somehow avoid considering (2^n - n) possible options.
- Avoid producing (or even considering) apparently suboptimal plans:
= Don't generate a merge of indexes (I_1, ... I_n) where columns of I_n are
a subset of columns covered by all other indexes.
= (TODO any other rules?)
- Correlation across selectivities. If there is a condition
"cond(key1) AND cond(key2) AND ... AND cond(keyN)",
can we consider satisfaction of AND-parts to be independent?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0