developers
Threads by month
- ----- 2025 -----
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- 7 participants
- 6852 discussions

[Maria-developers] Rev 2726: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r10/
by Sergey Petrunya 16 Aug '09
by Sergey Petrunya 16 Aug '09
16 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r10/
------------------------------------------------------------
revno: 2726
revision-id: psergey(a)askmonty.org-20090816091549-da84w3nlmx8prmvm
parent: psergey(a)askmonty.org-20090816072524-w9fu2hy23pjwlr8z
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r10
timestamp: Sun 2009-08-16 12:15:49 +0300
message:
MWL#17: Table elimination
- code cleanup
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-16 07:25:24 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-16 09:15:49 +0000
@@ -125,7 +125,7 @@
make elements that depend on them bound, too.
*/
Module_dep *next;
- uint unknown_args; /* TRUE<=> The entity is considered bound */
+ uint unknown_args;
Module_dep() : next(NULL), unknown_args(0) {}
};
@@ -249,11 +249,9 @@
void eliminate_tables(JOIN *join);
static void mark_as_eliminated(JOIN *join, TABLE_LIST *tbl);
-#if 0
#ifndef DBUG_OFF
static void dbug_print_deps(Table_elimination *te);
#endif
-#endif
/*******************************************************************************************/
/*
@@ -854,7 +852,7 @@
/*
- This is used to analyse expressions in "tbl.col=expr" dependencies so
+ This is used to analyze expressions in "tbl.col=expr" dependencies so
that we can figure out which fields the expression depends on.
*/
@@ -965,7 +963,7 @@
}
*bound_deps_list= bound_dep;
- //DBUG_EXECUTE("test", dbug_print_deps(te); );
+ DBUG_EXECUTE("test", dbug_print_deps(te); );
DBUG_RETURN(FALSE);
}
@@ -1089,7 +1087,6 @@
void signal_from_field_to_exprs(Table_elimination* te, Field_value *field_dep,
Module_dep **bound_modules)
{
- /* Now, expressions */
for (uint i=0; i < te->n_equality_deps; i++)
{
if (bitmap_is_set(&te->expr_deps, field_dep->bitmap_offset + i) &&
@@ -1213,7 +1210,6 @@
for (Outer_join_module *outer_join_dep= table_dep->outer_join_dep;
outer_join_dep; outer_join_dep= outer_join_dep->parent)
{
- //if (!(outer_join_dep->missing_tables &= ~table_dep->table->map))
if (outer_join_dep->unknown_args &&
!--outer_join_dep->unknown_args)
{
@@ -1268,7 +1264,6 @@
}
-#if 0
#ifndef DBUG_OFF
static
void dbug_print_deps(Table_elimination *te)
@@ -1324,7 +1319,6 @@
}
#endif
-#endif
/**
@} (end of group Table_Elimination)
*/
1
0

[Maria-developers] Updated (by Psergey): Make --replicate-(do, ignore)-(db, table) behaviour for RBR identical to that of SBR (49)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Make --replicate-(do,ignore)-(db,table) behaviour for RBR identical to
that of SBR
CREATION DATE..: Sun, 16 Aug 2009, 12:01
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 49 (http://askmonty.org/worklog/?tid=49)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 12:06)=-=-
High-Level Specification modified.
--- /tmp/wklog.49.old.14928 2009-08-16 12:06:27.000000000 +0300
+++ /tmp/wklog.49.new.14928 2009-08-16 12:06:27.000000000 +0300
@@ -1 +1,12 @@
+Some notes:
+
+The only required changes are on the slave. The slave can see bounds between
+statements (they are delimited by Query_event and Table_map_event entries),
+Table_map_event lists all tables that are going to be updated => it is possible
+to make a decision whether we should skip the statement, and if yes, skip all
+RBR events that belong to the statement.
+
+Possible syntax for options
+--replicate-wild-ignore-stmt-with-table=%.tmptbl%
+--replicate-ignore-stmt-with-table=tmptbl
-=-=(Psergey - Sun, 16 Aug 2009, 12:02)=-=-
High Level Description modified.
--- /tmp/wklog.49.old.14739 2009-08-16 12:02:05.000000000 +0300
+++ /tmp/wklog.49.new.14739 2009-08-16 12:02:05.000000000 +0300
@@ -6,8 +6,8 @@
This can be inconvenient, and also the semantics gets really complicated when
--binlog_format=mixed is used.
-This WL entry is about making processing of RBR events to work the same as SBR
-events did.
+This WL entry is about adding an option to make processing of RBR events to work
+the same as SBR events did.
DESCRIPTION:
At the moment semantics of --replicate-(do,ignore)-(db,table) rules is
different for RBR and SBR:
http://dev.mysql.com/doc/refman/5.1/en/replication-rules-table-options.html
This can be inconvenient, and also the semantics gets really complicated when
--binlog_format=mixed is used.
This WL entry is about adding an option to make processing of RBR events to work
the same as SBR events did.
HIGH-LEVEL SPECIFICATION:
Some notes:
The only required changes are on the slave. The slave can see bounds between
statements (they are delimited by Query_event and Table_map_event entries),
Table_map_event lists all tables that are going to be updated => it is possible
to make a decision whether we should skip the statement, and if yes, skip all
RBR events that belong to the statement.
Possible syntax for options
--replicate-wild-ignore-stmt-with-table=%.tmptbl%
--replicate-ignore-stmt-with-table=tmptbl
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Make --replicate-(do, ignore)-(db, table) behaviour for RBR identical to that of SBR (49)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Make --replicate-(do,ignore)-(db,table) behaviour for RBR identical to
that of SBR
CREATION DATE..: Sun, 16 Aug 2009, 12:01
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 49 (http://askmonty.org/worklog/?tid=49)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 12:06)=-=-
High-Level Specification modified.
--- /tmp/wklog.49.old.14928 2009-08-16 12:06:27.000000000 +0300
+++ /tmp/wklog.49.new.14928 2009-08-16 12:06:27.000000000 +0300
@@ -1 +1,12 @@
+Some notes:
+
+The only required changes are on the slave. The slave can see bounds between
+statements (they are delimited by Query_event and Table_map_event entries),
+Table_map_event lists all tables that are going to be updated => it is possible
+to make a decision whether we should skip the statement, and if yes, skip all
+RBR events that belong to the statement.
+
+Possible syntax for options
+--replicate-wild-ignore-stmt-with-table=%.tmptbl%
+--replicate-ignore-stmt-with-table=tmptbl
-=-=(Psergey - Sun, 16 Aug 2009, 12:02)=-=-
High Level Description modified.
--- /tmp/wklog.49.old.14739 2009-08-16 12:02:05.000000000 +0300
+++ /tmp/wklog.49.new.14739 2009-08-16 12:02:05.000000000 +0300
@@ -6,8 +6,8 @@
This can be inconvenient, and also the semantics gets really complicated when
--binlog_format=mixed is used.
-This WL entry is about making processing of RBR events to work the same as SBR
-events did.
+This WL entry is about adding an option to make processing of RBR events to work
+the same as SBR events did.
DESCRIPTION:
At the moment semantics of --replicate-(do,ignore)-(db,table) rules is
different for RBR and SBR:
http://dev.mysql.com/doc/refman/5.1/en/replication-rules-table-options.html
This can be inconvenient, and also the semantics gets really complicated when
--binlog_format=mixed is used.
This WL entry is about adding an option to make processing of RBR events to work
the same as SBR events did.
HIGH-LEVEL SPECIFICATION:
Some notes:
The only required changes are on the slave. The slave can see bounds between
statements (they are delimited by Query_event and Table_map_event entries),
Table_map_event lists all tables that are going to be updated => it is possible
to make a decision whether we should skip the statement, and if yes, skip all
RBR events that belong to the statement.
Possible syntax for options
--replicate-wild-ignore-stmt-with-table=%.tmptbl%
--replicate-ignore-stmt-with-table=tmptbl
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Make --replicate-(do, ignore)-(db, table) behaviour for RBR identical to that of SBR (49)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Make --replicate-(do,ignore)-(db,table) behaviour for RBR identical to
that of SBR
CREATION DATE..: Sun, 16 Aug 2009, 12:01
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 49 (http://askmonty.org/worklog/?tid=49)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 12:02)=-=-
High Level Description modified.
--- /tmp/wklog.49.old.14739 2009-08-16 12:02:05.000000000 +0300
+++ /tmp/wklog.49.new.14739 2009-08-16 12:02:05.000000000 +0300
@@ -6,8 +6,8 @@
This can be inconvenient, and also the semantics gets really complicated when
--binlog_format=mixed is used.
-This WL entry is about making processing of RBR events to work the same as SBR
-events did.
+This WL entry is about adding an option to make processing of RBR events to work
+the same as SBR events did.
DESCRIPTION:
At the moment semantics of --replicate-(do,ignore)-(db,table) rules is
different for RBR and SBR:
http://dev.mysql.com/doc/refman/5.1/en/replication-rules-table-options.html
This can be inconvenient, and also the semantics gets really complicated when
--binlog_format=mixed is used.
This WL entry is about adding an option to make processing of RBR events to work
the same as SBR events did.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Make --replicate-(do, ignore)-(db, table) behaviour for RBR identical to that of SBR (49)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Make --replicate-(do,ignore)-(db,table) behaviour for RBR identical to
that of SBR
CREATION DATE..: Sun, 16 Aug 2009, 12:01
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 49 (http://askmonty.org/worklog/?tid=49)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 12:02)=-=-
High Level Description modified.
--- /tmp/wklog.49.old.14739 2009-08-16 12:02:05.000000000 +0300
+++ /tmp/wklog.49.new.14739 2009-08-16 12:02:05.000000000 +0300
@@ -6,8 +6,8 @@
This can be inconvenient, and also the semantics gets really complicated when
--binlog_format=mixed is used.
-This WL entry is about making processing of RBR events to work the same as SBR
-events did.
+This WL entry is about adding an option to make processing of RBR events to work
+the same as SBR events did.
DESCRIPTION:
At the moment semantics of --replicate-(do,ignore)-(db,table) rules is
different for RBR and SBR:
http://dev.mysql.com/doc/refman/5.1/en/replication-rules-table-options.html
This can be inconvenient, and also the semantics gets really complicated when
--binlog_format=mixed is used.
This WL entry is about adding an option to make processing of RBR events to work
the same as SBR events did.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Make --replicate-(do, ignore)-(db, table) behaviour for RBR identical to that of SBR (49)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Make --replicate-(do,ignore)-(db,table) behaviour for RBR identical to
that of SBR
CREATION DATE..: Sun, 16 Aug 2009, 12:01
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 49 (http://askmonty.org/worklog/?tid=49)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
At the moment semantics of --replicate-(do,ignore)-(db,table) rules is
different for RBR and SBR:
http://dev.mysql.com/doc/refman/5.1/en/replication-rules-table-options.html
This can be inconvenient, and also the semantics gets really complicated when
--binlog_format=mixed is used.
This WL entry is about making processing of RBR events to work the same as SBR
events did.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Make --replicate-(do, ignore)-(db, table) behaviour for RBR identical to that of SBR (49)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Make --replicate-(do,ignore)-(db,table) behaviour for RBR identical to
that of SBR
CREATION DATE..: Sun, 16 Aug 2009, 12:01
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 49 (http://askmonty.org/worklog/?tid=49)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
At the moment semantics of --replicate-(do,ignore)-(db,table) rules is
different for RBR and SBR:
http://dev.mysql.com/doc/refman/5.1/en/replication-rules-table-options.html
This can be inconvenient, and also the semantics gets really complicated when
--binlog_format=mixed is used.
This WL entry is about making processing of RBR events to work the same as SBR
events did.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Change BINLOG statement syntax to be human-readable (46)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Change BINLOG statement syntax to be human-readable
CREATION DATE..: Sat, 15 Aug 2009, 23:42
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 46 (http://askmonty.org/worklog/?tid=46)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 11:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.46.old.13453 2009-08-16 11:30:06.000000000 +0300
+++ /tmp/wklog.46.new.13453 2009-08-16 11:30:06.000000000 +0300
@@ -26,3 +26,7 @@
(due to locking, table open/close, etc). (TODO: is it really slower? we
haven't checked).
+* When SBR replication is used and the statements refer to the current database
+ (a common scenario), one can use awk to filter out updates made in certain
+ databases. The proposed syntax doesn't allow to perform equivalent filtering?
+
-=-=(Psergey - Sun, 16 Aug 2009, 11:13)=-=-
High Level Description modified.
--- /tmp/wklog.46.old.12747 2009-08-16 11:13:54.000000000 +0300
+++ /tmp/wklog.46.new.12747 2009-08-16 11:13:54.000000000 +0300
@@ -6,4 +6,4 @@
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
-The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
+The approach of this WL is to some extent an alternative to WL#38, WL#40, WL#41.
-=-=(Psergey - Sun, 16 Aug 2009, 11:13)=-=-
High Level Description modified.
--- /tmp/wklog.46.old.12717 2009-08-16 11:13:40.000000000 +0300
+++ /tmp/wklog.46.new.12717 2009-08-16 11:13:40.000000000 +0300
@@ -5,3 +5,5 @@
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
+
+The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency deleted: 48 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency created: 48 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency deleted: 39 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sat, 15 Aug 2009, 23:43)=-=-
High-Level Specification modified.
--- /tmp/wklog.46.old.17742 2009-08-15 23:43:09.000000000 +0300
+++ /tmp/wklog.46.new.17742 2009-08-15 23:43:09.000000000 +0300
@@ -1 +1,28 @@
+Suggestion 1
+------------
+Original syntax suggestion by Kristian:
+
+ BINLOG
+ WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
+ TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
+ TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
+ WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
+ UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
+ FROM ('a') TO ('b') FLAGS 0x0
+ DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
+
+ This is basically a dump of what is stored in the events, and would be an
+ alternative to BINLOG 'gwWEShMBAA...'.
+
+Feedback and other suggestions
+------------------------------
+* What is the need for WITH TIMESTAMP part? Can't one use a separate
+ SET TIMESTAMP statement?
+
+* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
+ that's close to readable SQL. Can we make it to be regular parseable SQL?
+ + This will be syntax that's familiar to our parser and to the users
+ - A stream of SQL statements will be slower to run than BINLOG statements
+ (due to locking, table open/close, etc). (TODO: is it really slower? we
+ haven't checked).
DESCRIPTION:
One of great things about mysqlbinlog was that its output was human-readable
SQL, so it was possible to edit it manually or with help of scripts. With RBR
events and BINLOG 'DpiGShMBAAAALQAAADcBAA...' statements this is no longer the
case.
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
The approach of this WL is to some extent an alternative to WL#38, WL#40, WL#41.
HIGH-LEVEL SPECIFICATION:
Suggestion 1
------------
Original syntax suggestion by Kristian:
BINLOG
WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
FROM ('a') TO ('b') FLAGS 0x0
DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
This is basically a dump of what is stored in the events, and would be an
alternative to BINLOG 'gwWEShMBAA...'.
Feedback and other suggestions
------------------------------
* What is the need for WITH TIMESTAMP part? Can't one use a separate
SET TIMESTAMP statement?
* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
that's close to readable SQL. Can we make it to be regular parseable SQL?
+ This will be syntax that's familiar to our parser and to the users
- A stream of SQL statements will be slower to run than BINLOG statements
(due to locking, table open/close, etc). (TODO: is it really slower? we
haven't checked).
* When SBR replication is used and the statements refer to the current database
(a common scenario), one can use awk to filter out updates made in certain
databases. The proposed syntax doesn't allow to perform equivalent filtering?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Change BINLOG statement syntax to be human-readable (46)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Change BINLOG statement syntax to be human-readable
CREATION DATE..: Sat, 15 Aug 2009, 23:42
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 46 (http://askmonty.org/worklog/?tid=46)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 11:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.46.old.13453 2009-08-16 11:30:06.000000000 +0300
+++ /tmp/wklog.46.new.13453 2009-08-16 11:30:06.000000000 +0300
@@ -26,3 +26,7 @@
(due to locking, table open/close, etc). (TODO: is it really slower? we
haven't checked).
+* When SBR replication is used and the statements refer to the current database
+ (a common scenario), one can use awk to filter out updates made in certain
+ databases. The proposed syntax doesn't allow to perform equivalent filtering?
+
-=-=(Psergey - Sun, 16 Aug 2009, 11:13)=-=-
High Level Description modified.
--- /tmp/wklog.46.old.12747 2009-08-16 11:13:54.000000000 +0300
+++ /tmp/wklog.46.new.12747 2009-08-16 11:13:54.000000000 +0300
@@ -6,4 +6,4 @@
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
-The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
+The approach of this WL is to some extent an alternative to WL#38, WL#40, WL#41.
-=-=(Psergey - Sun, 16 Aug 2009, 11:13)=-=-
High Level Description modified.
--- /tmp/wklog.46.old.12717 2009-08-16 11:13:40.000000000 +0300
+++ /tmp/wklog.46.new.12717 2009-08-16 11:13:40.000000000 +0300
@@ -5,3 +5,5 @@
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
+
+The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency deleted: 48 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency created: 48 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency deleted: 39 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sat, 15 Aug 2009, 23:43)=-=-
High-Level Specification modified.
--- /tmp/wklog.46.old.17742 2009-08-15 23:43:09.000000000 +0300
+++ /tmp/wklog.46.new.17742 2009-08-15 23:43:09.000000000 +0300
@@ -1 +1,28 @@
+Suggestion 1
+------------
+Original syntax suggestion by Kristian:
+
+ BINLOG
+ WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
+ TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
+ TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
+ WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
+ UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
+ FROM ('a') TO ('b') FLAGS 0x0
+ DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
+
+ This is basically a dump of what is stored in the events, and would be an
+ alternative to BINLOG 'gwWEShMBAA...'.
+
+Feedback and other suggestions
+------------------------------
+* What is the need for WITH TIMESTAMP part? Can't one use a separate
+ SET TIMESTAMP statement?
+
+* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
+ that's close to readable SQL. Can we make it to be regular parseable SQL?
+ + This will be syntax that's familiar to our parser and to the users
+ - A stream of SQL statements will be slower to run than BINLOG statements
+ (due to locking, table open/close, etc). (TODO: is it really slower? we
+ haven't checked).
DESCRIPTION:
One of great things about mysqlbinlog was that its output was human-readable
SQL, so it was possible to edit it manually or with help of scripts. With RBR
events and BINLOG 'DpiGShMBAAAALQAAADcBAA...' statements this is no longer the
case.
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
The approach of this WL is to some extent an alternative to WL#38, WL#40, WL#41.
HIGH-LEVEL SPECIFICATION:
Suggestion 1
------------
Original syntax suggestion by Kristian:
BINLOG
WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
FROM ('a') TO ('b') FLAGS 0x0
DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
This is basically a dump of what is stored in the events, and would be an
alternative to BINLOG 'gwWEShMBAA...'.
Feedback and other suggestions
------------------------------
* What is the need for WITH TIMESTAMP part? Can't one use a separate
SET TIMESTAMP statement?
* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
that's close to readable SQL. Can we make it to be regular parseable SQL?
+ This will be syntax that's familiar to our parser and to the users
- A stream of SQL statements will be slower to run than BINLOG statements
(due to locking, table open/close, etc). (TODO: is it really slower? we
haven't checked).
* When SBR replication is used and the statements refer to the current database
(a common scenario), one can use awk to filter out updates made in certain
databases. The proposed syntax doesn't allow to perform equivalent filtering?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Change BINLOG statement syntax to be human-readable (46)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Change BINLOG statement syntax to be human-readable
CREATION DATE..: Sat, 15 Aug 2009, 23:42
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 46 (http://askmonty.org/worklog/?tid=46)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 11:13)=-=-
High Level Description modified.
--- /tmp/wklog.46.old.12747 2009-08-16 11:13:54.000000000 +0300
+++ /tmp/wklog.46.new.12747 2009-08-16 11:13:54.000000000 +0300
@@ -6,4 +6,4 @@
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
-The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
+The approach of this WL is to some extent an alternative to WL#38, WL#40, WL#41.
-=-=(Psergey - Sun, 16 Aug 2009, 11:13)=-=-
High Level Description modified.
--- /tmp/wklog.46.old.12717 2009-08-16 11:13:40.000000000 +0300
+++ /tmp/wklog.46.new.12717 2009-08-16 11:13:40.000000000 +0300
@@ -5,3 +5,5 @@
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
+
+The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency deleted: 48 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency created: 48 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency deleted: 39 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sat, 15 Aug 2009, 23:43)=-=-
High-Level Specification modified.
--- /tmp/wklog.46.old.17742 2009-08-15 23:43:09.000000000 +0300
+++ /tmp/wklog.46.new.17742 2009-08-15 23:43:09.000000000 +0300
@@ -1 +1,28 @@
+Suggestion 1
+------------
+Original syntax suggestion by Kristian:
+
+ BINLOG
+ WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
+ TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
+ TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
+ WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
+ UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
+ FROM ('a') TO ('b') FLAGS 0x0
+ DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
+
+ This is basically a dump of what is stored in the events, and would be an
+ alternative to BINLOG 'gwWEShMBAA...'.
+
+Feedback and other suggestions
+------------------------------
+* What is the need for WITH TIMESTAMP part? Can't one use a separate
+ SET TIMESTAMP statement?
+
+* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
+ that's close to readable SQL. Can we make it to be regular parseable SQL?
+ + This will be syntax that's familiar to our parser and to the users
+ - A stream of SQL statements will be slower to run than BINLOG statements
+ (due to locking, table open/close, etc). (TODO: is it really slower? we
+ haven't checked).
DESCRIPTION:
One of great things about mysqlbinlog was that its output was human-readable
SQL, so it was possible to edit it manually or with help of scripts. With RBR
events and BINLOG 'DpiGShMBAAAALQAAADcBAA...' statements this is no longer the
case.
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
The approach of this WL is to some extent an alternative to WL#38, WL#40, WL#41.
HIGH-LEVEL SPECIFICATION:
Suggestion 1
------------
Original syntax suggestion by Kristian:
BINLOG
WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
FROM ('a') TO ('b') FLAGS 0x0
DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
This is basically a dump of what is stored in the events, and would be an
alternative to BINLOG 'gwWEShMBAA...'.
Feedback and other suggestions
------------------------------
* What is the need for WITH TIMESTAMP part? Can't one use a separate
SET TIMESTAMP statement?
* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
that's close to readable SQL. Can we make it to be regular parseable SQL?
+ This will be syntax that's familiar to our parser and to the users
- A stream of SQL statements will be slower to run than BINLOG statements
(due to locking, table open/close, etc). (TODO: is it really slower? we
haven't checked).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Change BINLOG statement syntax to be human-readable (46)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Change BINLOG statement syntax to be human-readable
CREATION DATE..: Sat, 15 Aug 2009, 23:42
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 46 (http://askmonty.org/worklog/?tid=46)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 11:13)=-=-
High Level Description modified.
--- /tmp/wklog.46.old.12747 2009-08-16 11:13:54.000000000 +0300
+++ /tmp/wklog.46.new.12747 2009-08-16 11:13:54.000000000 +0300
@@ -6,4 +6,4 @@
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
-The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
+The approach of this WL is to some extent an alternative to WL#38, WL#40, WL#41.
-=-=(Psergey - Sun, 16 Aug 2009, 11:13)=-=-
High Level Description modified.
--- /tmp/wklog.46.old.12717 2009-08-16 11:13:40.000000000 +0300
+++ /tmp/wklog.46.new.12717 2009-08-16 11:13:40.000000000 +0300
@@ -5,3 +5,5 @@
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
+
+The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency deleted: 48 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency created: 48 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency deleted: 39 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sat, 15 Aug 2009, 23:43)=-=-
High-Level Specification modified.
--- /tmp/wklog.46.old.17742 2009-08-15 23:43:09.000000000 +0300
+++ /tmp/wklog.46.new.17742 2009-08-15 23:43:09.000000000 +0300
@@ -1 +1,28 @@
+Suggestion 1
+------------
+Original syntax suggestion by Kristian:
+
+ BINLOG
+ WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
+ TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
+ TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
+ WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
+ UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
+ FROM ('a') TO ('b') FLAGS 0x0
+ DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
+
+ This is basically a dump of what is stored in the events, and would be an
+ alternative to BINLOG 'gwWEShMBAA...'.
+
+Feedback and other suggestions
+------------------------------
+* What is the need for WITH TIMESTAMP part? Can't one use a separate
+ SET TIMESTAMP statement?
+
+* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
+ that's close to readable SQL. Can we make it to be regular parseable SQL?
+ + This will be syntax that's familiar to our parser and to the users
+ - A stream of SQL statements will be slower to run than BINLOG statements
+ (due to locking, table open/close, etc). (TODO: is it really slower? we
+ haven't checked).
DESCRIPTION:
One of great things about mysqlbinlog was that its output was human-readable
SQL, so it was possible to edit it manually or with help of scripts. With RBR
events and BINLOG 'DpiGShMBAAAALQAAADcBAA...' statements this is no longer the
case.
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
The approach of this WL is to some extent an alternative to WL#38, WL#40, WL#41.
HIGH-LEVEL SPECIFICATION:
Suggestion 1
------------
Original syntax suggestion by Kristian:
BINLOG
WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
FROM ('a') TO ('b') FLAGS 0x0
DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
This is basically a dump of what is stored in the events, and would be an
alternative to BINLOG 'gwWEShMBAA...'.
Feedback and other suggestions
------------------------------
* What is the need for WITH TIMESTAMP part? Can't one use a separate
SET TIMESTAMP statement?
* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
that's close to readable SQL. Can we make it to be regular parseable SQL?
+ This will be syntax that's familiar to our parser and to the users
- A stream of SQL statements will be slower to run than BINLOG statements
(due to locking, table open/close, etc). (TODO: is it really slower? we
haven't checked).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Change BINLOG statement syntax to be human-readable (46)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Change BINLOG statement syntax to be human-readable
CREATION DATE..: Sat, 15 Aug 2009, 23:42
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 46 (http://askmonty.org/worklog/?tid=46)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 11:13)=-=-
High Level Description modified.
--- /tmp/wklog.46.old.12717 2009-08-16 11:13:40.000000000 +0300
+++ /tmp/wklog.46.new.12717 2009-08-16 11:13:40.000000000 +0300
@@ -5,3 +5,5 @@
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
+
+The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency deleted: 48 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency created: 48 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency deleted: 39 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sat, 15 Aug 2009, 23:43)=-=-
High-Level Specification modified.
--- /tmp/wklog.46.old.17742 2009-08-15 23:43:09.000000000 +0300
+++ /tmp/wklog.46.new.17742 2009-08-15 23:43:09.000000000 +0300
@@ -1 +1,28 @@
+Suggestion 1
+------------
+Original syntax suggestion by Kristian:
+
+ BINLOG
+ WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
+ TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
+ TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
+ WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
+ UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
+ FROM ('a') TO ('b') FLAGS 0x0
+ DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
+
+ This is basically a dump of what is stored in the events, and would be an
+ alternative to BINLOG 'gwWEShMBAA...'.
+
+Feedback and other suggestions
+------------------------------
+* What is the need for WITH TIMESTAMP part? Can't one use a separate
+ SET TIMESTAMP statement?
+
+* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
+ that's close to readable SQL. Can we make it to be regular parseable SQL?
+ + This will be syntax that's familiar to our parser and to the users
+ - A stream of SQL statements will be slower to run than BINLOG statements
+ (due to locking, table open/close, etc). (TODO: is it really slower? we
+ haven't checked).
DESCRIPTION:
One of great things about mysqlbinlog was that its output was human-readable
SQL, so it was possible to edit it manually or with help of scripts. With RBR
events and BINLOG 'DpiGShMBAAAALQAAADcBAA...' statements this is no longer the
case.
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
HIGH-LEVEL SPECIFICATION:
Suggestion 1
------------
Original syntax suggestion by Kristian:
BINLOG
WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
FROM ('a') TO ('b') FLAGS 0x0
DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
This is basically a dump of what is stored in the events, and would be an
alternative to BINLOG 'gwWEShMBAA...'.
Feedback and other suggestions
------------------------------
* What is the need for WITH TIMESTAMP part? Can't one use a separate
SET TIMESTAMP statement?
* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
that's close to readable SQL. Can we make it to be regular parseable SQL?
+ This will be syntax that's familiar to our parser and to the users
- A stream of SQL statements will be slower to run than BINLOG statements
(due to locking, table open/close, etc). (TODO: is it really slower? we
haven't checked).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Change BINLOG statement syntax to be human-readable (46)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Change BINLOG statement syntax to be human-readable
CREATION DATE..: Sat, 15 Aug 2009, 23:42
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 46 (http://askmonty.org/worklog/?tid=46)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 11:13)=-=-
High Level Description modified.
--- /tmp/wklog.46.old.12717 2009-08-16 11:13:40.000000000 +0300
+++ /tmp/wklog.46.new.12717 2009-08-16 11:13:40.000000000 +0300
@@ -5,3 +5,5 @@
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
+
+The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 11:07)=-=-
Dependency deleted: 48 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency created: 48 now depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 10:59)=-=-
Dependency deleted: 39 no longer depends on 46
-=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=-
Dependency created: 39 now depends on 46
-=-=(Psergey - Sat, 15 Aug 2009, 23:43)=-=-
High-Level Specification modified.
--- /tmp/wklog.46.old.17742 2009-08-15 23:43:09.000000000 +0300
+++ /tmp/wklog.46.new.17742 2009-08-15 23:43:09.000000000 +0300
@@ -1 +1,28 @@
+Suggestion 1
+------------
+Original syntax suggestion by Kristian:
+
+ BINLOG
+ WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
+ TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
+ TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
+ WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
+ UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
+ FROM ('a') TO ('b') FLAGS 0x0
+ DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
+
+ This is basically a dump of what is stored in the events, and would be an
+ alternative to BINLOG 'gwWEShMBAA...'.
+
+Feedback and other suggestions
+------------------------------
+* What is the need for WITH TIMESTAMP part? Can't one use a separate
+ SET TIMESTAMP statement?
+
+* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
+ that's close to readable SQL. Can we make it to be regular parseable SQL?
+ + This will be syntax that's familiar to our parser and to the users
+ - A stream of SQL statements will be slower to run than BINLOG statements
+ (due to locking, table open/close, etc). (TODO: is it really slower? we
+ haven't checked).
DESCRIPTION:
One of great things about mysqlbinlog was that its output was human-readable
SQL, so it was possible to edit it manually or with help of scripts. With RBR
events and BINLOG 'DpiGShMBAAAALQAAADcBAA...' statements this is no longer the
case.
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
The approach of this WL is to some extent an alternative to WL#38, WL#40, WL41.
HIGH-LEVEL SPECIFICATION:
Suggestion 1
------------
Original syntax suggestion by Kristian:
BINLOG
WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
FROM ('a') TO ('b') FLAGS 0x0
DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
This is basically a dump of what is stored in the events, and would be an
alternative to BINLOG 'gwWEShMBAA...'.
Feedback and other suggestions
------------------------------
* What is the need for WITH TIMESTAMP part? Can't one use a separate
SET TIMESTAMP statement?
* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
that's close to readable SQL. Can we make it to be regular parseable SQL?
+ This will be syntax that's familiar to our parser and to the users
- A stream of SQL statements will be slower to run than BINLOG statements
(due to locking, table open/close, etc). (TODO: is it really slower? we
haven't checked).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 11:08)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.12485 2009-08-16 11:08:33.000000000 +0300
+++ /tmp/wklog.47.new.12485 2009-08-16 11:08:33.000000000 +0300
@@ -1 +1,6 @@
+First suggestion:
+
+> I think for this we would actually need a new binlog event type
+> (Comment_log_event?). Unless we want to log an empty statement Query_log_event
+> containing only a comment (a bit of a hack).
-=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=-
Dependency created: 39 now depends on 47
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
First suggestion:
> I think for this we would actually need a new binlog event type
> (Comment_log_event?). Unless we want to log an empty statement Query_log_event
> containing only a comment (a bit of a hack).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 16 Aug 2009, 11:08)=-=-
High-Level Specification modified.
--- /tmp/wklog.47.old.12485 2009-08-16 11:08:33.000000000 +0300
+++ /tmp/wklog.47.new.12485 2009-08-16 11:08:33.000000000 +0300
@@ -1 +1,6 @@
+First suggestion:
+
+> I think for this we would actually need a new binlog event type
+> (Comment_log_event?). Unless we want to log an empty statement Query_log_event
+> containing only a comment (a bit of a hack).
-=-=(Psergey - Sun, 16 Aug 2009, 00:02)=-=-
Dependency created: 39 now depends on 47
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
HIGH-LEVEL SPECIFICATION:
First suggestion:
> I think for this we would actually need a new binlog event type
> (Comment_log_event?). Unless we want to log an empty statement Query_log_event
> containing only a comment (a bit of a hack).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Extra replication tasks (48)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Extra replication tasks
CREATION DATE..: Sun, 16 Aug 2009, 10:58
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 48 (http://askmonty.org/worklog/?tid=48)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
An umbrella task for replication tasks that are nice to do but are not direct
responses for customer requests
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Extra replication tasks (48)
by worklog-noreply@askmonty.org 16 Aug '09
by worklog-noreply@askmonty.org 16 Aug '09
16 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Extra replication tasks
CREATION DATE..: Sun, 16 Aug 2009, 10:58
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 48 (http://askmonty.org/worklog/?tid=48)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
An umbrella task for replication tasks that are nice to do but are not direct
responses for customer requests
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Rev 2725: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r10-vg/
by Sergey Petrunya 16 Aug '09
by Sergey Petrunya 16 Aug '09
16 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r10-vg/
------------------------------------------------------------
revno: 2725
revision-id: psergey(a)askmonty.org-20090816072524-w9fu2hy23pjwlr8z
parent: psergey(a)askmonty.org-20090815153912-q47vfp1j22ilmup2
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r10-vg
timestamp: Sun 2009-08-16 10:25:24 +0300
message:
MWL#17: Table elimination
- Fix trivial valgrind failures that shown up after review
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-15 15:39:12 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-16 07:25:24 +0000
@@ -40,13 +40,16 @@
Table elimination is redone on every PS re-execution.
*/
-class Value_dep
+class Value_dep : public Sql_alloc
{
public:
enum {
VALUE_FIELD,
VALUE_TABLE,
} type; /* Type of the object */
+
+ Value_dep(): bound(FALSE), next(NULL)
+ {}
bool bound;
Value_dep *next;
1
0

[Maria-developers] Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Sun, 16 Aug 2009, 02:13)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.23383 2009-08-16 02:13:54.000000000 +0300
+++ /tmp/wklog.24.new.23383 2009-08-16 02:13:54.000000000 +0300
@@ -125,7 +125,7 @@
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
-(Here an imerge for col2=c2 OR col3=c3 won't be built since neither col2=c2 nor
+(Here no imerge for col2=c2 OR col3=c3 will be built since neither col2=c2 nor
col3=c3 represent index ranges.)
@@ -199,7 +199,7 @@
O2. "Create index_merge accesses when possible"
Current tree_or() will not create index_merge access when it could create
- non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
+ non-index merge access (see DISCARD-IMERGE-2 and its example in the "Problems
in the current implementation" section). This will be changed to work as
follows: we will create index_merge made for index scans that didn't have
their match in the other sel_tree.
-=-=(Guest - Sun, 16 Aug 2009, 01:03)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.20767 2009-08-16 01:03:11.000000000 +0300
+++ /tmp/wklog.24.new.20767 2009-08-16 01:03:11.000000000 +0300
@@ -18,6 +18,8 @@
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+ (here range(keyi) may represent ranges not for initial keyi prefixes,
+ but ranges for any infixes for keyi)
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
@@ -47,13 +49,13 @@
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
- return R;
+ return R; // DISCARD-IMERGE-2
else
{
/* could not build any range accesses. construct index_merge */
- remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from A;
remove non-ranges from B;
- return new index_merge(A, B);
+ return new index_merge(A, B); // DISCARD-IMERGE-3
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
@@ -65,12 +67,12 @@
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
- (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN)
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
- (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N)
Now each line represents an index_merge..
}
@@ -82,18 +84,18 @@
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
- -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-4
imergeA1
OR
- imergeB1 AND imergeB2 AND ... AND imergeBN =
+ imergeB1 =
- = (combine imergeA1 with each of the imergeB{i} ) =
+ = (combine imergeA1 with each of the range_treeB_1{i} ) =
- combine(imergeA1 OR imergeB1) AND
- combine(imergeA1 OR imergeB2) AND
+ combine(imergeA1 OR range_treeB_11) AND
+ combine(imergeA1 OR range_treeB_12) AND
... AND
- combine(imergeA1 OR imergeBN)
+ combine(imergeA1 OR range_treeB_1N)
}
}
@@ -109,7 +111,7 @@
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
- (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+ (t.badkey<c1 AND t.key1=c1) OR (t.key2=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
@@ -123,6 +125,8 @@
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
+(Here an imerge for col2=c2 OR col3=c3 won't be built since neither col2=c2 nor
+col3=c3 represent index ranges.)
2. New implementation
-=-=(Guest - Mon, 20 Jul 2009, 17:13)=-=-
Dependency deleted: 30 no longer depends on 24
-=-=(Guest - Sat, 20 Jun 2009, 09:34)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.21663 2009-06-20 09:34:48.000000000 +0300
+++ /tmp/wklog.24.new.21663 2009-06-20 09:34:48.000000000 +0300
@@ -4,6 +4,7 @@
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
+3. Testing and required coverage
</contents>
1. Current implementation overview
@@ -240,3 +241,14 @@
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
+
+3. Testing and required coverage
+================================
+So far could find the following user cases:
+
+* BUG#17259: Query optimizer chooses wrong index
+* BUG#17673: Optimizer does not use Index Merge optimization in some cases
+* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
+* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
+
+
-=-=(Guest - Thu, 18 Jun 2009, 16:55)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.19152 2009-06-18 16:55:00.000000000 +0300
+++ /tmp/wklog.24.new.19152 2009-06-18 16:55:00.000000000 +0300
@@ -141,13 +141,15 @@
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
+
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
-1. Don't remove index_merge part of the tree.
+A1. Don't remove index_merge part of the tree (this will take care of
+ DISCARD-IMERGE-1 problem)
-2. Push range conditions down into index_merge trees that may support them.
+A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
@@ -155,8 +157,86 @@
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
-3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
-2.2 New tree_or()
+2.2 New tree_or()
+-----------------
+O1. Dont remove non-range plans:
+ Current tree_or() code will refuse to produce index_merge plans for
+ conditions like
+
+ "t.key1part2=const OR t.key2part1=const"
+
+ (this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
+ the AND condition is not usable for range access, and the operation of
+ tree_and() guaranteed that there was no way it could changed to make a
+ usable range plan. With new tree_and() and rule A2, this is no longer the
+ case. For example for this query:
+
+ (t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
+
+ it will construct a
+
+ imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
+
+ then tree_and() will apply rule A2 to push the range down into index merge
+ and after that we'll have:
+
+ range(t.key1part1=const)
+ imerge(
+ t.key1part2=const AND t.key1part1=const,
+ t.key2part1=const
+ )
+ note that imerge(...) describes a usable index_merge plan and it's possible
+ that it will be the best access path.
+
+O2. "Create index_merge accesses when possible"
+ Current tree_or() will not create index_merge access when it could create
+ non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
+ in the current implementation" section). This will be changed to work as
+ follows: we will create index_merge made for index scans that didn't have
+ their match in the other sel_tree.
+ Ilustrating it with an example:
+
+ | sel_tree_A | sel_tree_B | A or B | include in index_merge?
+ ------+------------+------------+--------+------------------------
+ key1 | cond1 | cond2 | condM | no
+ key2 | cond3 | cond4 | NULL | no
+ key3 | cond5 | | | yes, A-side
+ key4 | cond6 | | | yes, A-side
+ key5 | | cond7 | | yes, B-side
+ key6 | | cond8 | | yes, B-side
+
+ here we assume that
+ - (cond1 OR cond2) did produce a combined range. Not including them in
+ index_merge.
+ - (cond3 OR cond4) didn't produce a usable range (e.g. they were
+ t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
+ didn't yield any range list)
+ - All other scand didn't have their counterparts, so we'll end up with a
+ SEL_TREE of:
+
+ range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
+ .
+
+O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
+that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
+seen any complaints that could be attributed to it.
+If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
+lift it ,and produce a cross-product:
+
+ ((key1p OR key2p) AND (key3p OR key4p))
+ OR
+ ((key5p OR key6p) AND (key7p OR key8p))
+
+ = (key1p OR key2p OR key5p OR key6p) AND // this part is currently
+ (key3p OR key4p OR key5p OR key6p) AND // produced
+
+ (key1p OR key2p OR key5p OR key6p) AND // this part will be added
+ (key3p OR key4p OR key5p OR key6p) //.
+
+In order to limit the impact of this combinatorial explosion, we will
+introduce a rule that we won't generate more than #defined
+MAX_IMERGE_OPTS options.
-=-=(Guest - Thu, 18 Jun 2009, 14:56)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.15612 2009-06-18 14:56:09.000000000 +0300
+++ /tmp/wklog.24.new.15612 2009-06-18 14:56:09.000000000 +0300
@@ -1 +1,162 @@
+<contents>
+1. Current implementation overview
+1.1. Problems in the current implementation
+2. New implementation
+2.1 New tree_and()
+2.2 New tree_or()
+</contents>
+
+1. Current implementation overview
+==================================
+At the moment, range analyzer works as follows:
+
+SEL_TREE structure represents
+
+ # There are sel_trees, a sel_tree is either range or merge tree
+ sel_tree = range_tree | imerge_tree
+
+ # a range tree has range access options, possibly for several keys
+ range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+
+ # merge tree represents several way to index_merge
+ imerge_tree = imerge1 AND imerge2 AND ...
+
+ # a way to do index merge == a set to use of different indexes.
+ imergeX = range_tree1 OR range_tree2 OR ..
+ where no pair of range_treeX have ranges over the same index.
+
+
+ tree_and(A, B)
+ {
+ if (both A and B are range trees)
+ return a range_tree with computed intersection for each range;
+ if (only one of A and B is a range tree)
+ return that tree; // DISCARD-IMERGE-1
+ // at this point both trees are index_merge trees
+ return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
+ }
+
+
+ tree_or(A, B)
+ {
+ if (A and B are range trees)
+ {
+ R = new range_tree;
+ for each index i
+ R.add(range_union(A.range(i), B.range(i)));
+
+ if (R has at least one range access)
+ return R;
+ else
+ {
+ /* could not build any range accesses. construct index_merge */
+ remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from B;
+ return new index_merge(A, B);
+ }
+ }
+ else if (A is range tree and B is index_merge tree (or vice versa))
+ {
+ Perform this transformation:
+
+ range_treeA // this is A
+ OR
+ (range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
+ (range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ =
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+
+ Now each line represents an index_merge..
+ }
+ else if (both A and B are index_merge trees)
+ {
+ Perform this transformation:
+
+ imergeA1 AND imergeA2 AND ... AND imergeAN
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN
+
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+
+ imergeA1
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN =
+
+ = (combine imergeA1 with each of the imergeB{i} ) =
+
+ combine(imergeA1 OR imergeB1) AND
+ combine(imergeA1 OR imergeB2) AND
+ ... AND
+ combine(imergeA1 OR imergeBN)
+ }
+ }
+
+1.1. Problems in the current implementation
+-------------------------------------------
+As marked in the code above:
+
+DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
+the WHERE clause has this form:
+
+ (t.key1=c1 OR t.key2=c2) AND t.badkey < c3
+
+DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
+the WHERE clause has this form (conditions t.badkey may have abritrary form):
+
+ (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+
+DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
+two indexes:
+
+ INDEX i1(col1, col2),
+ INDEX i2(col1, col3)
+
+and this WHERE clause:
+
+ col1=c1 AND (col2=c2 OR col3=c3)
+
+The optimizer will generate the plans that only use the "col1=c1" part. The
+right side of the AND will be ignored even if it has good selectivity.
+
+
+2. New implementation
+=====================
+
+<general idea>
+* Don't start fighting combinatorial explosion until we've actually got one.
+</>
+
+SEL_TREE structure will be now able to hold both index_merge and range scan
+candidates at the same time. That is,
+
+ sel_tree2 = range_tree AND imerge_tree
+
+where both parts are optional (i.e. can be empty)
+
+Operations on SEL_ARG trees will be modified to produce/process the trees of
+this kind:
+
+2.1 New tree_and()
+------------------
+In order not to lose plans, we'll make these changes:
+
+1. Don't remove index_merge part of the tree.
+
+2. Push range conditions down into index_merge trees that may support them.
+ if one tree has range(key1) and the other tree has imerge(key1 OR key2)
+ then perform an equvalent of this operation:
+
+ rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
+
+ (rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
+
+3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+ concatenate them together.
+
+2.2 New tree_or()
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 24
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
-=-=(Guest - Tue, 26 May 2009, 13:27)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.305 2009-05-26 13:27:32.000000000 +0300
+++ /tmp/wklog.24.new.305 2009-05-26 13:27:32.000000000 +0300
@@ -1 +1,70 @@
+(Not a ready HLS but draft)
+<contents>
+Solution overview
+Limitations
+TODO
+
+</contents>
+
+Solution overview
+=================
+The idea is to delay discarding potential index_merge plans until the point
+where it is really necessary.
+
+This way, we won't have to do much changes in the range analyzer, but will be
+able to keep potential index_merge plan just enough so that it's possible to
+take it into consideration together with range access plans.
+
+Since there are no changes in the optimizer, the ability to consider both
+range and index_merge options will be limited to WHERE clauses of this form:
+
+ WHERE := range_cond(key1_1) AND
+ range_cond(key2_1) AND
+ other_cond AND
+ index_merge_OR_cond1(key3_1, key3_2, ...)
+ index_merge_OR_cond2(key4_1, key4_2, ...)
+
+where
+
+ index_merge_OR_cond{N} := (range_cond(keyN_1) OR
+ range_cond(keyN_2) OR ...)
+
+
+ range_cond(keyX) := condition that allows to construct range access of keyX
+ and doesn't allow to construct range/index_merge accesses
+ for any keys of the table in question.
+
+
+For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
+
+ SEL_TREE(
+ range(key1_1),
+ ...
+ range(key2_1),
+ SEL_IMERGE( (1)
+ SEL_TREE(key3_1})
+ SEL_TREE(key3_2})
+ ...
+ )
+ ...
+ )
+
+which can be used to make a cost-based choice between range and index_merge.
+
+Limitations
+-----------
+This will not be a full solution in a sense that the range analyzer will not
+be able to produce sel_tree (1) if the WHERE clause is specified in other form
+(e.g. brackets were opened).
+
+TODO
+----
+* is it a problem if there are keys that are referred to both from
+ index_merge and from range access?
+
+* How strict is the limitation on the form of the WHERE?
+
+* TODO: The optimizer didn't compare costs of index_merge and range before (ok
+ it did but that was done for accesses to different tables). Will there be any
+ possible gotchas here?
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
LOW-LEVEL DESIGN:
<contents>
1. Current implementation overview
1.1. Problems in the current implementation
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
3. Testing and required coverage
</contents>
1. Current implementation overview
==================================
At the moment, range analyzer works as follows:
SEL_TREE structure represents
# There are sel_trees, a sel_tree is either range or merge tree
sel_tree = range_tree | imerge_tree
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
(here range(keyi) may represent ranges not for initial keyi prefixes,
but ranges for any infixes for keyi)
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
# a way to do index merge == a set to use of different indexes.
imergeX = range_tree1 OR range_tree2 OR ..
where no pair of range_treeX have ranges over the same index.
tree_and(A, B)
{
if (both A and B are range trees)
return a range_tree with computed intersection for each range;
if (only one of A and B is a range tree)
return that tree; // DISCARD-IMERGE-1
// at this point both trees are index_merge trees
return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
}
tree_or(A, B)
{
if (A and B are range trees)
{
R = new range_tree;
for each index i
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
return R; // DISCARD-IMERGE-2
else
{
/* could not build any range accesses. construct index_merge */
remove non-ranges from A;
remove non-ranges from B;
return new index_merge(A, B); // DISCARD-IMERGE-3
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
{
Perform this transformation:
range_treeA // this is A
OR
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
(range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN)
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N)
Now each line represents an index_merge..
}
else if (both A and B are index_merge trees)
{
Perform this transformation:
imergeA1 AND imergeA2 AND ... AND imergeAN
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
-> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-4
imergeA1
OR
imergeB1 =
= (combine imergeA1 with each of the range_treeB_1{i} ) =
combine(imergeA1 OR range_treeB_11) AND
combine(imergeA1 OR range_treeB_12) AND
... AND
combine(imergeA1 OR range_treeB_1N)
}
}
1.1. Problems in the current implementation
-------------------------------------------
As marked in the code above:
DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
the WHERE clause has this form:
(t.key1=c1 OR t.key2=c2) AND t.badkey < c3
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
(t.badkey<c1 AND t.key1=c1) OR (t.key2=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
INDEX i1(col1, col2),
INDEX i2(col1, col3)
and this WHERE clause:
col1=c1 AND (col2=c2 OR col3=c3)
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
(Here no imerge for col2=c2 OR col3=c3 will be built since neither col2=c2 nor
col3=c3 represent index ranges.)
2. New implementation
=====================
<general idea>
* Don't start fighting combinatorial explosion until we've actually got one.
</>
SEL_TREE structure will be now able to hold both index_merge and range scan
candidates at the same time. That is,
sel_tree2 = range_tree AND imerge_tree
where both parts are optional (i.e. can be empty)
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
A1. Don't remove index_merge part of the tree (this will take care of
DISCARD-IMERGE-1 problem)
A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
2.2 New tree_or()
-----------------
O1. Dont remove non-range plans:
Current tree_or() code will refuse to produce index_merge plans for
conditions like
"t.key1part2=const OR t.key2part1=const"
(this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
the AND condition is not usable for range access, and the operation of
tree_and() guaranteed that there was no way it could changed to make a
usable range plan. With new tree_and() and rule A2, this is no longer the
case. For example for this query:
(t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
it will construct a
imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
then tree_and() will apply rule A2 to push the range down into index merge
and after that we'll have:
range(t.key1part1=const)
imerge(
t.key1part2=const AND t.key1part1=const,
t.key2part1=const
)
note that imerge(...) describes a usable index_merge plan and it's possible
that it will be the best access path.
O2. "Create index_merge accesses when possible"
Current tree_or() will not create index_merge access when it could create
non-index merge access (see DISCARD-IMERGE-2 and its example in the "Problems
in the current implementation" section). This will be changed to work as
follows: we will create index_merge made for index scans that didn't have
their match in the other sel_tree.
Ilustrating it with an example:
| sel_tree_A | sel_tree_B | A or B | include in index_merge?
------+------------+------------+--------+------------------------
key1 | cond1 | cond2 | condM | no
key2 | cond3 | cond4 | NULL | no
key3 | cond5 | | | yes, A-side
key4 | cond6 | | | yes, A-side
key5 | | cond7 | | yes, B-side
key6 | | cond8 | | yes, B-side
here we assume that
- (cond1 OR cond2) did produce a combined range. Not including them in
index_merge.
- (cond3 OR cond4) didn't produce a usable range (e.g. they were
t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
didn't yield any range list)
- All other scand didn't have their counterparts, so we'll end up with a
SEL_TREE of:
range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
.
O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
seen any complaints that could be attributed to it.
If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
lift it ,and produce a cross-product:
((key1p OR key2p) AND (key3p OR key4p))
OR
((key5p OR key6p) AND (key7p OR key8p))
= (key1p OR key2p OR key5p OR key6p) AND // this part is currently
(key3p OR key4p OR key5p OR key6p) AND // produced
(key1p OR key2p OR key5p OR key6p) AND // this part will be added
(key3p OR key4p OR key5p OR key6p) //.
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
3. Testing and required coverage
================================
So far could find the following user cases:
* BUG#17259: Query optimizer chooses wrong index
* BUG#17673: Optimizer does not use Index Merge optimization in some cases
* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Sun, 16 Aug 2009, 02:13)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.23383 2009-08-16 02:13:54.000000000 +0300
+++ /tmp/wklog.24.new.23383 2009-08-16 02:13:54.000000000 +0300
@@ -125,7 +125,7 @@
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
-(Here an imerge for col2=c2 OR col3=c3 won't be built since neither col2=c2 nor
+(Here no imerge for col2=c2 OR col3=c3 will be built since neither col2=c2 nor
col3=c3 represent index ranges.)
@@ -199,7 +199,7 @@
O2. "Create index_merge accesses when possible"
Current tree_or() will not create index_merge access when it could create
- non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
+ non-index merge access (see DISCARD-IMERGE-2 and its example in the "Problems
in the current implementation" section). This will be changed to work as
follows: we will create index_merge made for index scans that didn't have
their match in the other sel_tree.
-=-=(Guest - Sun, 16 Aug 2009, 01:03)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.20767 2009-08-16 01:03:11.000000000 +0300
+++ /tmp/wklog.24.new.20767 2009-08-16 01:03:11.000000000 +0300
@@ -18,6 +18,8 @@
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+ (here range(keyi) may represent ranges not for initial keyi prefixes,
+ but ranges for any infixes for keyi)
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
@@ -47,13 +49,13 @@
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
- return R;
+ return R; // DISCARD-IMERGE-2
else
{
/* could not build any range accesses. construct index_merge */
- remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from A;
remove non-ranges from B;
- return new index_merge(A, B);
+ return new index_merge(A, B); // DISCARD-IMERGE-3
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
@@ -65,12 +67,12 @@
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
- (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN)
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
- (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N)
Now each line represents an index_merge..
}
@@ -82,18 +84,18 @@
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
- -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-4
imergeA1
OR
- imergeB1 AND imergeB2 AND ... AND imergeBN =
+ imergeB1 =
- = (combine imergeA1 with each of the imergeB{i} ) =
+ = (combine imergeA1 with each of the range_treeB_1{i} ) =
- combine(imergeA1 OR imergeB1) AND
- combine(imergeA1 OR imergeB2) AND
+ combine(imergeA1 OR range_treeB_11) AND
+ combine(imergeA1 OR range_treeB_12) AND
... AND
- combine(imergeA1 OR imergeBN)
+ combine(imergeA1 OR range_treeB_1N)
}
}
@@ -109,7 +111,7 @@
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
- (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+ (t.badkey<c1 AND t.key1=c1) OR (t.key2=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
@@ -123,6 +125,8 @@
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
+(Here an imerge for col2=c2 OR col3=c3 won't be built since neither col2=c2 nor
+col3=c3 represent index ranges.)
2. New implementation
-=-=(Guest - Mon, 20 Jul 2009, 17:13)=-=-
Dependency deleted: 30 no longer depends on 24
-=-=(Guest - Sat, 20 Jun 2009, 09:34)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.21663 2009-06-20 09:34:48.000000000 +0300
+++ /tmp/wklog.24.new.21663 2009-06-20 09:34:48.000000000 +0300
@@ -4,6 +4,7 @@
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
+3. Testing and required coverage
</contents>
1. Current implementation overview
@@ -240,3 +241,14 @@
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
+
+3. Testing and required coverage
+================================
+So far could find the following user cases:
+
+* BUG#17259: Query optimizer chooses wrong index
+* BUG#17673: Optimizer does not use Index Merge optimization in some cases
+* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
+* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
+
+
-=-=(Guest - Thu, 18 Jun 2009, 16:55)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.19152 2009-06-18 16:55:00.000000000 +0300
+++ /tmp/wklog.24.new.19152 2009-06-18 16:55:00.000000000 +0300
@@ -141,13 +141,15 @@
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
+
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
-1. Don't remove index_merge part of the tree.
+A1. Don't remove index_merge part of the tree (this will take care of
+ DISCARD-IMERGE-1 problem)
-2. Push range conditions down into index_merge trees that may support them.
+A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
@@ -155,8 +157,86 @@
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
-3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
-2.2 New tree_or()
+2.2 New tree_or()
+-----------------
+O1. Dont remove non-range plans:
+ Current tree_or() code will refuse to produce index_merge plans for
+ conditions like
+
+ "t.key1part2=const OR t.key2part1=const"
+
+ (this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
+ the AND condition is not usable for range access, and the operation of
+ tree_and() guaranteed that there was no way it could changed to make a
+ usable range plan. With new tree_and() and rule A2, this is no longer the
+ case. For example for this query:
+
+ (t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
+
+ it will construct a
+
+ imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
+
+ then tree_and() will apply rule A2 to push the range down into index merge
+ and after that we'll have:
+
+ range(t.key1part1=const)
+ imerge(
+ t.key1part2=const AND t.key1part1=const,
+ t.key2part1=const
+ )
+ note that imerge(...) describes a usable index_merge plan and it's possible
+ that it will be the best access path.
+
+O2. "Create index_merge accesses when possible"
+ Current tree_or() will not create index_merge access when it could create
+ non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
+ in the current implementation" section). This will be changed to work as
+ follows: we will create index_merge made for index scans that didn't have
+ their match in the other sel_tree.
+ Ilustrating it with an example:
+
+ | sel_tree_A | sel_tree_B | A or B | include in index_merge?
+ ------+------------+------------+--------+------------------------
+ key1 | cond1 | cond2 | condM | no
+ key2 | cond3 | cond4 | NULL | no
+ key3 | cond5 | | | yes, A-side
+ key4 | cond6 | | | yes, A-side
+ key5 | | cond7 | | yes, B-side
+ key6 | | cond8 | | yes, B-side
+
+ here we assume that
+ - (cond1 OR cond2) did produce a combined range. Not including them in
+ index_merge.
+ - (cond3 OR cond4) didn't produce a usable range (e.g. they were
+ t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
+ didn't yield any range list)
+ - All other scand didn't have their counterparts, so we'll end up with a
+ SEL_TREE of:
+
+ range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
+ .
+
+O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
+that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
+seen any complaints that could be attributed to it.
+If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
+lift it ,and produce a cross-product:
+
+ ((key1p OR key2p) AND (key3p OR key4p))
+ OR
+ ((key5p OR key6p) AND (key7p OR key8p))
+
+ = (key1p OR key2p OR key5p OR key6p) AND // this part is currently
+ (key3p OR key4p OR key5p OR key6p) AND // produced
+
+ (key1p OR key2p OR key5p OR key6p) AND // this part will be added
+ (key3p OR key4p OR key5p OR key6p) //.
+
+In order to limit the impact of this combinatorial explosion, we will
+introduce a rule that we won't generate more than #defined
+MAX_IMERGE_OPTS options.
-=-=(Guest - Thu, 18 Jun 2009, 14:56)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.15612 2009-06-18 14:56:09.000000000 +0300
+++ /tmp/wklog.24.new.15612 2009-06-18 14:56:09.000000000 +0300
@@ -1 +1,162 @@
+<contents>
+1. Current implementation overview
+1.1. Problems in the current implementation
+2. New implementation
+2.1 New tree_and()
+2.2 New tree_or()
+</contents>
+
+1. Current implementation overview
+==================================
+At the moment, range analyzer works as follows:
+
+SEL_TREE structure represents
+
+ # There are sel_trees, a sel_tree is either range or merge tree
+ sel_tree = range_tree | imerge_tree
+
+ # a range tree has range access options, possibly for several keys
+ range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+
+ # merge tree represents several way to index_merge
+ imerge_tree = imerge1 AND imerge2 AND ...
+
+ # a way to do index merge == a set to use of different indexes.
+ imergeX = range_tree1 OR range_tree2 OR ..
+ where no pair of range_treeX have ranges over the same index.
+
+
+ tree_and(A, B)
+ {
+ if (both A and B are range trees)
+ return a range_tree with computed intersection for each range;
+ if (only one of A and B is a range tree)
+ return that tree; // DISCARD-IMERGE-1
+ // at this point both trees are index_merge trees
+ return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
+ }
+
+
+ tree_or(A, B)
+ {
+ if (A and B are range trees)
+ {
+ R = new range_tree;
+ for each index i
+ R.add(range_union(A.range(i), B.range(i)));
+
+ if (R has at least one range access)
+ return R;
+ else
+ {
+ /* could not build any range accesses. construct index_merge */
+ remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from B;
+ return new index_merge(A, B);
+ }
+ }
+ else if (A is range tree and B is index_merge tree (or vice versa))
+ {
+ Perform this transformation:
+
+ range_treeA // this is A
+ OR
+ (range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
+ (range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ =
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+
+ Now each line represents an index_merge..
+ }
+ else if (both A and B are index_merge trees)
+ {
+ Perform this transformation:
+
+ imergeA1 AND imergeA2 AND ... AND imergeAN
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN
+
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+
+ imergeA1
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN =
+
+ = (combine imergeA1 with each of the imergeB{i} ) =
+
+ combine(imergeA1 OR imergeB1) AND
+ combine(imergeA1 OR imergeB2) AND
+ ... AND
+ combine(imergeA1 OR imergeBN)
+ }
+ }
+
+1.1. Problems in the current implementation
+-------------------------------------------
+As marked in the code above:
+
+DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
+the WHERE clause has this form:
+
+ (t.key1=c1 OR t.key2=c2) AND t.badkey < c3
+
+DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
+the WHERE clause has this form (conditions t.badkey may have abritrary form):
+
+ (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+
+DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
+two indexes:
+
+ INDEX i1(col1, col2),
+ INDEX i2(col1, col3)
+
+and this WHERE clause:
+
+ col1=c1 AND (col2=c2 OR col3=c3)
+
+The optimizer will generate the plans that only use the "col1=c1" part. The
+right side of the AND will be ignored even if it has good selectivity.
+
+
+2. New implementation
+=====================
+
+<general idea>
+* Don't start fighting combinatorial explosion until we've actually got one.
+</>
+
+SEL_TREE structure will be now able to hold both index_merge and range scan
+candidates at the same time. That is,
+
+ sel_tree2 = range_tree AND imerge_tree
+
+where both parts are optional (i.e. can be empty)
+
+Operations on SEL_ARG trees will be modified to produce/process the trees of
+this kind:
+
+2.1 New tree_and()
+------------------
+In order not to lose plans, we'll make these changes:
+
+1. Don't remove index_merge part of the tree.
+
+2. Push range conditions down into index_merge trees that may support them.
+ if one tree has range(key1) and the other tree has imerge(key1 OR key2)
+ then perform an equvalent of this operation:
+
+ rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
+
+ (rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
+
+3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+ concatenate them together.
+
+2.2 New tree_or()
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 24
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
-=-=(Guest - Tue, 26 May 2009, 13:27)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.305 2009-05-26 13:27:32.000000000 +0300
+++ /tmp/wklog.24.new.305 2009-05-26 13:27:32.000000000 +0300
@@ -1 +1,70 @@
+(Not a ready HLS but draft)
+<contents>
+Solution overview
+Limitations
+TODO
+
+</contents>
+
+Solution overview
+=================
+The idea is to delay discarding potential index_merge plans until the point
+where it is really necessary.
+
+This way, we won't have to do much changes in the range analyzer, but will be
+able to keep potential index_merge plan just enough so that it's possible to
+take it into consideration together with range access plans.
+
+Since there are no changes in the optimizer, the ability to consider both
+range and index_merge options will be limited to WHERE clauses of this form:
+
+ WHERE := range_cond(key1_1) AND
+ range_cond(key2_1) AND
+ other_cond AND
+ index_merge_OR_cond1(key3_1, key3_2, ...)
+ index_merge_OR_cond2(key4_1, key4_2, ...)
+
+where
+
+ index_merge_OR_cond{N} := (range_cond(keyN_1) OR
+ range_cond(keyN_2) OR ...)
+
+
+ range_cond(keyX) := condition that allows to construct range access of keyX
+ and doesn't allow to construct range/index_merge accesses
+ for any keys of the table in question.
+
+
+For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
+
+ SEL_TREE(
+ range(key1_1),
+ ...
+ range(key2_1),
+ SEL_IMERGE( (1)
+ SEL_TREE(key3_1})
+ SEL_TREE(key3_2})
+ ...
+ )
+ ...
+ )
+
+which can be used to make a cost-based choice between range and index_merge.
+
+Limitations
+-----------
+This will not be a full solution in a sense that the range analyzer will not
+be able to produce sel_tree (1) if the WHERE clause is specified in other form
+(e.g. brackets were opened).
+
+TODO
+----
+* is it a problem if there are keys that are referred to both from
+ index_merge and from range access?
+
+* How strict is the limitation on the form of the WHERE?
+
+* TODO: The optimizer didn't compare costs of index_merge and range before (ok
+ it did but that was done for accesses to different tables). Will there be any
+ possible gotchas here?
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
LOW-LEVEL DESIGN:
<contents>
1. Current implementation overview
1.1. Problems in the current implementation
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
3. Testing and required coverage
</contents>
1. Current implementation overview
==================================
At the moment, range analyzer works as follows:
SEL_TREE structure represents
# There are sel_trees, a sel_tree is either range or merge tree
sel_tree = range_tree | imerge_tree
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
(here range(keyi) may represent ranges not for initial keyi prefixes,
but ranges for any infixes for keyi)
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
# a way to do index merge == a set to use of different indexes.
imergeX = range_tree1 OR range_tree2 OR ..
where no pair of range_treeX have ranges over the same index.
tree_and(A, B)
{
if (both A and B are range trees)
return a range_tree with computed intersection for each range;
if (only one of A and B is a range tree)
return that tree; // DISCARD-IMERGE-1
// at this point both trees are index_merge trees
return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
}
tree_or(A, B)
{
if (A and B are range trees)
{
R = new range_tree;
for each index i
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
return R; // DISCARD-IMERGE-2
else
{
/* could not build any range accesses. construct index_merge */
remove non-ranges from A;
remove non-ranges from B;
return new index_merge(A, B); // DISCARD-IMERGE-3
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
{
Perform this transformation:
range_treeA // this is A
OR
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
(range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN)
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N)
Now each line represents an index_merge..
}
else if (both A and B are index_merge trees)
{
Perform this transformation:
imergeA1 AND imergeA2 AND ... AND imergeAN
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
-> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-4
imergeA1
OR
imergeB1 =
= (combine imergeA1 with each of the range_treeB_1{i} ) =
combine(imergeA1 OR range_treeB_11) AND
combine(imergeA1 OR range_treeB_12) AND
... AND
combine(imergeA1 OR range_treeB_1N)
}
}
1.1. Problems in the current implementation
-------------------------------------------
As marked in the code above:
DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
the WHERE clause has this form:
(t.key1=c1 OR t.key2=c2) AND t.badkey < c3
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
(t.badkey<c1 AND t.key1=c1) OR (t.key2=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
INDEX i1(col1, col2),
INDEX i2(col1, col3)
and this WHERE clause:
col1=c1 AND (col2=c2 OR col3=c3)
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
(Here no imerge for col2=c2 OR col3=c3 will be built since neither col2=c2 nor
col3=c3 represent index ranges.)
2. New implementation
=====================
<general idea>
* Don't start fighting combinatorial explosion until we've actually got one.
</>
SEL_TREE structure will be now able to hold both index_merge and range scan
candidates at the same time. That is,
sel_tree2 = range_tree AND imerge_tree
where both parts are optional (i.e. can be empty)
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
A1. Don't remove index_merge part of the tree (this will take care of
DISCARD-IMERGE-1 problem)
A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
2.2 New tree_or()
-----------------
O1. Dont remove non-range plans:
Current tree_or() code will refuse to produce index_merge plans for
conditions like
"t.key1part2=const OR t.key2part1=const"
(this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
the AND condition is not usable for range access, and the operation of
tree_and() guaranteed that there was no way it could changed to make a
usable range plan. With new tree_and() and rule A2, this is no longer the
case. For example for this query:
(t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
it will construct a
imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
then tree_and() will apply rule A2 to push the range down into index merge
and after that we'll have:
range(t.key1part1=const)
imerge(
t.key1part2=const AND t.key1part1=const,
t.key2part1=const
)
note that imerge(...) describes a usable index_merge plan and it's possible
that it will be the best access path.
O2. "Create index_merge accesses when possible"
Current tree_or() will not create index_merge access when it could create
non-index merge access (see DISCARD-IMERGE-2 and its example in the "Problems
in the current implementation" section). This will be changed to work as
follows: we will create index_merge made for index scans that didn't have
their match in the other sel_tree.
Ilustrating it with an example:
| sel_tree_A | sel_tree_B | A or B | include in index_merge?
------+------------+------------+--------+------------------------
key1 | cond1 | cond2 | condM | no
key2 | cond3 | cond4 | NULL | no
key3 | cond5 | | | yes, A-side
key4 | cond6 | | | yes, A-side
key5 | | cond7 | | yes, B-side
key6 | | cond8 | | yes, B-side
here we assume that
- (cond1 OR cond2) did produce a combined range. Not including them in
index_merge.
- (cond3 OR cond4) didn't produce a usable range (e.g. they were
t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
didn't yield any range list)
- All other scand didn't have their counterparts, so we'll end up with a
SEL_TREE of:
range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
.
O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
seen any complaints that could be attributed to it.
If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
lift it ,and produce a cross-product:
((key1p OR key2p) AND (key3p OR key4p))
OR
((key5p OR key6p) AND (key7p OR key8p))
= (key1p OR key2p OR key5p OR key6p) AND // this part is currently
(key3p OR key4p OR key5p OR key6p) AND // produced
(key1p OR key2p OR key5p OR key6p) AND // this part will be added
(key3p OR key4p OR key5p OR key6p) //.
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
3. Testing and required coverage
================================
So far could find the following user cases:
* BUG#17259: Query optimizer chooses wrong index
* BUG#17673: Optimizer does not use Index Merge optimization in some cases
* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Sun, 16 Aug 2009, 01:03)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.20767 2009-08-16 01:03:11.000000000 +0300
+++ /tmp/wklog.24.new.20767 2009-08-16 01:03:11.000000000 +0300
@@ -18,6 +18,8 @@
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+ (here range(keyi) may represent ranges not for initial keyi prefixes,
+ but ranges for any infixes for keyi)
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
@@ -47,13 +49,13 @@
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
- return R;
+ return R; // DISCARD-IMERGE-2
else
{
/* could not build any range accesses. construct index_merge */
- remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from A;
remove non-ranges from B;
- return new index_merge(A, B);
+ return new index_merge(A, B); // DISCARD-IMERGE-3
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
@@ -65,12 +67,12 @@
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
- (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN)
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
- (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N)
Now each line represents an index_merge..
}
@@ -82,18 +84,18 @@
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
- -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-4
imergeA1
OR
- imergeB1 AND imergeB2 AND ... AND imergeBN =
+ imergeB1 =
- = (combine imergeA1 with each of the imergeB{i} ) =
+ = (combine imergeA1 with each of the range_treeB_1{i} ) =
- combine(imergeA1 OR imergeB1) AND
- combine(imergeA1 OR imergeB2) AND
+ combine(imergeA1 OR range_treeB_11) AND
+ combine(imergeA1 OR range_treeB_12) AND
... AND
- combine(imergeA1 OR imergeBN)
+ combine(imergeA1 OR range_treeB_1N)
}
}
@@ -109,7 +111,7 @@
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
- (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+ (t.badkey<c1 AND t.key1=c1) OR (t.key2=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
@@ -123,6 +125,8 @@
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
+(Here an imerge for col2=c2 OR col3=c3 won't be built since neither col2=c2 nor
+col3=c3 represent index ranges.)
2. New implementation
-=-=(Guest - Mon, 20 Jul 2009, 17:13)=-=-
Dependency deleted: 30 no longer depends on 24
-=-=(Guest - Sat, 20 Jun 2009, 09:34)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.21663 2009-06-20 09:34:48.000000000 +0300
+++ /tmp/wklog.24.new.21663 2009-06-20 09:34:48.000000000 +0300
@@ -4,6 +4,7 @@
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
+3. Testing and required coverage
</contents>
1. Current implementation overview
@@ -240,3 +241,14 @@
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
+
+3. Testing and required coverage
+================================
+So far could find the following user cases:
+
+* BUG#17259: Query optimizer chooses wrong index
+* BUG#17673: Optimizer does not use Index Merge optimization in some cases
+* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
+* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
+
+
-=-=(Guest - Thu, 18 Jun 2009, 16:55)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.19152 2009-06-18 16:55:00.000000000 +0300
+++ /tmp/wklog.24.new.19152 2009-06-18 16:55:00.000000000 +0300
@@ -141,13 +141,15 @@
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
+
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
-1. Don't remove index_merge part of the tree.
+A1. Don't remove index_merge part of the tree (this will take care of
+ DISCARD-IMERGE-1 problem)
-2. Push range conditions down into index_merge trees that may support them.
+A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
@@ -155,8 +157,86 @@
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
-3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
-2.2 New tree_or()
+2.2 New tree_or()
+-----------------
+O1. Dont remove non-range plans:
+ Current tree_or() code will refuse to produce index_merge plans for
+ conditions like
+
+ "t.key1part2=const OR t.key2part1=const"
+
+ (this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
+ the AND condition is not usable for range access, and the operation of
+ tree_and() guaranteed that there was no way it could changed to make a
+ usable range plan. With new tree_and() and rule A2, this is no longer the
+ case. For example for this query:
+
+ (t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
+
+ it will construct a
+
+ imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
+
+ then tree_and() will apply rule A2 to push the range down into index merge
+ and after that we'll have:
+
+ range(t.key1part1=const)
+ imerge(
+ t.key1part2=const AND t.key1part1=const,
+ t.key2part1=const
+ )
+ note that imerge(...) describes a usable index_merge plan and it's possible
+ that it will be the best access path.
+
+O2. "Create index_merge accesses when possible"
+ Current tree_or() will not create index_merge access when it could create
+ non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
+ in the current implementation" section). This will be changed to work as
+ follows: we will create index_merge made for index scans that didn't have
+ their match in the other sel_tree.
+ Ilustrating it with an example:
+
+ | sel_tree_A | sel_tree_B | A or B | include in index_merge?
+ ------+------------+------------+--------+------------------------
+ key1 | cond1 | cond2 | condM | no
+ key2 | cond3 | cond4 | NULL | no
+ key3 | cond5 | | | yes, A-side
+ key4 | cond6 | | | yes, A-side
+ key5 | | cond7 | | yes, B-side
+ key6 | | cond8 | | yes, B-side
+
+ here we assume that
+ - (cond1 OR cond2) did produce a combined range. Not including them in
+ index_merge.
+ - (cond3 OR cond4) didn't produce a usable range (e.g. they were
+ t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
+ didn't yield any range list)
+ - All other scand didn't have their counterparts, so we'll end up with a
+ SEL_TREE of:
+
+ range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
+ .
+
+O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
+that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
+seen any complaints that could be attributed to it.
+If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
+lift it ,and produce a cross-product:
+
+ ((key1p OR key2p) AND (key3p OR key4p))
+ OR
+ ((key5p OR key6p) AND (key7p OR key8p))
+
+ = (key1p OR key2p OR key5p OR key6p) AND // this part is currently
+ (key3p OR key4p OR key5p OR key6p) AND // produced
+
+ (key1p OR key2p OR key5p OR key6p) AND // this part will be added
+ (key3p OR key4p OR key5p OR key6p) //.
+
+In order to limit the impact of this combinatorial explosion, we will
+introduce a rule that we won't generate more than #defined
+MAX_IMERGE_OPTS options.
-=-=(Guest - Thu, 18 Jun 2009, 14:56)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.15612 2009-06-18 14:56:09.000000000 +0300
+++ /tmp/wklog.24.new.15612 2009-06-18 14:56:09.000000000 +0300
@@ -1 +1,162 @@
+<contents>
+1. Current implementation overview
+1.1. Problems in the current implementation
+2. New implementation
+2.1 New tree_and()
+2.2 New tree_or()
+</contents>
+
+1. Current implementation overview
+==================================
+At the moment, range analyzer works as follows:
+
+SEL_TREE structure represents
+
+ # There are sel_trees, a sel_tree is either range or merge tree
+ sel_tree = range_tree | imerge_tree
+
+ # a range tree has range access options, possibly for several keys
+ range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+
+ # merge tree represents several way to index_merge
+ imerge_tree = imerge1 AND imerge2 AND ...
+
+ # a way to do index merge == a set to use of different indexes.
+ imergeX = range_tree1 OR range_tree2 OR ..
+ where no pair of range_treeX have ranges over the same index.
+
+
+ tree_and(A, B)
+ {
+ if (both A and B are range trees)
+ return a range_tree with computed intersection for each range;
+ if (only one of A and B is a range tree)
+ return that tree; // DISCARD-IMERGE-1
+ // at this point both trees are index_merge trees
+ return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
+ }
+
+
+ tree_or(A, B)
+ {
+ if (A and B are range trees)
+ {
+ R = new range_tree;
+ for each index i
+ R.add(range_union(A.range(i), B.range(i)));
+
+ if (R has at least one range access)
+ return R;
+ else
+ {
+ /* could not build any range accesses. construct index_merge */
+ remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from B;
+ return new index_merge(A, B);
+ }
+ }
+ else if (A is range tree and B is index_merge tree (or vice versa))
+ {
+ Perform this transformation:
+
+ range_treeA // this is A
+ OR
+ (range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
+ (range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ =
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+
+ Now each line represents an index_merge..
+ }
+ else if (both A and B are index_merge trees)
+ {
+ Perform this transformation:
+
+ imergeA1 AND imergeA2 AND ... AND imergeAN
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN
+
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+
+ imergeA1
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN =
+
+ = (combine imergeA1 with each of the imergeB{i} ) =
+
+ combine(imergeA1 OR imergeB1) AND
+ combine(imergeA1 OR imergeB2) AND
+ ... AND
+ combine(imergeA1 OR imergeBN)
+ }
+ }
+
+1.1. Problems in the current implementation
+-------------------------------------------
+As marked in the code above:
+
+DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
+the WHERE clause has this form:
+
+ (t.key1=c1 OR t.key2=c2) AND t.badkey < c3
+
+DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
+the WHERE clause has this form (conditions t.badkey may have abritrary form):
+
+ (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+
+DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
+two indexes:
+
+ INDEX i1(col1, col2),
+ INDEX i2(col1, col3)
+
+and this WHERE clause:
+
+ col1=c1 AND (col2=c2 OR col3=c3)
+
+The optimizer will generate the plans that only use the "col1=c1" part. The
+right side of the AND will be ignored even if it has good selectivity.
+
+
+2. New implementation
+=====================
+
+<general idea>
+* Don't start fighting combinatorial explosion until we've actually got one.
+</>
+
+SEL_TREE structure will be now able to hold both index_merge and range scan
+candidates at the same time. That is,
+
+ sel_tree2 = range_tree AND imerge_tree
+
+where both parts are optional (i.e. can be empty)
+
+Operations on SEL_ARG trees will be modified to produce/process the trees of
+this kind:
+
+2.1 New tree_and()
+------------------
+In order not to lose plans, we'll make these changes:
+
+1. Don't remove index_merge part of the tree.
+
+2. Push range conditions down into index_merge trees that may support them.
+ if one tree has range(key1) and the other tree has imerge(key1 OR key2)
+ then perform an equvalent of this operation:
+
+ rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
+
+ (rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
+
+3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+ concatenate them together.
+
+2.2 New tree_or()
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 24
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
-=-=(Guest - Tue, 26 May 2009, 13:27)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.305 2009-05-26 13:27:32.000000000 +0300
+++ /tmp/wklog.24.new.305 2009-05-26 13:27:32.000000000 +0300
@@ -1 +1,70 @@
+(Not a ready HLS but draft)
+<contents>
+Solution overview
+Limitations
+TODO
+
+</contents>
+
+Solution overview
+=================
+The idea is to delay discarding potential index_merge plans until the point
+where it is really necessary.
+
+This way, we won't have to do much changes in the range analyzer, but will be
+able to keep potential index_merge plan just enough so that it's possible to
+take it into consideration together with range access plans.
+
+Since there are no changes in the optimizer, the ability to consider both
+range and index_merge options will be limited to WHERE clauses of this form:
+
+ WHERE := range_cond(key1_1) AND
+ range_cond(key2_1) AND
+ other_cond AND
+ index_merge_OR_cond1(key3_1, key3_2, ...)
+ index_merge_OR_cond2(key4_1, key4_2, ...)
+
+where
+
+ index_merge_OR_cond{N} := (range_cond(keyN_1) OR
+ range_cond(keyN_2) OR ...)
+
+
+ range_cond(keyX) := condition that allows to construct range access of keyX
+ and doesn't allow to construct range/index_merge accesses
+ for any keys of the table in question.
+
+
+For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
+
+ SEL_TREE(
+ range(key1_1),
+ ...
+ range(key2_1),
+ SEL_IMERGE( (1)
+ SEL_TREE(key3_1})
+ SEL_TREE(key3_2})
+ ...
+ )
+ ...
+ )
+
+which can be used to make a cost-based choice between range and index_merge.
+
+Limitations
+-----------
+This will not be a full solution in a sense that the range analyzer will not
+be able to produce sel_tree (1) if the WHERE clause is specified in other form
+(e.g. brackets were opened).
+
+TODO
+----
+* is it a problem if there are keys that are referred to both from
+ index_merge and from range access?
+
+* How strict is the limitation on the form of the WHERE?
+
+* TODO: The optimizer didn't compare costs of index_merge and range before (ok
+ it did but that was done for accesses to different tables). Will there be any
+ possible gotchas here?
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
LOW-LEVEL DESIGN:
<contents>
1. Current implementation overview
1.1. Problems in the current implementation
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
3. Testing and required coverage
</contents>
1. Current implementation overview
==================================
At the moment, range analyzer works as follows:
SEL_TREE structure represents
# There are sel_trees, a sel_tree is either range or merge tree
sel_tree = range_tree | imerge_tree
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
(here range(keyi) may represent ranges not for initial keyi prefixes,
but ranges for any infixes for keyi)
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
# a way to do index merge == a set to use of different indexes.
imergeX = range_tree1 OR range_tree2 OR ..
where no pair of range_treeX have ranges over the same index.
tree_and(A, B)
{
if (both A and B are range trees)
return a range_tree with computed intersection for each range;
if (only one of A and B is a range tree)
return that tree; // DISCARD-IMERGE-1
// at this point both trees are index_merge trees
return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
}
tree_or(A, B)
{
if (A and B are range trees)
{
R = new range_tree;
for each index i
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
return R; // DISCARD-IMERGE-2
else
{
/* could not build any range accesses. construct index_merge */
remove non-ranges from A;
remove non-ranges from B;
return new index_merge(A, B); // DISCARD-IMERGE-3
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
{
Perform this transformation:
range_treeA // this is A
OR
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
(range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN)
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N)
Now each line represents an index_merge..
}
else if (both A and B are index_merge trees)
{
Perform this transformation:
imergeA1 AND imergeA2 AND ... AND imergeAN
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
-> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-4
imergeA1
OR
imergeB1 =
= (combine imergeA1 with each of the range_treeB_1{i} ) =
combine(imergeA1 OR range_treeB_11) AND
combine(imergeA1 OR range_treeB_12) AND
... AND
combine(imergeA1 OR range_treeB_1N)
}
}
1.1. Problems in the current implementation
-------------------------------------------
As marked in the code above:
DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
the WHERE clause has this form:
(t.key1=c1 OR t.key2=c2) AND t.badkey < c3
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
(t.badkey<c1 AND t.key1=c1) OR (t.key2=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
INDEX i1(col1, col2),
INDEX i2(col1, col3)
and this WHERE clause:
col1=c1 AND (col2=c2 OR col3=c3)
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
(Here an imerge for col2=c2 OR col3=c3 won't be built since neither col2=c2 nor
col3=c3 represent index ranges.)
2. New implementation
=====================
<general idea>
* Don't start fighting combinatorial explosion until we've actually got one.
</>
SEL_TREE structure will be now able to hold both index_merge and range scan
candidates at the same time. That is,
sel_tree2 = range_tree AND imerge_tree
where both parts are optional (i.e. can be empty)
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
A1. Don't remove index_merge part of the tree (this will take care of
DISCARD-IMERGE-1 problem)
A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
2.2 New tree_or()
-----------------
O1. Dont remove non-range plans:
Current tree_or() code will refuse to produce index_merge plans for
conditions like
"t.key1part2=const OR t.key2part1=const"
(this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
the AND condition is not usable for range access, and the operation of
tree_and() guaranteed that there was no way it could changed to make a
usable range plan. With new tree_and() and rule A2, this is no longer the
case. For example for this query:
(t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
it will construct a
imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
then tree_and() will apply rule A2 to push the range down into index merge
and after that we'll have:
range(t.key1part1=const)
imerge(
t.key1part2=const AND t.key1part1=const,
t.key2part1=const
)
note that imerge(...) describes a usable index_merge plan and it's possible
that it will be the best access path.
O2. "Create index_merge accesses when possible"
Current tree_or() will not create index_merge access when it could create
non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
in the current implementation" section). This will be changed to work as
follows: we will create index_merge made for index scans that didn't have
their match in the other sel_tree.
Ilustrating it with an example:
| sel_tree_A | sel_tree_B | A or B | include in index_merge?
------+------------+------------+--------+------------------------
key1 | cond1 | cond2 | condM | no
key2 | cond3 | cond4 | NULL | no
key3 | cond5 | | | yes, A-side
key4 | cond6 | | | yes, A-side
key5 | | cond7 | | yes, B-side
key6 | | cond8 | | yes, B-side
here we assume that
- (cond1 OR cond2) did produce a combined range. Not including them in
index_merge.
- (cond3 OR cond4) didn't produce a usable range (e.g. they were
t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
didn't yield any range list)
- All other scand didn't have their counterparts, so we'll end up with a
SEL_TREE of:
range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
.
O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
seen any complaints that could be attributed to it.
If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
lift it ,and produce a cross-product:
((key1p OR key2p) AND (key3p OR key4p))
OR
((key5p OR key6p) AND (key7p OR key8p))
= (key1p OR key2p OR key5p OR key6p) AND // this part is currently
(key3p OR key4p OR key5p OR key6p) AND // produced
(key1p OR key2p OR key5p OR key6p) AND // this part will be added
(key3p OR key4p OR key5p OR key6p) //.
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
3. Testing and required coverage
================================
So far could find the following user cases:
* BUG#17259: Query optimizer chooses wrong index
* BUG#17673: Optimizer does not use Index Merge optimization in some cases
* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Sun, 16 Aug 2009, 01:03)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.20767 2009-08-16 01:03:11.000000000 +0300
+++ /tmp/wklog.24.new.20767 2009-08-16 01:03:11.000000000 +0300
@@ -18,6 +18,8 @@
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+ (here range(keyi) may represent ranges not for initial keyi prefixes,
+ but ranges for any infixes for keyi)
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
@@ -47,13 +49,13 @@
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
- return R;
+ return R; // DISCARD-IMERGE-2
else
{
/* could not build any range accesses. construct index_merge */
- remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from A;
remove non-ranges from B;
- return new index_merge(A, B);
+ return new index_merge(A, B); // DISCARD-IMERGE-3
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
@@ -65,12 +67,12 @@
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
- (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN)
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
- (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N)
Now each line represents an index_merge..
}
@@ -82,18 +84,18 @@
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
- -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-4
imergeA1
OR
- imergeB1 AND imergeB2 AND ... AND imergeBN =
+ imergeB1 =
- = (combine imergeA1 with each of the imergeB{i} ) =
+ = (combine imergeA1 with each of the range_treeB_1{i} ) =
- combine(imergeA1 OR imergeB1) AND
- combine(imergeA1 OR imergeB2) AND
+ combine(imergeA1 OR range_treeB_11) AND
+ combine(imergeA1 OR range_treeB_12) AND
... AND
- combine(imergeA1 OR imergeBN)
+ combine(imergeA1 OR range_treeB_1N)
}
}
@@ -109,7 +111,7 @@
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
- (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+ (t.badkey<c1 AND t.key1=c1) OR (t.key2=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
@@ -123,6 +125,8 @@
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
+(Here an imerge for col2=c2 OR col3=c3 won't be built since neither col2=c2 nor
+col3=c3 represent index ranges.)
2. New implementation
-=-=(Guest - Mon, 20 Jul 2009, 17:13)=-=-
Dependency deleted: 30 no longer depends on 24
-=-=(Guest - Sat, 20 Jun 2009, 09:34)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.21663 2009-06-20 09:34:48.000000000 +0300
+++ /tmp/wklog.24.new.21663 2009-06-20 09:34:48.000000000 +0300
@@ -4,6 +4,7 @@
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
+3. Testing and required coverage
</contents>
1. Current implementation overview
@@ -240,3 +241,14 @@
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
+
+3. Testing and required coverage
+================================
+So far could find the following user cases:
+
+* BUG#17259: Query optimizer chooses wrong index
+* BUG#17673: Optimizer does not use Index Merge optimization in some cases
+* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
+* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
+
+
-=-=(Guest - Thu, 18 Jun 2009, 16:55)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.19152 2009-06-18 16:55:00.000000000 +0300
+++ /tmp/wklog.24.new.19152 2009-06-18 16:55:00.000000000 +0300
@@ -141,13 +141,15 @@
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
+
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
-1. Don't remove index_merge part of the tree.
+A1. Don't remove index_merge part of the tree (this will take care of
+ DISCARD-IMERGE-1 problem)
-2. Push range conditions down into index_merge trees that may support them.
+A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
@@ -155,8 +157,86 @@
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
-3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
-2.2 New tree_or()
+2.2 New tree_or()
+-----------------
+O1. Dont remove non-range plans:
+ Current tree_or() code will refuse to produce index_merge plans for
+ conditions like
+
+ "t.key1part2=const OR t.key2part1=const"
+
+ (this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
+ the AND condition is not usable for range access, and the operation of
+ tree_and() guaranteed that there was no way it could changed to make a
+ usable range plan. With new tree_and() and rule A2, this is no longer the
+ case. For example for this query:
+
+ (t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
+
+ it will construct a
+
+ imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
+
+ then tree_and() will apply rule A2 to push the range down into index merge
+ and after that we'll have:
+
+ range(t.key1part1=const)
+ imerge(
+ t.key1part2=const AND t.key1part1=const,
+ t.key2part1=const
+ )
+ note that imerge(...) describes a usable index_merge plan and it's possible
+ that it will be the best access path.
+
+O2. "Create index_merge accesses when possible"
+ Current tree_or() will not create index_merge access when it could create
+ non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
+ in the current implementation" section). This will be changed to work as
+ follows: we will create index_merge made for index scans that didn't have
+ their match in the other sel_tree.
+ Ilustrating it with an example:
+
+ | sel_tree_A | sel_tree_B | A or B | include in index_merge?
+ ------+------------+------------+--------+------------------------
+ key1 | cond1 | cond2 | condM | no
+ key2 | cond3 | cond4 | NULL | no
+ key3 | cond5 | | | yes, A-side
+ key4 | cond6 | | | yes, A-side
+ key5 | | cond7 | | yes, B-side
+ key6 | | cond8 | | yes, B-side
+
+ here we assume that
+ - (cond1 OR cond2) did produce a combined range. Not including them in
+ index_merge.
+ - (cond3 OR cond4) didn't produce a usable range (e.g. they were
+ t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
+ didn't yield any range list)
+ - All other scand didn't have their counterparts, so we'll end up with a
+ SEL_TREE of:
+
+ range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
+ .
+
+O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
+that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
+seen any complaints that could be attributed to it.
+If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
+lift it ,and produce a cross-product:
+
+ ((key1p OR key2p) AND (key3p OR key4p))
+ OR
+ ((key5p OR key6p) AND (key7p OR key8p))
+
+ = (key1p OR key2p OR key5p OR key6p) AND // this part is currently
+ (key3p OR key4p OR key5p OR key6p) AND // produced
+
+ (key1p OR key2p OR key5p OR key6p) AND // this part will be added
+ (key3p OR key4p OR key5p OR key6p) //.
+
+In order to limit the impact of this combinatorial explosion, we will
+introduce a rule that we won't generate more than #defined
+MAX_IMERGE_OPTS options.
-=-=(Guest - Thu, 18 Jun 2009, 14:56)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.15612 2009-06-18 14:56:09.000000000 +0300
+++ /tmp/wklog.24.new.15612 2009-06-18 14:56:09.000000000 +0300
@@ -1 +1,162 @@
+<contents>
+1. Current implementation overview
+1.1. Problems in the current implementation
+2. New implementation
+2.1 New tree_and()
+2.2 New tree_or()
+</contents>
+
+1. Current implementation overview
+==================================
+At the moment, range analyzer works as follows:
+
+SEL_TREE structure represents
+
+ # There are sel_trees, a sel_tree is either range or merge tree
+ sel_tree = range_tree | imerge_tree
+
+ # a range tree has range access options, possibly for several keys
+ range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+
+ # merge tree represents several way to index_merge
+ imerge_tree = imerge1 AND imerge2 AND ...
+
+ # a way to do index merge == a set to use of different indexes.
+ imergeX = range_tree1 OR range_tree2 OR ..
+ where no pair of range_treeX have ranges over the same index.
+
+
+ tree_and(A, B)
+ {
+ if (both A and B are range trees)
+ return a range_tree with computed intersection for each range;
+ if (only one of A and B is a range tree)
+ return that tree; // DISCARD-IMERGE-1
+ // at this point both trees are index_merge trees
+ return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
+ }
+
+
+ tree_or(A, B)
+ {
+ if (A and B are range trees)
+ {
+ R = new range_tree;
+ for each index i
+ R.add(range_union(A.range(i), B.range(i)));
+
+ if (R has at least one range access)
+ return R;
+ else
+ {
+ /* could not build any range accesses. construct index_merge */
+ remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from B;
+ return new index_merge(A, B);
+ }
+ }
+ else if (A is range tree and B is index_merge tree (or vice versa))
+ {
+ Perform this transformation:
+
+ range_treeA // this is A
+ OR
+ (range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
+ (range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ =
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+
+ Now each line represents an index_merge..
+ }
+ else if (both A and B are index_merge trees)
+ {
+ Perform this transformation:
+
+ imergeA1 AND imergeA2 AND ... AND imergeAN
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN
+
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+
+ imergeA1
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN =
+
+ = (combine imergeA1 with each of the imergeB{i} ) =
+
+ combine(imergeA1 OR imergeB1) AND
+ combine(imergeA1 OR imergeB2) AND
+ ... AND
+ combine(imergeA1 OR imergeBN)
+ }
+ }
+
+1.1. Problems in the current implementation
+-------------------------------------------
+As marked in the code above:
+
+DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
+the WHERE clause has this form:
+
+ (t.key1=c1 OR t.key2=c2) AND t.badkey < c3
+
+DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
+the WHERE clause has this form (conditions t.badkey may have abritrary form):
+
+ (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+
+DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
+two indexes:
+
+ INDEX i1(col1, col2),
+ INDEX i2(col1, col3)
+
+and this WHERE clause:
+
+ col1=c1 AND (col2=c2 OR col3=c3)
+
+The optimizer will generate the plans that only use the "col1=c1" part. The
+right side of the AND will be ignored even if it has good selectivity.
+
+
+2. New implementation
+=====================
+
+<general idea>
+* Don't start fighting combinatorial explosion until we've actually got one.
+</>
+
+SEL_TREE structure will be now able to hold both index_merge and range scan
+candidates at the same time. That is,
+
+ sel_tree2 = range_tree AND imerge_tree
+
+where both parts are optional (i.e. can be empty)
+
+Operations on SEL_ARG trees will be modified to produce/process the trees of
+this kind:
+
+2.1 New tree_and()
+------------------
+In order not to lose plans, we'll make these changes:
+
+1. Don't remove index_merge part of the tree.
+
+2. Push range conditions down into index_merge trees that may support them.
+ if one tree has range(key1) and the other tree has imerge(key1 OR key2)
+ then perform an equvalent of this operation:
+
+ rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
+
+ (rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
+
+3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+ concatenate them together.
+
+2.2 New tree_or()
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 24
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
-=-=(Guest - Tue, 26 May 2009, 13:27)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.305 2009-05-26 13:27:32.000000000 +0300
+++ /tmp/wklog.24.new.305 2009-05-26 13:27:32.000000000 +0300
@@ -1 +1,70 @@
+(Not a ready HLS but draft)
+<contents>
+Solution overview
+Limitations
+TODO
+
+</contents>
+
+Solution overview
+=================
+The idea is to delay discarding potential index_merge plans until the point
+where it is really necessary.
+
+This way, we won't have to do much changes in the range analyzer, but will be
+able to keep potential index_merge plan just enough so that it's possible to
+take it into consideration together with range access plans.
+
+Since there are no changes in the optimizer, the ability to consider both
+range and index_merge options will be limited to WHERE clauses of this form:
+
+ WHERE := range_cond(key1_1) AND
+ range_cond(key2_1) AND
+ other_cond AND
+ index_merge_OR_cond1(key3_1, key3_2, ...)
+ index_merge_OR_cond2(key4_1, key4_2, ...)
+
+where
+
+ index_merge_OR_cond{N} := (range_cond(keyN_1) OR
+ range_cond(keyN_2) OR ...)
+
+
+ range_cond(keyX) := condition that allows to construct range access of keyX
+ and doesn't allow to construct range/index_merge accesses
+ for any keys of the table in question.
+
+
+For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
+
+ SEL_TREE(
+ range(key1_1),
+ ...
+ range(key2_1),
+ SEL_IMERGE( (1)
+ SEL_TREE(key3_1})
+ SEL_TREE(key3_2})
+ ...
+ )
+ ...
+ )
+
+which can be used to make a cost-based choice between range and index_merge.
+
+Limitations
+-----------
+This will not be a full solution in a sense that the range analyzer will not
+be able to produce sel_tree (1) if the WHERE clause is specified in other form
+(e.g. brackets were opened).
+
+TODO
+----
+* is it a problem if there are keys that are referred to both from
+ index_merge and from range access?
+
+* How strict is the limitation on the form of the WHERE?
+
+* TODO: The optimizer didn't compare costs of index_merge and range before (ok
+ it did but that was done for accesses to different tables). Will there be any
+ possible gotchas here?
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
LOW-LEVEL DESIGN:
<contents>
1. Current implementation overview
1.1. Problems in the current implementation
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
3. Testing and required coverage
</contents>
1. Current implementation overview
==================================
At the moment, range analyzer works as follows:
SEL_TREE structure represents
# There are sel_trees, a sel_tree is either range or merge tree
sel_tree = range_tree | imerge_tree
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
(here range(keyi) may represent ranges not for initial keyi prefixes,
but ranges for any infixes for keyi)
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
# a way to do index merge == a set to use of different indexes.
imergeX = range_tree1 OR range_tree2 OR ..
where no pair of range_treeX have ranges over the same index.
tree_and(A, B)
{
if (both A and B are range trees)
return a range_tree with computed intersection for each range;
if (only one of A and B is a range tree)
return that tree; // DISCARD-IMERGE-1
// at this point both trees are index_merge trees
return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
}
tree_or(A, B)
{
if (A and B are range trees)
{
R = new range_tree;
for each index i
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
return R; // DISCARD-IMERGE-2
else
{
/* could not build any range accesses. construct index_merge */
remove non-ranges from A;
remove non-ranges from B;
return new index_merge(A, B); // DISCARD-IMERGE-3
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
{
Perform this transformation:
range_treeA // this is A
OR
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
(range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN)
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N)
Now each line represents an index_merge..
}
else if (both A and B are index_merge trees)
{
Perform this transformation:
imergeA1 AND imergeA2 AND ... AND imergeAN
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
-> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-4
imergeA1
OR
imergeB1 =
= (combine imergeA1 with each of the range_treeB_1{i} ) =
combine(imergeA1 OR range_treeB_11) AND
combine(imergeA1 OR range_treeB_12) AND
... AND
combine(imergeA1 OR range_treeB_1N)
}
}
1.1. Problems in the current implementation
-------------------------------------------
As marked in the code above:
DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
the WHERE clause has this form:
(t.key1=c1 OR t.key2=c2) AND t.badkey < c3
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
(t.badkey<c1 AND t.key1=c1) OR (t.key2=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
INDEX i1(col1, col2),
INDEX i2(col1, col3)
and this WHERE clause:
col1=c1 AND (col2=c2 OR col3=c3)
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
(Here an imerge for col2=c2 OR col3=c3 won't be built since neither col2=c2 nor
col3=c3 represent index ranges.)
2. New implementation
=====================
<general idea>
* Don't start fighting combinatorial explosion until we've actually got one.
</>
SEL_TREE structure will be now able to hold both index_merge and range scan
candidates at the same time. That is,
sel_tree2 = range_tree AND imerge_tree
where both parts are optional (i.e. can be empty)
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
A1. Don't remove index_merge part of the tree (this will take care of
DISCARD-IMERGE-1 problem)
A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
2.2 New tree_or()
-----------------
O1. Dont remove non-range plans:
Current tree_or() code will refuse to produce index_merge plans for
conditions like
"t.key1part2=const OR t.key2part1=const"
(this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
the AND condition is not usable for range access, and the operation of
tree_and() guaranteed that there was no way it could changed to make a
usable range plan. With new tree_and() and rule A2, this is no longer the
case. For example for this query:
(t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
it will construct a
imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
then tree_and() will apply rule A2 to push the range down into index merge
and after that we'll have:
range(t.key1part1=const)
imerge(
t.key1part2=const AND t.key1part1=const,
t.key2part1=const
)
note that imerge(...) describes a usable index_merge plan and it's possible
that it will be the best access path.
O2. "Create index_merge accesses when possible"
Current tree_or() will not create index_merge access when it could create
non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
in the current implementation" section). This will be changed to work as
follows: we will create index_merge made for index scans that didn't have
their match in the other sel_tree.
Ilustrating it with an example:
| sel_tree_A | sel_tree_B | A or B | include in index_merge?
------+------------+------------+--------+------------------------
key1 | cond1 | cond2 | condM | no
key2 | cond3 | cond4 | NULL | no
key3 | cond5 | | | yes, A-side
key4 | cond6 | | | yes, A-side
key5 | | cond7 | | yes, B-side
key6 | | cond8 | | yes, B-side
here we assume that
- (cond1 OR cond2) did produce a combined range. Not including them in
index_merge.
- (cond3 OR cond4) didn't produce a usable range (e.g. they were
t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
didn't yield any range list)
- All other scand didn't have their counterparts, so we'll end up with a
SEL_TREE of:
range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
.
O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
seen any complaints that could be attributed to it.
If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
lift it ,and produce a cross-product:
((key1p OR key2p) AND (key3p OR key4p))
OR
((key5p OR key6p) AND (key7p OR key8p))
= (key1p OR key2p OR key5p OR key6p) AND // this part is currently
(key3p OR key4p OR key5p OR key6p) AND // produced
(key1p OR key2p OR key5p OR key6p) AND // this part will be added
(key3p OR key4p OR key5p OR key6p) //.
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
3. Testing and required coverage
================================
So far could find the following user cases:
* BUG#17259: Query optimizer chooses wrong index
* BUG#17673: Optimizer does not use Index Merge optimization in some cases
* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Store in binlog text of statements that caused RBR events (47)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Store in binlog text of statements that caused RBR events
CREATION DATE..: Sat, 15 Aug 2009, 23:48
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 47 (http://askmonty.org/worklog/?tid=47)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Store in binlog (and show in mysqlbinlog output) texts of statements that
caused RBR events
This is needed for (list from Monty):
- Easier to understand why updates happened
- Would make it easier to find out where in application things went
wrong (as you can search for exact strings)
- Allow one to filter things based on comments in the statement.
The cost of this can be that the binlog will be approximately 2x in size
(especially insert of big blob's would be a bit painful), so this should
be an optional feature.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Change BINLOG statement syntax to be human-readable (46)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Change BINLOG statement syntax to be human-readable
CREATION DATE..: Sat, 15 Aug 2009, 23:42
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 46 (http://askmonty.org/worklog/?tid=46)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sat, 15 Aug 2009, 23:43)=-=-
High-Level Specification modified.
--- /tmp/wklog.46.old.17742 2009-08-15 23:43:09.000000000 +0300
+++ /tmp/wklog.46.new.17742 2009-08-15 23:43:09.000000000 +0300
@@ -1 +1,28 @@
+Suggestion 1
+------------
+Original syntax suggestion by Kristian:
+
+ BINLOG
+ WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
+ TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
+ TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
+ WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
+ UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
+ FROM ('a') TO ('b') FLAGS 0x0
+ DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
+
+ This is basically a dump of what is stored in the events, and would be an
+ alternative to BINLOG 'gwWEShMBAA...'.
+
+Feedback and other suggestions
+------------------------------
+* What is the need for WITH TIMESTAMP part? Can't one use a separate
+ SET TIMESTAMP statement?
+
+* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
+ that's close to readable SQL. Can we make it to be regular parseable SQL?
+ + This will be syntax that's familiar to our parser and to the users
+ - A stream of SQL statements will be slower to run than BINLOG statements
+ (due to locking, table open/close, etc). (TODO: is it really slower? we
+ haven't checked).
DESCRIPTION:
One of great things about mysqlbinlog was that its output was human-readable
SQL, so it was possible to edit it manually or with help of scripts. With RBR
events and BINLOG 'DpiGShMBAAAALQAAADcBAA...' statements this is no longer the
case.
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
HIGH-LEVEL SPECIFICATION:
Suggestion 1
------------
Original syntax suggestion by Kristian:
BINLOG
WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
FROM ('a') TO ('b') FLAGS 0x0
DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
This is basically a dump of what is stored in the events, and would be an
alternative to BINLOG 'gwWEShMBAA...'.
Feedback and other suggestions
------------------------------
* What is the need for WITH TIMESTAMP part? Can't one use a separate
SET TIMESTAMP statement?
* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
that's close to readable SQL. Can we make it to be regular parseable SQL?
+ This will be syntax that's familiar to our parser and to the users
- A stream of SQL statements will be slower to run than BINLOG statements
(due to locking, table open/close, etc). (TODO: is it really slower? we
haven't checked).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Change BINLOG statement syntax to be human-readable (46)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Change BINLOG statement syntax to be human-readable
CREATION DATE..: Sat, 15 Aug 2009, 23:42
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 46 (http://askmonty.org/worklog/?tid=46)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sat, 15 Aug 2009, 23:43)=-=-
High-Level Specification modified.
--- /tmp/wklog.46.old.17742 2009-08-15 23:43:09.000000000 +0300
+++ /tmp/wklog.46.new.17742 2009-08-15 23:43:09.000000000 +0300
@@ -1 +1,28 @@
+Suggestion 1
+------------
+Original syntax suggestion by Kristian:
+
+ BINLOG
+ WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
+ TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
+ TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
+ WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
+ UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
+ FROM ('a') TO ('b') FLAGS 0x0
+ DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
+
+ This is basically a dump of what is stored in the events, and would be an
+ alternative to BINLOG 'gwWEShMBAA...'.
+
+Feedback and other suggestions
+------------------------------
+* What is the need for WITH TIMESTAMP part? Can't one use a separate
+ SET TIMESTAMP statement?
+
+* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
+ that's close to readable SQL. Can we make it to be regular parseable SQL?
+ + This will be syntax that's familiar to our parser and to the users
+ - A stream of SQL statements will be slower to run than BINLOG statements
+ (due to locking, table open/close, etc). (TODO: is it really slower? we
+ haven't checked).
DESCRIPTION:
One of great things about mysqlbinlog was that its output was human-readable
SQL, so it was possible to edit it manually or with help of scripts. With RBR
events and BINLOG 'DpiGShMBAAAALQAAADcBAA...' statements this is no longer the
case.
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
HIGH-LEVEL SPECIFICATION:
Suggestion 1
------------
Original syntax suggestion by Kristian:
BINLOG
WITH TIMESTAMP xxx SERVER_ID 1 MASTER_POS 415 FLAGS 0x0
TABLE db1.table1 AS 1 COLUMNS (INT NOT NULL, BLOB, VARCHAR(100)) FLAGS 0x0
TABLE db2.table2 AS 2 COLUMNS (CHAR(10)) FLAGS 0x0
WRITE_ROW INTO db1.table1(1,3) VALUES (42, 'foobar'), (10, NULL) FLAGS 0x2
UPDATE_ROW INTO db2.table2 (1) (1) VALUES FROM ('beforeval') TO ('toval'),
FROM ('a') TO ('b') FLAGS 0x0
DELETE_ROW INTO db2.table2 (1) VALUES ('row_to_delete') FLAGS 0x0;
This is basically a dump of what is stored in the events, and would be an
alternative to BINLOG 'gwWEShMBAA...'.
Feedback and other suggestions
------------------------------
* What is the need for WITH TIMESTAMP part? Can't one use a separate
SET TIMESTAMP statement?
* mysqlbinlog --base64-output=DECODE-ROWS --verbose already produces something
that's close to readable SQL. Can we make it to be regular parseable SQL?
+ This will be syntax that's familiar to our parser and to the users
- A stream of SQL statements will be slower to run than BINLOG statements
(due to locking, table open/close, etc). (TODO: is it really slower? we
haven't checked).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Change BINLOG statement syntax to be human-readable (46)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Change BINLOG statement syntax to be human-readable
CREATION DATE..: Sat, 15 Aug 2009, 23:42
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 46 (http://askmonty.org/worklog/?tid=46)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
One of great things about mysqlbinlog was that its output was human-readable
SQL, so it was possible to edit it manually or with help of scripts. With RBR
events and BINLOG 'DpiGShMBAAAALQAAADcBAA...' statements this is no longer the
case.
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Change BINLOG statement syntax to be human-readable (46)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Change BINLOG statement syntax to be human-readable
CREATION DATE..: Sat, 15 Aug 2009, 23:42
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 46 (http://askmonty.org/worklog/?tid=46)
VERSION........: WorkLog-3.4
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
One of great things about mysqlbinlog was that its output was human-readable
SQL, so it was possible to edit it manually or with help of scripts. With RBR
events and BINLOG 'DpiGShMBAAAALQAAADcBAA...' statements this is no longer the
case.
This WL task is about making BINLOG statements to be human-readable (either as
an option or by default
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Add a mysqlbinlog option to produce succint output (45)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to produce succint output
CREATION DATE..: Sat, 15 Aug 2009, 23:40
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 45 (http://askmonty.org/worklog/?tid=45)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sat, 15 Aug 2009, 23:40)=-=-
Title modified.
--- /tmp/wklog.45.old.17603 2009-08-15 23:40:38.000000000 +0300
+++ /tmp/wklog.45.new.17603 2009-08-15 23:40:38.000000000 +0300
@@ -1 +1 @@
-Add a mysqlbinlog option to produce siccint output
+Add a mysqlbinlog option to produce succint output
DESCRIPTION:
Add a mysqlbinlog option to produce the most succinct output, without any
comments or other statements that are not needed to apply binlog correctly.
This will be different from --short-form option. That option causes mysqlbinlog
not to print RBR events, i.e. the output is not supposed to be applied.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Add a mysqlbinlog option to produce succint output (45)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to produce succint output
CREATION DATE..: Sat, 15 Aug 2009, 23:40
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 45 (http://askmonty.org/worklog/?tid=45)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sat, 15 Aug 2009, 23:40)=-=-
Title modified.
--- /tmp/wklog.45.old.17603 2009-08-15 23:40:38.000000000 +0300
+++ /tmp/wklog.45.new.17603 2009-08-15 23:40:38.000000000 +0300
@@ -1 +1 @@
-Add a mysqlbinlog option to produce siccint output
+Add a mysqlbinlog option to produce succint output
DESCRIPTION:
Add a mysqlbinlog option to produce the most succinct output, without any
comments or other statements that are not needed to apply binlog correctly.
This will be different from --short-form option. That option causes mysqlbinlog
not to print RBR events, i.e. the output is not supposed to be applied.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Add a mysqlbinlog option to produce siccint output (45)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to produce siccint output
CREATION DATE..: Sat, 15 Aug 2009, 23:40
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 45 (http://askmonty.org/worklog/?tid=45)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Add a mysqlbinlog option to produce the most succinct output, without any
comments or other statements that are not needed to apply binlog correctly.
This will be different from --short-form option. That option causes mysqlbinlog
not to print RBR events, i.e. the output is not supposed to be applied.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Add a mysqlbinlog option to produce siccint output (45)
by worklog-noreply@askmonty.org 15 Aug '09
by worklog-noreply@askmonty.org 15 Aug '09
15 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to produce siccint output
CREATION DATE..: Sat, 15 Aug 2009, 23:40
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 45 (http://askmonty.org/worklog/?tid=45)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Add a mysqlbinlog option to produce the most succinct output, without any
comments or other statements that are not needed to apply binlog correctly.
This will be different from --short-form option. That option causes mysqlbinlog
not to print RBR events, i.e. the output is not supposed to be applied.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Rev 2724: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r10/
by Sergey Petrunya 15 Aug '09
by Sergey Petrunya 15 Aug '09
15 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r10/
------------------------------------------------------------
revno: 2724
revision-id: psergey(a)askmonty.org-20090815153912-q47vfp1j22ilmup2
parent: psergey(a)askmonty.org-20090815121442-706m9ujn8km4u4y1
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r10
timestamp: Sat 2009-08-15 18:39:12 +0300
message:
MWL#17: Table elimination
- Review feedback, more variable renames
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-15 12:14:42 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-15 15:39:12 +0000
@@ -114,7 +114,6 @@
MODULE_EXPRESSION,
MODULE_MULTI_EQUALITY,
MODULE_UNIQUE_KEY,
- MODULE_TABLE,
MODULE_OUTER_JOIN
} type; /* Type of the object */
@@ -138,7 +137,7 @@
{
public:
Field_value *field;
- Item *val;
+ Item *expression;
/* Used during condition analysis only, similar to KEYUSE::level */
uint level;
@@ -510,18 +509,18 @@
*/
if (old->field == new_fields->field)
{
- if (!new_fields->val->const_item())
+ if (!new_fields->expression->const_item())
{
/*
If the value matches, we can use the key reference.
If not, we keep it until we have examined all new values
*/
- if (old->val->eq(new_fields->val, old->field->field->binary()))
+ if (old->expression->eq(new_fields->expression, old->field->field->binary()))
{
old->level= and_level;
}
}
- else if (old->val->eq_by_collation(new_fields->val,
+ else if (old->expression->eq_by_collation(new_fields->expression,
old->field->field->binary(),
old->field->field->charset()))
{
@@ -633,7 +632,7 @@
/* Store possible eq field */
(*eq_dep)->type= Module_dep::MODULE_EXPRESSION; //psergey-todo;
(*eq_dep)->field= get_field_value(te, field);
- (*eq_dep)->val= *value;
+ (*eq_dep)->expression= *value;
(*eq_dep)->level= and_level;
(*eq_dep)++;
}
@@ -953,7 +952,7 @@
{
deps_setter.expr_offset= eq_dep - te->equality_deps;
eq_dep->unknown_args= 0;
- eq_dep->val->walk(&Item::check_column_usage_processor, FALSE,
+ eq_dep->expression->walk(&Item::check_column_usage_processor, FALSE,
(uchar*)&deps_setter);
if (!eq_dep->unknown_args)
{
@@ -1283,7 +1282,7 @@
char buf[128];
String str(buf, sizeof(buf), &my_charset_bin);
str.length(0);
- eq_dep->val->print(&str, QT_ORDINARY);
+ eq_dep->expression->print(&str, QT_ORDINARY);
fprintf(DBUG_FILE, " equality%d: %s -> %s.%s\n",
eq_dep - te->equality_deps,
str.c_ptr(),
1
0

[Maria-developers] Rev 2723: Fix trivial typo in file:///home/psergey/dev/maria-5.1-table-elim-r10/
by Sergey Petrunya 15 Aug '09
by Sergey Petrunya 15 Aug '09
15 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r10/
------------------------------------------------------------
revno: 2723
revision-id: psergey(a)askmonty.org-20090815121442-706m9ujn8km4u4y1
parent: psergey(a)askmonty.org-20090815102953-7s0jb470ibwq58qz
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r10
timestamp: Sat 2009-08-15 16:14:42 +0400
message:
Fix trivial typo
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-15 10:29:53 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-15 12:14:42 +0000
@@ -841,7 +841,7 @@
*eliminable_tables);
}
- if (eliminable && get_outer_join_dep(te, tbl, cur_map))
+ if (eliminable && !get_outer_join_dep(te, tbl, cur_map))
return TRUE;
tables_used_on_left |= tbl->on_expr->used_tables();
1
0

[Maria-developers] Rev 2722: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r10/
by Sergey Petrunya 15 Aug '09
by Sergey Petrunya 15 Aug '09
15 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r10/
------------------------------------------------------------
revno: 2722
revision-id: psergey(a)askmonty.org-20090815102953-7s0jb470ibwq58qz
parent: psergey(a)askmonty.org-20090815060803-0yvp5mmgo87emykp
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r10
timestamp: Sat 2009-08-15 14:29:53 +0400
message:
MWL#17: Table elimination
Continue with addressing review feedback part two:
- rename enum members
- add checking for out of memory errors on allocation
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-15 06:08:03 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-15 10:29:53 +0000
@@ -111,11 +111,11 @@
{
public:
enum {
- FD_EXPRESSION,
- FD_MULTI_EQUALITY,
- FD_UNIQUE_KEY,
- FD_TABLE,
- FD_OUTER_JOIN
+ MODULE_EXPRESSION,
+ MODULE_MULTI_EQUALITY,
+ MODULE_UNIQUE_KEY,
+ MODULE_TABLE,
+ MODULE_OUTER_JOIN
} type; /* Type of the object */
/*
@@ -156,7 +156,7 @@
Key_module(Table_value *table_arg, uint keyno_arg, uint n_parts_arg) :
table(table_arg), keyno(keyno_arg), next_table_key(NULL)
{
- type= Module_dep::FD_UNIQUE_KEY;
+ type= Module_dep::MODULE_UNIQUE_KEY;
unknown_args= n_parts_arg;
}
Table_value *table; /* Table this key is from */
@@ -178,7 +178,7 @@
Outer_join_module(TABLE_LIST *table_list_arg, uint n_children) :
table_list(table_list_arg), parent(NULL)
{
- type= Module_dep::FD_OUTER_JOIN;
+ type= Module_dep::MODULE_OUTER_JOIN;
unknown_args= n_children;
}
/*
@@ -205,7 +205,7 @@
class Table_elimination
{
public:
- Table_elimination(JOIN *join_arg) : join(join_arg)
+ Table_elimination(JOIN *join_arg) : join(join_arg), n_outer_joins(0)
{
bzero(table_deps, sizeof(table_deps));
}
@@ -220,6 +220,7 @@
/* Outer joins that are candidates for elimination */
List<Outer_join_module> oj_deps;
+ uint n_outer_joins;
/* Bitmap of how expressions depend on bits */
MY_BITMAP expr_deps;
@@ -630,22 +631,25 @@
DBUG_ASSERT(eq_func);
/* Store possible eq field */
- (*eq_dep)->type= Module_dep::FD_EXPRESSION; //psergey-todo;
+ (*eq_dep)->type= Module_dep::MODULE_EXPRESSION; //psergey-todo;
(*eq_dep)->field= get_field_value(te, field);
(*eq_dep)->val= *value;
(*eq_dep)->level= and_level;
(*eq_dep)++;
}
+
/*
Get a Table_value object for the given table, creating it if necessary.
*/
static Table_value *get_table_value(Table_elimination *te, TABLE *table)
{
- Table_value *tbl_dep= new Table_value(table);
+ Table_value *tbl_dep;
+ if (!(tbl_dep= new Table_value(table)))
+ return NULL;
+
Key_module **key_list= &(tbl_dep->keys);
-
/* Add dependencies for unique keys */
for (uint i=0; i < table->s->keys; i++)
{
@@ -657,7 +661,7 @@
key_list= &(key_dep->next_table_key);
}
}
- return te->table_deps[table->tablenr] = tbl_dep;
+ return te->table_deps[table->tablenr]= tbl_dep;
}
@@ -672,7 +676,10 @@
/* First, get the table*/
if (!(tbl_dep= te->table_deps[table->tablenr]))
- tbl_dep= get_table_value(te, table);
+ {
+ if (!(tbl_dep= get_table_value(te, table)))
+ return NULL;
+ }
/* Try finding the field in field list */
Field_value **pfield= &(tbl_dep->fields);
@@ -702,10 +709,12 @@
static
Outer_join_module *get_outer_join_dep(Table_elimination *te,
- TABLE_LIST *outer_join, table_map deps_map)
+ TABLE_LIST *outer_join,
+ table_map deps_map)
{
Outer_join_module *oj_dep;
oj_dep= new Outer_join_module(outer_join, my_count_bits(deps_map));
+ te->n_outer_joins++;
/*
Collect a bitmap fo tables that we depend on, and also set parent pointer
@@ -734,7 +743,8 @@
}
}
DBUG_ASSERT(table);
- table_dep= get_table_value(te, table);
+ if (!(table_dep= get_table_value(te, table)))
+ return NULL;
}
/*
@@ -781,7 +791,7 @@
.
*/
-static void
+static bool
collect_funcdeps_for_join_list(Table_elimination *te,
List<TABLE_LIST> *join_list,
bool build_eq_deps,
@@ -808,11 +818,12 @@
eliminable= !(cur_map & outside_used_tables);
if (eliminable)
*eliminable_tables |= cur_map;
- collect_funcdeps_for_join_list(te, &tbl->nested_join->join_list,
- eliminable || build_eq_deps,
- outside_used_tables,
- eliminable_tables,
- eq_dep);
+ if (collect_funcdeps_for_join_list(te, &tbl->nested_join->join_list,
+ eliminable || build_eq_deps,
+ outside_used_tables,
+ eliminable_tables,
+ eq_dep))
+ return TRUE;
}
else
{
@@ -830,13 +841,13 @@
*eliminable_tables);
}
- if (eliminable)
- te->oj_deps.push_back(get_outer_join_dep(te, tbl, cur_map));
+ if (eliminable && get_outer_join_dep(te, tbl, cur_map))
+ return TRUE;
tables_used_on_left |= tbl->on_expr->used_tables();
}
}
- return;
+ return FALSE;
}
@@ -1053,16 +1064,18 @@
DBUG_VOID_RETURN;
Equality_module *eq_deps_end= te.equality_deps;
table_map eliminable_tables= 0;
- collect_funcdeps_for_join_list(&te, join->join_list,
- FALSE,
- used_tables,
- &eliminable_tables,
- &eq_deps_end);
+ if (collect_funcdeps_for_join_list(&te, join->join_list,
+ FALSE,
+ used_tables,
+ &eliminable_tables,
+ &eq_deps_end))
+ DBUG_VOID_RETURN;
te.n_equality_deps= eq_deps_end - te.equality_deps;
Module_dep *bound_modules;
//Value_dep *bound_values;
- setup_equality_deps(&te, &bound_modules);
+ if (setup_equality_deps(&te, &bound_modules))
+ DBUG_VOID_RETURN;
run_elimination_wave(&te, bound_modules);
}
@@ -1108,7 +1121,7 @@
{
switch (bound_modules->type)
{
- case Module_dep::FD_EXPRESSION:
+ case Module_dep::MODULE_EXPRESSION:
{
/* It's a field=expr and we got to know the expr, so we know the field */
Equality_module *eq_dep= (Equality_module*)bound_modules;
@@ -1121,7 +1134,7 @@
}
break;
}
- case Module_dep::FD_UNIQUE_KEY:
+ case Module_dep::MODULE_UNIQUE_KEY:
{
/* Unique key is known means the table is known */
Table_value *table_dep=((Key_module*)bound_modules)->table;
@@ -1134,13 +1147,13 @@
}
break;
}
- case Module_dep::FD_OUTER_JOIN:
+ case Module_dep::MODULE_OUTER_JOIN:
{
Outer_join_module *outer_join_dep= (Outer_join_module*)bound_modules;
mark_as_eliminated(te->join, outer_join_dep->table_list);
break;
}
- case Module_dep::FD_MULTI_EQUALITY:
+ case Module_dep::MODULE_MULTI_EQUALITY:
default:
DBUG_ASSERT(0);
}
1
0

[Maria-developers] Rev 2721: MWL#17: Address 2nd post-review feedback in file:///home/psergey/dev/maria-5.1-table-elim-r10/
by Sergey Petrunya 15 Aug '09
by Sergey Petrunya 15 Aug '09
15 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r10/
------------------------------------------------------------
revno: 2721
revision-id: psergey(a)askmonty.org-20090815060803-0yvp5mmgo87emykp
parent: psergey(a)askmonty.org-20090813211212-jghejwxsl6adtopl
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r10
timestamp: Sat 2009-08-15 10:08:03 +0400
message:
MWL#17: Address 2nd post-review feedback
- Switch from uniform graph to bipartite graph with two kinds of nodes:
"values" (tables and fields) and "modules" (t.col=func(...) equalities,
multi-equalities, unique keys, inner sides of outer joins).
- Rename functions, classes, etc.
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-13 20:44:52 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-15 06:08:03 +0000
@@ -40,19 +40,78 @@
Table elimination is redone on every PS re-execution.
*/
-
-/*
- An abstract structure that represents some entity that's being dependent on
- some other entity.
-*/
-
-class Func_dep : public Sql_alloc
-{
-public:
- enum {
- FD_INVALID,
+class Value_dep
+{
+public:
+ enum {
+ VALUE_FIELD,
+ VALUE_TABLE,
+ } type; /* Type of the object */
+
+ bool bound;
+ Value_dep *next;
+};
+
+class Field_value;
+class Table_value;
+class Outer_join_module;
+class Key_module;
+
+/*
+ A table field. There is only one such object for any tblX.fieldY
+ - the field epends on its table and equalities
+ - expressions that use the field are its dependencies
+*/
+class Field_value : public Value_dep
+{
+public:
+ Field_value(Table_value *table_arg, Field *field_arg) :
+ table(table_arg), field(field_arg)
+ {
+ type= Value_dep::VALUE_FIELD;
+ }
+
+ Table_value *table; /* Table this field is from */
+ Field *field;
+
+ /*
+ Field_deps that belong to one table form a linked list. list members are
+ ordered by field_index
+ */
+ Field_value *next_table_field;
+ uint bitmap_offset; /* Offset of our part of the bitmap */
+};
+
+
+/*
+ A table.
+ - table depends on any of its unique keys
+ - has its fields and embedding outer join as dependency.
+*/
+class Table_value : public Value_dep
+{
+public:
+ Table_value(TABLE *table_arg) :
+ table(table_arg), fields(NULL), keys(NULL), outer_join_dep(NULL)
+ {
+ type= Value_dep::VALUE_TABLE;
+ }
+ TABLE *table;
+ Field_value *fields; /* Ordered list of fields that belong to this table */
+ Key_module *keys; /* Ordered list of Unique keys in this table */
+ Outer_join_module *outer_join_dep; /* Innermost eliminable outer join we're in */
+};
+
+
+/*
+ A 'module'
+*/
+
+class Module_dep : public Sql_alloc
+{
+public:
+ enum {
FD_EXPRESSION,
- FD_FIELD,
FD_MULTI_EQUALITY,
FD_UNIQUE_KEY,
FD_TABLE,
@@ -63,58 +122,26 @@
Used to make a linked list of elements that became bound and thus can
make elements that depend on them bound, too.
*/
- Func_dep *next;
- bool bound; /* TRUE<=> The entity is considered bound */
- Func_dep() : next(NULL), bound(FALSE) {}
+ Module_dep *next;
+ uint unknown_args; /* TRUE<=> The entity is considered bound */
+
+ Module_dep() : next(NULL), unknown_args(0) {}
};
-class Field_dep;
-class Table_dep;
-class Outer_join_dep;
-
/*
A "tbl.column= expr" equality dependency. tbl.column depends on fields
used in expr.
*/
-class Equality_dep : public Func_dep
+class Equality_module : public Module_dep
{
public:
- Field_dep *field;
+ Field_value *field;
Item *val;
/* Used during condition analysis only, similar to KEYUSE::level */
uint level;
-
- /* Number of fields referenced from *val that are not yet 'bound' */
- uint unknown_args;
-};
-
-
-/*
- A table field. There is only one such object for any tblX.fieldY
- - the field epends on its table and equalities
- - expressions that use the field are its dependencies
-*/
-class Field_dep : public Func_dep
-{
-public:
- Field_dep(Table_dep *table_arg, Field *field_arg) :
- table(table_arg), field(field_arg)
- {
- type= Func_dep::FD_FIELD;
- }
-
- Table_dep *table; /* Table this field is from */
- Field *field;
-
- /*
- Field_deps that belong to one table form a linked list. list members are
- ordered by field_index
- */
- Field_dep *next_table_field;
- uint bitmap_offset; /* Offset of our part of the bitmap */
};
@@ -123,41 +150,21 @@
- Unique key depends on all of its components
- Key's table is its dependency
*/
-class Key_dep: public Func_dep
+class Key_module: public Module_dep
{
public:
- Key_dep(Table_dep *table_arg, uint keyno_arg, uint n_parts_arg) :
- table(table_arg), keyno(keyno_arg), n_missing_keyparts(n_parts_arg),
- next_table_key(NULL)
+ Key_module(Table_value *table_arg, uint keyno_arg, uint n_parts_arg) :
+ table(table_arg), keyno(keyno_arg), next_table_key(NULL)
{
- type= Func_dep::FD_UNIQUE_KEY;
+ type= Module_dep::FD_UNIQUE_KEY;
+ unknown_args= n_parts_arg;
}
- Table_dep *table; /* Table this key is from */
+ Table_value *table; /* Table this key is from */
uint keyno;
- uint n_missing_keyparts;
/* Unique keys form a linked list, ordered by keyno */
- Key_dep *next_table_key;
-};
-
-
-/*
- A table.
- - table depends on any of its unique keys
- - has its fields and embedding outer join as dependency.
-*/
-class Table_dep : public Func_dep
-{
-public:
- Table_dep(TABLE *table_arg) :
- table(table_arg), fields(NULL), keys(NULL), outer_join_dep(NULL)
- {
- type= Func_dep::FD_TABLE;
- }
- TABLE *table;
- Field_dep *fields; /* Ordered list of fields that belong to this table */
- Key_dep *keys; /* Ordered list of Unique keys in this table */
- Outer_join_dep *outer_join_dep; /* Innermost eliminable outer join we're in */
-};
+ Key_module *next_table_key;
+};
+
/*
@@ -165,14 +172,14 @@
- it depends on all tables inside it
- has its parent outer join as dependency
*/
-class Outer_join_dep: public Func_dep
+class Outer_join_module: public Module_dep
{
public:
- Outer_join_dep(TABLE_LIST *table_list_arg, table_map missing_tables_arg) :
- table_list(table_list_arg), missing_tables(missing_tables_arg),
- all_tables(missing_tables_arg), parent(NULL)
+ Outer_join_module(TABLE_LIST *table_list_arg, uint n_children) :
+ table_list(table_list_arg), parent(NULL)
{
- type= Func_dep::FD_OUTER_JOIN;
+ type= Module_dep::FD_OUTER_JOIN;
+ unknown_args= n_children;
}
/*
Outer join we're representing. This can be a join nest or a one table that
@@ -184,11 +191,11 @@
Tables within this outer join (and its descendants) that are not yet known
to be functionally dependent.
*/
- table_map missing_tables;
+ table_map missing_tables; //psergey-todo: remove
/* All tables within this outer join and its descendants */
- table_map all_tables;
+ table_map all_tables; //psergey-todo: remove
/* Parent eliminable outer join, if any */
- Outer_join_dep *parent;
+ Outer_join_module *parent;
};
@@ -205,44 +212,45 @@
JOIN *join;
/* Array of equality dependencies */
- Equality_dep *equality_deps;
+ Equality_module *equality_deps;
uint n_equality_deps; /* Number of elements in the array */
- /* tablenr -> Table_dep* mapping. */
- Table_dep *table_deps[MAX_KEY];
+ /* tablenr -> Table_value* mapping. */
+ Table_value *table_deps[MAX_KEY];
/* Outer joins that are candidates for elimination */
- List<Outer_join_dep> oj_deps;
+ List<Outer_join_module> oj_deps;
/* Bitmap of how expressions depend on bits */
MY_BITMAP expr_deps;
};
-
static
-void build_eq_deps_for_cond(Table_elimination *te, Equality_dep **fdeps,
+void build_eq_deps_for_cond(Table_elimination *te, Equality_module **fdeps,
uint *and_level, Item *cond,
table_map usable_tables);
static
void add_eq_dep(Table_elimination *te,
- Equality_dep **eq_dep, uint and_level,
+ Equality_module **eq_dep, uint and_level,
Item_func *cond, Field *field,
bool eq_func, Item **value,
uint num_values, table_map usable_tables);
static
-Equality_dep *merge_func_deps(Equality_dep *start, Equality_dep *new_fields,
- Equality_dep *end, uint and_level);
-
-static Table_dep *get_table_dep(Table_elimination *te, TABLE *table);
-static Field_dep *get_field_dep(Table_elimination *te, Field *field);
-
+Equality_module *merge_func_deps(Equality_module *start, Equality_module *new_fields,
+ Equality_module *end, uint and_level);
+
+static Table_value *get_table_value(Table_elimination *te, TABLE *table);
+static Field_value *get_field_value(Table_elimination *te, Field *field);
+static
+void run_elimination_wave(Table_elimination *te, Module_dep *bound_modules);
void eliminate_tables(JOIN *join);
static void mark_as_eliminated(JOIN *join, TABLE_LIST *tbl);
+#if 0
#ifndef DBUG_OFF
static void dbug_print_deps(Table_elimination *te);
#endif
-
+#endif
/*******************************************************************************************/
/*
@@ -262,14 +270,14 @@
*/
static
-void build_eq_deps_for_cond(Table_elimination *te, Equality_dep **fdeps,
+void build_eq_deps_for_cond(Table_elimination *te, Equality_module **fdeps,
uint *and_level, Item *cond,
table_map usable_tables)
{
if (cond->type() == Item_func::COND_ITEM)
{
List_iterator_fast<Item> li(*((Item_cond*) cond)->argument_list());
- Equality_dep *org_key_fields= *fdeps;
+ Equality_module *org_key_fields= *fdeps;
/* AND/OR */
if (((Item_cond*) cond)->functype() == Item_func::COND_AND_FUNC)
@@ -293,7 +301,7 @@
Item *item;
while ((item=li++))
{
- Equality_dep *start_key_fields= *fdeps;
+ Equality_module *start_key_fields= *fdeps;
(*and_level)++;
build_eq_deps_for_cond(te, fdeps, and_level, item, usable_tables);
*fdeps= merge_func_deps(org_key_fields, start_key_fields, *fdeps,
@@ -432,7 +440,7 @@
/*
- Perform an OR operation on two (adjacent) Equality_dep arrays.
+ Perform an OR operation on two (adjacent) Equality_module arrays.
SYNOPSIS
merge_func_deps()
@@ -442,7 +450,7 @@
and_level AND-level.
DESCRIPTION
- This function is invoked for two adjacent arrays of Equality_dep elements:
+ This function is invoked for two adjacent arrays of Equality_module elements:
$LEFT_PART $RIGHT_PART
+-----------------------+-----------------------+
@@ -477,19 +485,19 @@
*/
static
-Equality_dep *merge_func_deps(Equality_dep *start, Equality_dep *new_fields,
- Equality_dep *end, uint and_level)
+Equality_module *merge_func_deps(Equality_module *start, Equality_module *new_fields,
+ Equality_module *end, uint and_level)
{
if (start == new_fields)
return start; // Impossible or
if (new_fields == end)
return start; // No new fields, skip all
- Equality_dep *first_free=new_fields;
+ Equality_module *first_free=new_fields;
for (; new_fields != end ; new_fields++)
{
- for (Equality_dep *old=start ; old != first_free ; old++)
+ for (Equality_module *old=start ; old != first_free ; old++)
{
/*
TODO: does it make sense to attempt to merging multiple-equalities?
@@ -534,7 +542,7 @@
Ok, the results are within the [start, first_free) range, and the useful
elements have level==and_level. Now, lets remove all unusable elements:
*/
- for (Equality_dep *old=start ; old != first_free ;)
+ for (Equality_module *old=start ; old != first_free ;)
{
if (old->level != and_level)
{ // Not used in all levels
@@ -550,14 +558,14 @@
/*
- Add an Equality_dep element for a given predicate, if applicable
+ Add an Equality_module element for a given predicate, if applicable
DESCRIPTION
This function is modeled after add_key_field().
*/
static
-void add_eq_dep(Table_elimination *te, Equality_dep **eq_dep,
+void add_eq_dep(Table_elimination *te, Equality_module **eq_dep,
uint and_level, Item_func *cond, Field *field,
bool eq_func, Item **value, uint num_values,
table_map usable_tables)
@@ -622,22 +630,21 @@
DBUG_ASSERT(eq_func);
/* Store possible eq field */
- (*eq_dep)->type= Func_dep::FD_EXPRESSION; //psergey-todo;
- (*eq_dep)->field= get_field_dep(te, field);
+ (*eq_dep)->type= Module_dep::FD_EXPRESSION; //psergey-todo;
+ (*eq_dep)->field= get_field_value(te, field);
(*eq_dep)->val= *value;
(*eq_dep)->level= and_level;
(*eq_dep)++;
}
-
/*
- Get a Table_dep object for the given table, creating it if necessary.
+ Get a Table_value object for the given table, creating it if necessary.
*/
-static Table_dep *get_table_dep(Table_elimination *te, TABLE *table)
+static Table_value *get_table_value(Table_elimination *te, TABLE *table)
{
- Table_dep *tbl_dep= new Table_dep(table);
- Key_dep **key_list= &(tbl_dep->keys);
+ Table_value *tbl_dep= new Table_value(table);
+ Key_module **key_list= &(tbl_dep->keys);
/* Add dependencies for unique keys */
for (uint i=0; i < table->s->keys; i++)
@@ -645,7 +652,7 @@
KEY *key= table->key_info + i;
if ((key->flags & (HA_NOSAME | HA_END_SPACE_KEY)) == HA_NOSAME)
{
- Key_dep *key_dep= new Key_dep(tbl_dep, i, key->key_parts);
+ Key_module *key_dep= new Key_module(tbl_dep, i, key->key_parts);
*key_list= key_dep;
key_list= &(key_dep->next_table_key);
}
@@ -655,20 +662,20 @@
/*
- Get a Field_dep object for the given field, creating it if necessary
+ Get a Field_value object for the given field, creating it if necessary
*/
-static Field_dep *get_field_dep(Table_elimination *te, Field *field)
+static Field_value *get_field_value(Table_elimination *te, Field *field)
{
TABLE *table= field->table;
- Table_dep *tbl_dep;
+ Table_value *tbl_dep;
/* First, get the table*/
if (!(tbl_dep= te->table_deps[table->tablenr]))
- tbl_dep= get_table_dep(te, table);
+ tbl_dep= get_table_value(te, table);
/* Try finding the field in field list */
- Field_dep **pfield= &(tbl_dep->fields);
+ Field_value **pfield= &(tbl_dep->fields);
while (*pfield && (*pfield)->field->field_index < field->field_index)
{
pfield= &((*pfield)->next_table_field);
@@ -677,7 +684,7 @@
return *pfield;
/* Create the field and insert it in the list */
- Field_dep *new_field= new Field_dep(tbl_dep, field);
+ Field_value *new_field= new Field_value(tbl_dep, field);
new_field->next_table_field= *pfield;
*pfield= new_field;
@@ -686,19 +693,19 @@
/*
- Create an Outer_join_dep object for the given outer join
+ Create an Outer_join_module object for the given outer join
DESCRIPTION
- Outer_join_dep objects for children (or further descendants) are always
+ Outer_join_module objects for children (or further descendants) are always
created before the parents.
*/
static
-Outer_join_dep *get_outer_join_dep(Table_elimination *te,
+Outer_join_module *get_outer_join_dep(Table_elimination *te,
TABLE_LIST *outer_join, table_map deps_map)
{
- Outer_join_dep *oj_dep;
- oj_dep= new Outer_join_dep(outer_join, deps_map);
+ Outer_join_module *oj_dep;
+ oj_dep= new Outer_join_module(outer_join, my_count_bits(deps_map));
/*
Collect a bitmap fo tables that we depend on, and also set parent pointer
@@ -708,7 +715,7 @@
int idx;
while ((idx= it.next_bit()) != Table_map_iterator::BITMAP_END)
{
- Table_dep *table_dep;
+ Table_value *table_dep;
if (!(table_dep= te->table_deps[idx]))
{
/*
@@ -727,23 +734,24 @@
}
}
DBUG_ASSERT(table);
- table_dep= get_table_dep(te, table);
+ table_dep= get_table_value(te, table);
}
/*
Walk from the table up to its embedding outer joins. The goal is to
find the least embedded outer join nest and set its parent pointer to
- point to the newly created Outer_join_dep.
+ point to the newly created Outer_join_module.
to set the pointer of its near
*/
if (!table_dep->outer_join_dep)
table_dep->outer_join_dep= oj_dep;
else
{
- Outer_join_dep *oj= table_dep->outer_join_dep;
+ Outer_join_module *oj= table_dep->outer_join_dep;
while (oj->parent)
oj= oj->parent;
- oj->parent=oj_dep;
+ if (oj != oj_dep)
+ oj->parent=oj_dep;
}
}
return oj_dep;
@@ -757,7 +765,7 @@
collect_funcdeps_for_join_list()
te Table elimination context.
join_list Join list to work on
- build_eq_deps TRUE <=> build Equality_dep elements for all
+ build_eq_deps TRUE <=> build Equality_module elements for all
members of the join list, even if they cannot
be individually eliminated
tables_used_elsewhere Bitmap of tables that are referred to from
@@ -779,7 +787,7 @@
bool build_eq_deps,
table_map tables_used_elsewhere,
table_map *eliminable_tables,
- Equality_dep **eq_dep)
+ Equality_module **eq_dep)
{
TABLE_LIST *tbl;
List_iterator<TABLE_LIST> it(*join_list);
@@ -845,10 +853,10 @@
void see_field(Field *field)
{
- Table_dep *tbl_dep;
+ Table_value *tbl_dep;
if ((tbl_dep= te->table_deps[field->table->tablenr]))
{
- for (Field_dep *field_dep= tbl_dep->fields; field_dep;
+ for (Field_value *field_dep= tbl_dep->fields; field_dep;
field_dep= field_dep->next_table_field)
{
if (field->field_index == field_dep->field->field_index)
@@ -888,21 +896,21 @@
*/
static
-bool setup_equality_deps(Table_elimination *te, Func_dep **bound_deps_list)
+bool setup_equality_deps(Table_elimination *te, Module_dep **bound_deps_list)
{
DBUG_ENTER("setup_equality_deps");
/*
- Count Field_dep objects and assign each of them a unique bitmap_offset.
+ Count Field_value objects and assign each of them a unique bitmap_offset.
*/
uint offset= 0;
- for (Table_dep **tbl_dep=te->table_deps;
+ for (Table_value **tbl_dep=te->table_deps;
tbl_dep < te->table_deps + MAX_TABLES;
tbl_dep++)
{
if (*tbl_dep)
{
- for (Field_dep *field_dep= (*tbl_dep)->fields;
+ for (Field_value *field_dep= (*tbl_dep)->fields;
field_dep;
field_dep= field_dep->next_table_field)
{
@@ -926,9 +934,9 @@
Also collect a linked list of equalities that are bound.
*/
- Func_dep *bound_dep= NULL;
+ Module_dep *bound_dep= NULL;
Field_dependency_setter deps_setter(te);
- for (Equality_dep *eq_dep= te->equality_deps;
+ for (Equality_module *eq_dep= te->equality_deps;
eq_dep < te->equality_deps + te->n_equality_deps;
eq_dep++)
{
@@ -940,12 +948,11 @@
{
eq_dep->next= bound_dep;
bound_dep= eq_dep;
- eq_dep->bound= TRUE;
}
}
*bound_deps_list= bound_dep;
- DBUG_EXECUTE("test", dbug_print_deps(te); );
+ //DBUG_EXECUTE("test", dbug_print_deps(te); );
DBUG_RETURN(FALSE);
}
@@ -1042,9 +1049,9 @@
uint m= max(thd->lex->current_select->max_equal_elems,1);
uint max_elems= ((thd->lex->current_select->cond_count+1)*2 +
thd->lex->current_select->between_count)*m + 1 + 10;
- if (!(te.equality_deps= new Equality_dep[max_elems]))
+ if (!(te.equality_deps= new Equality_module[max_elems]))
DBUG_VOID_RETURN;
- Equality_dep *eq_deps_end= te.equality_deps;
+ Equality_module *eq_deps_end= te.equality_deps;
table_map eliminable_tables= 0;
collect_funcdeps_for_join_list(&te, join->join_list,
FALSE,
@@ -1052,96 +1059,125 @@
&eliminable_tables,
&eq_deps_end);
te.n_equality_deps= eq_deps_end - te.equality_deps;
- Func_dep *bound_dep;
- setup_equality_deps(&te, &bound_dep);
-
- /*
- Run the wave.
- All Func_dep-derived objects are divided into three classes:
- - Those that have bound=FALSE
- - Those that have bound=TRUE
- - Those that have bound=TRUE and are in the list..
-
- */
- while (bound_dep)
- {
- Func_dep *next= bound_dep->next;
- //e= list.remove_first();
- switch (bound_dep->type)
+
+ Module_dep *bound_modules;
+ //Value_dep *bound_values;
+ setup_equality_deps(&te, &bound_modules);
+
+ run_elimination_wave(&te, bound_modules);
+ }
+ DBUG_VOID_RETURN;
+}
+
+
+static
+void signal_from_field_to_exprs(Table_elimination* te, Field_value *field_dep,
+ Module_dep **bound_modules)
+{
+ /* Now, expressions */
+ for (uint i=0; i < te->n_equality_deps; i++)
+ {
+ if (bitmap_is_set(&te->expr_deps, field_dep->bitmap_offset + i) &&
+ te->equality_deps[i].unknown_args &&
+ !--te->equality_deps[i].unknown_args)
+ {
+ /* Mark as bound and add to the list */
+ Equality_module* eq_dep= &te->equality_deps[i];
+ eq_dep->next= *bound_modules;
+ *bound_modules= eq_dep;
+ }
+ }
+}
+
+
+static
+void run_elimination_wave(Table_elimination *te, Module_dep *bound_modules)
+{
+ Value_dep *bound_values= NULL;
+ /*
+ Run the wave.
+ All Func_dep-derived objects are divided into three classes:
+ - Those that have bound=FALSE
+ - Those that have bound=TRUE
+ - Those that have bound=TRUE and are in the list..
+
+ */
+ while (bound_modules)
+ {
+ for (;bound_modules; bound_modules= bound_modules->next)
+ {
+ switch (bound_modules->type)
{
- case Func_dep::FD_EXPRESSION:
+ case Module_dep::FD_EXPRESSION:
{
/* It's a field=expr and we got to know the expr, so we know the field */
- Equality_dep *eq_dep= (Equality_dep*)bound_dep;
+ Equality_module *eq_dep= (Equality_module*)bound_modules;
if (!eq_dep->field->bound)
{
/* Mark as bound and add to the list */
eq_dep->field->bound= TRUE;
- eq_dep->field->next= next;
- next= eq_dep->field;
- }
- break;
- }
- case Func_dep::FD_FIELD:
+ eq_dep->field->next= bound_values;
+ bound_values= eq_dep->field;
+ }
+ break;
+ }
+ case Module_dep::FD_UNIQUE_KEY:
+ {
+ /* Unique key is known means the table is known */
+ Table_value *table_dep=((Key_module*)bound_modules)->table;
+ if (!table_dep->bound)
+ {
+ /* Mark as bound and add to the list */
+ table_dep->bound= TRUE;
+ table_dep->next= bound_values;
+ bound_values= table_dep;
+ }
+ break;
+ }
+ case Module_dep::FD_OUTER_JOIN:
+ {
+ Outer_join_module *outer_join_dep= (Outer_join_module*)bound_modules;
+ mark_as_eliminated(te->join, outer_join_dep->table_list);
+ break;
+ }
+ case Module_dep::FD_MULTI_EQUALITY:
+ default:
+ DBUG_ASSERT(0);
+ }
+ }
+
+ for (;bound_values; bound_values=bound_values->next)
+ {
+ switch (bound_values->type)
+ {
+ case Value_dep::VALUE_FIELD:
{
/*
Field became known. Check out
- unique keys we belong to
- expressions that depend on us.
*/
- Field_dep *field_dep= (Field_dep*)bound_dep;
- for (Key_dep *key_dep= field_dep->table->keys; key_dep;
+ Field_value *field_dep= (Field_value*)bound_values;
+ for (Key_module *key_dep= field_dep->table->keys; key_dep;
key_dep= key_dep->next_table_key)
{
DBUG_PRINT("info", ("key %s.%s is now bound",
key_dep->table->table->alias,
key_dep->table->table->key_info[key_dep->keyno].name));
if (field_dep->field->part_of_key.is_set(key_dep->keyno) &&
- !key_dep->bound)
- {
- if (!--key_dep->n_missing_keyparts)
- {
- /* Mark as bound and add to the list */
- key_dep->bound= TRUE;
- key_dep->next= next;
- next= key_dep;
- }
- }
- }
-
- /* Now, expressions */
- for (uint i=0; i < te.n_equality_deps; i++)
- {
- if (bitmap_is_set(&te.expr_deps, field_dep->bitmap_offset + i))
- {
- Equality_dep* eq_dep= &te.equality_deps[i];
- if (!--eq_dep->unknown_args)
- {
- /* Mark as bound and add to the list */
- eq_dep->bound= TRUE;
- eq_dep->next= next;
- next= eq_dep;
- }
- }
- }
- break;
- }
- case Func_dep::FD_UNIQUE_KEY:
- {
- /* Unique key is known means the table is known */
- Table_dep *table_dep=((Key_dep*)bound_dep)->table;
- if (!table_dep->bound)
- {
- /* Mark as bound and add to the list */
- table_dep->bound= TRUE;
- table_dep->next= next;
- next= table_dep;
- }
- break;
- }
- case Func_dep::FD_TABLE:
- {
- Table_dep *table_dep=(Table_dep*)bound_dep;
+ key_dep->unknown_args && !--key_dep->unknown_args)
+ {
+ /* Mark as bound and add to the list */
+ key_dep->next= bound_modules;
+ bound_modules= key_dep;
+ }
+ }
+ signal_from_field_to_exprs(te, field_dep, &bound_modules);
+ break;
+ }
+ case Value_dep::VALUE_TABLE:
+ {
+ Table_value *table_dep=(Table_value*)bound_values;
DBUG_PRINT("info", ("table %s is now bound",
table_dep->table->alias));
/*
@@ -1149,50 +1185,35 @@
- all its fields are known
- one more element in outer join nest is known
*/
- for (Field_dep *field_dep= table_dep->fields; field_dep;
+ for (Field_value *field_dep= table_dep->fields; field_dep;
field_dep= field_dep->next_table_field)
{
if (!field_dep->bound)
{
/* Mark as bound and add to the list */
field_dep->bound= TRUE;
- field_dep->next= next;
- next= field_dep;
- }
- }
- Outer_join_dep *outer_join_dep= table_dep->outer_join_dep;
- if (!(outer_join_dep->missing_tables &= ~table_dep->table->map))
- {
- /* Mark as bound and add to the list */
- outer_join_dep->bound= TRUE;
- outer_join_dep->next= next;
- next= outer_join_dep;
- }
- break;
- }
- case Func_dep::FD_OUTER_JOIN:
- {
- Outer_join_dep *outer_join_dep= (Outer_join_dep*)bound_dep;
- mark_as_eliminated(te.join, outer_join_dep->table_list);
- Outer_join_dep *parent= outer_join_dep->parent;
- if (parent &&
- !(parent->missing_tables &= ~outer_join_dep->all_tables))
- {
- /* Mark as bound and add to the list */
- parent->bound= TRUE;
- parent->next= next;
- next= parent;
- }
- break;
- }
- case Func_dep::FD_MULTI_EQUALITY:
- default:
+ signal_from_field_to_exprs(te, field_dep, &bound_modules);
+ }
+ }
+ for (Outer_join_module *outer_join_dep= table_dep->outer_join_dep;
+ outer_join_dep; outer_join_dep= outer_join_dep->parent)
+ {
+ //if (!(outer_join_dep->missing_tables &= ~table_dep->table->map))
+ if (outer_join_dep->unknown_args &&
+ !--outer_join_dep->unknown_args)
+ {
+ /* Mark as bound and add to the list */
+ outer_join_dep->next= bound_modules;
+ bound_modules= outer_join_dep;
+ }
+ }
+ break;
+ }
+ default:
DBUG_ASSERT(0);
}
- bound_dep= next;
}
}
- DBUG_VOID_RETURN;
}
@@ -1232,7 +1253,7 @@
}
-
+#if 0
#ifndef DBUG_OFF
static
void dbug_print_deps(Table_elimination *te)
@@ -1243,7 +1264,7 @@
fprintf(DBUG_FILE,"deps {\n");
/* Start with printing equalities */
- for (Equality_dep *eq_dep= te->equality_deps;
+ for (Equality_module *eq_dep= te->equality_deps;
eq_dep != te->equality_deps + te->n_equality_deps; eq_dep++)
{
char buf[128];
@@ -1261,13 +1282,13 @@
/* Then tables and their fields */
for (uint i=0; i < MAX_TABLES; i++)
{
- Table_dep *table_dep;
+ Table_value *table_dep;
if ((table_dep= te->table_deps[i]))
{
/* Print table */
fprintf(DBUG_FILE, " table %s\n", table_dep->table->alias);
/* Print fields */
- for (Field_dep *field_dep= table_dep->fields; field_dep;
+ for (Field_value *field_dep= table_dep->fields; field_dep;
field_dep= field_dep->next_table_field)
{
fprintf(DBUG_FILE, " field %s.%s ->", table_dep->table->alias,
@@ -1288,7 +1309,7 @@
}
#endif
-
+#endif
/**
@} (end of group Table_Elimination)
*/
1
0

[Maria-developers] Updated (by Guest): improving mysqlbinlog output and doing rename (39)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: improving mysqlbinlog output and doing rename
CREATION DATE..: Sun, 09 Aug 2009, 12:24
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-RawIdeaBin
TASK ID........: 39 (http://askmonty.org/worklog/?tid=39)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 17
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 15:52)=-=-
Title modified.
--- /tmp/wklog.39.old.11123 2009-08-14 15:52:29.000000000 +0300
+++ /tmp/wklog.39.new.11123 2009-08-14 15:52:29.000000000 +0300
@@ -1 +1 @@
-Replication tasks
+improving mysqlbinlog output and doing rename
-=-=(Guest - Mon, 10 Aug 2009, 16:32)=-=-
Adding 1 hour for Monty's initial work on starting the architecture review.
Worked 1 hour and estimate 0 hours remain (original estimate increased by 1 hour).
-=-=(Psergey - Mon, 10 Aug 2009, 15:59)=-=-
Re-searched and added subtasks.
Worked 16 hours and estimate 0 hours remain (original estimate increased by 16 hours).
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Psergey - Sun, 09 Aug 2009, 12:27)=-=-
Dependency created: 39 now depends on 36
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 38
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 37
DESCRIPTION:
A combine task for all replication tasks.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): improving mysqlbinlog output and doing rename (39)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: improving mysqlbinlog output and doing rename
CREATION DATE..: Sun, 09 Aug 2009, 12:24
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-RawIdeaBin
TASK ID........: 39 (http://askmonty.org/worklog/?tid=39)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 17
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 15:52)=-=-
Title modified.
--- /tmp/wklog.39.old.11123 2009-08-14 15:52:29.000000000 +0300
+++ /tmp/wklog.39.new.11123 2009-08-14 15:52:29.000000000 +0300
@@ -1 +1 @@
-Replication tasks
+improving mysqlbinlog output and doing rename
-=-=(Guest - Mon, 10 Aug 2009, 16:32)=-=-
Adding 1 hour for Monty's initial work on starting the architecture review.
Worked 1 hour and estimate 0 hours remain (original estimate increased by 1 hour).
-=-=(Psergey - Mon, 10 Aug 2009, 15:59)=-=-
Re-searched and added subtasks.
Worked 16 hours and estimate 0 hours remain (original estimate increased by 16 hours).
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Psergey - Sun, 09 Aug 2009, 12:27)=-=-
Dependency created: 39 now depends on 36
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 38
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 37
DESCRIPTION:
A combine task for all replication tasks.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Knielsen): Add a mysqlbinlog option to filter updates to certain tables (40)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter updates to certain tables
CREATION DATE..: Mon, 10 Aug 2009, 13:25
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 40 (http://askmonty.org/worklog/?tid=40)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Knielsen - Fri, 14 Aug 2009, 15:47)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.10896 2009-08-14 15:47:39.000000000 +0300
+++ /tmp/wklog.40.new.10896 2009-08-14 15:47:39.000000000 +0300
@@ -72,3 +72,21 @@
/* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
and further processing in mysqlbinlog will be trivial.
+
+2.4 Implement server functionality to ignore certain tables
+-----------------------------------------------------------
+
+We could add a general facility in the server to ignore certain tables:
+
+ SET SESSION ignored_tables = "db1.t1,db2.t2";
+
+This would work similar to --replicate-ignore-table, but in a general way not
+restricted to the slave SQL thread.
+
+It would then be trivial for mysqlbinlog to add such statements at the start
+of the output, or probably the user could just do it manually with no need for
+additional options for mysqlbinlog.
+
+It might be useful to integrate this with the code that already handles
+--replicate-ignore-db and similar slave options.
+
-=-=(Psergey - Mon, 10 Aug 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.12989 2009-08-10 15:41:23.000000000 +0300
+++ /tmp/wklog.40.new.12989 2009-08-10 15:41:23.000000000 +0300
@@ -1,6 +1,7 @@
-
1. Context
----------
+(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
+overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High Level Description modified.
--- /tmp/wklog.40.old.16985 2009-08-10 14:51:59.000000000 +0300
+++ /tmp/wklog.40.new.16985 2009-08-10 14:51:59.000000000 +0300
@@ -1,3 +1,4 @@
Replication slave can be set to filter updates to certain tables with
---replicate-[wild-]{do,ignore}-table options. This task is about adding similar
-functionality to mysqlbinlog.
+--replicate-[wild-]{do,ignore}-table options.
+
+This task is about adding similar functionality to mysqlbinlog.
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.16949 2009-08-10 14:51:33.000000000 +0300
+++ /tmp/wklog.40.new.16949 2009-08-10 14:51:33.000000000 +0300
@@ -1 +1,73 @@
+1. Context
+----------
+At the moment, the server has these replication slave options:
+
+ --replicate-do-table=db.tbl
+ --replicate-ignore-table=db.tbl
+ --replicate-wild-do-table=pattern.pattern
+ --replicate-wild-ignore-table=pattern.pattern
+
+They affect both RBR and SBR events. SBR events are checked after the
+statement has been parsed, the server iterates over list of used tables and
+checks them againist --replicate instructions.
+
+What is interesting is that this scheme still allows to update the ignored
+table through a VIEW.
+
+2. Table filtering in mysqlbinlog
+---------------------------------
+
+Per-table filtering of RBR events is easy (as it is relatively easy to extract
+the name of the table that the event applies to).
+
+Per-table filtering of SBR events is hard, as generally it is not apparent
+which tables the statement refers to.
+
+This opens possible options:
+
+2.1 Put the parser into mysqlbinlog
+-----------------------------------
+Once we have a full parser in mysqlbinlog, we'll be able to check which tables
+are used by a statement, and will allow to show behaviour identical to those
+that one obtains when using --replicate-* slave options.
+
+(It is not clear how much effort is needed to put the parser into mysqlbinlog.
+Any guesses?)
+
+
+2.2 Use dumb regexp match
+-------------------------
+Use a really dumb approach. A query is considered to be modifying table X if
+it matches an expression
+
+CREATE TABLE $tablename
+DROP $tablename
+UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
+DELETE ...$tablename ... WHERE // same as above
+ALTER TABLE $tablename
+.. etc (go get from the grammar) ..
+
+The advantage over doing the same in awk is that mysqlbinlog will also process
+RBR statements, and together with that will provide a working solution for
+those who are careful with their table names not mixing with string constants
+and such.
+
+(TODO: string constants are of particular concern as they come from
+[potentially hostile] users, unlike e.g. table aliases which come from
+[not hostile] developers. Remove also all string constants before attempting
+to do match?)
+
+2.3 Have the master put annotations
+-----------------------------------
+We could add a master option so that it injects into query a mark that tells
+which tables the query will affect, e.g. for the query
+
+ UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
+
+
+the binlog will have
+
+ /* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
+
+and further processing in mysqlbinlog will be trivial.
DESCRIPTION:
Replication slave can be set to filter updates to certain tables with
--replicate-[wild-]{do,ignore}-table options.
This task is about adding similar functionality to mysqlbinlog.
HIGH-LEVEL SPECIFICATION:
1. Context
----------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
--replicate-ignore-table=db.tbl
--replicate-wild-do-table=pattern.pattern
--replicate-wild-ignore-table=pattern.pattern
They affect both RBR and SBR events. SBR events are checked after the
statement has been parsed, the server iterates over list of used tables and
checks them againist --replicate instructions.
What is interesting is that this scheme still allows to update the ignored
table through a VIEW.
2. Table filtering in mysqlbinlog
---------------------------------
Per-table filtering of RBR events is easy (as it is relatively easy to extract
the name of the table that the event applies to).
Per-table filtering of SBR events is hard, as generally it is not apparent
which tables the statement refers to.
This opens possible options:
2.1 Put the parser into mysqlbinlog
-----------------------------------
Once we have a full parser in mysqlbinlog, we'll be able to check which tables
are used by a statement, and will allow to show behaviour identical to those
that one obtains when using --replicate-* slave options.
(It is not clear how much effort is needed to put the parser into mysqlbinlog.
Any guesses?)
2.2 Use dumb regexp match
-------------------------
Use a really dumb approach. A query is considered to be modifying table X if
it matches an expression
CREATE TABLE $tablename
DROP $tablename
UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
DELETE ...$tablename ... WHERE // same as above
ALTER TABLE $tablename
.. etc (go get from the grammar) ..
The advantage over doing the same in awk is that mysqlbinlog will also process
RBR statements, and together with that will provide a working solution for
those who are careful with their table names not mixing with string constants
and such.
(TODO: string constants are of particular concern as they come from
[potentially hostile] users, unlike e.g. table aliases which come from
[not hostile] developers. Remove also all string constants before attempting
to do match?)
2.3 Have the master put annotations
-----------------------------------
We could add a master option so that it injects into query a mark that tells
which tables the query will affect, e.g. for the query
UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
the binlog will have
/* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
and further processing in mysqlbinlog will be trivial.
2.4 Implement server functionality to ignore certain tables
-----------------------------------------------------------
We could add a general facility in the server to ignore certain tables:
SET SESSION ignored_tables = "db1.t1,db2.t2";
This would work similar to --replicate-ignore-table, but in a general way not
restricted to the slave SQL thread.
It would then be trivial for mysqlbinlog to add such statements at the start
of the output, or probably the user could just do it manually with no need for
additional options for mysqlbinlog.
It might be useful to integrate this with the code that already handles
--replicate-ignore-db and similar slave options.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Knielsen): Add a mysqlbinlog option to filter updates to certain tables (40)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter updates to certain tables
CREATION DATE..: Mon, 10 Aug 2009, 13:25
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 40 (http://askmonty.org/worklog/?tid=40)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Knielsen - Fri, 14 Aug 2009, 15:47)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.10896 2009-08-14 15:47:39.000000000 +0300
+++ /tmp/wklog.40.new.10896 2009-08-14 15:47:39.000000000 +0300
@@ -72,3 +72,21 @@
/* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
and further processing in mysqlbinlog will be trivial.
+
+2.4 Implement server functionality to ignore certain tables
+-----------------------------------------------------------
+
+We could add a general facility in the server to ignore certain tables:
+
+ SET SESSION ignored_tables = "db1.t1,db2.t2";
+
+This would work similar to --replicate-ignore-table, but in a general way not
+restricted to the slave SQL thread.
+
+It would then be trivial for mysqlbinlog to add such statements at the start
+of the output, or probably the user could just do it manually with no need for
+additional options for mysqlbinlog.
+
+It might be useful to integrate this with the code that already handles
+--replicate-ignore-db and similar slave options.
+
-=-=(Psergey - Mon, 10 Aug 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.12989 2009-08-10 15:41:23.000000000 +0300
+++ /tmp/wklog.40.new.12989 2009-08-10 15:41:23.000000000 +0300
@@ -1,6 +1,7 @@
-
1. Context
----------
+(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
+overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High Level Description modified.
--- /tmp/wklog.40.old.16985 2009-08-10 14:51:59.000000000 +0300
+++ /tmp/wklog.40.new.16985 2009-08-10 14:51:59.000000000 +0300
@@ -1,3 +1,4 @@
Replication slave can be set to filter updates to certain tables with
---replicate-[wild-]{do,ignore}-table options. This task is about adding similar
-functionality to mysqlbinlog.
+--replicate-[wild-]{do,ignore}-table options.
+
+This task is about adding similar functionality to mysqlbinlog.
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.16949 2009-08-10 14:51:33.000000000 +0300
+++ /tmp/wklog.40.new.16949 2009-08-10 14:51:33.000000000 +0300
@@ -1 +1,73 @@
+1. Context
+----------
+At the moment, the server has these replication slave options:
+
+ --replicate-do-table=db.tbl
+ --replicate-ignore-table=db.tbl
+ --replicate-wild-do-table=pattern.pattern
+ --replicate-wild-ignore-table=pattern.pattern
+
+They affect both RBR and SBR events. SBR events are checked after the
+statement has been parsed, the server iterates over list of used tables and
+checks them againist --replicate instructions.
+
+What is interesting is that this scheme still allows to update the ignored
+table through a VIEW.
+
+2. Table filtering in mysqlbinlog
+---------------------------------
+
+Per-table filtering of RBR events is easy (as it is relatively easy to extract
+the name of the table that the event applies to).
+
+Per-table filtering of SBR events is hard, as generally it is not apparent
+which tables the statement refers to.
+
+This opens possible options:
+
+2.1 Put the parser into mysqlbinlog
+-----------------------------------
+Once we have a full parser in mysqlbinlog, we'll be able to check which tables
+are used by a statement, and will allow to show behaviour identical to those
+that one obtains when using --replicate-* slave options.
+
+(It is not clear how much effort is needed to put the parser into mysqlbinlog.
+Any guesses?)
+
+
+2.2 Use dumb regexp match
+-------------------------
+Use a really dumb approach. A query is considered to be modifying table X if
+it matches an expression
+
+CREATE TABLE $tablename
+DROP $tablename
+UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
+DELETE ...$tablename ... WHERE // same as above
+ALTER TABLE $tablename
+.. etc (go get from the grammar) ..
+
+The advantage over doing the same in awk is that mysqlbinlog will also process
+RBR statements, and together with that will provide a working solution for
+those who are careful with their table names not mixing with string constants
+and such.
+
+(TODO: string constants are of particular concern as they come from
+[potentially hostile] users, unlike e.g. table aliases which come from
+[not hostile] developers. Remove also all string constants before attempting
+to do match?)
+
+2.3 Have the master put annotations
+-----------------------------------
+We could add a master option so that it injects into query a mark that tells
+which tables the query will affect, e.g. for the query
+
+ UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
+
+
+the binlog will have
+
+ /* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
+
+and further processing in mysqlbinlog will be trivial.
DESCRIPTION:
Replication slave can be set to filter updates to certain tables with
--replicate-[wild-]{do,ignore}-table options.
This task is about adding similar functionality to mysqlbinlog.
HIGH-LEVEL SPECIFICATION:
1. Context
----------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
--replicate-ignore-table=db.tbl
--replicate-wild-do-table=pattern.pattern
--replicate-wild-ignore-table=pattern.pattern
They affect both RBR and SBR events. SBR events are checked after the
statement has been parsed, the server iterates over list of used tables and
checks them againist --replicate instructions.
What is interesting is that this scheme still allows to update the ignored
table through a VIEW.
2. Table filtering in mysqlbinlog
---------------------------------
Per-table filtering of RBR events is easy (as it is relatively easy to extract
the name of the table that the event applies to).
Per-table filtering of SBR events is hard, as generally it is not apparent
which tables the statement refers to.
This opens possible options:
2.1 Put the parser into mysqlbinlog
-----------------------------------
Once we have a full parser in mysqlbinlog, we'll be able to check which tables
are used by a statement, and will allow to show behaviour identical to those
that one obtains when using --replicate-* slave options.
(It is not clear how much effort is needed to put the parser into mysqlbinlog.
Any guesses?)
2.2 Use dumb regexp match
-------------------------
Use a really dumb approach. A query is considered to be modifying table X if
it matches an expression
CREATE TABLE $tablename
DROP $tablename
UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
DELETE ...$tablename ... WHERE // same as above
ALTER TABLE $tablename
.. etc (go get from the grammar) ..
The advantage over doing the same in awk is that mysqlbinlog will also process
RBR statements, and together with that will provide a working solution for
those who are careful with their table names not mixing with string constants
and such.
(TODO: string constants are of particular concern as they come from
[potentially hostile] users, unlike e.g. table aliases which come from
[not hostile] developers. Remove also all string constants before attempting
to do match?)
2.3 Have the master put annotations
-----------------------------------
We could add a master option so that it injects into query a mark that tells
which tables the query will affect, e.g. for the query
UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
the binlog will have
/* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
and further processing in mysqlbinlog will be trivial.
2.4 Implement server functionality to ignore certain tables
-----------------------------------------------------------
We could add a general facility in the server to ignore certain tables:
SET SESSION ignored_tables = "db1.t1,db2.t2";
This would work similar to --replicate-ignore-table, but in a general way not
restricted to the slave SQL thread.
It would then be trivial for mysqlbinlog to add such statements at the start
of the output, or probably the user could just do it manually with no need for
additional options for mysqlbinlog.
It might be useful to integrate this with the code that already handles
--replicate-ignore-db and similar slave options.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Knielsen): Add a mysqlbinlog option to filter certain kinds of statements (41)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter certain kinds of statements
CREATION DATE..: Mon, 10 Aug 2009, 15:30
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-BackLog
TASK ID........: 41 (http://askmonty.org/worklog/?tid=41)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Knielsen - Fri, 14 Aug 2009, 14:17)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.6963 2009-08-14 14:17:32.000000000 +0300
+++ /tmp/wklog.41.new.6963 2009-08-14 14:17:32.000000000 +0300
@@ -1,6 +1,11 @@
The implementation will depend on design choices made in WL#40:
-- If we decide to parse the statement, SQL-verb filtering will be trivial
-- If we decide not to parse the statement, we still can reliably distinguish the
+
+Option 1:
+
+If we decide to parse the statement, SQL-verb filtering will be trivial
+
+Option 2:
+If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
If we chose the second, we'll have to perform certain normalization before
-=-=(Psergey - Mon, 10 Aug 2009, 15:47)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.13282 2009-08-10 15:47:13.000000000 +0300
+++ /tmp/wklog.41.new.13282 2009-08-10 15:47:13.000000000 +0300
@@ -2,3 +2,10 @@
- If we decide to parse the statement, SQL-verb filtering will be trivial
- If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
+
+If we chose the second, we'll have to perform certain normalization before
+matching the patterns:
+ - Remove all comments from the command
+ - Remove all pre-space
+ - Compare the string case-insensitively
+ - etc
-=-=(Psergey - Mon, 10 Aug 2009, 15:35)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.12689 2009-08-10 15:35:04.000000000 +0300
+++ /tmp/wklog.41.new.12689 2009-08-10 15:35:04.000000000 +0300
@@ -1 +1,4 @@
-
+The implementation will depend on design choices made in WL#40:
+- If we decide to parse the statement, SQL-verb filtering will be trivial
+- If we decide not to parse the statement, we still can reliably distinguish the
+statement by matching the first characters against a set of patterns.
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
DESCRIPTION:
Add a mysqlbinlog option to filter certain kinds of statements, i.e. (syntax
subject to discussion):
mysqlbinlog --exclude='alter table,drop table,alter database,...'
HIGH-LEVEL SPECIFICATION:
The implementation will depend on design choices made in WL#40:
Option 1:
If we decide to parse the statement, SQL-verb filtering will be trivial
Option 2:
If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
If we chose the second, we'll have to perform certain normalization before
matching the patterns:
- Remove all comments from the command
- Remove all pre-space
- Compare the string case-insensitively
- etc
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Knielsen): Add a mysqlbinlog option to filter certain kinds of statements (41)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter certain kinds of statements
CREATION DATE..: Mon, 10 Aug 2009, 15:30
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-BackLog
TASK ID........: 41 (http://askmonty.org/worklog/?tid=41)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Knielsen - Fri, 14 Aug 2009, 14:17)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.6963 2009-08-14 14:17:32.000000000 +0300
+++ /tmp/wklog.41.new.6963 2009-08-14 14:17:32.000000000 +0300
@@ -1,6 +1,11 @@
The implementation will depend on design choices made in WL#40:
-- If we decide to parse the statement, SQL-verb filtering will be trivial
-- If we decide not to parse the statement, we still can reliably distinguish the
+
+Option 1:
+
+If we decide to parse the statement, SQL-verb filtering will be trivial
+
+Option 2:
+If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
If we chose the second, we'll have to perform certain normalization before
-=-=(Psergey - Mon, 10 Aug 2009, 15:47)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.13282 2009-08-10 15:47:13.000000000 +0300
+++ /tmp/wklog.41.new.13282 2009-08-10 15:47:13.000000000 +0300
@@ -2,3 +2,10 @@
- If we decide to parse the statement, SQL-verb filtering will be trivial
- If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
+
+If we chose the second, we'll have to perform certain normalization before
+matching the patterns:
+ - Remove all comments from the command
+ - Remove all pre-space
+ - Compare the string case-insensitively
+ - etc
-=-=(Psergey - Mon, 10 Aug 2009, 15:35)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.12689 2009-08-10 15:35:04.000000000 +0300
+++ /tmp/wklog.41.new.12689 2009-08-10 15:35:04.000000000 +0300
@@ -1 +1,4 @@
-
+The implementation will depend on design choices made in WL#40:
+- If we decide to parse the statement, SQL-verb filtering will be trivial
+- If we decide not to parse the statement, we still can reliably distinguish the
+statement by matching the first characters against a set of patterns.
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
DESCRIPTION:
Add a mysqlbinlog option to filter certain kinds of statements, i.e. (syntax
subject to discussion):
mysqlbinlog --exclude='alter table,drop table,alter database,...'
HIGH-LEVEL SPECIFICATION:
The implementation will depend on design choices made in WL#40:
Option 1:
If we decide to parse the statement, SQL-verb filtering will be trivial
Option 2:
If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
If we chose the second, we'll have to perform certain normalization before
matching the patterns:
- Remove all comments from the command
- Remove all pre-space
- Compare the string case-insensitively
- etc
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Progress (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 20
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 09:13)=-=-
2009-8-10: spent 3.5 hrs for analysis of the current implementation of UNION/UNION ALL
came up with the idea how to bypass temporary table when executing UNION ALL
2009-8-11: spent 6.5 hrs to prepare a hack that executed UNION ALL without temporary table
2009-8-12: spent 4 hrs more to investigate in debugger different cases with usage of union operations
(in subqueries, in queries that do not use tables)
2009-8-13: spent 6 hrs to put together and to publish an HLS document for the task
Worked 20 hours and estimate 0 hours remain (original estimate increased by 20 hours).
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Supervisor updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Bothorsen
+Monty
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Version updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Privacy level updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-y
+n
-=-=(Guest - Fri, 14 Aug 2009, 08:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22656 2009-08-14 08:50:48.000000000 +0300
+++ /tmp/wklog.44.new.22656 2009-08-14 08:50:48.000000000 +0300
@@ -19,28 +19,29 @@
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
- (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union all
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
- (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
-a2!=b2) union all
+ (select a1,b1,c3 from t1 where a1=b1) union
+ (select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
+
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
-MySQL does not accept nested unions. For example the following valid query is
-considered by MySQL Server as erroneous:
- ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
-) union all
- ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+MySQL does not accept nested unions. For example the following valid SQL query
+is considered by MySQL Server as erroneous:
+ ((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
+ union all
+ ((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union
(select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:
((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
union all
((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Progress (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 20
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 09:13)=-=-
2009-8-10: spent 3.5 hrs for analysis of the current implementation of UNION/UNION ALL
came up with the idea how to bypass temporary table when executing UNION ALL
2009-8-11: spent 6.5 hrs to prepare a hack that executed UNION ALL without temporary table
2009-8-12: spent 4 hrs more to investigate in debugger different cases with usage of union operations
(in subqueries, in queries that do not use tables)
2009-8-13: spent 6 hrs to put together and to publish an HLS document for the task
Worked 20 hours and estimate 0 hours remain (original estimate increased by 20 hours).
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Supervisor updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Bothorsen
+Monty
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Version updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Privacy level updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-y
+n
-=-=(Guest - Fri, 14 Aug 2009, 08:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22656 2009-08-14 08:50:48.000000000 +0300
+++ /tmp/wklog.44.new.22656 2009-08-14 08:50:48.000000000 +0300
@@ -19,28 +19,29 @@
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
- (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union all
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
- (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
-a2!=b2) union all
+ (select a1,b1,c3 from t1 where a1=b1) union
+ (select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
+
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
-MySQL does not accept nested unions. For example the following valid query is
-considered by MySQL Server as erroneous:
- ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
-) union all
- ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+MySQL does not accept nested unions. For example the following valid SQL query
+is considered by MySQL Server as erroneous:
+ ((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
+ union all
+ ((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union
(select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:
((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
union all
((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Progress (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 20
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 09:13)=-=-
2009-8-10: spent 3.5 hrs for analysis of the current implementation of UNION/UNION ALL
came up with the idea how to bypass temporary table when executing UNION ALL
2009-8-11: spent 6.5 hrs to prepare a hack that executed UNION ALL without temporary table
2009-8-12: spent 4 hrs more to investigate in debugger different cases with usage of union operations
(in subqueries, in queries that do not use tables)
2009-8-13: spent 6 hrs to put together and to publish an HLS document for the task
Worked 20 hours and estimate 0 hours remain (original estimate increased by 20 hours).
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Supervisor updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Bothorsen
+Monty
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Version updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Privacy level updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-y
+n
-=-=(Guest - Fri, 14 Aug 2009, 08:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22656 2009-08-14 08:50:48.000000000 +0300
+++ /tmp/wklog.44.new.22656 2009-08-14 08:50:48.000000000 +0300
@@ -19,28 +19,29 @@
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
- (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union all
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
- (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
-a2!=b2) union all
+ (select a1,b1,c3 from t1 where a1=b1) union
+ (select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
+
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
-MySQL does not accept nested unions. For example the following valid query is
-considered by MySQL Server as erroneous:
- ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
-) union all
- ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+MySQL does not accept nested unions. For example the following valid SQL query
+is considered by MySQL Server as erroneous:
+ ((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
+ union all
+ ((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union
(select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:
((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
union all
((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Supervisor updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Bothorsen
+Monty
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Version updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Privacy level updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-y
+n
-=-=(Guest - Fri, 14 Aug 2009, 08:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22656 2009-08-14 08:50:48.000000000 +0300
+++ /tmp/wklog.44.new.22656 2009-08-14 08:50:48.000000000 +0300
@@ -19,28 +19,29 @@
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
- (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union all
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
- (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
-a2!=b2) union all
+ (select a1,b1,c3 from t1 where a1=b1) union
+ (select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
+
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
-MySQL does not accept nested unions. For example the following valid query is
-considered by MySQL Server as erroneous:
- ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
-) union all
- ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+MySQL does not accept nested unions. For example the following valid SQL query
+is considered by MySQL Server as erroneous:
+ ((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
+ union all
+ ((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union
(select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:
((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
union all
((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Supervisor updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Bothorsen
+Monty
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Version updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Privacy level updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-y
+n
-=-=(Guest - Fri, 14 Aug 2009, 08:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22656 2009-08-14 08:50:48.000000000 +0300
+++ /tmp/wklog.44.new.22656 2009-08-14 08:50:48.000000000 +0300
@@ -19,28 +19,29 @@
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
- (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union all
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
- (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
-a2!=b2) union all
+ (select a1,b1,c3 from t1 where a1=b1) union
+ (select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
+
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
-MySQL does not accept nested unions. For example the following valid query is
-considered by MySQL Server as erroneous:
- ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
-) union all
- ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+MySQL does not accept nested unions. For example the following valid SQL query
+is considered by MySQL Server as erroneous:
+ ((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
+ union all
+ ((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union
(select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:
((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
union all
((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Supervisor updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Bothorsen
+Monty
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Version updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Guest - Fri, 14 Aug 2009, 08:52)=-=-
Privacy level updated.
--- /tmp/wklog.44.old.22769 2009-08-14 08:52:13.000000000 +0300
+++ /tmp/wklog.44.new.22769 2009-08-14 08:52:13.000000000 +0300
@@ -1 +1 @@
-y
+n
-=-=(Guest - Fri, 14 Aug 2009, 08:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22656 2009-08-14 08:50:48.000000000 +0300
+++ /tmp/wklog.44.new.22656 2009-08-14 08:50:48.000000000 +0300
@@ -19,28 +19,29 @@
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
- (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union all
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
- (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
-a2!=b2) union all
+ (select a1,b1,c3 from t1 where a1=b1) union
+ (select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
+
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
-MySQL does not accept nested unions. For example the following valid query is
-considered by MySQL Server as erroneous:
- ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
-) union all
- ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+MySQL does not accept nested unions. For example the following valid SQL query
+is considered by MySQL Server as erroneous:
+ ((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
+ union all
+ ((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union
(select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:
((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
union all
((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22656 2009-08-14 08:50:48.000000000 +0300
+++ /tmp/wklog.44.new.22656 2009-08-14 08:50:48.000000000 +0300
@@ -19,28 +19,29 @@
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
- (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union all
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
- (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
-a2!=b2) union all
+ (select a1,b1,c3 from t1 where a1=b1) union
+ (select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
+
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
-MySQL does not accept nested unions. For example the following valid query is
-considered by MySQL Server as erroneous:
- ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
-) union all
- ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+MySQL does not accept nested unions. For example the following valid SQL query
+is considered by MySQL Server as erroneous:
+ ((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
+ union all
+ ((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union
(select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:
((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
union all
((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22656 2009-08-14 08:50:48.000000000 +0300
+++ /tmp/wklog.44.new.22656 2009-08-14 08:50:48.000000000 +0300
@@ -19,28 +19,29 @@
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
- (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union all
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
- (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
-a2!=b2) union all
+ (select a1,b1,c3 from t1 where a1=b1) union
+ (select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
+
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
-MySQL does not accept nested unions. For example the following valid query is
-considered by MySQL Server as erroneous:
- ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
-) union all
- ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+MySQL does not accept nested unions. For example the following valid SQL query
+is considered by MySQL Server as erroneous:
+ ((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
+ union all
+ ((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union
(select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:
((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
union all
((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22656 2009-08-14 08:50:48.000000000 +0300
+++ /tmp/wklog.44.new.22656 2009-08-14 08:50:48.000000000 +0300
@@ -19,28 +19,29 @@
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
- (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union all
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
- (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
-a2!=b2) union all
+ (select a1,b1,c3 from t1 where a1=b1) union
+ (select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
+
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
-MySQL does not accept nested unions. For example the following valid query is
-considered by MySQL Server as erroneous:
- ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
-) union all
- ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+MySQL does not accept nested unions. For example the following valid SQL query
+is considered by MySQL Server as erroneous:
+ ((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
+ union all
+ ((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union
(select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:
((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
union all
((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:50)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22656 2009-08-14 08:50:48.000000000 +0300
+++ /tmp/wklog.44.new.22656 2009-08-14 08:50:48.000000000 +0300
@@ -19,28 +19,29 @@
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
- (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union all
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
- (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
-a2!=b2) union all
+ (select a1,b1,c3 from t1 where a1=b1) union
+ (select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
- (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
-a2!=b2) union
+ (select a1,b1,c1 from t1 where a1=b1) union all
+ (select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
+
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
-MySQL does not accept nested unions. For example the following valid query is
-considered by MySQL Server as erroneous:
- ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
-) union all
- ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+MySQL does not accept nested unions. For example the following valid SQL query
+is considered by MySQL Server as erroneous:
+ ((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
+ union all
+ ((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union
(select a2,b2,c3 from t2 where a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all
(select a2,b2,c2 from t2 where a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid SQL query
is considered by MySQL Server as erroneous:
((select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2))
union all
((select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4))
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid query is
considered by MySQL Server as erroneous:
( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
) union all
( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid query is
considered by MySQL Server as erroneous:
( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
) union all
( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid query is
considered by MySQL Server as erroneous:
( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
) union all
( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:45)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22406 2009-08-14 08:45:22.000000000 +0300
+++ /tmp/wklog.44.new.22406 2009-08-14 08:45:22.000000000 +0300
@@ -6,15 +6,15 @@
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
- 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
-==================================
+============================================
1.1. Specifics of MySQL union operations
-------------------------------------------------------
+----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
@@ -49,7 +49,7 @@
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------------
+-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
@@ -77,7 +77,7 @@
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
-----------------------------------
+----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
@@ -109,13 +109,13 @@
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
-=================================================
+===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
-------------------------------------------------------------------
+--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
@@ -159,7 +159,7 @@
};
2.2. Avoiding unnecessary copying
-------------------------------------------
+---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
@@ -174,8 +174,8 @@
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
-2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
-----------------------------------------------------------------------------------------------------------
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
+----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
@@ -190,7 +190,7 @@
3. Other possible optimizations for union units
-=================================
+===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
============================================
1.1. Specifics of MySQL union operations
----------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid query is
considered by MySQL Server as erroneous:
( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
) union all
( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
-----------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
===============================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
--------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
---------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL
----------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
===============================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
==================================
1.1. Specifics of MySQL union operations
------------------------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid query is
considered by MySQL Server as erroneous:
( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
) union all
( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
----------------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
=================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
------------------------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
------------------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
----------------------------------------------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
=================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
==================================
1.1. Specifics of MySQL union operations
------------------------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid query is
considered by MySQL Server as erroneous:
( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
) union all
( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
----------------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
=================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
------------------------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
------------------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
----------------------------------------------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
=================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
==================================
1.1. Specifics of MySQL union operations
------------------------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid query is
considered by MySQL Server as erroneous:
( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
) union all
( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
----------------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
=================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
------------------------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
------------------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
----------------------------------------------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
=================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Fri, 14 Aug 2009, 08:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.44.old.22182 2009-08-14 08:41:17.000000000 +0300
+++ /tmp/wklog.44.new.22182 2009-08-14 08:41:17.000000000 +0300
@@ -1 +1,205 @@
+<contents>
+1. Handling union operations in MySQL Server
+ 1.1. Specifics of MySQL union operations
+ 1.2 Validation of union units
+ 1.3 Execution of union units
+2. Optimizations improving performance of UNION ALL operations
+ 2.1 Execution of UNION ALL without temporary table
+ 2.2. Avoiding unnecessary copying
+ 2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+3. Other possible optimizations for union units
+</contents>
+
+1. Handling union operations in MySQL Server
+==================================
+
+1.1. Specifics of MySQL union operations
+------------------------------------------------------
+
+UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
+allows us to use these operations in a sequence, one after another. For example
+the following queries are accepted by the MySQL Server:
+ (select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (1)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (2)
+Any mix of UNION and UNION ALL is also acceptable:
+ (select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
+a2!=b2) union all
+ (select a3,b3,c3 from t3 where a3>b3); (3)
+ (select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
+a2!=b2) union
+ (select a3,b3,c3 from t3 where a3>b3); (4)
+It should be noted that query (4) is equivalent to query (1). At the same time
+query (3) is not equivalent to any of the queries (1),(2),(4).
+In general any UNION ALL in a sequence of union operations can be equivalently
+substituted for UNION if there occur another UNION further in the sequence.
+MySQL does not accept nested unions. For example the following valid query is
+considered by MySQL Server as erroneous:
+ ( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
+) union all
+ ( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
+
+A sequence of select constructs separated by UNION/UNION ALL is called 'union
+unit' if it s not a part of another such sequence.
+A union unit can be executed as a query. It also can be used as a subquery.
+A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
+In this case it cannot be used as a subquery.
+
+1.2 Validation of union units
+----------------------------------
+
+When the parser stage is over the further processing of a union unit is
+performed by the function mysql_union.
+The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
+The method first validates each of the select constructs of the unit and then it
+checks that all select are compatible. The method checks that the selects return
+the same number of columns and for each set of columns with the same number k
+there is a type to which the types of the columns can be coerced. This type is
+considered as the type of column k of the result set returned by the union unit.
+For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
+bigint and double respectively then the second column of the union unit will be
+of the type double. If the types of the columns c1,c2,c3 are specified as
+varchar(10), varchar(20), varchar(10) then the type of the corresponding column
+of the result set will be varchar(20). If the columns have different collations
+then a collation from which all these collations can be derived is looked for
+and it is assigned as the
+collation of the third column in the result set.
+After compatibility of the corresponding select columns has been checked and the
+types of the columns from of the result set have been determined the method
+SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
+result set for the union unit. Currently rows returned by the selects from the
+union unit are always written into a temporary table. To force selects to send
+rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
+the selects such that the JOIN::result field refers to an object of the class
+select_union. All selects from a union unit share the same select_union object.
+
+1.3 Execution of union units
+----------------------------------
+
+After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
+created a temporary table as a container for rows from the result sets returned
+by the selects of the unit, and has prepared all data structures needed for
+execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
+The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
+by one.
+Each select first is optimized with JOIN::optimize(), then it's executed with
+JOIN::exec().The result rows from each select are sent to a temporary table.
+This table accumulates all rows that are to be returned by the union unit. For
+UNION operations duplicate rows are not added, for UNION ALL operations all
+records are added. It is achieved by enabling and disabling usage of the unique
+index defined on all fields of the temporary table. The index is never used if
+only UINION ALL operation occurs in the unit. Otherwise it is enabled before
+the first select is executed and disabled after the last UNION operation.
+To send rows to the temporary table the method select_union::send_data is used.
+For a row it receives from the currently executed select the method first stores
+the fields of the row in in the fields of the record buffer of the temporary
+table. To do this the method calls function fill_record. All needed type
+conversions of the field values are performed when they are stored the record
+buffer. After this the method select_union::send_data calls the ha_write_row
+handler function to write the record from the buffer to the temporary table. A
+possible error on duplicate key that occurs with an attempt to write a duplicate
+row is ignored.
+After all rows received from all selects have been placed into the temporary
+table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
+from the temporary table and sends them to the output stream (to the client). If
+there is an ORDER BY clause to be applied to result of the union unit then the
+rows read from the temporary table have to be sorted first.
+
+2. Optimizations improving performance of UNION ALL operations
+=================================================
+
+The following three optimizations are proposed to be implemented in the
+framework of this task.
+
+2.1 Execution of UNION ALL without temporary table
+------------------------------------------------------------------
+
+If a union unit with only UNION ALL operations is used at the top level of the
+query (in other words it's not used as a subquery) and is not appended with an
+ORDER BY clause then it does not make sense to send rows received from selects
+to a temporary table at all. After all needed type conversions have been done
+the row fields could be sent directly into the output stream. It would improve
+the performance of UNION ALL operations since writing to the temporary table and
+reading from it would not be needed anymore. In the cases when the result set is
+big enough and the temporary table cannot be allocated in the main memory the
+performance gains would be significant. Besides, the client could get the first
+result rows at once as it would not have to wait until all selects have been
+executed.
+To make an UNION ALL operation not to send rows to a temporary table we could
+provide the JOIN objects created for the selects from the union unit with an
+interceptor object that differs from the one they use now. In the current code
+they use an object of the class select_union derived from the
+select_result_interceptor class. The new interceptor object of the class that
+we'll call select_union_send (by analogy with the class select_send) shall
+inherit from the select_union and shall have its own implementations of the
+virtual methods send_data, send_fields, and send_eof.
+The method send_data shall send fields received from selects to the record
+buffer of the temporary table and then from this buffer to the output stream.
+The method send_fields shall send the format of the rows to the client before it
+starts getting records from the first select , while the method send_eof shall
+signal about the end of the rows after the last select finishes sending records.
+The method create_result_table of the class select_union shall be re-defined
+as virtual. The implementation of this method for the class select_union_send
+shall call select_union::create_result_table and then shall build internal
+structures needed for select_unionsend::send_data. So, the definition of the
+class select_union_send should look like this:
+ class select_union_send :public select_union
+ {
+ ... // private structures
+ public:
+ select_union_send() :select_union(), ...{...}
+ bool send_data(List<Item> &items);
+ bool send_fields(List<Item> &list, uint flags);
+ bool create_result_table(THD *thd, List<Item> *column_types,
+ bool is_distinct, ulonglong options,
+ const char *alias);
+ };
+
+2.2. Avoiding unnecessary copying
+------------------------------------------
+
+If a field does not need type conversion it does not make sense to send it to a
+record buffer. It can be sent directly to the output stream. Different selects
+can require type conversions for different columns.
+Let's provide each select from the union unit with a data structure (e.g. a
+bitmap) that says what fields require conversions, and what don't . Before
+execution of a select this data structure must be passed to the
+select_union_send object shared by all selects from the unit. The info in this
+structure will tell select_union_send::send_data what fields should be sent to
+the record buffer for type conversion and what can be sent directly to the
+output stream. In this case another variant of the fill_record procedure is
+needed that would take as parameter the info that says what fields are to be
+stored in the record buffer.
+
+2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
+----------------------------------------------------------------------------------------------------------
+
+If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
+used at the top level of a query then any UNION ALL operation after the last
+UNION operation can be executed in more efficient way than it's done in the
+current implementation. More exactly, the rows from any select that follows
+after the second operand of the last UNION operations could be sent directly to
+the output stream. In this case two interceptor objects have to be created: one,
+of the type select_union, is shared by the selects for which UNION operations
+are performed, another, of the type select_union_send, is shared by the the
+remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
+undergo a serious re-work.
+
+
+3. Other possible optimizations for union units
+=================================
+
+The following optimizations are not supposed to be implemented in the framework
+this task.
+1. For a union unit containing only UNION ALL with an ORDER BY send rows from
+selects directly to the sorting procedure.
+2. For a union unit at the top level of the query without ORDER BY clause send
+any row received from an operand of a UNION operation directly to the output
+stream as soon as it has been checked by a lookup in the temporary table that
+it's not a duplicate.
+3. Not to use temporary table for any union unit used in EXIST or IN subquery.
+
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
HIGH-LEVEL SPECIFICATION:
<contents>
1. Handling union operations in MySQL Server
1.1. Specifics of MySQL union operations
1.2 Validation of union units
1.3 Execution of union units
2. Optimizations improving performance of UNION ALL operations
2.1 Execution of UNION ALL without temporary table
2.2. Avoiding unnecessary copying
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
3. Other possible optimizations for union units
</contents>
1. Handling union operations in MySQL Server
==================================
1.1. Specifics of MySQL union operations
------------------------------------------------------
UNION and UNION ALL are the only set operations supported by MySQL Server. MySQL
allows us to use these operations in a sequence, one after another. For example
the following queries are accepted by the MySQL Server:
(select a1,b1,c1 from t1 where a1=b1) union (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (1)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (2)
Any mix of UNION and UNION ALL is also acceptable:
(select a1,b1,c3 from t1 where a1=b1) union (select a2,b2,c3 from t2 where
a2!=b2) union all
(select a3,b3,c3 from t3 where a3>b3); (3)
(select a1,b1,c1 from t1 where a1=b1) union all (select a2,b2,c2 from t2 where
a2!=b2) union
(select a3,b3,c3 from t3 where a3>b3); (4)
It should be noted that query (4) is equivalent to query (1). At the same time
query (3) is not equivalent to any of the queries (1),(2),(4).
In general any UNION ALL in a sequence of union operations can be equivalently
substituted for UNION if there occur another UNION further in the sequence.
MySQL does not accept nested unions. For example the following valid query is
considered by MySQL Server as erroneous:
( (select a1,b1 from t1 where a1=b1) union (select a2,b2 from t2 where a2!=b2)
) union all
( (select a3,b3 from t3 where a3=b3) union (select a4,b4 from t4 where a4!=b4) )
A sequence of select constructs separated by UNION/UNION ALL is called 'union
unit' if it s not a part of another such sequence.
A union unit can be executed as a query. It also can be used as a subquery.
A union unit can be optionally appended by an ORDER BY and/or LIMIT construct.
In this case it cannot be used as a subquery.
1.2 Validation of union units
----------------------------------
When the parser stage is over the further processing of a union unit is
performed by the function mysql_union.
The function first validate the unit in the method SELECT_LEX_UNIT::prepare.
The method first validates each of the select constructs of the unit and then it
checks that all select are compatible. The method checks that the selects return
the same number of columns and for each set of columns with the same number k
there is a type to which the types of the columns can be coerced. This type is
considered as the type of column k of the result set returned by the union unit.
For example, if in the query (1) the columns b1, b2 and b3 are of the types int,
bigint and double respectively then the second column of the union unit will be
of the type double. If the types of the columns c1,c2,c3 are specified as
varchar(10), varchar(20), varchar(10) then the type of the corresponding column
of the result set will be varchar(20). If the columns have different collations
then a collation from which all these collations can be derived is looked for
and it is assigned as the
collation of the third column in the result set.
After compatibility of the corresponding select columns has been checked and the
types of the columns from of the result set have been determined the method
SELECT_LEX_UNIT::prepare creates a temporary table to store the rows of the
result set for the union unit. Currently rows returned by the selects from the
union unit are always written into a temporary table. To force selects to send
rows to this temporary table SELECT_LEX_UNIT::prepare creates JOIN objects for
the selects such that the JOIN::result field refers to an object of the class
select_union. All selects from a union unit share the same select_union object.
1.3 Execution of union units
----------------------------------
After SELECT_LEX_UNIT::prepare has successfully validated the union unit, has
created a temporary table as a container for rows from the result sets returned
by the selects of the unit, and has prepared all data structures needed for
execution, the function mysql_union invokes SELECT_LEX_UNIT::exec.
The method SELECT_LEX_UNIT::exec processes the selects from the union unit one
by one.
Each select first is optimized with JOIN::optimize(), then it's executed with
JOIN::exec().The result rows from each select are sent to a temporary table.
This table accumulates all rows that are to be returned by the union unit. For
UNION operations duplicate rows are not added, for UNION ALL operations all
records are added. It is achieved by enabling and disabling usage of the unique
index defined on all fields of the temporary table. The index is never used if
only UINION ALL operation occurs in the unit. Otherwise it is enabled before
the first select is executed and disabled after the last UNION operation.
To send rows to the temporary table the method select_union::send_data is used.
For a row it receives from the currently executed select the method first stores
the fields of the row in in the fields of the record buffer of the temporary
table. To do this the method calls function fill_record. All needed type
conversions of the field values are performed when they are stored the record
buffer. After this the method select_union::send_data calls the ha_write_row
handler function to write the record from the buffer to the temporary table. A
possible error on duplicate key that occurs with an attempt to write a duplicate
row is ignored.
After all rows received from all selects have been placed into the temporary
table the method SELECT_LEX_UNIT::exec calls mysql_select that reads rows
from the temporary table and sends them to the output stream (to the client). If
there is an ORDER BY clause to be applied to result of the union unit then the
rows read from the temporary table have to be sorted first.
2. Optimizations improving performance of UNION ALL operations
=================================================
The following three optimizations are proposed to be implemented in the
framework of this task.
2.1 Execution of UNION ALL without temporary table
------------------------------------------------------------------
If a union unit with only UNION ALL operations is used at the top level of the
query (in other words it's not used as a subquery) and is not appended with an
ORDER BY clause then it does not make sense to send rows received from selects
to a temporary table at all. After all needed type conversions have been done
the row fields could be sent directly into the output stream. It would improve
the performance of UNION ALL operations since writing to the temporary table and
reading from it would not be needed anymore. In the cases when the result set is
big enough and the temporary table cannot be allocated in the main memory the
performance gains would be significant. Besides, the client could get the first
result rows at once as it would not have to wait until all selects have been
executed.
To make an UNION ALL operation not to send rows to a temporary table we could
provide the JOIN objects created for the selects from the union unit with an
interceptor object that differs from the one they use now. In the current code
they use an object of the class select_union derived from the
select_result_interceptor class. The new interceptor object of the class that
we'll call select_union_send (by analogy with the class select_send) shall
inherit from the select_union and shall have its own implementations of the
virtual methods send_data, send_fields, and send_eof.
The method send_data shall send fields received from selects to the record
buffer of the temporary table and then from this buffer to the output stream.
The method send_fields shall send the format of the rows to the client before it
starts getting records from the first select , while the method send_eof shall
signal about the end of the rows after the last select finishes sending records.
The method create_result_table of the class select_union shall be re-defined
as virtual. The implementation of this method for the class select_union_send
shall call select_union::create_result_table and then shall build internal
structures needed for select_unionsend::send_data. So, the definition of the
class select_union_send should look like this:
class select_union_send :public select_union
{
... // private structures
public:
select_union_send() :select_union(), ...{...}
bool send_data(List<Item> &items);
bool send_fields(List<Item> &list, uint flags);
bool create_result_table(THD *thd, List<Item> *column_types,
bool is_distinct, ulonglong options,
const char *alias);
};
2.2. Avoiding unnecessary copying
------------------------------------------
If a field does not need type conversion it does not make sense to send it to a
record buffer. It can be sent directly to the output stream. Different selects
can require type conversions for different columns.
Let's provide each select from the union unit with a data structure (e.g. a
bitmap) that says what fields require conversions, and what don't . Before
execution of a select this data structure must be passed to the
select_union_send object shared by all selects from the unit. The info in this
structure will tell select_union_send::send_data what fields should be sent to
the record buffer for type conversion and what can be sent directly to the
output stream. In this case another variant of the fill_record procedure is
needed that would take as parameter the info that says what fields are to be
stored in the record buffer.
2.3 Optimizing execution of a union unit with a mix of UNION/UNION ALL operations
----------------------------------------------------------------------------------------------------------
If a union unit with a mix of UNIIN/UNION ALL operations and without ORDER BY is
used at the top level of a query then any UNION ALL operation after the last
UNION operation can be executed in more efficient way than it's done in the
current implementation. More exactly, the rows from any select that follows
after the second operand of the last UNION operations could be sent directly to
the output stream. In this case two interceptor objects have to be created: one,
of the type select_union, is shared by the selects for which UNION operations
are performed, another, of the type select_union_send, is shared by the the
remaining selects. For this optimization the method SELECT_LEX_UNIT::exec is to
undergo a serious re-work.
3. Other possible optimizations for union units
=================================
The following optimizations are not supposed to be implemented in the framework
this task.
1. For a union unit containing only UNION ALL with an ORDER BY send rows from
selects directly to the sorting procedure.
2. For a union unit at the top level of the query without ORDER BY clause send
any row received from an operand of a UNION operation directly to the output
stream as soon as it has been checked by a lookup in the temporary table that
it's not a duplicate.
3. Not to use temporary table for any union unit used in EXIST or IN subquery.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Igor): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Igor): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Igor): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Igor): Implement UNION ALL without usage of a temporary table (44)
by worklog-noreply@askmonty.org 14 Aug '09
by worklog-noreply@askmonty.org 14 Aug '09
14 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Implement UNION ALL without usage of a temporary table
CREATION DATE..: Fri, 14 Aug 2009, 08:31
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....:
COPIES TO......: Monty, Psergey
CATEGORY.......: Client-BackLog
TASK ID........: 44 (http://askmonty.org/worklog/?tid=44)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Currently when any union operation is executed the rows received from its
operands are always sent to a temporary table. Meanwhile for a UNION ALL
operation that is used at the top level of a query without an ORDER BY clause it
is not necessary. In this case the rows could be sent directly to the client.
The goal of this task is to provide such an implementation of UNION ALL
operation that would not use temporary table at all in certain, most usable cases.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Rev 2720: Merge maria-5.1 -> maria-5.1-table-elimination in file:///home/psergey/dev/maria-5.1-table-elim-r10/
by Sergey Petrunya 13 Aug '09
by Sergey Petrunya 13 Aug '09
13 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r10/
------------------------------------------------------------
revno: 2720
revision-id: psergey(a)askmonty.org-20090813211212-jghejwxsl6adtopl
parent: knielsen(a)knielsen-hq.org-20090805072137-wg97dcem1cxnzt3p
parent: psergey(a)askmonty.org-20090813204452-o8whzlbio19cgkyv
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r10
timestamp: Fri 2009-08-14 01:12:12 +0400
message:
Merge maria-5.1 -> maria-5.1-table-elimination
added:
mysql-test/r/table_elim.result table_elim.result-20090603125022-nge13y0ohk1g2tt2-1
mysql-test/t/table_elim.test table_elim.test-20090603125018-ka3vcfrm07bsldz8-1
sql-bench/test-table-elimination.sh testtableelimination-20090616194329-gai92muve732qknl-1
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
modified:
.bzrignore sp1f-ignore-20001018235455-q4gxfbritt5f42nwix354ufpsvrf5ebj
libmysqld/Makefile.am sp1f-makefile.am-20010411110351-26htpk3ynkyh7pkfvnshztqrxx3few4g
mysql-test/r/mysql-bug41486.result mysqlbug41486.result-20090323135900-fobg67a3yzg0b7e8-1
mysql-test/r/ps_11bugs.result sp1f-ps_11bugs.result-20041012140047-4pktjlfeq27q6bxqfdsbcszr5nybv6zz
mysql-test/r/select.result sp1f-select.result-20010103001548-znkoalxem6wchsbxizfosjhpfmhfyxuk
mysql-test/r/subselect.result sp1f-subselect.result-20020512204640-zgegcsgavnfd7t7eyrf7ibuqomsw7uzo
mysql-test/r/union.result sp1f-unions_one.result-20010725122836-ofxtwraxeohz7whhrmfdz57sl4a5prmp
mysql-test/t/mysql-bug41486.test mysqlbug41486.test-20090323135900-fobg67a3yzg0b7e8-2
mysql-test/valgrind.supp sp1f-valgrind.supp-20050406142216-yg7xhezklqhgqlc3inx36vbghodhbovy
sql/CMakeLists.txt sp1f-cmakelists.txt-20060831175237-esoeu5kpdtwjvehkghwy6fzbleniq2wy
sql/Makefile.am sp1f-makefile.am-19700101030959-xsjdiakci3nqcdd4xl4yomwdl5eo2f3q
sql/item.cc sp1f-item.cc-19700101030959-u7hxqopwpfly4kf5ctlyk2dvrq4l3dhn
sql/item.h sp1f-item.h-19700101030959-rrkb43htudd62batmoteashkebcwykpa
sql/item_subselect.cc sp1f-item_subselect.cc-20020512204640-qep43aqhsfrwkqmrobni6czc3fqj36oo
sql/item_subselect.h sp1f-item_subselect.h-20020512204640-qdg77wil56cxyhtc2bjjdrppxq3wqgh3
sql/item_sum.cc sp1f-item_sum.cc-19700101030959-4woo23bi3am2t2zvsddqbpxk7xbttdkm
sql/item_sum.h sp1f-item_sum.h-19700101030959-ecgohlekwm355wxl5fv4zzq3alalbwyl
sql/sql_bitmap.h sp1f-sql_bitmap.h-20031024204444-g4eiad7vopzqxe2trxmt3fn3xsvnomvj
sql/sql_lex.cc sp1f-sql_lex.cc-19700101030959-4pizwlu5rqkti27gcwsvxkawq6bc2kph
sql/sql_lex.h sp1f-sql_lex.h-19700101030959-sgldb2sooc7twtw5q7pgjx7qzqiaa3sn
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
sql/table.h sp1f-table.h-19700101030959-dv72bajftxj5fbdjuajquappanuv2ija
------------------------------------------------------------
revno: 2707.1.27
revision-id: psergey(a)askmonty.org-20090813204452-o8whzlbio19cgkyv
parent: psergey(a)askmonty.org-20090813191053-g1xfeieoti4bqgbc
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Fri 2009-08-14 00:44:52 +0400
message:
MWL#17: Table elimination
- More function renames, added comments
modified:
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
------------------------------------------------------------
revno: 2707.1.26
revision-id: psergey(a)askmonty.org-20090813191053-g1xfeieoti4bqgbc
parent: psergey(a)askmonty.org-20090813093613-hy7tdlsgdy83xszq
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 23:10:53 +0400
message:
MWL#17: Table elimination
- Better comments
modified:
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
------------------------------------------------------------
revno: 2707.1.25
revision-id: psergey(a)askmonty.org-20090813093613-hy7tdlsgdy83xszq
parent: psergey(a)askmonty.org-20090813092402-jlqucf6nultxlv4b
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 13:36:13 +0400
message:
MWL#17: Table elimination
Fixes after post-review fixes:
- Don't search for tables in JOIN_TAB array. it's not initialized yet.
use select_lex->leaf_tables instead.
modified:
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
------------------------------------------------------------
revno: 2707.1.24
revision-id: psergey(a)askmonty.org-20090813092402-jlqucf6nultxlv4b
parent: psergey(a)askmonty.org-20090813000143-dukzk352hjywidk7
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 13:24:02 +0400
message:
MWL#17: Table elimination
- Post-postreview changes fix: Do set NESTED_JOIN::n_tables to number of
tables left after elimination.
modified:
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
------------------------------------------------------------
revno: 2707.1.23
revision-id: psergey(a)askmonty.org-20090813000143-dukzk352hjywidk7
parent: psergey(a)askmonty.org-20090812234302-10es7qmf0m09ahbq
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 04:01:43 +0400
message:
MWL#17: Table elimination
- When making inferences "field is bound" -> "key is bound", do check
that the field is part of the key
modified:
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
------------------------------------------------------------
revno: 2707.1.22
revision-id: psergey(a)askmonty.org-20090812234302-10es7qmf0m09ahbq
parent: psergey(a)askmonty.org-20090812223421-w4xyzj7azqgo83ps
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 03:43:02 +0400
message:
MWL#17: Table elimination
- Continue addressing review feedback: remove "unusable KEYUSEs"
extension as it is no longer needed.
modified:
sql/item.h sp1f-item.h-19700101030959-rrkb43htudd62batmoteashkebcwykpa
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
------------------------------------------------------------
revno: 2707.1.21
revision-id: psergey(a)askmonty.org-20090812223421-w4xyzj7azqgo83ps
parent: psergey(a)askmonty.org-20090708171038-9nyc3hcg1o7h8635
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 02:34:21 +0400
message:
MWL#17: Table elimination
Address review feedback:
- Change from Wave-based approach (a-la const table detection) to
building and walking functional dependency graph.
- Change from piggy-backing on ref-access code and KEYUSE structures
to using our own expression analyzer.
modified:
sql/item.cc sp1f-item.cc-19700101030959-u7hxqopwpfly4kf5ctlyk2dvrq4l3dhn
sql/item.h sp1f-item.h-19700101030959-rrkb43htudd62batmoteashkebcwykpa
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
sql/sql_bitmap.h sp1f-sql_bitmap.h-20031024204444-g4eiad7vopzqxe2trxmt3fn3xsvnomvj
------------------------------------------------------------
revno: 2707.1.20
revision-id: psergey(a)askmonty.org-20090708171038-9nyc3hcg1o7h8635
parent: psergey(a)askmonty.org-20090630132018-8qwou8bqiq5z1qjg
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Wed 2009-07-08 21:10:38 +0400
message:
MWL#17: Table elimination
- When collecting Item_subselect::refers_to, put references to the correct
subselect entry.
modified:
sql/sql_lex.cc sp1f-sql_lex.cc-19700101030959-4pizwlu5rqkti27gcwsvxkawq6bc2kph
------------------------------------------------------------
revno: 2707.1.19
revision-id: psergey(a)askmonty.org-20090630132018-8qwou8bqiq5z1qjg
parent: psergey(a)askmonty.org-20090630131100-r6o8yqzse4yvny9l
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Tue 2009-06-30 17:20:18 +0400
message:
MWL#17: Table elimination
- More comments
- Renove old code
modified:
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
------------------------------------------------------------
revno: 2707.1.18
revision-id: psergey(a)askmonty.org-20090630131100-r6o8yqzse4yvny9l
parent: psergey(a)askmonty.org-20090629135115-472up9wsj0dq843i
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Tue 2009-06-30 17:11:00 +0400
message:
MWL#17: Table elimination
- Last fixes
modified:
sql/item.cc sp1f-item.cc-19700101030959-u7hxqopwpfly4kf5ctlyk2dvrq4l3dhn
sql/item.h sp1f-item.h-19700101030959-rrkb43htudd62batmoteashkebcwykpa
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
sql/table.h sp1f-table.h-19700101030959-dv72bajftxj5fbdjuajquappanuv2ija
------------------------------------------------------------
revno: 2707.1.17
revision-id: psergey(a)askmonty.org-20090629135115-472up9wsj0dq843i
parent: psergey(a)askmonty.org-20090625200729-u11xpwwn5ebddx09
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Mon 2009-06-29 17:51:15 +0400
message:
MWL#17: Table elimination
modified:
mysql-test/r/table_elim.result table_elim.result-20090603125022-nge13y0ohk1g2tt2-1
mysql-test/t/table_elim.test table_elim.test-20090603125018-ka3vcfrm07bsldz8-1
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
sql/table.h sp1f-table.h-19700101030959-dv72bajftxj5fbdjuajquappanuv2ija
------------------------------------------------------------
revno: 2707.1.16
revision-id: psergey(a)askmonty.org-20090625200729-u11xpwwn5ebddx09
parent: psergey(a)askmonty.org-20090625100947-mg9xwnbeyyjgzl3w
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-movearound
timestamp: Fri 2009-06-26 00:07:29 +0400
message:
MWL#17: Table elimination
- Better comments, variable/function renames
modified:
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
------------------------------------------------------------
revno: 2707.1.15
revision-id: psergey(a)askmonty.org-20090625100947-mg9xwnbeyyjgzl3w
parent: psergey(a)askmonty.org-20090624224414-71xqbljy8jf4z1qs
parent: psergey(a)askmonty.org-20090625100553-j1xenbz3o5nekiu2
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Thu 2009-06-25 14:09:47 +0400
message:
Automerge
added:
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
modified:
.bzrignore sp1f-ignore-20001018235455-q4gxfbritt5f42nwix354ufpsvrf5ebj
libmysqld/Makefile.am sp1f-makefile.am-20010411110351-26htpk3ynkyh7pkfvnshztqrxx3few4g
sql/CMakeLists.txt sp1f-cmakelists.txt-20060831175237-esoeu5kpdtwjvehkghwy6fzbleniq2wy
sql/Makefile.am sp1f-makefile.am-19700101030959-xsjdiakci3nqcdd4xl4yomwdl5eo2f3q
sql/item.cc sp1f-item.cc-19700101030959-u7hxqopwpfly4kf5ctlyk2dvrq4l3dhn
sql/item.h sp1f-item.h-19700101030959-rrkb43htudd62batmoteashkebcwykpa
sql/item_subselect.cc sp1f-item_subselect.cc-20020512204640-qep43aqhsfrwkqmrobni6czc3fqj36oo
sql/item_sum.h sp1f-item_sum.h-19700101030959-ecgohlekwm355wxl5fv4zzq3alalbwyl
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
------------------------------------------------------------
revno: 2707.3.1
revision-id: psergey(a)askmonty.org-20090625100553-j1xenbz3o5nekiu2
parent: psergey(a)askmonty.org-20090624090104-c63mp3sfxcxytk0d
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-movearound
timestamp: Thu 2009-06-25 14:05:53 +0400
message:
MWL#17: Table elimination
- Moved table elimination code to sql/opt_table_elimination.cc
- Added comments
added:
sql/opt_table_elimination.cc opt_table_eliminatio-20090625095316-7ka9w3zr7n5114iv-1
modified:
.bzrignore sp1f-ignore-20001018235455-q4gxfbritt5f42nwix354ufpsvrf5ebj
libmysqld/Makefile.am sp1f-makefile.am-20010411110351-26htpk3ynkyh7pkfvnshztqrxx3few4g
sql/CMakeLists.txt sp1f-cmakelists.txt-20060831175237-esoeu5kpdtwjvehkghwy6fzbleniq2wy
sql/Makefile.am sp1f-makefile.am-19700101030959-xsjdiakci3nqcdd4xl4yomwdl5eo2f3q
sql/item.cc sp1f-item.cc-19700101030959-u7hxqopwpfly4kf5ctlyk2dvrq4l3dhn
sql/item.h sp1f-item.h-19700101030959-rrkb43htudd62batmoteashkebcwykpa
sql/item_subselect.cc sp1f-item_subselect.cc-20020512204640-qep43aqhsfrwkqmrobni6czc3fqj36oo
sql/item_sum.h sp1f-item_sum.h-19700101030959-ecgohlekwm355wxl5fv4zzq3alalbwyl
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
------------------------------------------------------------
revno: 2707.1.14
revision-id: psergey(a)askmonty.org-20090624224414-71xqbljy8jf4z1qs
parent: psergey(a)askmonty.org-20090624090104-c63mp3sfxcxytk0d
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Thu 2009-06-25 02:44:14 +0400
message:
MWL#17: Table elimination
- fix a typo bug in has_eqref_access_candidate()
- Adjust test to remove race condition
modified:
mysql-test/r/mysql-bug41486.result mysqlbug41486.result-20090323135900-fobg67a3yzg0b7e8-1
mysql-test/t/mysql-bug41486.test mysqlbug41486.test-20090323135900-fobg67a3yzg0b7e8-2
sql/item.cc sp1f-item.cc-19700101030959-u7hxqopwpfly4kf5ctlyk2dvrq4l3dhn
------------------------------------------------------------
revno: 2707.1.13
revision-id: psergey(a)askmonty.org-20090624090104-c63mp3sfxcxytk0d
parent: psergey(a)askmonty.org-20090623200613-w9dl8g41ysf51r80
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Wed 2009-06-24 13:01:04 +0400
message:
More comments
modified:
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
------------------------------------------------------------
revno: 2707.1.12
revision-id: psergey(a)askmonty.org-20090623200613-w9dl8g41ysf51r80
parent: psergey(a)askmonty.org-20090622114631-yop0q2p8ktmfnctm
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Wed 2009-06-24 00:06:13 +0400
message:
MWL#17: Table elimination
- More testcases
- Let add_ft_key() set keyuse->usable
modified:
mysql-test/r/table_elim.result table_elim.result-20090603125022-nge13y0ohk1g2tt2-1
mysql-test/t/table_elim.test table_elim.test-20090603125018-ka3vcfrm07bsldz8-1
sql-bench/test-table-elimination.sh testtableelimination-20090616194329-gai92muve732qknl-1
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
------------------------------------------------------------
revno: 2707.1.11
revision-id: psergey(a)askmonty.org-20090622114631-yop0q2p8ktmfnctm
parent: psergey(a)askmonty.org-20090617052739-37i1r8lip0m4ft9r
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Mon 2009-06-22 15:46:31 +0400
message:
MWL#17: Table elimination
- Make elimination check to be able detect cases like t.primary_key_col1=othertbl.col AND t.primary_key_col2=func(t.primary_key_col1).
These are needed to handle e.g. the case of func() being a correlated subquery that selects the latest value.
- If we've removed a condition with subquery predicate, EXPLAIN [EXTENDED] won't show the subquery anymore
modified:
sql/item.cc sp1f-item.cc-19700101030959-u7hxqopwpfly4kf5ctlyk2dvrq4l3dhn
sql/item.h sp1f-item.h-19700101030959-rrkb43htudd62batmoteashkebcwykpa
sql/item_subselect.cc sp1f-item_subselect.cc-20020512204640-qep43aqhsfrwkqmrobni6czc3fqj36oo
sql/item_subselect.h sp1f-item_subselect.h-20020512204640-qdg77wil56cxyhtc2bjjdrppxq3wqgh3
sql/item_sum.cc sp1f-item_sum.cc-19700101030959-4woo23bi3am2t2zvsddqbpxk7xbttdkm
sql/sql_lex.cc sp1f-sql_lex.cc-19700101030959-4pizwlu5rqkti27gcwsvxkawq6bc2kph
sql/sql_lex.h sp1f-sql_lex.h-19700101030959-sgldb2sooc7twtw5q7pgjx7qzqiaa3sn
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
------------------------------------------------------------
revno: 2707.1.10
revision-id: psergey(a)askmonty.org-20090617052739-37i1r8lip0m4ft9r
parent: psergey(a)askmonty.org-20090616204358-yjkyfxczsomrn9yn
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Wed 2009-06-17 09:27:39 +0400
message:
* Use excessive parentheses to stop compiler warning
* Fix test results to account for changes in previous cset
modified:
mysql-test/r/select.result sp1f-select.result-20010103001548-znkoalxem6wchsbxizfosjhpfmhfyxuk
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
------------------------------------------------------------
revno: 2707.1.9
revision-id: psergey(a)askmonty.org-20090616204358-yjkyfxczsomrn9yn
parent: psergey(a)askmonty.org-20090616195413-rfmi9un20za8gn8g
parent: psergey(a)askmonty.org-20090615162208-p4w8s8jo06bdz1vj
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Wed 2009-06-17 00:43:58 +0400
message:
* Merge
* Change valgrind suppression to work on valgrind 3.3.0
modified:
mysql-test/valgrind.supp sp1f-valgrind.supp-20050406142216-yg7xhezklqhgqlc3inx36vbghodhbovy
------------------------------------------------------------
revno: 2707.2.1
revision-id: psergey(a)askmonty.org-20090615162208-p4w8s8jo06bdz1vj
parent: psergey(a)askmonty.org-20090614205924-1vnfwbuo4brzyfhp
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-movearound
timestamp: Mon 2009-06-15 20:22:08 +0400
message:
Fix spurious valgrind warnings in rpl_trigger.test
modified:
mysql-test/valgrind.supp sp1f-valgrind.supp-20050406142216-yg7xhezklqhgqlc3inx36vbghodhbovy
------------------------------------------------------------
revno: 2707.1.8
revision-id: psergey(a)askmonty.org-20090616195413-rfmi9un20za8gn8g
parent: psergey(a)askmonty.org-20090614205924-1vnfwbuo4brzyfhp
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Tue 2009-06-16 23:54:13 +0400
message:
MWL#17: Table elimination
- Move eliminate_tables() to before constant table detection.
- First code for benchmark
added:
sql-bench/test-table-elimination.sh testtableelimination-20090616194329-gai92muve732qknl-1
modified:
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
------------------------------------------------------------
revno: 2707.1.7
revision-id: psergey(a)askmonty.org-20090614205924-1vnfwbuo4brzyfhp
parent: psergey(a)askmonty.org-20090614123504-jf4pcb333ojwaxfy
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Mon 2009-06-15 00:59:24 +0400
message:
MWL#17: Table elimination
- Fix print_join() to work both for EXPLAIN EXTENDED (after table elimination) and for
CREATE VIEW (after join->prepare() but without any optimization).
modified:
mysql-test/r/union.result sp1f-unions_one.result-20010725122836-ofxtwraxeohz7whhrmfdz57sl4a5prmp
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
------------------------------------------------------------
revno: 2707.1.6
revision-id: psergey(a)askmonty.org-20090614123504-jf4pcb333ojwaxfy
parent: psergey(a)askmonty.org-20090614100110-u7l54gk0b6zbtj50
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Sun 2009-06-14 16:35:04 +0400
message:
MWL#17: Table elimination
- Fix the previous cset: take into account that select_lex may be printed when
1. There is no select_lex->join at all (in that case, assume that no tables were eliminated)
2. select_lex->join exists but there was no JOIN::optimize() call yet. handle this by initializing join->eliminated really early.
modified:
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
------------------------------------------------------------
revno: 2707.1.5
revision-id: psergey(a)askmonty.org-20090614100110-u7l54gk0b6zbtj50
parent: psergey(a)askmonty.org-20090609211133-wfau2tgwo2vpgc5d
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Sun 2009-06-14 14:01:10 +0400
message:
MWL#17: Table elimination
- Do not show eliminated tables in the output of EXPLAIN EXTENDED
modified:
mysql-test/r/table_elim.result table_elim.result-20090603125022-nge13y0ohk1g2tt2-1
mysql-test/t/table_elim.test table_elim.test-20090603125018-ka3vcfrm07bsldz8-1
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
sql/table.h sp1f-table.h-19700101030959-dv72bajftxj5fbdjuajquappanuv2ija
------------------------------------------------------------
revno: 2707.1.4
revision-id: psergey(a)askmonty.org-20090609211133-wfau2tgwo2vpgc5d
parent: psergey(a)askmonty.org-20090608135546-ut1yrzbah4gdw6e6
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Wed 2009-06-10 01:11:33 +0400
message:
MWL#17: Table elimination
- Make elimination work with aggregate functions. The problem was that aggregate functions
reported all table bits in used_tables(), and that prevented table elimination. Fixed by
making aggregate functions return more correct value from used_tables().
modified:
mysql-test/r/ps_11bugs.result sp1f-ps_11bugs.result-20041012140047-4pktjlfeq27q6bxqfdsbcszr5nybv6zz
mysql-test/r/subselect.result sp1f-subselect.result-20020512204640-zgegcsgavnfd7t7eyrf7ibuqomsw7uzo
mysql-test/r/table_elim.result table_elim.result-20090603125022-nge13y0ohk1g2tt2-1
mysql-test/t/table_elim.test table_elim.test-20090603125018-ka3vcfrm07bsldz8-1
sql/item.h sp1f-item.h-19700101030959-rrkb43htudd62batmoteashkebcwykpa
sql/item_sum.cc sp1f-item_sum.cc-19700101030959-4woo23bi3am2t2zvsddqbpxk7xbttdkm
sql/item_sum.h sp1f-item_sum.h-19700101030959-ecgohlekwm355wxl5fv4zzq3alalbwyl
------------------------------------------------------------
revno: 2707.1.3
revision-id: psergey(a)askmonty.org-20090608135546-ut1yrzbah4gdw6e6
parent: psergey(a)askmonty.org-20090607182938-ycajee5ozg33b7c8
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-fix
timestamp: Mon 2009-06-08 17:55:46 +0400
message:
Fix valgrind failure: provide an implementation of strmov_overlapp() that really can
handle overlapping.
added:
strings/strmov_overlapp.c strmov_overlapp.c-20090608135132-403c5p4dlnexqwxi-1
modified:
include/m_string.h sp1f-m_string.h-19700101030959-rraattbvw5ffkokv4sixxf3s7brqqaga
libmysql/Makefile.shared sp1f-makefile.shared-20000818182429-m3kdhxi23vorlqjct2y2hl3yw357jtxt
strings/Makefile.am sp1f-makefile.am-19700101030959-jfitkanzc3r4h2otoyaaprgqn7muf4ux
------------------------------------------------------------
revno: 2707.1.2
revision-id: psergey(a)askmonty.org-20090607182938-ycajee5ozg33b7c8
parent: psergey(a)askmonty.org-20090603182330-ll3gc91iowhtgb23
parent: psergey(a)askmonty.org-20090607182403-6sfpvdr7nkkekcy9
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1
timestamp: Sun 2009-06-07 22:29:38 +0400
message:
Merge MWL#17: Table elimination
modified:
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
------------------------------------------------------------
revno: 2705.2.2
revision-id: psergey(a)askmonty.org-20090607182403-6sfpvdr7nkkekcy9
parent: psergey(a)askmonty.org-20090603131045-c8jqhwlanli7eimv
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Sun 2009-06-07 22:24:03 +0400
message:
MWL#17: Table Elimination
- Fix trivial valgrind warning
modified:
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
------------------------------------------------------------
revno: 2707.1.1
revision-id: psergey(a)askmonty.org-20090603182330-ll3gc91iowhtgb23
parent: knielsen(a)knielsen-hq.org-20090602110359-n4q9gof38buucrny
parent: psergey(a)askmonty.org-20090603131045-c8jqhwlanli7eimv
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1
timestamp: Wed 2009-06-03 22:23:30 +0400
message:
Merge MWL#17 with maria/5.1
added:
mysql-test/r/table_elim.result table_elim.result-20090603125022-nge13y0ohk1g2tt2-1
mysql-test/t/table_elim.test table_elim.test-20090603125018-ka3vcfrm07bsldz8-1
modified:
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
sql/table.h sp1f-table.h-19700101030959-dv72bajftxj5fbdjuajquappanuv2ija
------------------------------------------------------------
revno: 2705.2.1
revision-id: psergey(a)askmonty.org-20090603131045-c8jqhwlanli7eimv
parent: knielsen(a)knielsen-hq.org-20090522175325-xpwm83ilnhqoqjz0
committer: Sergey Petrunia <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim
timestamp: Wed 2009-06-03 17:10:45 +0400
message:
MWL#17: Table elimination
- First code. Elimination works for simple cases, passes the testsuite.
- Known issues:
= No elimination is done for aggregate functions.
= EXPLAIN EXTENDED shows eliminated tables (I think it better not)
= No benchmark yet
= The code needs some polishing.
added:
mysql-test/r/table_elim.result table_elim.result-20090603125022-nge13y0ohk1g2tt2-1
mysql-test/t/table_elim.test table_elim.test-20090603125018-ka3vcfrm07bsldz8-1
modified:
sql/sql_select.cc sp1f-sql_select.cc-19700101030959-egb7whpkh76zzvikycs5nsnuviu4fdlb
sql/sql_select.h sp1f-sql_select.h-19700101030959-oqegfxr76xlgmrzd6qlevonoibfnwzoz
sql/table.h sp1f-table.h-19700101030959-dv72bajftxj5fbdjuajquappanuv2ija
Diff too large for email (3022 lines, the limit is 1000).
1
0

[Maria-developers] Rev 2734: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r5/
by Sergey Petrunya 13 Aug '09
by Sergey Petrunya 13 Aug '09
13 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r5/
------------------------------------------------------------
revno: 2734
revision-id: psergey(a)askmonty.org-20090813204452-o8whzlbio19cgkyv
parent: psergey(a)askmonty.org-20090813191053-g1xfeieoti4bqgbc
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Fri 2009-08-14 00:44:52 +0400
message:
MWL#17: Table elimination
- More function renames, added comments
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-13 19:10:53 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-13 20:44:52 +0000
@@ -93,11 +93,9 @@
/*
- A field.
- - Depends on table or equality
- - Has expressions it participates as dependencies
-
- There is no counter, bound fields are in $list, not bound are not.
+ A table field. There is only one such object for any tblX.fieldY
+ - the field epends on its table and equalities
+ - expressions that use the field are its dependencies
*/
class Field_dep : public Func_dep
{
@@ -107,19 +105,23 @@
{
type= Func_dep::FD_FIELD;
}
- /* Table we're from. It also has pointers to keys that we're part of */
- Table_dep *table;
+
+ Table_dep *table; /* Table this field is from */
Field *field;
+ /*
+ Field_deps that belong to one table form a linked list. list members are
+ ordered by field_index
+ */
Field_dep *next_table_field;
uint bitmap_offset; /* Offset of our part of the bitmap */
};
/*
- A unique key.
- - Depends on all its components
- - Has its table as dependency
+ A Unique key.
+ - Unique key depends on all of its components
+ - Key's table is its dependency
*/
class Key_dep: public Func_dep
{
@@ -133,14 +135,15 @@
Table_dep *table; /* Table this key is from */
uint keyno;
uint n_missing_keyparts;
+ /* Unique keys form a linked list, ordered by keyno */
Key_dep *next_table_key;
};
/*
- A table.
- - Depends on any of its unique keys
- - Has its fields and embedding outer join as dependency.
+ A table.
+ - table depends on any of its unique keys
+ - has its fields and embedding outer join as dependency.
*/
class Table_dep : public Func_dep
{
@@ -151,16 +154,16 @@
type= Func_dep::FD_TABLE;
}
TABLE *table;
- Field_dep *fields; /* Fields that belong to this table */
- Key_dep *keys; /* Unique keys */
- Outer_join_dep *outer_join_dep;
+ Field_dep *fields; /* Ordered list of fields that belong to this table */
+ Key_dep *keys; /* Ordered list of Unique keys in this table */
+ Outer_join_dep *outer_join_dep; /* Innermost eliminable outer join we're in */
};
/*
- An outer join nest.
- - Depends on all tables inside it.
- - (And that's it).
+ An outer join nest that is subject to elimination
+ - it depends on all tables inside it
+ - has its parent outer join as dependency
*/
class Outer_join_dep: public Func_dep
{
@@ -171,14 +174,27 @@
{
type= Func_dep::FD_OUTER_JOIN;
}
+ /*
+ Outer join we're representing. This can be a join nest or a one table that
+ is outer join'ed.
+ */
TABLE_LIST *table_list;
+
+ /*
+ Tables within this outer join (and its descendants) that are not yet known
+ to be functionally dependent.
+ */
table_map missing_tables;
+ /* All tables within this outer join and its descendants */
table_map all_tables;
+ /* Parent eliminable outer join, if any */
Outer_join_dep *parent;
};
-/* TODO need this? */
+/*
+ Table elimination context
+*/
class Table_elimination
{
public:
@@ -204,20 +220,22 @@
static
-void build_funcdeps_for_cond(Table_elimination *te, Equality_dep **fdeps,
- uint *and_level, Item *cond,
- table_map usable_tables);
+void build_eq_deps_for_cond(Table_elimination *te, Equality_dep **fdeps,
+ uint *and_level, Item *cond,
+ table_map usable_tables);
static
-void add_funcdep(Table_elimination *te,
- Equality_dep **eq_dep, uint and_level,
- Item_func *cond, Field *field,
- bool eq_func, Item **value,
- uint num_values, table_map usable_tables);
+void add_eq_dep(Table_elimination *te,
+ Equality_dep **eq_dep, uint and_level,
+ Item_func *cond, Field *field,
+ bool eq_func, Item **value,
+ uint num_values, table_map usable_tables);
static
Equality_dep *merge_func_deps(Equality_dep *start, Equality_dep *new_fields,
Equality_dep *end, uint and_level);
-Field_dep *get_field_dep(Table_elimination *te, Field *field);
+static Table_dep *get_table_dep(Table_elimination *te, TABLE *table);
+static Field_dep *get_field_dep(Table_elimination *te, Field *field);
+
void eliminate_tables(JOIN *join);
static void mark_as_eliminated(JOIN *join, TABLE_LIST *tbl);
@@ -228,24 +246,25 @@
/*******************************************************************************************/
/*
- Produce FUNC_DEP elements for the given item (i.e. condition) and add them
- to fdeps array.
+ Produce Eq_dep elements for given condition.
SYNOPSIS
- build_funcdeps_for_cond()
- fdeps INOUT Put created FUNC_DEP structures here
-
+ build_eq_deps_for_cond()
+ te Table elimination context
+ fdeps INOUT Put produced equality conditions here
+ and_level INOUT AND-level (like in add_key_fields)
+ cond Condition to process
+ usable_tables Tables which fields we're interested in. That is,
+ Equality_dep represent "tbl.col=expr" and we'll
+ produce them only if tbl is in usable_tables.
DESCRIPTION
- a
-
- SEE ALSO
- add_key_fields()
-
+ This function is modeled after add_key_fields()
*/
+
static
-void build_funcdeps_for_cond(Table_elimination *te,
- Equality_dep **fdeps, uint *and_level, Item *cond,
- table_map usable_tables)
+void build_eq_deps_for_cond(Table_elimination *te, Equality_dep **fdeps,
+ uint *and_level, Item *cond,
+ table_map usable_tables)
{
if (cond->type() == Item_func::COND_ITEM)
{
@@ -258,7 +277,7 @@
Item *item;
while ((item=li++))
{
- build_funcdeps_for_cond(te, fdeps, and_level, item, usable_tables);
+ build_eq_deps_for_cond(te, fdeps, and_level, item, usable_tables);
}
/*
TODO: inject here a "if we have {t.col=const AND t.col=smth_else}, then
@@ -270,13 +289,13 @@
else
{
(*and_level)++;
- build_funcdeps_for_cond(te, fdeps, and_level, li++, usable_tables);
+ build_eq_deps_for_cond(te, fdeps, and_level, li++, usable_tables);
Item *item;
while ((item=li++))
{
Equality_dep *start_key_fields= *fdeps;
(*and_level)++;
- build_funcdeps_for_cond(te, fdeps, and_level, item, usable_tables);
+ build_eq_deps_for_cond(te, fdeps, and_level, item, usable_tables);
*fdeps= merge_func_deps(org_key_fields, start_key_fields, *fdeps,
++(*and_level));
}
@@ -304,11 +323,11 @@
values--;
DBUG_ASSERT(cond_func->functype() != Item_func::IN_FUNC ||
cond_func->argument_count() != 2);
- add_funcdep(te, fdeps, *and_level, cond_func,
- ((Item_field*)(cond_func->key_item()->real_item()))->field,
- 0, values,
- cond_func->argument_count()-1,
- usable_tables);
+ add_eq_dep(te, fdeps, *and_level, cond_func,
+ ((Item_field*)(cond_func->key_item()->real_item()))->field,
+ 0, values,
+ cond_func->argument_count()-1,
+ usable_tables);
}
if (cond_func->functype() == Item_func::BETWEEN)
{
@@ -321,8 +340,8 @@
!(cond_func->arguments()[i]->used_tables() & OUTER_REF_TABLE_BIT))
{
field_item= (Item_field *) (cond_func->arguments()[i]->real_item());
- add_funcdep(te, fdeps, *and_level, cond_func,
- field_item->field, 0, values, 1, usable_tables);
+ add_eq_dep(te, fdeps, *and_level, cond_func,
+ field_item->field, 0, values, 1, usable_tables);
}
}
}
@@ -336,19 +355,19 @@
if (cond_func->arguments()[0]->real_item()->type() == Item::FIELD_ITEM &&
!(cond_func->arguments()[0]->used_tables() & OUTER_REF_TABLE_BIT))
{
- add_funcdep(te, fdeps, *and_level, cond_func,
- ((Item_field*)(cond_func->arguments()[0])->real_item())->field,
- equal_func,
- cond_func->arguments()+1, 1, usable_tables);
+ add_eq_dep(te, fdeps, *and_level, cond_func,
+ ((Item_field*)(cond_func->arguments()[0])->real_item())->field,
+ equal_func,
+ cond_func->arguments()+1, 1, usable_tables);
}
if (cond_func->arguments()[1]->real_item()->type() == Item::FIELD_ITEM &&
cond_func->functype() != Item_func::LIKE_FUNC &&
!(cond_func->arguments()[1]->used_tables() & OUTER_REF_TABLE_BIT))
{
- add_funcdep(te, fdeps, *and_level, cond_func,
- ((Item_field*)(cond_func->arguments()[1])->real_item())->field,
- equal_func,
- cond_func->arguments(),1,usable_tables);
+ add_eq_dep(te, fdeps, *and_level, cond_func,
+ ((Item_field*)(cond_func->arguments()[1])->real_item())->field,
+ equal_func,
+ cond_func->arguments(),1,usable_tables);
}
break;
}
@@ -360,10 +379,10 @@
Item *tmp=new Item_null;
if (unlikely(!tmp)) // Should never be true
return;
- add_funcdep(te, fdeps, *and_level, cond_func,
- ((Item_field*)(cond_func->arguments()[0])->real_item())->field,
- cond_func->functype() == Item_func::ISNULL_FUNC,
- &tmp, 1, usable_tables);
+ add_eq_dep(te, fdeps, *and_level, cond_func,
+ ((Item_field*)(cond_func->arguments()[0])->real_item())->field,
+ cond_func->functype() == Item_func::ISNULL_FUNC,
+ &tmp, 1, usable_tables);
}
break;
case Item_func::OPTIMIZE_EQUAL:
@@ -380,8 +399,8 @@
*/
while ((item= it++))
{
- add_funcdep(te, fdeps, *and_level, cond_func, item->field,
- TRUE, &const_item, 1, usable_tables);
+ add_eq_dep(te, fdeps, *and_level, cond_func, item->field,
+ TRUE, &const_item, 1, usable_tables);
}
}
else
@@ -400,8 +419,8 @@
{
if (!field->eq(item->field))
{
- add_funcdep(te, fdeps, *and_level, cond_func, field/*item*/,
- TRUE, (Item **) &item, 1, usable_tables);
+ add_eq_dep(te, fdeps, *and_level, cond_func, field,
+ TRUE, (Item **) &item, 1, usable_tables);
}
}
it.rewind();
@@ -411,15 +430,19 @@
}
}
+
/*
- Perform an OR operation on two (adjacent) FUNC_DEP arrays.
+ Perform an OR operation on two (adjacent) Equality_dep arrays.
SYNOPSIS
merge_func_deps()
+ start Start of left OR-part
+ new_fields Start of right OR-part
+ end End of right OR-part
+ and_level AND-level.
DESCRIPTION
-
- This function is invoked for two adjacent arrays of FUNC_DEP elements:
+ This function is invoked for two adjacent arrays of Equality_dep elements:
$LEFT_PART $RIGHT_PART
+-----------------------+-----------------------+
@@ -527,17 +550,18 @@
/*
- Add a funcdep for a given equality.
+ Add an Equality_dep element for a given predicate, if applicable
+
+ DESCRIPTION
+ This function is modeled after add_key_field().
*/
static
-void add_funcdep(Table_elimination *te,
- Equality_dep **eq_dep, uint and_level,
- Item_func *cond, Field *field,
- bool eq_func, Item **value,
- uint num_values, table_map usable_tables)
+void add_eq_dep(Table_elimination *te, Equality_dep **eq_dep,
+ uint and_level, Item_func *cond, Field *field,
+ bool eq_func, Item **value, uint num_values,
+ table_map usable_tables)
{
- // Field *field= item_field->field;
if (!(field->table->map & usable_tables))
return;
@@ -606,7 +630,11 @@
}
-Table_dep *get_table_dep(Table_elimination *te, TABLE *table)
+/*
+ Get a Table_dep object for the given table, creating it if necessary.
+*/
+
+static Table_dep *get_table_dep(Table_elimination *te, TABLE *table)
{
Table_dep *tbl_dep= new Table_dep(table);
Key_dep **key_list= &(tbl_dep->keys);
@@ -625,19 +653,21 @@
return te->table_deps[table->tablenr] = tbl_dep;
}
+
/*
- Given a field, get its dependency element: if it already exists, find it,
- otherwise create it.
+ Get a Field_dep object for the given field, creating it if necessary
*/
-Field_dep *get_field_dep(Table_elimination *te, Field *field)
+static Field_dep *get_field_dep(Table_elimination *te, Field *field)
{
TABLE *table= field->table;
Table_dep *tbl_dep;
+ /* First, get the table*/
if (!(tbl_dep= te->table_deps[table->tablenr]))
tbl_dep= get_table_dep(te, table);
-
+
+ /* Try finding the field in field list */
Field_dep **pfield= &(tbl_dep->fields);
while (*pfield && (*pfield)->field->field_index < field->field_index)
{
@@ -646,20 +676,34 @@
if (*pfield && (*pfield)->field->field_index == field->field_index)
return *pfield;
+ /* Create the field and insert it in the list */
Field_dep *new_field= new Field_dep(tbl_dep, field);
-
new_field->next_table_field= *pfield;
*pfield= new_field;
+
return new_field;
}
+/*
+ Create an Outer_join_dep object for the given outer join
+
+ DESCRIPTION
+ Outer_join_dep objects for children (or further descendants) are always
+ created before the parents.
+*/
+
+static
Outer_join_dep *get_outer_join_dep(Table_elimination *te,
TABLE_LIST *outer_join, table_map deps_map)
{
Outer_join_dep *oj_dep;
oj_dep= new Outer_join_dep(outer_join, deps_map);
-
+
+ /*
+ Collect a bitmap fo tables that we depend on, and also set parent pointer
+ for descendant outer join elements.
+ */
Table_map_iterator it(deps_map);
int idx;
while ((idx= it.next_bit()) != Table_map_iterator::BITMAP_END)
@@ -667,6 +711,11 @@
Table_dep *table_dep;
if (!(table_dep= te->table_deps[idx]))
{
+ /*
+ We get here only when ON expression had no references to inner tables
+ and Table_map objects weren't created for them. This is a rare/
+ unimportant case so it's ok to do not too efficient searches.
+ */
TABLE *table= NULL;
for (TABLE_LIST *tlist= te->join->select_lex->leaf_tables; tlist;
tlist=tlist->next_leaf)
@@ -680,7 +729,13 @@
DBUG_ASSERT(table);
table_dep= get_table_dep(te, table);
}
-
+
+ /*
+ Walk from the table up to its embedding outer joins. The goal is to
+ find the least embedded outer join nest and set its parent pointer to
+ point to the newly created Outer_join_dep.
+ to set the pointer of its near
+ */
if (!table_dep->outer_join_dep)
table_dep->outer_join_dep= oj_dep;
else
@@ -690,43 +745,35 @@
oj= oj->parent;
oj->parent=oj_dep;
}
-
}
return oj_dep;
}
/*
- Perform table elimination in a given join list
+ Build functional dependency graph for elements of given join list
SYNOPSIS
collect_funcdeps_for_join_list()
- te Table elimination context.
- join_list Join list to work on
- its_outer_join TRUE <=> the join_list is an inner side of an
- outer join
- FALSE <=> otherwise (this is top-level join
- list, simplify_joins flattens out all
- other kinds of join lists)
-
- tables_in_list Bitmap of tables embedded in the join_list.
- tables_used_elsewhere Bitmap of tables that are referred to from
- somewhere outside of the join list (e.g.
- select list, HAVING, etc).
+ te Table elimination context.
+ join_list Join list to work on
+ build_eq_deps TRUE <=> build Equality_dep elements for all
+ members of the join list, even if they cannot
+ be individually eliminated
+ tables_used_elsewhere Bitmap of tables that are referred to from
+ somewhere outside of this join list (e.g.
+ select list, HAVING, ON expressions of parent
+ joins, etc).
+ eliminable_tables INOUT Tables that can potentially be eliminated
+ (needed so we know for which tables to build
+ dependencies for)
+ eq_dep INOUT End of array of equality dependencies.
DESCRIPTION
- Perform table elimination for a join list.
- Try eliminating children nests first.
- The "all tables in join nest can produce only one matching record
- combination" property checking is modeled after constant table detection,
- plus we reuse info attempts to eliminate child join nests.
-
- RETURN
- Number of children left after elimination. 0 means everything was
- eliminated.
+ .
*/
-static uint
+static void
collect_funcdeps_for_join_list(Table_elimination *te,
List<TABLE_LIST> *join_list,
bool build_eq_deps,
@@ -771,7 +818,7 @@
{
// build comp_cond from ON expression
uint and_level=0;
- build_funcdeps_for_cond(te, eq_dep, &and_level, tbl->on_expr,
+ build_eq_deps_for_cond(te, eq_dep, &and_level, tbl->on_expr,
*eliminable_tables);
}
@@ -781,19 +828,13 @@
tables_used_on_left |= tbl->on_expr->used_tables();
}
}
- return 0;
+ return;
}
+
/*
- Analyze exising FUNC_DEP array and add elements for tables and uniq keys
-
- SYNOPSIS
-
- DESCRIPTION
- Add FUNC_DEP elements
-
- RETURN
- .
+ This is used to analyse expressions in "tbl.col=expr" dependencies so
+ that we can figure out which fields the expression depends on.
*/
class Field_dependency_setter : public Field_enumerator
@@ -819,20 +860,41 @@
return;
}
}
- /* We didn't find the field. Bump the dependency anyway */
+ /*
+ We got here if didn't find this field. It's not a part of
+ a unique key, and/or there is no field=expr element for it.
+ Bump the dependency anyway, this will signal that this dependency
+ cannot be satisfied.
+ */
te->equality_deps[expr_offset].unknown_args++;
}
}
+
Table_elimination *te;
- uint expr_offset; /* Offset of the expression we're processing */
+ /* Offset of the expression we're processing in the dependency bitmap */
+ uint expr_offset;
};
+/*
+ Setup equality dependencies
+
+ SYNOPSIS
+ setup_equality_deps()
+ te Table elimination context
+ bound_deps_list OUT Start of linked list of elements that were found to
+ be bound (caller will use this to see if that
+ allows to declare further elements bound)
+*/
+
static
bool setup_equality_deps(Table_elimination *te, Func_dep **bound_deps_list)
{
DBUG_ENTER("setup_equality_deps");
+ /*
+ Count Field_dep objects and assign each of them a unique bitmap_offset.
+ */
uint offset= 0;
for (Table_dep **tbl_dep=te->table_deps;
tbl_dep < te->table_deps + MAX_TABLES;
@@ -859,7 +921,10 @@
bitmap_clear_all(&te->expr_deps);
/*
- Walk through all field=expr elements and collect all fields.
+ Analyze all "field=expr" dependencies, and have te->expr_deps encode
+ dependencies of expressions from fields.
+
+ Also collect a linked list of equalities that are bound.
*/
Func_dep *bound_dep= NULL;
Field_dependency_setter deps_setter(te);
1
0

[Maria-developers] Rev 2733: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r5/
by Sergey Petrunya 13 Aug '09
by Sergey Petrunya 13 Aug '09
13 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r5/
------------------------------------------------------------
revno: 2733
revision-id: psergey(a)askmonty.org-20090813191053-g1xfeieoti4bqgbc
parent: psergey(a)askmonty.org-20090813093613-hy7tdlsgdy83xszq
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 23:10:53 +0400
message:
MWL#17: Table elimination
- Better comments
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-13 09:36:13 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-13 19:10:53 +0000
@@ -20,19 +20,16 @@
OVERVIEW
The module has one entry point - eliminate_tables() function, which one
- needs to call (once) sometime after update_ref_and_keys() but before the
- join optimization.
+ needs to call (once) at some point before the join optimization.
eliminate_tables() operates over the JOIN structures. Logically, it
removes the right sides of outer join nests. Physically, it changes the
following members:
* Eliminated tables are marked as constant and moved to the front of the
join order.
+
* In addition to this, they are recorded in JOIN::eliminated_tables bitmap.
- * All join nests have their NESTED_JOIN::n_tables updated to discount
- the eliminated tables
-
* Items that became disused because they were in the ON expression of an
eliminated outer join are notified by means of the Item tree walk which
calls Item::mark_as_eliminated_processor for every item
@@ -40,26 +37,13 @@
Item_subselect with its Item_subselect::eliminated flag which is used
by EXPLAIN code to check if the subquery should be shown in EXPLAIN.
- Table elimination is redone on every PS re-execution. (TODO reasons?)
+ Table elimination is redone on every PS re-execution.
*/
+
/*
- A structure that represents a functional dependency of something over
- something else. This can be one of:
-
- 1. A "tbl.field = expr" equality. The field depends on the expression.
-
- 2. An Item_equal(...) multi-equality. Each participating field depends on
- every other participating field. (TODO???)
-
- 3. A UNIQUE_KEY(field1, field2, fieldN). The key depends on the fields that
- it is composed of.
-
- 4. A table (which is within an outer join nest). Table depends on a unique
- key (value of a unique key identifies a table record)
-
- 5. An outer join nest. It depends on all tables it contains.
-
+ An abstract structure that represents some entity that's being dependent on
+ some other entity.
*/
class Func_dep : public Sql_alloc
@@ -73,9 +57,14 @@
FD_UNIQUE_KEY,
FD_TABLE,
FD_OUTER_JOIN
- } type;
- Func_dep *next;
- bool bound;
+ } type; /* Type of the object */
+
+ /*
+ Used to make a linked list of elements that became bound and thus can
+ make elements that depend on them bound, too.
+ */
+ Func_dep *next;
+ bool bound; /* TRUE<=> The entity is considered bound */
Func_dep() : next(NULL), bound(FALSE) {}
};
@@ -84,10 +73,10 @@
class Table_dep;
class Outer_join_dep;
+
/*
- An equality
- - Depends on multiple fields (those in its expression), unknown_args is a
- counter of unsatisfied dependencies.
+ A "tbl.column= expr" equality dependency. tbl.column depends on fields
+ used in expr.
*/
class Equality_dep : public Func_dep
{
@@ -95,8 +84,11 @@
Field_dep *field;
Item *val;
- uint level; /* Used during condition analysis only */
- uint unknown_args; /* Number of yet unknown arguments */
+ /* Used during condition analysis only, similar to KEYUSE::level */
+ uint level;
+
+ /* Number of fields referenced from *val that are not yet 'bound' */
+ uint unknown_args;
};
@@ -139,7 +131,7 @@
type= Func_dep::FD_UNIQUE_KEY;
}
Table_dep *table; /* Table this key is from */
- uint keyno; // TODO do we care about this
+ uint keyno;
uint n_missing_keyparts;
Key_dep *next_table_key;
};
=== modified file 'sql/sql_select.cc'
--- a/sql/sql_select.cc 2009-08-13 09:24:02 +0000
+++ b/sql/sql_select.cc 2009-08-13 19:10:53 +0000
@@ -114,7 +114,7 @@
COND *conds, bool top);
static bool check_interleaving_with_nj(JOIN_TAB *next);
static void restore_prev_nj_state(JOIN_TAB *last);
-static void reset_nj_counters(JOIN *join, List<TABLE_LIST> *join_list);
+static uint reset_nj_counters(JOIN *join, List<TABLE_LIST> *join_list);
static uint build_bitmap_for_nested_joins(List<TABLE_LIST> *join_list,
uint first_unused);
@@ -8791,23 +8791,26 @@
tables which will be ignored.
*/
-static void reset_nj_counters(JOIN *join, List<TABLE_LIST> *join_list)
+static uint reset_nj_counters(JOIN *join, List<TABLE_LIST> *join_list)
{
List_iterator<TABLE_LIST> li(*join_list);
TABLE_LIST *table;
DBUG_ENTER("reset_nj_counters");
+ uint n=0;
while ((table= li++))
{
NESTED_JOIN *nested_join;
if ((nested_join= table->nested_join))
{
nested_join->counter= 0;
- nested_join->n_tables= my_count_bits(nested_join->used_tables &
- ~join->eliminated_tables);
- reset_nj_counters(join, &nested_join->join_list);
+ //nested_join->n_tables= my_count_bits(nested_join->used_tables &
+ // ~join->eliminated_tables);
+ nested_join->n_tables= reset_nj_counters(join, &nested_join->join_list);
}
+ if (table->table && (table->table->map & ~join->eliminated_tables))
+ n++;
}
- DBUG_VOID_RETURN;
+ DBUG_RETURN(n);
}
1
0

[Maria-developers] Rev 2732: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r5/
by Sergey Petrunya 13 Aug '09
by Sergey Petrunya 13 Aug '09
13 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r5/
------------------------------------------------------------
revno: 2732
revision-id: psergey(a)askmonty.org-20090813093613-hy7tdlsgdy83xszq
parent: psergey(a)askmonty.org-20090813092402-jlqucf6nultxlv4b
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 13:36:13 +0400
message:
MWL#17: Table elimination
Fixes after post-review fixes:
- Don't search for tables in JOIN_TAB array. it's not initialized yet.
use select_lex->leaf_tables instead.
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-13 00:01:43 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-13 09:36:13 +0000
@@ -676,16 +676,12 @@
if (!(table_dep= te->table_deps[idx]))
{
TABLE *table= NULL;
- /*
- Locate and create the table. The search isnt very efficient but
- typically we won't get here as we process the ON expression first
- and that will create the Table_dep
- */
- for (uint i= 0; i < te->join->tables; i++)
+ for (TABLE_LIST *tlist= te->join->select_lex->leaf_tables; tlist;
+ tlist=tlist->next_leaf)
{
- if (te->join->join_tab[i].table->tablenr == (uint)idx)
+ if (tlist->table->tablenr == (uint)idx)
{
- table= te->join->join_tab[i].table;
+ table=tlist->table;
break;
}
}
1
0

[Maria-developers] Rev 2731: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r5/
by Sergey Petrunya 13 Aug '09
by Sergey Petrunya 13 Aug '09
13 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r5/
------------------------------------------------------------
revno: 2731
revision-id: psergey(a)askmonty.org-20090813092402-jlqucf6nultxlv4b
parent: psergey(a)askmonty.org-20090813000143-dukzk352hjywidk7
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 13:24:02 +0400
message:
MWL#17: Table elimination
- Post-postreview changes fix: Do set NESTED_JOIN::n_tables to number of
tables left after elimination.
=== modified file 'sql/sql_select.cc'
--- a/sql/sql_select.cc 2009-08-12 23:43:02 +0000
+++ b/sql/sql_select.cc 2009-08-13 09:24:02 +0000
@@ -114,7 +114,7 @@
COND *conds, bool top);
static bool check_interleaving_with_nj(JOIN_TAB *next);
static void restore_prev_nj_state(JOIN_TAB *last);
-static void reset_nj_counters(List<TABLE_LIST> *join_list);
+static void reset_nj_counters(JOIN *join, List<TABLE_LIST> *join_list);
static uint build_bitmap_for_nested_joins(List<TABLE_LIST> *join_list,
uint first_unused);
@@ -1011,7 +1011,7 @@
DBUG_RETURN(1);
}
- reset_nj_counters(join_list);
+ reset_nj_counters(this, join_list);
make_outerjoin_info(this);
/*
@@ -4625,7 +4625,7 @@
DBUG_ENTER("choose_plan");
join->cur_embedding_map= 0;
- reset_nj_counters(join->join_list);
+ reset_nj_counters(join, join->join_list);
/*
if (SELECT_STRAIGHT_JOIN option is set)
reorder tables so dependent tables come after tables they depend
@@ -8791,7 +8791,7 @@
tables which will be ignored.
*/
-static void reset_nj_counters(List<TABLE_LIST> *join_list)
+static void reset_nj_counters(JOIN *join, List<TABLE_LIST> *join_list)
{
List_iterator<TABLE_LIST> li(*join_list);
TABLE_LIST *table;
@@ -8802,7 +8802,9 @@
if ((nested_join= table->nested_join))
{
nested_join->counter= 0;
- reset_nj_counters(&nested_join->join_list);
+ nested_join->n_tables= my_count_bits(nested_join->used_tables &
+ ~join->eliminated_tables);
+ reset_nj_counters(join, &nested_join->join_list);
}
}
DBUG_VOID_RETURN;
1
0

[Maria-developers] Rev 2730: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r5/
by Sergey Petrunya 13 Aug '09
by Sergey Petrunya 13 Aug '09
13 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r5/
------------------------------------------------------------
revno: 2730
revision-id: psergey(a)askmonty.org-20090813000143-dukzk352hjywidk7
parent: psergey(a)askmonty.org-20090812234302-10es7qmf0m09ahbq
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 04:01:43 +0400
message:
MWL#17: Table elimination
- When making inferences "field is bound" -> "key is bound", do check
that the field is part of the key
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-12 23:43:02 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-13 00:01:43 +0000
@@ -1043,7 +1043,8 @@
DBUG_PRINT("info", ("key %s.%s is now bound",
key_dep->table->table->alias,
key_dep->table->table->key_info[key_dep->keyno].name));
- if (!key_dep->bound)
+ if (field_dep->field->part_of_key.is_set(key_dep->keyno) &&
+ !key_dep->bound)
{
if (!--key_dep->n_missing_keyparts)
{
1
0

[Maria-developers] Rev 2729: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r5/
by Sergey Petrunya 12 Aug '09
by Sergey Petrunya 12 Aug '09
12 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r5/
------------------------------------------------------------
revno: 2729
revision-id: psergey(a)askmonty.org-20090812234302-10es7qmf0m09ahbq
parent: psergey(a)askmonty.org-20090812223421-w4xyzj7azqgo83ps
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 03:43:02 +0400
message:
MWL#17: Table elimination
- Continue addressing review feedback: remove "unusable KEYUSEs"
extension as it is no longer needed.
=== modified file 'sql/item.h'
--- a/sql/item.h 2009-08-12 22:34:21 +0000
+++ b/sql/item.h 2009-08-12 23:43:02 +0000
@@ -1017,18 +1017,6 @@
bool eq_by_collation(Item *item, bool binary_cmp, CHARSET_INFO *cs);
};
-#if 0
-typedef struct
-{
- TABLE *table; /* Table of interest */
- uint keyno; /* Index of interest */
- uint forbidden_part; /* key part which one is not allowed to refer to */
- /* [Set by processor] used tables, besides the table of interest */
- table_map used_tables;
- /* [Set by processor] Parts of index of interest that expression refers to */
- uint needed_key_parts;
-} Field_processor_info;
-#endif
/* Data for Item::check_column_usage_processor */
class Field_enumerator
=== modified file 'sql/opt_table_elimination.cc'
--- a/sql/opt_table_elimination.cc 2009-08-12 22:34:21 +0000
+++ b/sql/opt_table_elimination.cc 2009-08-12 23:43:02 +0000
@@ -1119,7 +1119,6 @@
case Func_dep::FD_OUTER_JOIN:
{
Outer_join_dep *outer_join_dep= (Outer_join_dep*)bound_dep;
- /* TODO what do here? Stop if eliminated the top-level? */
mark_as_eliminated(te.join, outer_join_dep->table_list);
Outer_join_dep *parent= outer_join_dep->parent;
if (parent &&
@@ -1236,38 +1235,6 @@
#endif
-/***********************************************************************************************/
-
-#if 0
-static void dbug_print_fdep(FUNC_DEP *fd)
-{
- switch (fd->type) {
- case FUNC_DEP::FD_OUTER_JOIN:
- {
- fprintf(DBUG_FILE, "outer_join(");
- if (fd->table_list->nested_join)
- {
- bool first= TRUE;
- List_iterator<TABLE_LIST> it(fd->table_list->nested_join->join_list);
- TABLE_LIST *tbl;
- while ((tbl= it++))
- {
- fprintf(DBUG_FILE, "%s%s", first?"":" ",
- tbl->table? tbl->table->alias : "...");
- first= FALSE;
- }
- fprintf(DBUG_FILE, ")");
- }
- else
- fprintf(DBUG_FILE, "%s", fd->table_list->table->alias);
- fprintf(DBUG_FILE, ")");
- break;
- }
- }
-}
-
-#endif
-
/**
@} (end of group Table_Elimination)
*/
=== modified file 'sql/sql_select.cc'
--- a/sql/sql_select.cc 2009-06-30 13:11:00 +0000
+++ b/sql/sql_select.cc 2009-08-12 23:43:02 +0000
@@ -2474,7 +2474,6 @@
DBUG_RETURN(HA_POS_ERROR); /* This shouldn't happend */
}
-
/*
This structure is used to collect info on potentially sargable
predicates in order to check whether they become sargable after
@@ -2762,16 +2761,14 @@
{
start_keyuse=keyuse;
key=keyuse->key;
- if (keyuse->type == KEYUSE_USABLE)
- s->keys.set_bit(key); // QQ: remove this ?
+ s->keys.set_bit(key); // QQ: remove this ?
refs=0;
const_ref.clear_all();
eq_part.clear_all();
do
{
- if (keyuse->type == KEYUSE_USABLE &&
- keyuse->val->type() != Item::NULL_ITEM && !keyuse->optimize)
+ if (keyuse->val->type() != Item::NULL_ITEM && !keyuse->optimize)
{
if (!((~found_const_table_map) & keyuse->used_tables))
const_ref.set_bit(keyuse->keypart);
@@ -2971,9 +2968,11 @@
*/
bool null_rejecting;
bool *cond_guard; /* See KEYUSE::cond_guard */
- enum keyuse_type type; /* See KEYUSE::type */
} KEY_FIELD;
+/* Values in optimize */
+#define KEY_OPTIMIZE_EXISTS 1
+#define KEY_OPTIMIZE_REF_OR_NULL 2
/**
Merge new key definitions to old ones, remove those not used in both.
@@ -3064,18 +3063,13 @@
KEY_OPTIMIZE_REF_OR_NULL));
old->null_rejecting= (old->null_rejecting &&
new_fields->null_rejecting);
- /*
- The conditions are the same, hence their usabilities should
- be, too (TODO: shouldn't that apply to the above
- null_rejecting and optimize attributes?)
- */
- DBUG_ASSERT(old->type == new_fields->type);
}
}
else if (old->eq_func && new_fields->eq_func &&
old->val->eq_by_collation(new_fields->val,
old->field->binary(),
old->field->charset()))
+
{
old->level= and_level;
old->optimize= ((old->optimize & new_fields->optimize &
@@ -3084,15 +3078,10 @@
KEY_OPTIMIZE_REF_OR_NULL));
old->null_rejecting= (old->null_rejecting &&
new_fields->null_rejecting);
- // "t.key_col=const" predicates are always usable
- DBUG_ASSERT(old->type == KEYUSE_USABLE &&
- new_fields->type == KEYUSE_USABLE);
}
else if (old->eq_func && new_fields->eq_func &&
- ((new_fields->type == KEYUSE_USABLE &&
- old->val->const_item() && old->val->is_null()) ||
- ((old->type == KEYUSE_USABLE && new_fields->val->is_null()))))
- /* TODO ^ why is the above asymmetric, why const_item()? */
+ ((old->val->const_item() && old->val->is_null()) ||
+ new_fields->val->is_null()))
{
/* field = expression OR field IS NULL */
old->level= and_level;
@@ -3163,7 +3152,6 @@
table_map usable_tables, SARGABLE_PARAM **sargables)
{
uint exists_optimize= 0;
- bool optimizable=0;
if (!(field->flags & PART_KEY_FLAG))
{
// Don't remove column IS NULL on a LEFT JOIN table
@@ -3176,12 +3164,15 @@
else
{
table_map used_tables=0;
+ bool optimizable=0;
for (uint i=0; i<num_values; i++)
{
used_tables|=(value[i])->used_tables();
if (!((value[i])->used_tables() & (field->table->map | RAND_TABLE_BIT)))
optimizable=1;
}
+ if (!optimizable)
+ return;
if (!(usable_tables & field->table->map))
{
if (!eq_func || (*value)->type() != Item::NULL_ITEM ||
@@ -3194,8 +3185,7 @@
JOIN_TAB *stat=field->table->reginfo.join_tab;
key_map possible_keys=field->key_start;
possible_keys.intersect(field->table->keys_in_use_for_query);
- if (optimizable)
- stat[0].keys.merge(possible_keys); // Add possible keys
+ stat[0].keys.merge(possible_keys); // Add possible keys
/*
Save the following cases:
@@ -3288,7 +3278,6 @@
(*key_fields)->val= *value;
(*key_fields)->level= and_level;
(*key_fields)->optimize= exists_optimize;
- (*key_fields)->type= optimizable? KEYUSE_USABLE : KEYUSE_UNKNOWN;
/*
If the condition has form "tbl.keypart = othertbl.field" and
othertbl.field can be NULL, there will be no matches if othertbl.field
@@ -3600,7 +3589,6 @@
keyuse.optimize= key_field->optimize & KEY_OPTIMIZE_REF_OR_NULL;
keyuse.null_rejecting= key_field->null_rejecting;
keyuse.cond_guard= key_field->cond_guard;
- keyuse.type= key_field->type;
VOID(insert_dynamic(keyuse_array,(uchar*) &keyuse));
}
}
@@ -3609,6 +3597,7 @@
}
+#define FT_KEYPART (MAX_REF_PARTS+10)
static void
add_ft_keys(DYNAMIC_ARRAY *keyuse_array,
@@ -3667,7 +3656,6 @@
keyuse.used_tables=cond_func->key_item()->used_tables();
keyuse.optimize= 0;
keyuse.keypart_map= 0;
- keyuse.type= KEYUSE_USABLE;
VOID(insert_dynamic(keyuse_array,(uchar*) &keyuse));
}
@@ -3682,13 +3670,6 @@
return (int) (a->key - b->key);
if (a->keypart != b->keypart)
return (int) (a->keypart - b->keypart);
-
- // Usable ones go before the unusable
- int a_ok= test(a->type == KEYUSE_USABLE);
- int b_ok= test(b->type == KEYUSE_USABLE);
- if (a_ok != b_ok)
- return a_ok? -1 : 1;
-
// Place const values before other ones
if ((res= test((a->used_tables & ~OUTER_REF_TABLE_BIT)) -
test((b->used_tables & ~OUTER_REF_TABLE_BIT))))
@@ -3899,8 +3880,7 @@
found_eq_constant=0;
for (i=0 ; i < keyuse->elements-1 ; i++,use++)
{
- if (use->type == KEYUSE_USABLE && !use->used_tables &&
- use->optimize != KEY_OPTIMIZE_REF_OR_NULL)
+ if (!use->used_tables && use->optimize != KEY_OPTIMIZE_REF_OR_NULL)
use->table->const_key_parts[use->key]|= use->keypart_map;
if (use->keypart != FT_KEYPART)
{
@@ -3924,8 +3904,7 @@
/* Save ptr to first use */
if (!use->table->reginfo.join_tab->keyuse)
use->table->reginfo.join_tab->keyuse=save_pos;
- if (use->type == KEYUSE_USABLE)
- use->table->reginfo.join_tab->checked_keys.set_bit(use->key);
+ use->table->reginfo.join_tab->checked_keys.set_bit(use->key);
save_pos++;
}
i=(uint) (save_pos-(KEYUSE*) keyuse->buffer);
@@ -3955,7 +3934,7 @@
To avoid bad matches, we don't make ref_table_rows less than 100.
*/
keyuse->ref_table_rows= ~(ha_rows) 0; // If no ref
- if (keyuse->type == KEYUSE_USABLE && keyuse->used_tables &
+ if (keyuse->used_tables &
(map= (keyuse->used_tables & ~join->const_table_map &
~OUTER_REF_TABLE_BIT)))
{
@@ -4147,8 +4126,7 @@
if 1. expression doesn't refer to forward tables
2. we won't get two ref-or-null's
*/
- if (keyuse->type == KEYUSE_USABLE &&
- !(remaining_tables & keyuse->used_tables) &&
+ if (!(remaining_tables & keyuse->used_tables) &&
!(ref_or_null_part && (keyuse->optimize &
KEY_OPTIMIZE_REF_OR_NULL)))
{
@@ -5602,8 +5580,7 @@
*/
do
{
- if (!(~used_tables & keyuse->used_tables) &&
- keyuse->type == KEYUSE_USABLE)
+ if (!(~used_tables & keyuse->used_tables))
{
if (keyparts == keyuse->keypart &&
!(found_part_ref_or_null & keyuse->optimize))
@@ -5653,11 +5630,9 @@
uint i;
for (i=0 ; i < keyparts ; keyuse++,i++)
{
- while (keyuse->keypart != i || ((~used_tables) & keyuse->used_tables) ||
- !(keyuse->type == KEYUSE_USABLE))
- {
+ while (keyuse->keypart != i ||
+ ((~used_tables) & keyuse->used_tables))
keyuse++; /* Skip other parts */
- }
uint maybe_null= test(keyinfo->key_part[i].null_bit);
j->ref.items[i]=keyuse->val; // Save for cond removal
=== modified file 'sql/sql_select.h'
--- a/sql/sql_select.h 2009-06-30 13:11:00 +0000
+++ b/sql/sql_select.h 2009-08-12 23:43:02 +0000
@@ -28,45 +28,6 @@
#include "procedure.h"
#include <myisam.h>
-#define FT_KEYPART (MAX_REF_PARTS+10)
-/* Values in optimize */
-#define KEY_OPTIMIZE_EXISTS 1
-#define KEY_OPTIMIZE_REF_OR_NULL 2
-
-/* KEYUSE element types */
-enum keyuse_type
-{
- /*
- val refers to the same table, this is either KEYUSE_BIND or KEYUSE_NO_BIND
- type, we didn't determine which one yet.
- */
- KEYUSE_UNKNOWN= 0,
- /*
- 'regular' keyuse, i.e. it represents one of the following
- * t.keyXpartY = func(constants, other-tables)
- * t.keyXpartY IS NULL
- * t.keyXpartY = func(constants, other-tables) OR t.keyXpartY IS NULL
- and can be used to construct ref acces
- */
- KEYUSE_USABLE,
- /*
- The keyuse represents a condition in form:
-
- t.uniq_keyXpartY = func(other parts of uniq_keyX)
-
- This can't be used to construct uniq_keyX but we could use it to determine
- that the table will produce at most one match.
- */
- KEYUSE_BIND,
- /*
- Keyuse that's not usable for ref access and doesn't meet the criteria of
- KEYUSE_BIND. Examples:
- t.keyXpartY = func(t.keyXpartY)
- t.keyXpartY = func(column of t that's not covered by keyX)
- */
- KEYUSE_NO_BIND
-};
-
typedef struct keyuse_t {
TABLE *table;
Item *val; /**< or value if no field */
@@ -90,15 +51,6 @@
NULL - Otherwise (the source equality can't be turned off)
*/
bool *cond_guard;
- /*
- 1 <=> This keyuse can be used to construct key access.
- 0 <=> Otherwise. Currently unusable KEYUSEs represent equalities
- where one table column refers to another one, like this:
- t.keyXpartA=func(t.keyXpartB)
- This equality cannot be used for index access but is useful
- for table elimination.
- */
- enum keyuse_type type;
} KEYUSE;
class store_key;
@@ -258,7 +210,7 @@
JOIN *join;
/** Bitmap of nested joins this table is part of */
nested_join_map embedding_map;
-
+
void cleanup();
inline bool is_using_loose_index_scan()
{
1
0

[Maria-developers] Rev 2728: MWL#17: Table elimination in file:///home/psergey/dev/maria-5.1-table-elim-r5/
by Sergey Petrunya 12 Aug '09
by Sergey Petrunya 12 Aug '09
12 Aug '09
At file:///home/psergey/dev/maria-5.1-table-elim-r5/
------------------------------------------------------------
revno: 2728
revision-id: psergey(a)askmonty.org-20090812223421-w4xyzj7azqgo83ps
parent: psergey(a)askmonty.org-20090708171038-9nyc3hcg1o7h8635
committer: Sergey Petrunya <psergey(a)askmonty.org>
branch nick: maria-5.1-table-elim-r5
timestamp: Thu 2009-08-13 02:34:21 +0400
message:
MWL#17: Table elimination
Address review feedback:
- Change from Wave-based approach (a-la const table detection) to
building and walking functional dependency graph.
- Change from piggy-backing on ref-access code and KEYUSE structures
to using our own expression analyzer.
Diff too large for email (1602 lines, the limit is 1000).
1
0
I have now implemented and installed in our Buildbot enhanced facilities for
dealing with compiler warnings.
We already have a file support-files/compiler_warnings.supp, which I think is
used by PushBuild @ MySQL. The new facilities in our Buildbot uses the same
file to suppress certain warnings that for some reason cannot be removed or
are not desirable to remove.
See for example:
https://askmonty.org/buildbot/waterfall?branch=5.1
https://askmonty.org/buildbot/builders/hardy-amd64-valgrind/builds/113/step…
So there are still a few warnings that need to be eliminated, patches welcome :-)
Note that old builds from earlier than today still have the old log files,
without these new warning facilities.
Would be great to get us to compile without any warnings. The Drizzle people
already compile with -pedantic -Werror, so we are trailing behind there!
- Kristian.
4
4

[Maria-developers] Updated (by Guest): options for CREATE TABLE (43)
by worklog-noreply@askmonty.org 11 Aug '09
by worklog-noreply@askmonty.org 11 Aug '09
11 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: options for CREATE TABLE
CREATION DATE..: Tue, 11 Aug 2009, 17:02
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....: Sanja
COPIES TO......: Monty
CATEGORY.......: Server-BackLog
TASK ID........: 43 (http://askmonty.org/worklog/?tid=43)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 32 (hours remain)
ORIG. ESTIMATE.: 32
PROGRESS NOTES:
-=-=(Guest - Tue, 11 Aug 2009, 19:57)=-=-
High-Level Specification modified.
--- /tmp/wklog.43.old.31856 2009-08-11 19:57:38.000000000 +0300
+++ /tmp/wklog.43.new.31856 2009-08-11 19:57:38.000000000 +0300
@@ -5,8 +5,43 @@
key key1(field) key_opt1=kval1 key_opt2=kval2)
table_option1=tval1, table_option2=tval2;
-Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
+Exclusion should be made for old table and key options where
'=' was not obligatory.
+Old key options:
+KEY_BLOCK_SIZE <num> -> KEY_BLOCK_SIZE=num
+WITH PARSER <name> -> PARSER=name
+
+Old table options:
+ENGINE name -> ENGINE=name
+TYPE name -> TYPE=name
+MAX_ROWS num -> MAX_ROWS=num
+MIX_ROWS num -> MIX_ROWS=num
+AVG_ROW_LENGTH num -> AVG_ROW_LENGTH=num
+PASSWORD string -> PASSWORD=string
+COMMENT string -> COMMENT=string
+AUTO_INCREMENT num -> AUTO_INCREMENT=num
+PACK_KEYS num/default -> PACK_KEYS=num/default
+CHECKSUM num -> CHECKSUM=num
+TABLE_CHECKSUM num -> TABLE_CHECKSUM=num
+PAGE_CHECKSUM num -> PAGE_CHECKSUM=num
+DELAY_KEY_WRITE num -> DELAY_KEY_WRITE=num
+ROW_FORMAT name -> ROW_FORMAT=name
+INSERT_METHOD name -> INSERT_METHOD=name
+KEY_BLOCK_SIZE num -> KEY_BLOCK_SIZE=num
+TRANSACTIONAL num -> TRANSACTIONAL=num
+
+Table options which will be left hardcoded
+UNION
+default charset
+default collation
+DATA DIRECTORY
+TABLESPACE
+STORAGE
+
For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
be separated from them by '=' sign.
+
+
+
+
-=-=(Guest - Tue, 11 Aug 2009, 19:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.43.old.30883 2009-08-11 19:36:45.000000000 +0300
+++ /tmp/wklog.43.new.30883 2009-08-11 19:36:45.000000000 +0300
@@ -1 +1,12 @@
+Table definition ca looks like following
+CREATE TABLE table
+ (field int ... field_opt1=fval1 field_opt2=fval2,
+ key key1(field) key_opt1=kval1 key_opt2=kval2)
+ table_option1=tval1, table_option2=tval2;
+
+Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
+'=' was not obligatory.
+
+For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
+be separated from them by '=' sign.
DESCRIPTION:
Add ability to create table with additional option which can be passed to engine.
Also make current options such as TRANSACTIONAL working via this mechanism.
HIGH-LEVEL SPECIFICATION:
Table definition ca looks like following
CREATE TABLE table
(field int ... field_opt1=fval1 field_opt2=fval2,
key key1(field) key_opt1=kval1 key_opt2=kval2)
table_option1=tval1, table_option2=tval2;
Exclusion should be made for old table and key options where
'=' was not obligatory.
Old key options:
KEY_BLOCK_SIZE <num> -> KEY_BLOCK_SIZE=num
WITH PARSER <name> -> PARSER=name
Old table options:
ENGINE name -> ENGINE=name
TYPE name -> TYPE=name
MAX_ROWS num -> MAX_ROWS=num
MIX_ROWS num -> MIX_ROWS=num
AVG_ROW_LENGTH num -> AVG_ROW_LENGTH=num
PASSWORD string -> PASSWORD=string
COMMENT string -> COMMENT=string
AUTO_INCREMENT num -> AUTO_INCREMENT=num
PACK_KEYS num/default -> PACK_KEYS=num/default
CHECKSUM num -> CHECKSUM=num
TABLE_CHECKSUM num -> TABLE_CHECKSUM=num
PAGE_CHECKSUM num -> PAGE_CHECKSUM=num
DELAY_KEY_WRITE num -> DELAY_KEY_WRITE=num
ROW_FORMAT name -> ROW_FORMAT=name
INSERT_METHOD name -> INSERT_METHOD=name
KEY_BLOCK_SIZE num -> KEY_BLOCK_SIZE=num
TRANSACTIONAL num -> TRANSACTIONAL=num
Table options which will be left hardcoded
UNION
default charset
default collation
DATA DIRECTORY
TABLESPACE
STORAGE
For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
be separated from them by '=' sign.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): options for CREATE TABLE (43)
by worklog-noreply@askmonty.org 11 Aug '09
by worklog-noreply@askmonty.org 11 Aug '09
11 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: options for CREATE TABLE
CREATION DATE..: Tue, 11 Aug 2009, 17:02
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....: Sanja
COPIES TO......: Monty
CATEGORY.......: Server-BackLog
TASK ID........: 43 (http://askmonty.org/worklog/?tid=43)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 32 (hours remain)
ORIG. ESTIMATE.: 32
PROGRESS NOTES:
-=-=(Guest - Tue, 11 Aug 2009, 19:57)=-=-
High-Level Specification modified.
--- /tmp/wklog.43.old.31856 2009-08-11 19:57:38.000000000 +0300
+++ /tmp/wklog.43.new.31856 2009-08-11 19:57:38.000000000 +0300
@@ -5,8 +5,43 @@
key key1(field) key_opt1=kval1 key_opt2=kval2)
table_option1=tval1, table_option2=tval2;
-Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
+Exclusion should be made for old table and key options where
'=' was not obligatory.
+Old key options:
+KEY_BLOCK_SIZE <num> -> KEY_BLOCK_SIZE=num
+WITH PARSER <name> -> PARSER=name
+
+Old table options:
+ENGINE name -> ENGINE=name
+TYPE name -> TYPE=name
+MAX_ROWS num -> MAX_ROWS=num
+MIX_ROWS num -> MIX_ROWS=num
+AVG_ROW_LENGTH num -> AVG_ROW_LENGTH=num
+PASSWORD string -> PASSWORD=string
+COMMENT string -> COMMENT=string
+AUTO_INCREMENT num -> AUTO_INCREMENT=num
+PACK_KEYS num/default -> PACK_KEYS=num/default
+CHECKSUM num -> CHECKSUM=num
+TABLE_CHECKSUM num -> TABLE_CHECKSUM=num
+PAGE_CHECKSUM num -> PAGE_CHECKSUM=num
+DELAY_KEY_WRITE num -> DELAY_KEY_WRITE=num
+ROW_FORMAT name -> ROW_FORMAT=name
+INSERT_METHOD name -> INSERT_METHOD=name
+KEY_BLOCK_SIZE num -> KEY_BLOCK_SIZE=num
+TRANSACTIONAL num -> TRANSACTIONAL=num
+
+Table options which will be left hardcoded
+UNION
+default charset
+default collation
+DATA DIRECTORY
+TABLESPACE
+STORAGE
+
For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
be separated from them by '=' sign.
+
+
+
+
-=-=(Guest - Tue, 11 Aug 2009, 19:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.43.old.30883 2009-08-11 19:36:45.000000000 +0300
+++ /tmp/wklog.43.new.30883 2009-08-11 19:36:45.000000000 +0300
@@ -1 +1,12 @@
+Table definition ca looks like following
+CREATE TABLE table
+ (field int ... field_opt1=fval1 field_opt2=fval2,
+ key key1(field) key_opt1=kval1 key_opt2=kval2)
+ table_option1=tval1, table_option2=tval2;
+
+Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
+'=' was not obligatory.
+
+For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
+be separated from them by '=' sign.
DESCRIPTION:
Add ability to create table with additional option which can be passed to engine.
Also make current options such as TRANSACTIONAL working via this mechanism.
HIGH-LEVEL SPECIFICATION:
Table definition ca looks like following
CREATE TABLE table
(field int ... field_opt1=fval1 field_opt2=fval2,
key key1(field) key_opt1=kval1 key_opt2=kval2)
table_option1=tval1, table_option2=tval2;
Exclusion should be made for old table and key options where
'=' was not obligatory.
Old key options:
KEY_BLOCK_SIZE <num> -> KEY_BLOCK_SIZE=num
WITH PARSER <name> -> PARSER=name
Old table options:
ENGINE name -> ENGINE=name
TYPE name -> TYPE=name
MAX_ROWS num -> MAX_ROWS=num
MIX_ROWS num -> MIX_ROWS=num
AVG_ROW_LENGTH num -> AVG_ROW_LENGTH=num
PASSWORD string -> PASSWORD=string
COMMENT string -> COMMENT=string
AUTO_INCREMENT num -> AUTO_INCREMENT=num
PACK_KEYS num/default -> PACK_KEYS=num/default
CHECKSUM num -> CHECKSUM=num
TABLE_CHECKSUM num -> TABLE_CHECKSUM=num
PAGE_CHECKSUM num -> PAGE_CHECKSUM=num
DELAY_KEY_WRITE num -> DELAY_KEY_WRITE=num
ROW_FORMAT name -> ROW_FORMAT=name
INSERT_METHOD name -> INSERT_METHOD=name
KEY_BLOCK_SIZE num -> KEY_BLOCK_SIZE=num
TRANSACTIONAL num -> TRANSACTIONAL=num
Table options which will be left hardcoded
UNION
default charset
default collation
DATA DIRECTORY
TABLESPACE
STORAGE
For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
be separated from them by '=' sign.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): options for CREATE TABLE (43)
by worklog-noreply@askmonty.org 11 Aug '09
by worklog-noreply@askmonty.org 11 Aug '09
11 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: options for CREATE TABLE
CREATION DATE..: Tue, 11 Aug 2009, 17:02
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....: Sanja
COPIES TO......: Monty
CATEGORY.......: Server-BackLog
TASK ID........: 43 (http://askmonty.org/worklog/?tid=43)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 32 (hours remain)
ORIG. ESTIMATE.: 32
PROGRESS NOTES:
-=-=(Guest - Tue, 11 Aug 2009, 19:57)=-=-
High-Level Specification modified.
--- /tmp/wklog.43.old.31856 2009-08-11 19:57:38.000000000 +0300
+++ /tmp/wklog.43.new.31856 2009-08-11 19:57:38.000000000 +0300
@@ -5,8 +5,43 @@
key key1(field) key_opt1=kval1 key_opt2=kval2)
table_option1=tval1, table_option2=tval2;
-Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
+Exclusion should be made for old table and key options where
'=' was not obligatory.
+Old key options:
+KEY_BLOCK_SIZE <num> -> KEY_BLOCK_SIZE=num
+WITH PARSER <name> -> PARSER=name
+
+Old table options:
+ENGINE name -> ENGINE=name
+TYPE name -> TYPE=name
+MAX_ROWS num -> MAX_ROWS=num
+MIX_ROWS num -> MIX_ROWS=num
+AVG_ROW_LENGTH num -> AVG_ROW_LENGTH=num
+PASSWORD string -> PASSWORD=string
+COMMENT string -> COMMENT=string
+AUTO_INCREMENT num -> AUTO_INCREMENT=num
+PACK_KEYS num/default -> PACK_KEYS=num/default
+CHECKSUM num -> CHECKSUM=num
+TABLE_CHECKSUM num -> TABLE_CHECKSUM=num
+PAGE_CHECKSUM num -> PAGE_CHECKSUM=num
+DELAY_KEY_WRITE num -> DELAY_KEY_WRITE=num
+ROW_FORMAT name -> ROW_FORMAT=name
+INSERT_METHOD name -> INSERT_METHOD=name
+KEY_BLOCK_SIZE num -> KEY_BLOCK_SIZE=num
+TRANSACTIONAL num -> TRANSACTIONAL=num
+
+Table options which will be left hardcoded
+UNION
+default charset
+default collation
+DATA DIRECTORY
+TABLESPACE
+STORAGE
+
For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
be separated from them by '=' sign.
+
+
+
+
-=-=(Guest - Tue, 11 Aug 2009, 19:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.43.old.30883 2009-08-11 19:36:45.000000000 +0300
+++ /tmp/wklog.43.new.30883 2009-08-11 19:36:45.000000000 +0300
@@ -1 +1,12 @@
+Table definition ca looks like following
+CREATE TABLE table
+ (field int ... field_opt1=fval1 field_opt2=fval2,
+ key key1(field) key_opt1=kval1 key_opt2=kval2)
+ table_option1=tval1, table_option2=tval2;
+
+Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
+'=' was not obligatory.
+
+For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
+be separated from them by '=' sign.
DESCRIPTION:
Add ability to create table with additional option which can be passed to engine.
Also make current options such as TRANSACTIONAL working via this mechanism.
HIGH-LEVEL SPECIFICATION:
Table definition ca looks like following
CREATE TABLE table
(field int ... field_opt1=fval1 field_opt2=fval2,
key key1(field) key_opt1=kval1 key_opt2=kval2)
table_option1=tval1, table_option2=tval2;
Exclusion should be made for old table and key options where
'=' was not obligatory.
Old key options:
KEY_BLOCK_SIZE <num> -> KEY_BLOCK_SIZE=num
WITH PARSER <name> -> PARSER=name
Old table options:
ENGINE name -> ENGINE=name
TYPE name -> TYPE=name
MAX_ROWS num -> MAX_ROWS=num
MIX_ROWS num -> MIX_ROWS=num
AVG_ROW_LENGTH num -> AVG_ROW_LENGTH=num
PASSWORD string -> PASSWORD=string
COMMENT string -> COMMENT=string
AUTO_INCREMENT num -> AUTO_INCREMENT=num
PACK_KEYS num/default -> PACK_KEYS=num/default
CHECKSUM num -> CHECKSUM=num
TABLE_CHECKSUM num -> TABLE_CHECKSUM=num
PAGE_CHECKSUM num -> PAGE_CHECKSUM=num
DELAY_KEY_WRITE num -> DELAY_KEY_WRITE=num
ROW_FORMAT name -> ROW_FORMAT=name
INSERT_METHOD name -> INSERT_METHOD=name
KEY_BLOCK_SIZE num -> KEY_BLOCK_SIZE=num
TRANSACTIONAL num -> TRANSACTIONAL=num
Table options which will be left hardcoded
UNION
default charset
default collation
DATA DIRECTORY
TABLESPACE
STORAGE
For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
be separated from them by '=' sign.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): options for CREATE TABLE (43)
by worklog-noreply@askmonty.org 11 Aug '09
by worklog-noreply@askmonty.org 11 Aug '09
11 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: options for CREATE TABLE
CREATION DATE..: Tue, 11 Aug 2009, 17:02
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....: Sanja
COPIES TO......: Monty
CATEGORY.......: Server-BackLog
TASK ID........: 43 (http://askmonty.org/worklog/?tid=43)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 32 (hours remain)
ORIG. ESTIMATE.: 32
PROGRESS NOTES:
-=-=(Guest - Tue, 11 Aug 2009, 19:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.43.old.30883 2009-08-11 19:36:45.000000000 +0300
+++ /tmp/wklog.43.new.30883 2009-08-11 19:36:45.000000000 +0300
@@ -1 +1,12 @@
+Table definition ca looks like following
+CREATE TABLE table
+ (field int ... field_opt1=fval1 field_opt2=fval2,
+ key key1(field) key_opt1=kval1 key_opt2=kval2)
+ table_option1=tval1, table_option2=tval2;
+
+Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
+'=' was not obligatory.
+
+For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
+be separated from them by '=' sign.
DESCRIPTION:
Add ability to create table with additional option which can be passed to engine.
Also make current options such as TRANSACTIONAL working via this mechanism.
HIGH-LEVEL SPECIFICATION:
Table definition ca looks like following
CREATE TABLE table
(field int ... field_opt1=fval1 field_opt2=fval2,
key key1(field) key_opt1=kval1 key_opt2=kval2)
table_option1=tval1, table_option2=tval2;
Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
'=' was not obligatory.
For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
be separated from them by '=' sign.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): options for CREATE TABLE (43)
by worklog-noreply@askmonty.org 11 Aug '09
by worklog-noreply@askmonty.org 11 Aug '09
11 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: options for CREATE TABLE
CREATION DATE..: Tue, 11 Aug 2009, 17:02
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....: Sanja
COPIES TO......: Monty
CATEGORY.......: Server-BackLog
TASK ID........: 43 (http://askmonty.org/worklog/?tid=43)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 32 (hours remain)
ORIG. ESTIMATE.: 32
PROGRESS NOTES:
-=-=(Guest - Tue, 11 Aug 2009, 19:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.43.old.30883 2009-08-11 19:36:45.000000000 +0300
+++ /tmp/wklog.43.new.30883 2009-08-11 19:36:45.000000000 +0300
@@ -1 +1,12 @@
+Table definition ca looks like following
+CREATE TABLE table
+ (field int ... field_opt1=fval1 field_opt2=fval2,
+ key key1(field) key_opt1=kval1 key_opt2=kval2)
+ table_option1=tval1, table_option2=tval2;
+
+Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
+'=' was not obligatory.
+
+For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
+be separated from them by '=' sign.
DESCRIPTION:
Add ability to create table with additional option which can be passed to engine.
Also make current options such as TRANSACTIONAL working via this mechanism.
HIGH-LEVEL SPECIFICATION:
Table definition ca looks like following
CREATE TABLE table
(field int ... field_opt1=fval1 field_opt2=fval2,
key key1(field) key_opt1=kval1 key_opt2=kval2)
table_option1=tval1, table_option2=tval2;
Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
'=' was not obligatory.
For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
be separated from them by '=' sign.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): options for CREATE TABLE (43)
by worklog-noreply@askmonty.org 11 Aug '09
by worklog-noreply@askmonty.org 11 Aug '09
11 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: options for CREATE TABLE
CREATION DATE..: Tue, 11 Aug 2009, 17:02
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....: Sanja
COPIES TO......: Monty
CATEGORY.......: Server-BackLog
TASK ID........: 43 (http://askmonty.org/worklog/?tid=43)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 32 (hours remain)
ORIG. ESTIMATE.: 32
PROGRESS NOTES:
-=-=(Guest - Tue, 11 Aug 2009, 19:36)=-=-
High-Level Specification modified.
--- /tmp/wklog.43.old.30883 2009-08-11 19:36:45.000000000 +0300
+++ /tmp/wklog.43.new.30883 2009-08-11 19:36:45.000000000 +0300
@@ -1 +1,12 @@
+Table definition ca looks like following
+CREATE TABLE table
+ (field int ... field_opt1=fval1 field_opt2=fval2,
+ key key1(field) key_opt1=kval1 key_opt2=kval2)
+ table_option1=tval1, table_option2=tval2;
+
+Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
+'=' was not obligatory.
+
+For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
+be separated from them by '=' sign.
DESCRIPTION:
Add ability to create table with additional option which can be passed to engine.
Also make current options such as TRANSACTIONAL working via this mechanism.
HIGH-LEVEL SPECIFICATION:
Table definition ca looks like following
CREATE TABLE table
(field int ... field_opt1=fval1 field_opt2=fval2,
key key1(field) key_opt1=kval1 key_opt2=kval2)
table_option1=tval1, table_option2=tval2;
Exclusion should be made for old table and key (KEY_BLOCK_SIZE) options where
'=' was not obligatory.
For fields options can go with field attributes (NOT NULL, UNIQUE and so on) can
be separated from them by '=' sign.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
Hi!
I am copying to maria-developers@ to ensure that everyone has a change
to answer...
>>>>> "Patrick" == Patrick Galbraith <patg(a)patg.net> writes:
Patrick> Monty,
Patrick> I saw your message in IRC - I replied in case you don't see it . I want
Patrick> to get this into the tree soon and am only having a small problem right now:
Patrick> [08:06] <CaptTofu> montywi: I am striving to
Patrick> [08:08] <CaptTofu> montywi: I just have one issue to solve - if the
Patrick> engine is build as a plugin, how I can get the test to run. right now,
Patrick> when it runs, it doesn't find the engine loaded, so it skips the test. I
Patrick> tried to add a 'load plugin' to the test, but it can find the shared
Patrick> library because it expects it to be in "(errno: 2
Patrick> dlopen(/Users/patg/code_devel/federated/lib/mysql/plugin/ha_federatedx.so,
Patrick> 2): image not found)"
We should probably try to fix that for the test suite.
Kristian, do you have any ideas for this ?
Patrick> So, I'm wondering if to test properly, one needs to compile the engine
Patrick> into the server versus as a plugin?
Yes, that is what you need to do (as far as I know).
Regards,
Monty
4
4

[Maria-developers] New (by Sanja): options for CREATE TABLE (43)
by worklog-noreply@askmonty.org 11 Aug '09
by worklog-noreply@askmonty.org 11 Aug '09
11 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: options for CREATE TABLE
CREATION DATE..: Tue, 11 Aug 2009, 17:02
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....: Sanja
COPIES TO......: Monty
CATEGORY.......: Server-BackLog
TASK ID........: 43 (http://askmonty.org/worklog/?tid=43)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 32 (hours remain)
ORIG. ESTIMATE.: 32
PROGRESS NOTES:
DESCRIPTION:
Add ability to create table with additional option which can be passed to engine.
Also make current options such as TRANSACTIONAL working via this mechanism.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Sanja): options for CREATE TABLE (43)
by worklog-noreply@askmonty.org 11 Aug '09
by worklog-noreply@askmonty.org 11 Aug '09
11 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: options for CREATE TABLE
CREATION DATE..: Tue, 11 Aug 2009, 17:02
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....: Sanja
COPIES TO......: Monty
CATEGORY.......: Server-BackLog
TASK ID........: 43 (http://askmonty.org/worklog/?tid=43)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 32 (hours remain)
ORIG. ESTIMATE.: 32
PROGRESS NOTES:
DESCRIPTION:
Add ability to create table with additional option which can be passed to engine.
Also make current options such as TRANSACTIONAL working via this mechanism.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Sanja): options for CREATE TABLE (43)
by worklog-noreply@askmonty.org 11 Aug '09
by worklog-noreply@askmonty.org 11 Aug '09
11 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: options for CREATE TABLE
CREATION DATE..: Tue, 11 Aug 2009, 17:02
SUPERVISOR.....: Bothorsen
IMPLEMENTOR....: Sanja
COPIES TO......: Monty
CATEGORY.......: Server-BackLog
TASK ID........: 43 (http://askmonty.org/worklog/?tid=43)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 32 (hours remain)
ORIG. ESTIMATE.: 32
PROGRESS NOTES:
DESCRIPTION:
Add ability to create table with additional option which can be passed to engine.
Also make current options such as TRANSACTIONAL working via this mechanism.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Progress (by Monty): Backporting pool of threads to MariaDB (6)
by worklog-noreply@askmonty.org 11 Aug '09
by worklog-noreply@askmonty.org 11 Aug '09
11 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Backporting pool of threads to MariaDB
CREATION DATE..: Mon, 09 Mar 2009, 17:21
SUPERVISOR.....: Monty
IMPLEMENTOR....: Monty
COPIES TO......: Monty
CATEGORY.......: Server-Sprint
TASK ID........: 6 (http://askmonty.org/worklog/?tid=6)
VERSION........: Server-5.1
STATUS.........: Complete
PRIORITY.......: 60
WORKED HOURS...: 16
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 8
PROGRESS NOTES:
-=-=(Monty - Tue, 11 Aug 2009, 16:58)=-=-
Done, ages ago
Worked 16 hours and estimate 0 hours remain (original estimate increased by 8 hours).
-=-=(Monty - Thu, 26 Mar 2009, 00:32)=-=-
Privacy level updated.
--- /tmp/wklog.6.old.6586 2009-03-26 00:32:23.000000000 +0200
+++ /tmp/wklog.6.new.6586 2009-03-26 00:32:23.000000000 +0200
@@ -1 +1 @@
-y
+n
-=-=(Monty - Thu, 26 Mar 2009, 00:31)=-=-
Supervisor updated.
--- /tmp/wklog.6.old.6580 2009-03-26 00:31:30.000000000 +0200
+++ /tmp/wklog.6.new.6580 2009-03-26 00:31:30.000000000 +0200
@@ -1 +1 @@
-Knielsen
+Monty
-=-=(Monty - Fri, 13 Mar 2009, 02:43)=-=-
Low Level Design modified.
--- /tmp/wklog.6.old.26076 2009-03-13 02:43:17.000000000 +0200
+++ /tmp/wklog.6.new.26076 2009-03-13 02:43:17.000000000 +0200
@@ -1 +1,20 @@
+To be able to work with both one-thread-per-connection and pool-of-threads at
+the same time, I added a new global scheduler variable 'extra_thread_scheduler'
+that is always using the one-thread-per-connection method.
+
+To the THD structure was added a pointer to the 'scheduler' variable that should
+be used for this connection.
+
+To do easy handing of two connect counter and two max_connection variables, I
+added pointer to these pointer in the scheduler variable.:
+
+Other changes was:
+
+- If extra-port was <> 0, start listing to this port too
+- At connect time, set THD->scheduler to point to the given scheduler (based on
+the port that was used to connect)
+- Change some calls that was done trough functions pointer in the scheduler to
+instead use thd->scheduler->
+- Change max_connections to *thd->scheduler->max_connections
+- Change connection_count to *thd->scheduler->connection_count
-=-=(Monty - Fri, 13 Mar 2009, 02:29)=-=-
Version updated.
--- /tmp/wklog.6.old.25818 2009-03-13 02:29:16.000000000 +0200
+++ /tmp/wklog.6.new.25818 2009-03-13 02:29:16.000000000 +0200
@@ -1 +1 @@
-Server-9.x
+Server-5.1
-=-=(Monty - Fri, 13 Mar 2009, 02:29)=-=-
Status updated.
--- /tmp/wklog.6.old.25818 2009-03-13 02:29:16.000000000 +0200
+++ /tmp/wklog.6.new.25818 2009-03-13 02:29:16.000000000 +0200
@@ -1 +1 @@
-Assigned
+Complete
-=-=(Monty - Fri, 13 Mar 2009, 02:28)=-=-
High Level Description modified.
--- /tmp/wklog.6.old.25790 2009-03-13 02:28:25.000000000 +0200
+++ /tmp/wklog.6.new.25790 2009-03-13 02:28:25.000000000 +0200
@@ -8,3 +8,6 @@
Add option --extra-port to allow connections with old one-thread-per-connection
method. This is needed to allow root to login and kill threads if something
goes wrong.
+Add option --extra-max-connections to regulate how many connections can be made
+to 'extra-port'. This should work in a similar way as 'max-connections', in the
+way that one connection is reserved for a SUPER user.
-=-=(Knielsen - Mon, 09 Mar 2009, 19:02)=-=-
Version updated.
--- /tmp/wklog.6.old.10740 2009-03-09 19:02:38.000000000 +0200
+++ /tmp/wklog.6.new.10740 2009-03-09 19:02:38.000000000 +0200
@@ -1 +1 @@
-WorkLog-3.4
+Server-9.x
-=-=(Knielsen - Mon, 09 Mar 2009, 19:02)=-=-
Title modified.
--- /tmp/wklog.6.old.10740 2009-03-09 19:02:38.000000000 +0200
+++ /tmp/wklog.6.new.10740 2009-03-09 19:02:38.000000000 +0200
@@ -1 +1 @@
-Backporting pool of threads tro MariaDB
+Backporting pool of threads to MariaDB
DESCRIPTION:
Back porting pool of threads to MariaDB
We will use code for Maria 6.0, with the following extensions:
Add option: --test-ignore-wrong-options to ignore errors in enum values for
testing pool-of-threads. (Better than having --pool-of-threads command line
option just for testing)
Add option --extra-port to allow connections with old one-thread-per-connection
method. This is needed to allow root to login and kill threads if something
goes wrong.
Add option --extra-max-connections to regulate how many connections can be made
to 'extra-port'. This should work in a similar way as 'max-connections', in the
way that one connection is reserved for a SUPER user.
LOW-LEVEL DESIGN:
To be able to work with both one-thread-per-connection and pool-of-threads at
the same time, I added a new global scheduler variable 'extra_thread_scheduler'
that is always using the one-thread-per-connection method.
To the THD structure was added a pointer to the 'scheduler' variable that should
be used for this connection.
To do easy handing of two connect counter and two max_connection variables, I
added pointer to these pointer in the scheduler variable.:
Other changes was:
- If extra-port was <> 0, start listing to this port too
- At connect time, set THD->scheduler to point to the given scheduler (based on
the port that was used to connect)
- Change some calls that was done trough functions pointer in the scheduler to
instead use thd->scheduler->
- Change max_connections to *thd->scheduler->max_connections
- Change connection_count to *thd->scheduler->connection_count
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Monty): Add Sphinx storage engine to MariaDB (42)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add Sphinx storage engine to MariaDB
CREATION DATE..: Mon, 10 Aug 2009, 23:57
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Maria-BackLog
TASK ID........: 42 (http://askmonty.org/worklog/?tid=42)
VERSION........: Connector/.NET-5.1
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 16 (hours remain)
ORIG. ESTIMATE.: 16
PROGRESS NOTES:
DESCRIPTION:
Add the Sphinx storage engine to the MariaDB tree
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0
Hi!
For those that don't know what this is about:
- This is a fix for the case where you do a DROP TABLE of a MyISAM
table with key delayed MyISAM writes all the changed key pages for the
file to disk before closing and then deleting the table.
This patch is a first attempt to fix that we don't write the key pages
in case of drop.
>>>>> "Oleksandr" == Oleksandr Byelkin <Oleksandr> writes:
Oleksandr> Hi!
Oleksandr> I made different patch from you suggested maybe I am wrong but IMHO it
Oleksandr> is better because:
Oleksandr> 1) use already existing keycache calls
Oleksandr> 2) do not require additional finding table MI_INFO by name and locking
Oleksandr> around it
Oleksandr> The main idea was that if we are going to drop table it can be passed
Oleksandr> via existing table descriptors to the place where we call flush and it
Oleksandr> does not matter if other threads trying to do something with the table
Oleksandr> we will drop it in any case.
Oleksandr> === modified file 'sql/handler.h'
Oleksandr> --- sql/handler.h 2009-06-29 21:03:30 +0000
Oleksandr> +++ sql/handler.h 2009-08-09 19:52:01 +0000
Oleksandr> @@ -1342,6 +1342,7 @@ public:
Oleksandr> virtual void column_bitmaps_signal();
Oleksandr> uint get_index(void) const { return active_index; }
Oleksandr> virtual int close(void)=0;
Oleksandr> + virtual void prepare_for_delete() {}
Oleksandr> /**
Oleksandr> @retval 0 Bulk update used by handler
Why using prepare_for_delete(), instead of adding a more general call
?
Oleksandr> === modified file 'sql/lock.cc'
Oleksandr> --- sql/lock.cc 2009-04-25 10:05:32 +0000
Oleksandr> +++ sql/lock.cc 2009-08-09 22:08:41 +0000
Oleksandr> @@ -1049,10 +1049,12 @@ int lock_table_name(THD *thd, TABLE_LIST
Oleksandr> DBUG_RETURN(-1);
Oleksandr> table_list->table=table;
Oleksandr> + table->s->deleting= table_list->deleting;
Oleksandr> /* Return 1 if table is in use */
Oleksandr> DBUG_RETURN(test(remove_table_from_cache(thd, db, table_list->table_name,
Oleksandr> - check_in_use ? RTFC_NO_FLAG : RTFC_WAIT_OTHER_THREAD_FLAG)));
Oleksandr> + check_in_use ? RTFC_NO_FLAG : RTFC_WAIT_OTHER_THREAD_FLAG,
Oleksandr> + table_list->deleting)));
Oleksandr> }
Oleksandr> === modified file 'sql/mysql_priv.h'
Oleksandr> --- sql/mysql_priv.h 2009-04-25 10:05:32 +0000
Oleksandr> +++ sql/mysql_priv.h 2009-08-09 21:51:48 +0000
Oleksandr> @@ -1609,7 +1609,7 @@ uint prep_alter_part_table(THD *thd, TAB
Oleksandr> #define RTFC_WAIT_OTHER_THREAD_FLAG 0x0002
Oleksandr> #define RTFC_CHECK_KILLED_FLAG 0x0004
Oleksandr> bool remove_table_from_cache(THD *thd, const char *db, const char *table,
Oleksandr> - uint flags);
Oleksandr> + uint flags, my_bool deleting);
Oleksandr> #define NORMAL_PART_NAME 0
Oleksandr> #define TEMP_PART_NAME 1
Oleksandr> === modified file 'sql/sql_base.cc'
Oleksandr> --- sql/sql_base.cc 2009-05-19 09:28:05 +0000
Oleksandr> +++ sql/sql_base.cc 2009-08-09 21:54:54 +0000
Oleksandr> @@ -927,7 +927,7 @@ bool close_cached_tables(THD *thd, TABLE
Oleksandr> for (TABLE_LIST *table= tables; table; table= table->next_local)
Oleksandr> {
Oleksandr> if (remove_table_from_cache(thd, table->db, table->table_name,
Oleksandr> - RTFC_OWNED_BY_THD_FLAG))
Oleksandr> + RTFC_OWNED_BY_THD_FLAG, table->deleting))
Oleksandr> found=1;
Oleksandr> }
Oleksandr> if (!found)
Oleksandr> @@ -8395,7 +8395,7 @@ void flush_tables()
Oleksandr> */
Oleksandr> bool remove_table_from_cache(THD *thd, const char *db, const char *table_name,
Oleksandr> - uint flags)
Oleksandr> + uint flags, my_bool deleting)
Oleksandr> {
Oleksandr> char key[MAX_DBKEY_LENGTH];
Oleksandr> uint key_length;
Oleksandr> @@ -8482,7 +8482,10 @@ bool remove_table_from_cache(THD *thd, c
Oleksandr> }
Oleksandr> }
Oleksandr> while (unused_tables && !unused_tables->s->version)
Oleksandr> + {
Oleksandr> + unused_tables->s->deleting= deleting;
Oleksandr> VOID(hash_delete(&open_cache,(uchar*) unused_tables));
Oleksandr> + }
Oleksandr> DBUG_PRINT("info", ("Removing table from table_def_cache"));
Oleksandr> /* Remove table from table definition cache if it's not in use */
Oleksandr> @@ -8676,7 +8679,8 @@ int abort_and_upgrade_lock(ALTER_PARTITI
Oleksandr> /* If MERGE child, forward lock handling to parent. */
Oleksandr> mysql_lock_abort(lpt->thd, lpt->table->parent ? lpt->table->parent :
Oleksandr> lpt->table, TRUE);
Oleksandr> - VOID(remove_table_from_cache(lpt->thd, lpt->db, lpt->table_name, flags));
Oleksandr> + VOID(remove_table_from_cache(lpt->thd, lpt->db, lpt->table_name, flags,
Oleksandr> + FALSE));
Oleksandr> VOID(pthread_mutex_unlock(&LOCK_open));
Oleksandr> DBUG_RETURN(0);
Oleksandr> }
Oleksandr> @@ -8701,7 +8705,7 @@ void close_open_tables_and_downgrade(ALT
Oleksandr> {
Oleksandr> VOID(pthread_mutex_lock(&LOCK_open));
Oleksandr> remove_table_from_cache(lpt->thd, lpt->db, lpt->table_name,
Oleksandr> - RTFC_WAIT_OTHER_THREAD_FLAG);
Oleksandr> + RTFC_WAIT_OTHER_THREAD_FLAG, FALSE);
Oleksandr> VOID(pthread_mutex_unlock(&LOCK_open));
Oleksandr> /* If MERGE child, forward lock handling to parent. */
Oleksandr> mysql_lock_downgrade_write(lpt->thd, lpt->table->parent ? lpt->table->parent :
Oleksandr> === modified file 'sql/sql_table.cc'
Oleksandr> --- sql/sql_table.cc 2009-06-18 12:39:21 +0000
Oleksandr> +++ sql/sql_table.cc 2009-08-09 21:48:04 +0000
Oleksandr> @@ -1599,6 +1599,8 @@ int mysql_rm_table_part2(THD *thd, TABLE
Oleksandr> if ((share= get_cached_table_share(table->db, table->table_name)))
Oleksandr> table->db_type= share->db_type();
Oleksandr> + table->deleting= TRUE;
Oleksandr> +
Oleksandr> /* Disable drop of enabled log tables */
Oleksandr> if (share && (share->table_category == TABLE_CATEGORY_PERFORMANCE) &&
Oleksandr> check_if_log_table(table->db_length, table->db,
Oleksandr> @@ -1676,7 +1678,7 @@ int mysql_rm_table_part2(THD *thd, TABLE
Oleksandr> abort_locked_tables(thd, db, table->table_name);
Oleksandr> remove_table_from_cache(thd, db, table->table_name,
Oleksandr> RTFC_WAIT_OTHER_THREAD_FLAG |
Oleksandr> - RTFC_CHECK_KILLED_FLAG);
Oleksandr> + RTFC_CHECK_KILLED_FLAG, TRUE);
Oleksandr> /*
Oleksandr> If the table was used in lock tables, remember it so that
Oleksandr> unlock_table_names can free it
Oleksandr> @@ -3862,7 +3864,7 @@ void wait_while_table_is_used(THD *thd,T
Oleksandr> /* Wait until all there are no other threads that has this table open */
Oleksandr> remove_table_from_cache(thd, table->s->db.str,
Oleksandr> table->s->table_name.str,
Oleksandr> - RTFC_WAIT_OTHER_THREAD_FLAG);
Oleksandr> + RTFC_WAIT_OTHER_THREAD_FLAG, FALSE);
Oleksandr> /* extra() call must come only after all instances above are closed */
Oleksandr> VOID(table->file->extra(function));
Oleksandr> DBUG_VOID_RETURN;
Oleksandr> @@ -4366,7 +4368,7 @@ static bool mysql_admin_table(THD* thd,
Oleksandr> remove_table_from_cache(thd, table->table->s->db.str,
Oleksandr> table->table->s->table_name.str,
Oleksandr> RTFC_WAIT_OTHER_THREAD_FLAG |
Oleksandr> - RTFC_CHECK_KILLED_FLAG);
Oleksandr> + RTFC_CHECK_KILLED_FLAG, FALSE);
Oleksandr> thd->exit_cond(old_message);
Oleksandr> DBUG_EXECUTE_IF("wait_in_mysql_admin_table", wait_for_kill_signal(thd););
Oleksandr> if (thd->killed)
Oleksandr> @@ -4624,7 +4626,8 @@ send_result_message:
Oleksandr> {
Oleksandr> pthread_mutex_lock(&LOCK_open);
Oleksandr> remove_table_from_cache(thd, table->table->s->db.str,
Oleksandr> - table->table->s->table_name.str, RTFC_NO_FLAG);
Oleksandr> + table->table->s->table_name.str,
Oleksandr> + RTFC_NO_FLAG, FALSE);
Oleksandr> pthread_mutex_unlock(&LOCK_open);
Oleksandr> }
Oleksandr> /* May be something modified consequently we have to invalidate cache */
Oleksandr> === modified file 'sql/table.cc'
Oleksandr> --- sql/table.cc 2009-06-29 21:03:30 +0000
Oleksandr> +++ sql/table.cc 2009-08-09 20:46:07 +0000
Oleksandr> @@ -1960,7 +1960,12 @@ int closefrm(register TABLE *table, bool
Oleksandr> DBUG_PRINT("enter", ("table: 0x%lx", (long) table));
Oleksandr> if (table->db_stat)
Oleksandr> - error=table->file->close();
Oleksandr> + {
Oleksandr> + if (table->s->deleting)
Oleksandr> + table->file->prepare_for_delete();
Oleksandr> + error= table->file->close();
Oleksandr> + }
Oleksandr> +
As we have a handler here, we not instead do ?
table->file->extra(HA_EXTRA_PREPARE_FOR_DROP);
There is no reason to add an extra prepare_for_delete() here.
Oleksandr> --- storage/myisam/ha_myisam.cc 2009-06-29 21:03:30 +0000
Oleksandr> +++ storage/myisam/ha_myisam.cc 2009-08-09 20:42:15 +0000
Oleksandr> @@ -26,7 +26,9 @@
Oleksandr> #include <myisampack.h>
Oleksandr> #include "ha_myisam.h"
Oleksandr> #include <stdarg.h>
Oleksandr> +C_MODE_START
Oleksandr> #include "myisamdef.h"
Oleksandr> +C_MODE_END
Oleksandr> #include "rt_index.h"
With my suggested change, no reason to do any changes in ha_myisam.cc
or ha_myisam.h
<cut>
Oleksandr> +++ storage/myisam/mi_close.c 2009-08-09 22:01:32 +0000
Oleksandr> @@ -65,8 +65,9 @@ int mi_close(register MI_INFO *info)
Oleksandr> {
Oleksandr> if (share->kfile >= 0 &&
Oleksandr> flush_key_blocks(share->key_cache, share->kfile,
Oleksandr> - share->temporary ? FLUSH_IGNORE_CHANGED :
Oleksandr> - FLUSH_RELEASE))
Oleksandr> + (share->temporary || share->deleting) ?
Oleksandr> + FLUSH_IGNORE_CHANGED :
Oleksandr> + FLUSH_RELEASE))
Oleksandr> error=my_errno;
Oleksandr> if (share->kfile >= 0)
Oleksandr> {
No reason for the above change.
1) In my suggestion, no reason to do this.
2) If we implement it your way, we could reuse 'share->temporary' for
this cse.
Oleksandr> === modified file 'storage/myisam/mi_locking.c'
Oleksandr> --- storage/myisam/mi_locking.c 2009-04-01 09:34:52 +0000
Oleksandr> +++ storage/myisam/mi_locking.c 2009-08-09 20:42:00 +0000
Oleksandr> @@ -68,7 +68,10 @@ int mi_lock_database(MI_INFO *info, int
Oleksandr> --share->tot_locks;
Oleksandr> if (info->lock_type == F_WRLCK && !share->w_locks &&
Oleksandr> !share->delay_key_write && flush_key_blocks(share->key_cache,
Oleksandr> - share->kfile,FLUSH_KEEP))
Oleksandr> + share->kfile,
Oleksandr> + (share->deleting ?
Oleksandr> + FLUSH_IGNORE_CHANGED :
Oleksandr> + FLUSH_KEEP)))
No reason to do the above. Reasons:
- In case of delay_key_write, they above code will not be executed.
- If delay_key_write is not set, things was flushed at previous
statement.
Did I miss some case?
Oleksandr> --- storage/myisam/myisamdef.h 2009-04-25 09:04:38 +0000
Oleksandr> +++ storage/myisam/myisamdef.h 2009-08-09 20:41:25 +0000
Oleksandr> @@ -218,6 +218,7 @@ typedef struct st_mi_isam_share
Oleksandr> my_bool changed, /* If changed since lock */
Oleksandr> global_changed, /* If changed since open */
Oleksandr> not_flushed, temporary, delay_key_write, concurrent_insert;
Oleksandr> + my_bool deleting; /* we are going to delete this table */
Not needed.
---------------
Other fixes:
Please fix mi_extra.c as we dicussed (move the #ifdef so that things
are flushed)
do also the folloing fix to ma_extra.c:
if (share->kfile.file >= 0)
_ma_decrement_open_count(info);
->
if (share->kfile.file >= 0 && do_flush)
_ma_decrement_open_count(info);
The idea is that we should not decrement the open_count in case of
drop. This will ensure that if we die between flushing the key cache
and close, the index will be rechecked.
-------------
Regards,
Monty
1
0

[Maria-developers] Progress (by Guest): Replication tasks (39)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Replication tasks
CREATION DATE..: Sun, 09 Aug 2009, 12:24
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-RawIdeaBin
TASK ID........: 39 (http://askmonty.org/worklog/?tid=39)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 17
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Mon, 10 Aug 2009, 16:32)=-=-
Adding 1 hour for Monty's initial work on starting the architecture review.
Worked 1 hour and estimate 0 hours remain (original estimate increased by 1 hour).
-=-=(Psergey - Mon, 10 Aug 2009, 15:59)=-=-
Re-searched and added subtasks.
Worked 16 hours and estimate 0 hours remain (original estimate increased by 16 hours).
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Psergey - Sun, 09 Aug 2009, 12:27)=-=-
Dependency created: 39 now depends on 36
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 38
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 37
DESCRIPTION:
A combine task for all replication tasks.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Progress (by Guest): Replication tasks (39)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Replication tasks
CREATION DATE..: Sun, 09 Aug 2009, 12:24
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-RawIdeaBin
TASK ID........: 39 (http://askmonty.org/worklog/?tid=39)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 17
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Mon, 10 Aug 2009, 16:32)=-=-
Adding 1 hour for Monty's initial work on starting the architecture review.
Worked 1 hour and estimate 0 hours remain (original estimate increased by 1 hour).
-=-=(Psergey - Mon, 10 Aug 2009, 15:59)=-=-
Re-searched and added subtasks.
Worked 16 hours and estimate 0 hours remain (original estimate increased by 16 hours).
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Psergey - Sun, 09 Aug 2009, 12:27)=-=-
Dependency created: 39 now depends on 36
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 38
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 37
DESCRIPTION:
A combine task for all replication tasks.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Progress (by Psergey): Replication tasks (39)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Replication tasks
CREATION DATE..: Sun, 09 Aug 2009, 12:24
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-RawIdeaBin
TASK ID........: 39 (http://askmonty.org/worklog/?tid=39)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 16
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 10 Aug 2009, 15:59)=-=-
Re-searched and added subtasks.
Worked 16 hours and estimate 0 hours remain (original estimate increased by 16 hours).
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Psergey - Sun, 09 Aug 2009, 12:27)=-=-
Dependency created: 39 now depends on 36
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 38
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 37
DESCRIPTION:
A combine task for all replication tasks.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Progress (by Psergey): Replication tasks (39)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Replication tasks
CREATION DATE..: Sun, 09 Aug 2009, 12:24
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-RawIdeaBin
TASK ID........: 39 (http://askmonty.org/worklog/?tid=39)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 16
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 10 Aug 2009, 15:59)=-=-
Re-searched and added subtasks.
Worked 16 hours and estimate 0 hours remain (original estimate increased by 16 hours).
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Psergey - Sun, 09 Aug 2009, 12:27)=-=-
Dependency created: 39 now depends on 36
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 38
-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 37
DESCRIPTION:
A combine task for all replication tasks.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Add a mysqlbinlog option to filter certain kinds of statements (41)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter certain kinds of statements
CREATION DATE..: Mon, 10 Aug 2009, 15:30
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-BackLog
TASK ID........: 41 (http://askmonty.org/worklog/?tid=41)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 10 Aug 2009, 15:47)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.13282 2009-08-10 15:47:13.000000000 +0300
+++ /tmp/wklog.41.new.13282 2009-08-10 15:47:13.000000000 +0300
@@ -2,3 +2,10 @@
- If we decide to parse the statement, SQL-verb filtering will be trivial
- If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
+
+If we chose the second, we'll have to perform certain normalization before
+matching the patterns:
+ - Remove all comments from the command
+ - Remove all pre-space
+ - Compare the string case-insensitively
+ - etc
-=-=(Psergey - Mon, 10 Aug 2009, 15:35)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.12689 2009-08-10 15:35:04.000000000 +0300
+++ /tmp/wklog.41.new.12689 2009-08-10 15:35:04.000000000 +0300
@@ -1 +1,4 @@
-
+The implementation will depend on design choices made in WL#40:
+- If we decide to parse the statement, SQL-verb filtering will be trivial
+- If we decide not to parse the statement, we still can reliably distinguish the
+statement by matching the first characters against a set of patterns.
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
DESCRIPTION:
Add a mysqlbinlog option to filter certain kinds of statements, i.e. (syntax
subject to discussion):
mysqlbinlog --exclude='alter table,drop table,alter database,...'
HIGH-LEVEL SPECIFICATION:
The implementation will depend on design choices made in WL#40:
- If we decide to parse the statement, SQL-verb filtering will be trivial
- If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
If we chose the second, we'll have to perform certain normalization before
matching the patterns:
- Remove all comments from the command
- Remove all pre-space
- Compare the string case-insensitively
- etc
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Add a mysqlbinlog option to filter certain kinds of statements (41)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter certain kinds of statements
CREATION DATE..: Mon, 10 Aug 2009, 15:30
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-BackLog
TASK ID........: 41 (http://askmonty.org/worklog/?tid=41)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 10 Aug 2009, 15:47)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.13282 2009-08-10 15:47:13.000000000 +0300
+++ /tmp/wklog.41.new.13282 2009-08-10 15:47:13.000000000 +0300
@@ -2,3 +2,10 @@
- If we decide to parse the statement, SQL-verb filtering will be trivial
- If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
+
+If we chose the second, we'll have to perform certain normalization before
+matching the patterns:
+ - Remove all comments from the command
+ - Remove all pre-space
+ - Compare the string case-insensitively
+ - etc
-=-=(Psergey - Mon, 10 Aug 2009, 15:35)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.12689 2009-08-10 15:35:04.000000000 +0300
+++ /tmp/wklog.41.new.12689 2009-08-10 15:35:04.000000000 +0300
@@ -1 +1,4 @@
-
+The implementation will depend on design choices made in WL#40:
+- If we decide to parse the statement, SQL-verb filtering will be trivial
+- If we decide not to parse the statement, we still can reliably distinguish the
+statement by matching the first characters against a set of patterns.
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
DESCRIPTION:
Add a mysqlbinlog option to filter certain kinds of statements, i.e. (syntax
subject to discussion):
mysqlbinlog --exclude='alter table,drop table,alter database,...'
HIGH-LEVEL SPECIFICATION:
The implementation will depend on design choices made in WL#40:
- If we decide to parse the statement, SQL-verb filtering will be trivial
- If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
If we chose the second, we'll have to perform certain normalization before
matching the patterns:
- Remove all comments from the command
- Remove all pre-space
- Compare the string case-insensitively
- etc
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Add a mysqlbinlog option to change the used database (36)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to change the used database
CREATION DATE..: Fri, 07 Aug 2009, 14:57
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 36 (http://askmonty.org/worklog/?tid=36)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 10 Aug 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.13035 2009-08-10 15:41:51.000000000 +0300
+++ /tmp/wklog.36.new.13035 2009-08-10 15:41:51.000000000 +0300
@@ -1,5 +1,7 @@
Context
-------
+(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
+overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
-=-=(Guest - Mon, 10 Aug 2009, 11:12)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.6580 2009-08-10 11:12:36.000000000 +0300
+++ /tmp/wklog.36.new.6580 2009-08-10 11:12:36.000000000 +0300
@@ -1,4 +1,3 @@
-
Context
-------
At the moment, the server has a replication slave option
@@ -67,6 +66,6 @@
It will be possible to do the rewrites either on the slave (
--replicate-rewrite-db will work for all kinds of statements), or in
-mysqlbinlog (adding a comment is easy and doesn't require use to parse the
-statement).
+mysqlbinlog (adding a comment is easy and doesn't require mysqlbinlog to
+parse the statement).
-=-=(Psergey - Sun, 09 Aug 2009, 23:53)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.13425 2009-08-09 23:53:54.000000000 +0300
+++ /tmp/wklog.36.new.13425 2009-08-09 23:53:54.000000000 +0300
@@ -1 +1,72 @@
+Context
+-------
+At the moment, the server has a replication slave option
+
+ --replicate-rewrite-db="from->to"
+
+the option affects
+- Table_map_log_event (all RBR events)
+- Load_log_event (LOAD DATA)
+- Query_log_event (SBR-based updates, with the usual assumption that the
+ statement refers to tables in current database, so that changing the current
+ database will make the statement to work on a table in a different database).
+
+What we could do
+----------------
+
+Option1: make mysqlbinlog accept --replicate-rewrite-db option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Make mysqlbinlog accept --replicate-rewrite-db options and process them to the
+same extent as replication slave would process --replicate-rewrite-db option.
+
+
+Option2: Add database-agnostic RBR events and --strip-db option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Right now RBR events require a databasename. It is not possible to have RBR
+event stream that won't mention which database the events are for. When I
+tried to use debugger and specify empty database name, attempt to apply the
+binlog resulted in this error:
+
+090809 17:38:44 [ERROR] Slave SQL: Error 'Table '.tablename' doesn't exist' on
+opening tables,
+
+We could do as follows:
+- Make the server interpret empty database name in RBR event (i.e. in a
+ Table_map_log_event) as "use current database". Binlog slave thread
+ probably should not allow such events as it doesn't have a natural current
+ database.
+- Add a mysqlbinlog --strip-db option that would
+ = not produce any "USE dbname" statements
+ = change databasename for all RBR events to be empty
+
+That way, mysqlbinlog output will be database-agnostic and apply to the
+current database.
+(this will have the usual limitations that we assume that all statements in
+the binlog refer to the current database).
+
+Option3: Enhance database rewrite
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If there is a need to support database change for statements that use
+dbname.tablename notation and are replicated as statements (i.e. are DDL
+statements and/or DML statements that are binlogged as statements),
+then that could be supported as follows:
+
+- Make the server's parser recognize special form of comments
+
+ /* !database-alias(oldname,newname) */
+
+ and save the mapping somewhere
+
+- Put the hooks in table open and name resolution code to use the saved
+ mapping.
+
+
+Once we've done the above, it will be easy to perform a complete,
+no-compromise or restrictions database name change in binary log.
+
+It will be possible to do the rewrites either on the slave (
+--replicate-rewrite-db will work for all kinds of statements), or in
+mysqlbinlog (adding a comment is easy and doesn't require use to parse the
+statement).
+
-=-=(Psergey - Sun, 09 Aug 2009, 12:27)=-=-
Dependency created: 39 now depends on 36
-=-=(Psergey - Fri, 07 Aug 2009, 14:57)=-=-
Title modified.
--- /tmp/wklog.36.old.14687 2009-08-07 14:57:49.000000000 +0300
+++ /tmp/wklog.36.new.14687 2009-08-07 14:57:49.000000000 +0300
@@ -1 +1 @@
-Add a mysqlbinlog option to change the database
+Add a mysqlbinlog option to change the used database
DESCRIPTION:
Sometimes there is a need to take a binary log and apply it to a database with
a different name than the original name of the database on binlog producer.
If one is using statement-based replication, he can achieve this by grepping
out "USE dbname" statements out of the output of mysqlbinlog(*). With
row-based replication this is no longer possible, as database name is encoded
within the the BINLOG '....' statement.
This task is about adding an option to mysqlbinlog that would allow to change
the names of used databases in both RBR and SBR events.
(*) this implies that all statements refer to tables in the current database,
doesn't catch updates made inside stored functions and so forth, but still
works for a practially-important subset of cases.
HIGH-LEVEL SPECIFICATION:
Context
-------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
the option affects
- Table_map_log_event (all RBR events)
- Load_log_event (LOAD DATA)
- Query_log_event (SBR-based updates, with the usual assumption that the
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
What we could do
----------------
Option1: make mysqlbinlog accept --replicate-rewrite-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Make mysqlbinlog accept --replicate-rewrite-db options and process them to the
same extent as replication slave would process --replicate-rewrite-db option.
Option2: Add database-agnostic RBR events and --strip-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Right now RBR events require a databasename. It is not possible to have RBR
event stream that won't mention which database the events are for. When I
tried to use debugger and specify empty database name, attempt to apply the
binlog resulted in this error:
090809 17:38:44 [ERROR] Slave SQL: Error 'Table '.tablename' doesn't exist' on
opening tables,
We could do as follows:
- Make the server interpret empty database name in RBR event (i.e. in a
Table_map_log_event) as "use current database". Binlog slave thread
probably should not allow such events as it doesn't have a natural current
database.
- Add a mysqlbinlog --strip-db option that would
= not produce any "USE dbname" statements
= change databasename for all RBR events to be empty
That way, mysqlbinlog output will be database-agnostic and apply to the
current database.
(this will have the usual limitations that we assume that all statements in
the binlog refer to the current database).
Option3: Enhance database rewrite
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If there is a need to support database change for statements that use
dbname.tablename notation and are replicated as statements (i.e. are DDL
statements and/or DML statements that are binlogged as statements),
then that could be supported as follows:
- Make the server's parser recognize special form of comments
/* !database-alias(oldname,newname) */
and save the mapping somewhere
- Put the hooks in table open and name resolution code to use the saved
mapping.
Once we've done the above, it will be easy to perform a complete,
no-compromise or restrictions database name change in binary log.
It will be possible to do the rewrites either on the slave (
--replicate-rewrite-db will work for all kinds of statements), or in
mysqlbinlog (adding a comment is easy and doesn't require mysqlbinlog to
parse the statement).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Add a mysqlbinlog option to change the used database (36)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to change the used database
CREATION DATE..: Fri, 07 Aug 2009, 14:57
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 36 (http://askmonty.org/worklog/?tid=36)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 10 Aug 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.13035 2009-08-10 15:41:51.000000000 +0300
+++ /tmp/wklog.36.new.13035 2009-08-10 15:41:51.000000000 +0300
@@ -1,5 +1,7 @@
Context
-------
+(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
+overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
-=-=(Guest - Mon, 10 Aug 2009, 11:12)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.6580 2009-08-10 11:12:36.000000000 +0300
+++ /tmp/wklog.36.new.6580 2009-08-10 11:12:36.000000000 +0300
@@ -1,4 +1,3 @@
-
Context
-------
At the moment, the server has a replication slave option
@@ -67,6 +66,6 @@
It will be possible to do the rewrites either on the slave (
--replicate-rewrite-db will work for all kinds of statements), or in
-mysqlbinlog (adding a comment is easy and doesn't require use to parse the
-statement).
+mysqlbinlog (adding a comment is easy and doesn't require mysqlbinlog to
+parse the statement).
-=-=(Psergey - Sun, 09 Aug 2009, 23:53)=-=-
High-Level Specification modified.
--- /tmp/wklog.36.old.13425 2009-08-09 23:53:54.000000000 +0300
+++ /tmp/wklog.36.new.13425 2009-08-09 23:53:54.000000000 +0300
@@ -1 +1,72 @@
+Context
+-------
+At the moment, the server has a replication slave option
+
+ --replicate-rewrite-db="from->to"
+
+the option affects
+- Table_map_log_event (all RBR events)
+- Load_log_event (LOAD DATA)
+- Query_log_event (SBR-based updates, with the usual assumption that the
+ statement refers to tables in current database, so that changing the current
+ database will make the statement to work on a table in a different database).
+
+What we could do
+----------------
+
+Option1: make mysqlbinlog accept --replicate-rewrite-db option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Make mysqlbinlog accept --replicate-rewrite-db options and process them to the
+same extent as replication slave would process --replicate-rewrite-db option.
+
+
+Option2: Add database-agnostic RBR events and --strip-db option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Right now RBR events require a databasename. It is not possible to have RBR
+event stream that won't mention which database the events are for. When I
+tried to use debugger and specify empty database name, attempt to apply the
+binlog resulted in this error:
+
+090809 17:38:44 [ERROR] Slave SQL: Error 'Table '.tablename' doesn't exist' on
+opening tables,
+
+We could do as follows:
+- Make the server interpret empty database name in RBR event (i.e. in a
+ Table_map_log_event) as "use current database". Binlog slave thread
+ probably should not allow such events as it doesn't have a natural current
+ database.
+- Add a mysqlbinlog --strip-db option that would
+ = not produce any "USE dbname" statements
+ = change databasename for all RBR events to be empty
+
+That way, mysqlbinlog output will be database-agnostic and apply to the
+current database.
+(this will have the usual limitations that we assume that all statements in
+the binlog refer to the current database).
+
+Option3: Enhance database rewrite
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If there is a need to support database change for statements that use
+dbname.tablename notation and are replicated as statements (i.e. are DDL
+statements and/or DML statements that are binlogged as statements),
+then that could be supported as follows:
+
+- Make the server's parser recognize special form of comments
+
+ /* !database-alias(oldname,newname) */
+
+ and save the mapping somewhere
+
+- Put the hooks in table open and name resolution code to use the saved
+ mapping.
+
+
+Once we've done the above, it will be easy to perform a complete,
+no-compromise or restrictions database name change in binary log.
+
+It will be possible to do the rewrites either on the slave (
+--replicate-rewrite-db will work for all kinds of statements), or in
+mysqlbinlog (adding a comment is easy and doesn't require use to parse the
+statement).
+
-=-=(Psergey - Sun, 09 Aug 2009, 12:27)=-=-
Dependency created: 39 now depends on 36
-=-=(Psergey - Fri, 07 Aug 2009, 14:57)=-=-
Title modified.
--- /tmp/wklog.36.old.14687 2009-08-07 14:57:49.000000000 +0300
+++ /tmp/wklog.36.new.14687 2009-08-07 14:57:49.000000000 +0300
@@ -1 +1 @@
-Add a mysqlbinlog option to change the database
+Add a mysqlbinlog option to change the used database
DESCRIPTION:
Sometimes there is a need to take a binary log and apply it to a database with
a different name than the original name of the database on binlog producer.
If one is using statement-based replication, he can achieve this by grepping
out "USE dbname" statements out of the output of mysqlbinlog(*). With
row-based replication this is no longer possible, as database name is encoded
within the the BINLOG '....' statement.
This task is about adding an option to mysqlbinlog that would allow to change
the names of used databases in both RBR and SBR events.
(*) this implies that all statements refer to tables in the current database,
doesn't catch updates made inside stored functions and so forth, but still
works for a practially-important subset of cases.
HIGH-LEVEL SPECIFICATION:
Context
-------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has a replication slave option
--replicate-rewrite-db="from->to"
the option affects
- Table_map_log_event (all RBR events)
- Load_log_event (LOAD DATA)
- Query_log_event (SBR-based updates, with the usual assumption that the
statement refers to tables in current database, so that changing the current
database will make the statement to work on a table in a different database).
What we could do
----------------
Option1: make mysqlbinlog accept --replicate-rewrite-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Make mysqlbinlog accept --replicate-rewrite-db options and process them to the
same extent as replication slave would process --replicate-rewrite-db option.
Option2: Add database-agnostic RBR events and --strip-db option
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Right now RBR events require a databasename. It is not possible to have RBR
event stream that won't mention which database the events are for. When I
tried to use debugger and specify empty database name, attempt to apply the
binlog resulted in this error:
090809 17:38:44 [ERROR] Slave SQL: Error 'Table '.tablename' doesn't exist' on
opening tables,
We could do as follows:
- Make the server interpret empty database name in RBR event (i.e. in a
Table_map_log_event) as "use current database". Binlog slave thread
probably should not allow such events as it doesn't have a natural current
database.
- Add a mysqlbinlog --strip-db option that would
= not produce any "USE dbname" statements
= change databasename for all RBR events to be empty
That way, mysqlbinlog output will be database-agnostic and apply to the
current database.
(this will have the usual limitations that we assume that all statements in
the binlog refer to the current database).
Option3: Enhance database rewrite
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If there is a need to support database change for statements that use
dbname.tablename notation and are replicated as statements (i.e. are DDL
statements and/or DML statements that are binlogged as statements),
then that could be supported as follows:
- Make the server's parser recognize special form of comments
/* !database-alias(oldname,newname) */
and save the mapping somewhere
- Put the hooks in table open and name resolution code to use the saved
mapping.
Once we've done the above, it will be easy to perform a complete,
no-compromise or restrictions database name change in binary log.
It will be possible to do the rewrites either on the slave (
--replicate-rewrite-db will work for all kinds of statements), or in
mysqlbinlog (adding a comment is easy and doesn't require mysqlbinlog to
parse the statement).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Add a mysqlbinlog option to filter updates to certain tables (40)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter updates to certain tables
CREATION DATE..: Mon, 10 Aug 2009, 13:25
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 40 (http://askmonty.org/worklog/?tid=40)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 10 Aug 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.12989 2009-08-10 15:41:23.000000000 +0300
+++ /tmp/wklog.40.new.12989 2009-08-10 15:41:23.000000000 +0300
@@ -1,6 +1,7 @@
-
1. Context
----------
+(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
+overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High Level Description modified.
--- /tmp/wklog.40.old.16985 2009-08-10 14:51:59.000000000 +0300
+++ /tmp/wklog.40.new.16985 2009-08-10 14:51:59.000000000 +0300
@@ -1,3 +1,4 @@
Replication slave can be set to filter updates to certain tables with
---replicate-[wild-]{do,ignore}-table options. This task is about adding similar
-functionality to mysqlbinlog.
+--replicate-[wild-]{do,ignore}-table options.
+
+This task is about adding similar functionality to mysqlbinlog.
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.16949 2009-08-10 14:51:33.000000000 +0300
+++ /tmp/wklog.40.new.16949 2009-08-10 14:51:33.000000000 +0300
@@ -1 +1,73 @@
+1. Context
+----------
+At the moment, the server has these replication slave options:
+
+ --replicate-do-table=db.tbl
+ --replicate-ignore-table=db.tbl
+ --replicate-wild-do-table=pattern.pattern
+ --replicate-wild-ignore-table=pattern.pattern
+
+They affect both RBR and SBR events. SBR events are checked after the
+statement has been parsed, the server iterates over list of used tables and
+checks them againist --replicate instructions.
+
+What is interesting is that this scheme still allows to update the ignored
+table through a VIEW.
+
+2. Table filtering in mysqlbinlog
+---------------------------------
+
+Per-table filtering of RBR events is easy (as it is relatively easy to extract
+the name of the table that the event applies to).
+
+Per-table filtering of SBR events is hard, as generally it is not apparent
+which tables the statement refers to.
+
+This opens possible options:
+
+2.1 Put the parser into mysqlbinlog
+-----------------------------------
+Once we have a full parser in mysqlbinlog, we'll be able to check which tables
+are used by a statement, and will allow to show behaviour identical to those
+that one obtains when using --replicate-* slave options.
+
+(It is not clear how much effort is needed to put the parser into mysqlbinlog.
+Any guesses?)
+
+
+2.2 Use dumb regexp match
+-------------------------
+Use a really dumb approach. A query is considered to be modifying table X if
+it matches an expression
+
+CREATE TABLE $tablename
+DROP $tablename
+UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
+DELETE ...$tablename ... WHERE // same as above
+ALTER TABLE $tablename
+.. etc (go get from the grammar) ..
+
+The advantage over doing the same in awk is that mysqlbinlog will also process
+RBR statements, and together with that will provide a working solution for
+those who are careful with their table names not mixing with string constants
+and such.
+
+(TODO: string constants are of particular concern as they come from
+[potentially hostile] users, unlike e.g. table aliases which come from
+[not hostile] developers. Remove also all string constants before attempting
+to do match?)
+
+2.3 Have the master put annotations
+-----------------------------------
+We could add a master option so that it injects into query a mark that tells
+which tables the query will affect, e.g. for the query
+
+ UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
+
+
+the binlog will have
+
+ /* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
+
+and further processing in mysqlbinlog will be trivial.
DESCRIPTION:
Replication slave can be set to filter updates to certain tables with
--replicate-[wild-]{do,ignore}-table options.
This task is about adding similar functionality to mysqlbinlog.
HIGH-LEVEL SPECIFICATION:
1. Context
----------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
--replicate-ignore-table=db.tbl
--replicate-wild-do-table=pattern.pattern
--replicate-wild-ignore-table=pattern.pattern
They affect both RBR and SBR events. SBR events are checked after the
statement has been parsed, the server iterates over list of used tables and
checks them againist --replicate instructions.
What is interesting is that this scheme still allows to update the ignored
table through a VIEW.
2. Table filtering in mysqlbinlog
---------------------------------
Per-table filtering of RBR events is easy (as it is relatively easy to extract
the name of the table that the event applies to).
Per-table filtering of SBR events is hard, as generally it is not apparent
which tables the statement refers to.
This opens possible options:
2.1 Put the parser into mysqlbinlog
-----------------------------------
Once we have a full parser in mysqlbinlog, we'll be able to check which tables
are used by a statement, and will allow to show behaviour identical to those
that one obtains when using --replicate-* slave options.
(It is not clear how much effort is needed to put the parser into mysqlbinlog.
Any guesses?)
2.2 Use dumb regexp match
-------------------------
Use a really dumb approach. A query is considered to be modifying table X if
it matches an expression
CREATE TABLE $tablename
DROP $tablename
UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
DELETE ...$tablename ... WHERE // same as above
ALTER TABLE $tablename
.. etc (go get from the grammar) ..
The advantage over doing the same in awk is that mysqlbinlog will also process
RBR statements, and together with that will provide a working solution for
those who are careful with their table names not mixing with string constants
and such.
(TODO: string constants are of particular concern as they come from
[potentially hostile] users, unlike e.g. table aliases which come from
[not hostile] developers. Remove also all string constants before attempting
to do match?)
2.3 Have the master put annotations
-----------------------------------
We could add a master option so that it injects into query a mark that tells
which tables the query will affect, e.g. for the query
UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
the binlog will have
/* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
and further processing in mysqlbinlog will be trivial.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Add a mysqlbinlog option to filter updates to certain tables (40)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter updates to certain tables
CREATION DATE..: Mon, 10 Aug 2009, 13:25
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 40 (http://askmonty.org/worklog/?tid=40)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 10 Aug 2009, 15:41)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.12989 2009-08-10 15:41:23.000000000 +0300
+++ /tmp/wklog.40.new.12989 2009-08-10 15:41:23.000000000 +0300
@@ -1,6 +1,7 @@
-
1. Context
----------
+(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
+overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
-=-=(Guest - Mon, 10 Aug 2009, 14:52)=-=-
Dependency created: 39 now depends on 40
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High Level Description modified.
--- /tmp/wklog.40.old.16985 2009-08-10 14:51:59.000000000 +0300
+++ /tmp/wklog.40.new.16985 2009-08-10 14:51:59.000000000 +0300
@@ -1,3 +1,4 @@
Replication slave can be set to filter updates to certain tables with
---replicate-[wild-]{do,ignore}-table options. This task is about adding similar
-functionality to mysqlbinlog.
+--replicate-[wild-]{do,ignore}-table options.
+
+This task is about adding similar functionality to mysqlbinlog.
-=-=(Guest - Mon, 10 Aug 2009, 14:51)=-=-
High-Level Specification modified.
--- /tmp/wklog.40.old.16949 2009-08-10 14:51:33.000000000 +0300
+++ /tmp/wklog.40.new.16949 2009-08-10 14:51:33.000000000 +0300
@@ -1 +1,73 @@
+1. Context
+----------
+At the moment, the server has these replication slave options:
+
+ --replicate-do-table=db.tbl
+ --replicate-ignore-table=db.tbl
+ --replicate-wild-do-table=pattern.pattern
+ --replicate-wild-ignore-table=pattern.pattern
+
+They affect both RBR and SBR events. SBR events are checked after the
+statement has been parsed, the server iterates over list of used tables and
+checks them againist --replicate instructions.
+
+What is interesting is that this scheme still allows to update the ignored
+table through a VIEW.
+
+2. Table filtering in mysqlbinlog
+---------------------------------
+
+Per-table filtering of RBR events is easy (as it is relatively easy to extract
+the name of the table that the event applies to).
+
+Per-table filtering of SBR events is hard, as generally it is not apparent
+which tables the statement refers to.
+
+This opens possible options:
+
+2.1 Put the parser into mysqlbinlog
+-----------------------------------
+Once we have a full parser in mysqlbinlog, we'll be able to check which tables
+are used by a statement, and will allow to show behaviour identical to those
+that one obtains when using --replicate-* slave options.
+
+(It is not clear how much effort is needed to put the parser into mysqlbinlog.
+Any guesses?)
+
+
+2.2 Use dumb regexp match
+-------------------------
+Use a really dumb approach. A query is considered to be modifying table X if
+it matches an expression
+
+CREATE TABLE $tablename
+DROP $tablename
+UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
+DELETE ...$tablename ... WHERE // same as above
+ALTER TABLE $tablename
+.. etc (go get from the grammar) ..
+
+The advantage over doing the same in awk is that mysqlbinlog will also process
+RBR statements, and together with that will provide a working solution for
+those who are careful with their table names not mixing with string constants
+and such.
+
+(TODO: string constants are of particular concern as they come from
+[potentially hostile] users, unlike e.g. table aliases which come from
+[not hostile] developers. Remove also all string constants before attempting
+to do match?)
+
+2.3 Have the master put annotations
+-----------------------------------
+We could add a master option so that it injects into query a mark that tells
+which tables the query will affect, e.g. for the query
+
+ UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
+
+
+the binlog will have
+
+ /* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
+
+and further processing in mysqlbinlog will be trivial.
DESCRIPTION:
Replication slave can be set to filter updates to certain tables with
--replicate-[wild-]{do,ignore}-table options.
This task is about adding similar functionality to mysqlbinlog.
HIGH-LEVEL SPECIFICATION:
1. Context
----------
(See http://askmonty.org/wiki/index.php/Scratch/ReplicationOptions for global
overview)
At the moment, the server has these replication slave options:
--replicate-do-table=db.tbl
--replicate-ignore-table=db.tbl
--replicate-wild-do-table=pattern.pattern
--replicate-wild-ignore-table=pattern.pattern
They affect both RBR and SBR events. SBR events are checked after the
statement has been parsed, the server iterates over list of used tables and
checks them againist --replicate instructions.
What is interesting is that this scheme still allows to update the ignored
table through a VIEW.
2. Table filtering in mysqlbinlog
---------------------------------
Per-table filtering of RBR events is easy (as it is relatively easy to extract
the name of the table that the event applies to).
Per-table filtering of SBR events is hard, as generally it is not apparent
which tables the statement refers to.
This opens possible options:
2.1 Put the parser into mysqlbinlog
-----------------------------------
Once we have a full parser in mysqlbinlog, we'll be able to check which tables
are used by a statement, and will allow to show behaviour identical to those
that one obtains when using --replicate-* slave options.
(It is not clear how much effort is needed to put the parser into mysqlbinlog.
Any guesses?)
2.2 Use dumb regexp match
-------------------------
Use a really dumb approach. A query is considered to be modifying table X if
it matches an expression
CREATE TABLE $tablename
DROP $tablename
UPDATE ...$tablename ... SET // here '...' can't contain the word 'SET'
DELETE ...$tablename ... WHERE // same as above
ALTER TABLE $tablename
.. etc (go get from the grammar) ..
The advantage over doing the same in awk is that mysqlbinlog will also process
RBR statements, and together with that will provide a working solution for
those who are careful with their table names not mixing with string constants
and such.
(TODO: string constants are of particular concern as they come from
[potentially hostile] users, unlike e.g. table aliases which come from
[not hostile] developers. Remove also all string constants before attempting
to do match?)
2.3 Have the master put annotations
-----------------------------------
We could add a master option so that it injects into query a mark that tells
which tables the query will affect, e.g. for the query
UPDATE t1 LEFT JOIN db3.t2 ON ... WHERE ...
the binlog will have
/* !mysqlbinlog: updates t1,db3.t2 */ UPDATE t1 LEFT JOIN ...
and further processing in mysqlbinlog will be trivial.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Add a mysqlbinlog option to filter certain kinds of statements (41)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter certain kinds of statements
CREATION DATE..: Mon, 10 Aug 2009, 15:30
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-BackLog
TASK ID........: 41 (http://askmonty.org/worklog/?tid=41)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 10 Aug 2009, 15:35)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.12689 2009-08-10 15:35:04.000000000 +0300
+++ /tmp/wklog.41.new.12689 2009-08-10 15:35:04.000000000 +0300
@@ -1 +1,4 @@
-
+The implementation will depend on design choices made in WL#40:
+- If we decide to parse the statement, SQL-verb filtering will be trivial
+- If we decide not to parse the statement, we still can reliably distinguish the
+statement by matching the first characters against a set of patterns.
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
DESCRIPTION:
Add a mysqlbinlog option to filter certain kinds of statements, i.e. (syntax
subject to discussion):
mysqlbinlog --exclude='alter table,drop table,alter database,...'
HIGH-LEVEL SPECIFICATION:
The implementation will depend on design choices made in WL#40:
- If we decide to parse the statement, SQL-verb filtering will be trivial
- If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Psergey): Add a mysqlbinlog option to filter certain kinds of statements (41)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter certain kinds of statements
CREATION DATE..: Mon, 10 Aug 2009, 15:30
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-BackLog
TASK ID........: 41 (http://askmonty.org/worklog/?tid=41)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Mon, 10 Aug 2009, 15:35)=-=-
High-Level Specification modified.
--- /tmp/wklog.41.old.12689 2009-08-10 15:35:04.000000000 +0300
+++ /tmp/wklog.41.new.12689 2009-08-10 15:35:04.000000000 +0300
@@ -1 +1,4 @@
-
+The implementation will depend on design choices made in WL#40:
+- If we decide to parse the statement, SQL-verb filtering will be trivial
+- If we decide not to parse the statement, we still can reliably distinguish the
+statement by matching the first characters against a set of patterns.
-=-=(Psergey - Mon, 10 Aug 2009, 15:31)=-=-
Dependency created: 39 now depends on 41
DESCRIPTION:
Add a mysqlbinlog option to filter certain kinds of statements, i.e. (syntax
subject to discussion):
mysqlbinlog --exclude='alter table,drop table,alter database,...'
HIGH-LEVEL SPECIFICATION:
The implementation will depend on design choices made in WL#40:
- If we decide to parse the statement, SQL-verb filtering will be trivial
- If we decide not to parse the statement, we still can reliably distinguish the
statement by matching the first characters against a set of patterns.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Add a mysqlbinlog option to filter certain kinds of statements (41)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter certain kinds of statements
CREATION DATE..: Mon, 10 Aug 2009, 15:30
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-BackLog
TASK ID........: 41 (http://askmonty.org/worklog/?tid=41)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Add a mysqlbinlog option to filter certain kinds of statements, i.e. (syntax
subject to discussion):
mysqlbinlog --exclude='alter table,drop table,alter database,...'
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] New (by Psergey): Add a mysqlbinlog option to filter certain kinds of statements (41)
by worklog-noreply@askmonty.org 10 Aug '09
by worklog-noreply@askmonty.org 10 Aug '09
10 Aug '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add a mysqlbinlog option to filter certain kinds of statements
CREATION DATE..: Mon, 10 Aug 2009, 15:30
SUPERVISOR.....: Monty
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-BackLog
TASK ID........: 41 (http://askmonty.org/worklog/?tid=41)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
DESCRIPTION:
Add a mysqlbinlog option to filter certain kinds of statements, i.e. (syntax
subject to discussion):
mysqlbinlog --exclude='alter table,drop table,alter database,...'
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0