developers
Threads by month
- ----- 2025 -----
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- 3 participants
- 6842 discussions

[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (sanja:2712) Bug#45632
by sanja@askmonty.org 25 Jun '09
by sanja@askmonty.org 25 Jun '09
25 Jun '09
#At lp:maria
2712 sanja(a)askmonty.org 2009-06-25
Fix of BUG#45632 (http://bugs.mysql.com/bug.php?id=45632) - sharing non default debug settings between sessions. This bugfix proposed by Monty.
modified:
mysql-test/r/variables_debug.result
mysql-test/t/variables_debug.test
sql/set_var.cc
per-file messages:
mysql-test/r/variables_debug.result
Test that sessions do not share the same session debug variable.
mysql-test/t/variables_debug.test
Test that sessions do not share the same session debug variable.
sql/set_var.cc
As soon as default setting are shared between sessions we should push dbug state before changing debug setting first time.
=== modified file 'mysql-test/r/variables_debug.result'
--- a/mysql-test/r/variables_debug.result 2008-02-26 15:03:59 +0000
+++ b/mysql-test/r/variables_debug.result 2009-06-24 22:22:20 +0000
@@ -10,3 +10,18 @@ set debug= '-P';
select @@debug;
@@debug
T
+set session debug="t";
+show session variables like 'debug';
+Variable_name Value
+debug t
+set session debug="t";
+show session variables like 'debug';
+Variable_name Value
+debug t
+set session debug="d:t";
+show session variables like 'debug';
+Variable_name Value
+debug d:t
+show session variables like 'debug';
+Variable_name Value
+debug t
=== modified file 'mysql-test/t/variables_debug.test'
--- a/mysql-test/t/variables_debug.test 2008-02-26 15:03:59 +0000
+++ b/mysql-test/t/variables_debug.test 2009-06-24 22:22:20 +0000
@@ -10,3 +10,31 @@ set debug= '+P';
select @@debug;
set debug= '-P';
select @@debug;
+
+#
+# Checks that assigning variable 'debug' in one session has no influence on
+# other session. (BUG#45632 of bugs.mysql.com)
+#
+connect(con1,localhost,root,,test,,);
+connect(con2,localhost,root,,test,,);
+
+# makes output independant of current debug status
+connection con1;
+set session debug="t";
+show session variables like 'debug';
+connection con2;
+set session debug="t";
+show session variables like 'debug';
+
+# checks influence one session debug variable on another
+connection con1;
+set session debug="d:t";
+show session variables like 'debug';
+connection con2;
+show session variables like 'debug';
+
+disconnect con1;
+disconnect con2;
+
+connection default;
+
=== modified file 'sql/set_var.cc'
--- a/sql/set_var.cc 2009-05-19 09:28:05 +0000
+++ b/sql/set_var.cc 2009-06-24 22:22:20 +0000
@@ -4231,11 +4231,28 @@ bool sys_var_thd_dbug::check(THD *thd, s
bool sys_var_thd_dbug::update(THD *thd, set_var *var)
{
+#ifndef DBUG_OFF
+ const char *command= var ? var->value->str_value.c_ptr() : "";
+
if (var->type == OPT_GLOBAL)
- DBUG_SET_INITIAL(var ? var->value->str_value.c_ptr() : "");
+ DBUG_SET_INITIAL(command);
else
- DBUG_SET(var ? var->value->str_value.c_ptr() : "");
-
+ {
+ if (_db_is_pushed_())
+ {
+ /* We have already a local state done with DBUG_PUSH; Modify the state */
+ DBUG_SET(command);
+ }
+ else
+ {
+ /*
+ We are sharing the state with the global state;
+ Create a local state for this thread.
+ */
+ DBUG_PUSH(command);
+ }
+ }
+#endif
return 0;
}
2
1
Hi All,
This is a heads-up that we are actively testing Tungsten Replicator against the latest MariaDB build and hope to complete initial certification this week. So far there are no problems. We are also going to add a feature to use Maria for our catalog tables instead of InnoDB. Tungsten Replicator builds are GPL V2, by the way.
This brings up a question for the Maria dev team-what are your plans, if any, for replication support in MariaDB? In particular, are there any plans that would affect binlog formats?
Thanks, Robert
P.s., I don't see a location on AskMonty.org to post builds for related products. Any hints where to go? We would like to post our open source offerings in a location that is easy for MariaDB users to find.
--
Robert Hodges, CTO, Continuent, Inc.
Email: robert.hodges(a)continuent.com
Mobile: +1-510-501-3728 Skype: hodgesrm
6
11

[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (knielsen:2716)
by knielsen@knielsen-hq.org 23 Jun '09
by knielsen@knielsen-hq.org 23 Jun '09
23 Jun '09
#At lp:maria
2716 knielsen(a)knielsen-hq.org 2009-06-23
Fix memory leak in mysql_ssl_set() when called more than once.
Fix sleep() synchronisation in innodb_information_schema test case.
modified:
mysql-test/t/innodb_information_schema.test
sql-common/client.c
per-file messages:
mysql-test/t/innodb_information_schema.test
Using sleep for synchronisation does not work!!!
Replace by looping until the required condition is met.
sql-common/client.c
mysql_ssl_set() did not free old pointers before overwriting with new ones (happens when
mysql_ssl_set() is called twice without calling mysql_close() in-between).
This sometimes caused memory leaks in the slave depending on exact timing of
master/slave shutdown.
Fixed by freeing old pointers before installing new ones in mysql_ssl_set(), just like
mysql_options() does.
=== modified file 'mysql-test/t/innodb_information_schema.test'
--- a/mysql-test/t/innodb_information_schema.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb_information_schema.test 2009-06-23 12:00:24 +0000
@@ -109,14 +109,19 @@ SELECT * FROM ```t'\"_str` WHERE c1 = '3
-- send
SELECT * FROM ```t'\"_str` WHERE c1 = '4' FOR UPDATE;
-# Give time to the above 2 queries to execute before continuing.
-# Without this sleep it sometimes happens that the SELECT from innodb_locks
+-- connection con_verify_innodb_locks
+
+# Loop, giving time for the above 2 queries to execute before continuing.
+# Without this, it sometimes happens that the SELECT FROM innodb_locks
# executes before some of them, resulting in less than expected number
# of rows being selected from innodb_locks.
--- sleep 0.1
+SET @counter := 0;
+while (`SELECT (@counter := @counter + 1) <= 50 AND COUNT(*) != 14 FROM INFORMATION_SCHEMA.INNODB_LOCKS`)
+{
+ sleep 0.1;
+}
-- enable_result_log
--- connection con_verify_innodb_locks
SELECT lock_mode, lock_type, lock_table, lock_index, lock_rec, lock_data
FROM INFORMATION_SCHEMA.INNODB_LOCKS ORDER BY lock_data;
=== modified file 'sql-common/client.c'
--- a/sql-common/client.c 2009-04-25 10:05:32 +0000
+++ b/sql-common/client.c 2009-06-23 12:00:24 +0000
@@ -1585,6 +1585,11 @@ mysql_ssl_set(MYSQL *mysql __attribute__
{
DBUG_ENTER("mysql_ssl_set");
#ifdef HAVE_OPENSSL
+ my_free(mysql->options.ssl_key, MYF(MY_ALLOW_ZERO_PTR));
+ my_free(mysql->options.ssl_cert, MYF(MY_ALLOW_ZERO_PTR));
+ my_free(mysql->options.ssl_ca, MYF(MY_ALLOW_ZERO_PTR));
+ my_free(mysql->options.ssl_capath, MYF(MY_ALLOW_ZERO_PTR));
+ my_free(mysql->options.ssl_cipher, MYF(MY_ALLOW_ZERO_PTR));
mysql->options.ssl_key= strdup_if_not_null(key);
mysql->options.ssl_cert= strdup_if_not_null(cert);
mysql->options.ssl_ca= strdup_if_not_null(ca);
1
0

[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (knielsen:2703)
by knielsen@knielsen-hq.org 22 Jun '09
by knielsen@knielsen-hq.org 22 Jun '09
22 Jun '09
#At lp:maria
2703 knielsen(a)knielsen-hq.org 2009-06-10
Fix XtraDB to build with atomic operations, for good performance.
The root of the problem is that ./configure mixed together two different things. One is the
availability of GCC atomic operation intrinsics. The other is the selection of which
primitives to use for my_atomic implementation.
Then at some point a hack was made to not use GCC intrinsics in my_atomic to work around
some test failures. But because the two things are mixed in ./configure, this as a side
effect also makes GCC intrinsics unavailable for XtraDB.
Fixed by splitting this in two in configure, so that we have HAVE_GCC_ATOMIC_BUILTINS for
GCC intrinsics availability and MY_ATOMIC_MODE_GCC_BUILTINS for use in my_atomic.
modified:
configure.in
include/atomic/nolock.h
=== modified file 'configure.in'
--- a/configure.in 2009-04-23 13:06:16 +0000
+++ b/configure.in 2009-06-10 09:13:53 +0000
@@ -1726,6 +1726,30 @@ then
fi
fi
+AC_CACHE_CHECK([whether the compiler provides atomic builtins],
+ [mysql_cv_gcc_atomic_builtins], [AC_TRY_RUN([
+ int main()
+ {
+ int foo= -10; int bar= 10;
+ if (!__sync_fetch_and_add(&foo, bar) || foo)
+ return -1;
+ bar= __sync_lock_test_and_set(&foo, bar);
+ if (bar || foo != 10)
+ return -1;
+ bar= __sync_val_compare_and_swap(&bar, foo, 15);
+ if (bar)
+ return -1;
+ return 0;
+ }
+], [mysql_cv_gcc_atomic_builtins=yes],
+ [mysql_cv_gcc_atomic_builtins=no],
+ [mysql_cv_gcc_atomic_builtins=no])])
+
+if test "x$mysql_cv_gcc_atomic_builtins" = xyes; then
+ AC_DEFINE(HAVE_GCC_ATOMIC_BUILTINS, 1,
+ [Define to 1 if compiler provides atomic builtins.])
+fi
+
AC_ARG_WITH([atomic-ops],
AC_HELP_STRING([--with-atomic-ops=rwlocks|smp|up],
[Implement atomic operations using pthread rwlocks or atomic CPU
@@ -1739,28 +1763,9 @@ case "$with_atomic_ops" in
[Use pthread rwlocks for atomic ops]) ;;
"smp") ;;
"")
- AC_CACHE_CHECK([whether the compiler provides atomic builtins],
- [mysql_cv_gcc_atomic_builtins], [AC_TRY_RUN([
- int main()
- {
- int foo= -10; int bar= 10;
- if (!__sync_fetch_and_add(&foo, bar) || foo)
- return -1;
- bar= __sync_lock_test_and_set(&foo, bar);
- if (bar || foo != 10)
- return -1;
- bar= __sync_val_compare_and_swap(&bar, foo, 15);
- if (bar)
- return -1;
- return 0;
- }
- ], [mysql_cv_gcc_atomic_builtins=yes_but_disabled],
- [mysql_cv_gcc_atomic_builtins=no],
- [mysql_cv_gcc_atomic_builtins=no])])
-
- if test "x$mysql_cv_gcc_atomic_builtins" = xyes; then
- AC_DEFINE(HAVE_GCC_ATOMIC_BUILTINS, 1,
- [Define to 1 if compiler provides atomic builtins.])
+ if test "x$mysql_cv_gcc_atomic_builtins" = xyes_but_disabled; then
+ AC_DEFINE([MY_ATOMIC_MODE_GCC_BUILTINS], [1],
+ [Use GCC atomic builtins for atomic ops])
fi
;;
*) AC_MSG_ERROR(["$with_atomic_ops" is not a valid value for --with-atomic-ops]) ;;
=== modified file 'include/atomic/nolock.h'
--- a/include/atomic/nolock.h 2008-02-05 15:47:11 +0000
+++ b/include/atomic/nolock.h 2009-06-10 09:13:53 +0000
@@ -14,7 +14,7 @@
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#if defined(__i386__) || defined(_MSC_VER) || \
- defined(__x86_64__) || defined(HAVE_GCC_ATOMIC_BUILTINS)
+ defined(__x86_64__) || defined(MY_ATOMIC_MODE_GCC_BUILTINS)
# ifdef MY_ATOMIC_MODE_DUMMY
# define LOCK_prefix ""
@@ -22,7 +22,7 @@
# define LOCK_prefix "lock"
# endif
-# ifdef HAVE_GCC_ATOMIC_BUILTINS
+# ifdef MY_ATOMIC_MODE_GCC_BUILTINS
# include "gcc_builtins.h"
# elif __GNUC__
# include "x86-gcc.h"
4
4

[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (psergey:2718)
by Sergey Petrunia 22 Jun '09
by Sergey Petrunia 22 Jun '09
22 Jun '09
#At lp:maria based on revid:psergey@askmonty.org-20090617052739-37i1r8lip0m4ft9r
2718 Sergey Petrunia 2009-06-22
MWL#17: Table elimination
- Make elimination check to be able detect cases like t.primary_key_col1=othertbl.col AND t.primary_key_col2=func(t.primary_key_col1).
These are needed to handle e.g. the case of func() being a correlated subquery that selects the latest value.
- If we've removed a condition with subquery predicate, EXPLAIN [EXTENDED] won't show the subquery anymore
modified:
sql/item.cc
sql/item.h
sql/item_subselect.cc
sql/item_subselect.h
sql/item_sum.cc
sql/sql_lex.cc
sql/sql_lex.h
sql/sql_select.cc
sql/sql_select.h
per-file messages:
sql/item.cc
MWL#17: Table elimination
- Add tem_field::check_column_usage_processor(). it allows to check which key parts a condition depends on.
sql/item.h
MWL#17: Table elimination
- Add tem_field::check_column_usage_processor(). it allows to check which key parts a condition depends on.
sql/item_subselect.cc
MWL#17: Table elimination
- Item_subselect got 'eliminated' attribute. It is used only to determine if the subselect should be printed by EXPLAIN.
- Item_subselect got List<Item> refers_to - a list of item in the current select that are referred to from within the subselect.
- Added Item_*::check_column_usage_processor(). it allows to check which key parts a condition depends on.
- Added a comment about possible problem in Item_subselect::walk
sql/item_subselect.h
MWL#17: Table elimination
- Item_subselect got 'eliminated' attribute. It is used only to determine if the subselect should be printed by EXPLAIN.
- Item_subselect got List<Item> refers_to - a list of item in the current select that are referred to from within the subselect.
- Added Item_*::check_column_usage_processor(). it allows to check which key parts a condition depends on.
sql/item_sum.cc
MWL#17: Table elimination
sql/sql_lex.cc
MWL#17: Table elimination
sql/sql_lex.h
MWL#17: Table elimination
sql/sql_select.cc
MWL#17: Table elimination
- Make elimination check to be able detect cases like t.primary_key_col1=othertbl.col AND t.primary_key_col2=func(t.primary_key_col1).
These are needed to handle e.g. the case of func() being a correlated subquery that selects the latest value.
- If we've removed a condition with subquery predicate, EXPLAIN [EXTENDED] won't show the subquery anymore
sql/sql_select.h
MWL#17: Table elimination
=== modified file 'sql/item.cc'
--- a/sql/item.cc 2009-04-25 10:05:32 +0000
+++ b/sql/item.cc 2009-06-22 11:46:31 +0000
@@ -1915,6 +1915,30 @@ void Item_field::reset_field(Field *f)
name= (char*) f->field_name;
}
+bool Item_field::check_column_usage_processor(uchar *arg)
+{
+ Field_processor_info* info=(Field_processor_info*)arg;
+ if (used_tables() & ~info->allowed_tables)
+ return FALSE;
+
+ if (field->table == info->table)
+ {
+ if (!(field->part_of_key.is_set(info->keyno)))
+ return TRUE;
+
+ KEY *key= &field->table->key_info[info->keyno];
+ for (uint part= 0; part < key->key_parts; part++)
+ {
+ if (field->field_index == key->key_part[part].field->field_index)
+ {
+ info->needed_key_parts |= key_part_map(1) << part;
+ break;
+ }
+ }
+ }
+ return FALSE;
+}
+
const char *Item_ident::full_name() const
{
char *tmp;
@@ -3380,7 +3404,7 @@ static void mark_as_dependent(THD *thd,
/* store pointer on SELECT_LEX from which item is dependent */
if (mark_item)
mark_item->depended_from= last;
- current->mark_as_dependent(last);
+ current->mark_as_dependent(last, resolved_item);
if (thd->lex->describe & DESCRIBE_EXTENDED)
{
char warn_buff[MYSQL_ERRMSG_SIZE];
=== modified file 'sql/item.h'
--- a/sql/item.h 2009-06-09 21:11:33 +0000
+++ b/sql/item.h 2009-06-22 11:46:31 +0000
@@ -888,6 +888,8 @@ public:
virtual bool reset_query_id_processor(uchar *query_id_arg) { return 0; }
virtual bool is_expensive_processor(uchar *arg) { return 0; }
virtual bool register_field_in_read_map(uchar *arg) { return 0; }
+ virtual bool check_column_usage_processor(uchar *arg) { return 0; }
+ virtual bool mark_as_eliminated_processor(uchar *arg) { return 0; }
/*
Check if a partition function is allowed
SYNOPSIS
@@ -1012,6 +1014,14 @@ public:
};
+typedef struct
+{
+ table_map allowed_tables;
+ TABLE *table;
+ uint keyno;
+ uint needed_key_parts;
+} Field_processor_info;
+
class sp_head;
@@ -1477,6 +1487,7 @@ public:
bool find_item_in_field_list_processor(uchar *arg);
bool register_field_in_read_map(uchar *arg);
bool check_partition_func_processor(uchar *int_arg) {return FALSE;}
+ bool check_column_usage_processor(uchar *arg);
void cleanup();
bool result_as_longlong()
{
=== modified file 'sql/item_subselect.cc'
--- a/sql/item_subselect.cc 2009-01-31 21:22:44 +0000
+++ b/sql/item_subselect.cc 2009-06-22 11:46:31 +0000
@@ -39,7 +39,7 @@ inline Item * and_items(Item* cond, Item
Item_subselect::Item_subselect():
Item_result_field(), value_assigned(0), thd(0), substitution(0),
engine(0), old_engine(0), used_tables_cache(0), have_to_be_excluded(0),
- const_item_cache(1), engine_changed(0), changed(0), is_correlated(FALSE)
+ const_item_cache(1), in_fix_fields(0), engine_changed(0), changed(0), is_correlated(FALSE)
{
with_subselect= 1;
reset();
@@ -151,10 +151,14 @@ bool Item_subselect::fix_fields(THD *thd
DBUG_ASSERT(fixed == 0);
engine->set_thd((thd= thd_param));
+ if (!in_fix_fields)
+ refers_to.empty();
+ eliminated= FALSE;
if (check_stack_overrun(thd, STACK_MIN_SIZE, (uchar*)&res))
return TRUE;
-
+
+ in_fix_fields++;
res= engine->prepare();
// all transformation is done (used by prepared statements)
@@ -181,12 +185,14 @@ bool Item_subselect::fix_fields(THD *thd
if (!(*ref)->fixed)
ret= (*ref)->fix_fields(thd, ref);
thd->where= save_where;
+ in_fix_fields--;
return ret;
}
// Is it one field subselect?
if (engine->cols() > max_columns)
{
my_error(ER_OPERAND_COLUMNS, MYF(0), 1);
+ in_fix_fields--;
return TRUE;
}
fix_length_and_dec();
@@ -203,11 +209,30 @@ bool Item_subselect::fix_fields(THD *thd
fixed= 1;
err:
+ in_fix_fields--;
thd->where= save_where;
return res;
}
+bool Item_subselect::check_column_usage_processor(uchar *arg)
+{
+ List_iterator<Item> it(refers_to);
+ Item *item;
+ while ((item= it++))
+ {
+ if (item->walk(&Item::check_column_usage_processor,FALSE, arg))
+ return TRUE;
+ }
+ return FALSE;
+}
+
+bool Item_subselect::mark_as_eliminated_processor(uchar *arg)
+{
+ eliminated= TRUE;
+ return FALSE;
+}
+
bool Item_subselect::walk(Item_processor processor, bool walk_subquery,
uchar *argument)
{
@@ -225,6 +250,7 @@ bool Item_subselect::walk(Item_processor
if (lex->having && (lex->having)->walk(processor, walk_subquery,
argument))
return 1;
+ /* TODO: why doesn't this walk the OUTER JOINs' ON expressions */
while ((item=li++))
{
=== modified file 'sql/item_subselect.h'
--- a/sql/item_subselect.h 2008-02-22 10:30:33 +0000
+++ b/sql/item_subselect.h 2009-06-22 11:46:31 +0000
@@ -52,8 +52,16 @@ protected:
bool have_to_be_excluded;
/* cache of constant state */
bool const_item_cache;
-
+
public:
+ /*
+ References from inside the subquery to the select that this predicate is
+ in. References to parent selects not included.
+ */
+ List<Item> refers_to;
+ int in_fix_fields;
+ bool eliminated;
+
/* changed engine indicator */
bool engine_changed;
/* subquery is transformed */
@@ -126,6 +134,8 @@ public:
virtual void reset_value_registration() {}
enum_parsing_place place() { return parsing_place; }
bool walk(Item_processor processor, bool walk_subquery, uchar *arg);
+ bool mark_as_eliminated_processor(uchar *arg);
+ bool check_column_usage_processor(uchar *arg);
/**
Get the SELECT_LEX structure associated with this Item.
=== modified file 'sql/item_sum.cc'
--- a/sql/item_sum.cc 2009-06-09 21:11:33 +0000
+++ b/sql/item_sum.cc 2009-06-22 11:46:31 +0000
@@ -350,7 +350,7 @@ bool Item_sum::register_sum_func(THD *th
sl= sl->master_unit()->outer_select() )
sl->master_unit()->item->with_sum_func= 1;
}
- thd->lex->current_select->mark_as_dependent(aggr_sel);
+ thd->lex->current_select->mark_as_dependent(aggr_sel, NULL);
return FALSE;
}
=== modified file 'sql/sql_lex.cc'
--- a/sql/sql_lex.cc 2009-04-25 10:05:32 +0000
+++ b/sql/sql_lex.cc 2009-06-22 11:46:31 +0000
@@ -1778,7 +1778,7 @@ void st_select_lex_unit::exclude_tree()
'last' should be reachable from this st_select_lex_node
*/
-void st_select_lex::mark_as_dependent(st_select_lex *last)
+void st_select_lex::mark_as_dependent(st_select_lex *last, Item *dependency)
{
/*
Mark all selects from resolved to 1 before select where was
@@ -1804,6 +1804,8 @@ void st_select_lex::mark_as_dependent(st
}
is_correlated= TRUE;
this->master_unit()->item->is_correlated= TRUE;
+ if (dependency)
+ this->master_unit()->item->refers_to.push_back(dependency);
}
bool st_select_lex_node::set_braces(bool value) { return 1; }
=== modified file 'sql/sql_lex.h'
--- a/sql/sql_lex.h 2009-03-17 20:29:24 +0000
+++ b/sql/sql_lex.h 2009-06-22 11:46:31 +0000
@@ -743,7 +743,7 @@ public:
return master_unit()->return_after_parsing();
}
- void mark_as_dependent(st_select_lex *last);
+ void mark_as_dependent(st_select_lex *last, Item *dependency);
bool set_braces(bool value);
bool inc_in_sum_expr();
=== modified file 'sql/sql_select.cc'
--- a/sql/sql_select.cc 2009-06-17 05:27:39 +0000
+++ b/sql/sql_select.cc 2009-06-22 11:46:31 +0000
@@ -2478,6 +2478,14 @@ static ha_rows get_quick_record_count(TH
}
+typedef struct st_keyuse_w_needed_reg
+{
+ KEYUSE *first;
+ key_part_map second;
+
+} Keyuse_w_needed_reg;
+
+static
bool has_eq_ref_access_candidate(TABLE *table, table_map can_refer_to_these)
{
KEYUSE *keyuse= table->reginfo.join_tab->keyuse;
@@ -2494,24 +2502,85 @@ bool has_eq_ref_access_candidate(TABLE *
{
uint key= keyuse->key;
key_part_map bound_parts=0;
- bool ft_key= test(keyuse->keypart == FT_KEYPART);
-
+ uint n_unusable=0;
+ bool ft_key= test(keyuse->keypart == FT_KEYPART);
+ KEY *keyinfo= table->key_info + key;
+ KEYUSE *key_start = keyuse;
+
do /* For each keypart and each way to read it */
{
- if (!(keyuse->used_tables & ~can_refer_to_these) &&
- !(keyuse->optimize & KEY_OPTIMIZE_REF_OR_NULL))
+ if (keyuse->usable)
{
- bound_parts |= keyuse->keypart_map;
+ if(!(keyuse->used_tables & ~can_refer_to_these) &&
+ !(keyuse->optimize & KEY_OPTIMIZE_REF_OR_NULL))
+ {
+ bound_parts |= keyuse->keypart_map;
+ }
}
+ else
+ n_unusable++;
keyuse++;
- } while (keyuse->table && keyuse->key == key);
+ } while (keyuse->table == table && keyuse->key == key);
+
+ if (ft_key || ((keyinfo->flags & (HA_NOSAME | HA_NULL_PART_KEY))
+ != HA_NOSAME))
+ {
+ continue;
+ }
- KEY *keyinfo= table->key_info + key;
- if (!ft_key &&
- ((keyinfo->flags & (HA_NOSAME | HA_NULL_PART_KEY)) == HA_NOSAME) &&
- bound_parts == PREV_BITS(key_part_map, keyinfo->key_parts))
- {
+ if (bound_parts == PREV_BITS(key_part_map, keyinfo->key_parts))
return TRUE;
+ /*
+ Ok, usable keyuse elements didn't help us. Try making use of
+ unusable KEYUSEs (psergey-todo: sane comments:)
+ */
+ if (n_unusable && bound_parts)
+ {
+ /*
+ Check if unusable KEYUSE elements cause all parts of key to be
+ bound. An unusable keyuse element makes a key part bound when it
+ represents the following:
+
+ keyXpartY=func(bound_columns, preceding_tables)
+
+ .
+ */
+ Keyuse_w_needed_reg *uses;
+ if (!(uses= (Keyuse_w_needed_reg*)my_alloca(sizeof(Keyuse_w_needed_reg)*n_unusable)))
+ return FALSE;
+ uint n_uses=0;
+ for (KEYUSE *k= key_start; k!=keyuse; k++)
+ {
+ if (!k->usable && !(k->used_tables & ~can_refer_to_these))
+ {
+ //Walk k->val and check which key parts it depends on.
+ Field_processor_info fp= {can_refer_to_these, table, k->key, 0};
+ if (!k->val->walk(&Item::check_column_usage_processor, FALSE,
+ (uchar*)&fp))
+ {
+ uses[n_uses].first= k;
+ uses[n_uses].second= fp.needed_key_parts;
+ n_uses++;
+ }
+ }
+ }
+ /* Now compute transitive closure */
+ uint n_bounded;
+ do
+ {
+ n_bounded= 0;
+ for (uint i=0; i< n_uses; i++)
+ {
+ /* needed_parts is covered by what is already bound*/
+ if (!(uses[i].second & ~bound_parts))
+ {
+ bound_parts|= key_part_map(1) << uses[i].first->keypart;
+ n_bounded++;
+ }
+ if (bound_parts == PREV_BITS(key_part_map, keyinfo->key_parts))
+ return TRUE;
+ }
+ } while (n_bounded != 0);
}
}
}
@@ -2657,6 +2726,7 @@ eliminate_tables_for_join_list(JOIN *joi
eliminated += tbl->nested_join->join_list.elements;
//psergey-todo: do we need to do anything about removing the join
//nest?
+ tbl->on_expr->walk(&Item::mark_as_eliminated_processor, FALSE, NULL);
}
else
{
@@ -2673,6 +2743,7 @@ eliminate_tables_for_join_list(JOIN *joi
{
mark_table_as_eliminated(join, tbl->table, const_tbl_count,
const_tables);
+ tbl->on_expr->walk(&Item::mark_as_eliminated_processor, FALSE, NULL);
eliminated += 1;
}
}
@@ -3065,14 +3136,16 @@ make_join_statistics(JOIN *join, TABLE_L
{
start_keyuse=keyuse;
key=keyuse->key;
- s->keys.set_bit(key); // QQ: remove this ?
+ if (keyuse->usable)
+ s->keys.set_bit(key); // QQ: remove this ?
refs=0;
const_ref.clear_all();
eq_part.clear_all();
do
{
- if (keyuse->val->type() != Item::NULL_ITEM && !keyuse->optimize)
+ if (keyuse->usable && keyuse->val->type() != Item::NULL_ITEM &&
+ !keyuse->optimize)
{
if (!((~found_const_table_map) & keyuse->used_tables))
const_ref.set_bit(keyuse->keypart);
@@ -3276,6 +3349,7 @@ typedef struct key_field_t {
*/
bool null_rejecting;
bool *cond_guard; /* See KEYUSE::cond_guard */
+ bool usable;
} KEY_FIELD;
@@ -3284,6 +3358,26 @@ typedef struct key_field_t {
This is called for OR between different levels.
+ That is, the function operates on an array of KEY_FIELD elements which has
+ two parts:
+
+ $LEFT_PART $RIGHT_PART
+ +-----------------------+-----------------------+
+ start new_fields end
+
+ $LEFT_PART and $RIGHT_PART are arrays that have KEY_FIELD elements for two
+ parts of the OR condition. Our task is to produce an array of KEY_FIELD
+ elements that would correspond to "$LEFT_PART OR $RIGHT_PART".
+
+ The rules for combining elements are as follows:
+ (keyfieldA1 AND keyfieldA2 AND ...) OR (keyfieldB1 AND keyfieldB2 AND ...)=
+ AND_ij (keyfieldA_i OR keyfieldB_j)
+
+ We discard all (keyfieldA_i OR keyfieldB_j) that refer to different
+ fields. For those referring to the same field, the logic is as follows:
+
+ t.keycol=
+
To be able to do 'ref_or_null' we merge a comparison of a column
and 'column IS NULL' to one test. This is useful for sub select queries
that are internally transformed to something like:.
@@ -3348,13 +3442,18 @@ merge_key_fields(KEY_FIELD *start,KEY_FI
KEY_OPTIMIZE_REF_OR_NULL));
old->null_rejecting= (old->null_rejecting &&
new_fields->null_rejecting);
+ /*
+ The conditions are the same, hence their usabilities should
+ be, too (TODO: shouldn't that apply to the above
+ null_rejecting and optimize attributes?)
+ */
+ DBUG_ASSERT(old->usable == new_fields->usable);
}
}
else if (old->eq_func && new_fields->eq_func &&
old->val->eq_by_collation(new_fields->val,
old->field->binary(),
old->field->charset()))
-
{
old->level= and_level;
old->optimize= ((old->optimize & new_fields->optimize &
@@ -3363,10 +3462,14 @@ merge_key_fields(KEY_FIELD *start,KEY_FI
KEY_OPTIMIZE_REF_OR_NULL));
old->null_rejecting= (old->null_rejecting &&
new_fields->null_rejecting);
+ // "t.key_col=const" predicates are always usable
+ DBUG_ASSERT(old->usable && new_fields->usable);
}
else if (old->eq_func && new_fields->eq_func &&
- ((old->val->const_item() && old->val->is_null()) ||
- new_fields->val->is_null()))
+ ((new_fields->usable && old->val->const_item() &&
+ old->val->is_null()) ||
+ ((old->usable && new_fields->val->is_null()))))
+ /* TODO ^ why is the above asymmetric, why const_item()? */
{
/* field = expression OR field IS NULL */
old->level= and_level;
@@ -3437,6 +3540,7 @@ add_key_field(KEY_FIELD **key_fields,uin
table_map usable_tables, SARGABLE_PARAM **sargables)
{
uint exists_optimize= 0;
+ bool optimizable=0;
if (!(field->flags & PART_KEY_FLAG))
{
// Don't remove column IS NULL on a LEFT JOIN table
@@ -3449,15 +3553,15 @@ add_key_field(KEY_FIELD **key_fields,uin
else
{
table_map used_tables=0;
- bool optimizable=0;
for (uint i=0; i<num_values; i++)
{
used_tables|=(value[i])->used_tables();
if (!((value[i])->used_tables() & (field->table->map | RAND_TABLE_BIT)))
optimizable=1;
}
- if (!optimizable)
- return;
+ // psergey-tbl-elim:
+ // if (!optimizable)
+ // return;
if (!(usable_tables & field->table->map))
{
if (!eq_func || (*value)->type() != Item::NULL_ITEM ||
@@ -3470,7 +3574,8 @@ add_key_field(KEY_FIELD **key_fields,uin
JOIN_TAB *stat=field->table->reginfo.join_tab;
key_map possible_keys=field->key_start;
possible_keys.intersect(field->table->keys_in_use_for_query);
- stat[0].keys.merge(possible_keys); // Add possible keys
+ if (optimizable)
+ stat[0].keys.merge(possible_keys); // Add possible keys
/*
Save the following cases:
@@ -3563,6 +3668,7 @@ add_key_field(KEY_FIELD **key_fields,uin
(*key_fields)->val= *value;
(*key_fields)->level= and_level;
(*key_fields)->optimize= exists_optimize;
+ (*key_fields)->usable= optimizable;
/*
If the condition has form "tbl.keypart = othertbl.field" and
othertbl.field can be NULL, there will be no matches if othertbl.field
@@ -3874,6 +3980,7 @@ add_key_part(DYNAMIC_ARRAY *keyuse_array
keyuse.optimize= key_field->optimize & KEY_OPTIMIZE_REF_OR_NULL;
keyuse.null_rejecting= key_field->null_rejecting;
keyuse.cond_guard= key_field->cond_guard;
+ keyuse.usable= key_field->usable;
VOID(insert_dynamic(keyuse_array,(uchar*) &keyuse));
}
}
@@ -3954,6 +4061,11 @@ sort_keyuse(KEYUSE *a,KEYUSE *b)
return (int) (a->key - b->key);
if (a->keypart != b->keypart)
return (int) (a->keypart - b->keypart);
+
+ // Usable ones go before the unusable
+ if (a->usable != b->usable)
+ return (int)a->usable - (int)b->usable;
+
// Place const values before other ones
if ((res= test((a->used_tables & ~OUTER_REF_TABLE_BIT)) -
test((b->used_tables & ~OUTER_REF_TABLE_BIT))))
@@ -4164,7 +4276,8 @@ update_ref_and_keys(THD *thd, DYNAMIC_AR
found_eq_constant=0;
for (i=0 ; i < keyuse->elements-1 ; i++,use++)
{
- if (!use->used_tables && use->optimize != KEY_OPTIMIZE_REF_OR_NULL)
+ if (use->usable && !use->used_tables &&
+ use->optimize != KEY_OPTIMIZE_REF_OR_NULL)
use->table->const_key_parts[use->key]|= use->keypart_map;
if (use->keypart != FT_KEYPART)
{
@@ -4188,7 +4301,8 @@ update_ref_and_keys(THD *thd, DYNAMIC_AR
/* Save ptr to first use */
if (!use->table->reginfo.join_tab->keyuse)
use->table->reginfo.join_tab->keyuse=save_pos;
- use->table->reginfo.join_tab->checked_keys.set_bit(use->key);
+ if (use->usable)
+ use->table->reginfo.join_tab->checked_keys.set_bit(use->key);
save_pos++;
}
i=(uint) (save_pos-(KEYUSE*) keyuse->buffer);
@@ -4218,7 +4332,7 @@ static void optimize_keyuse(JOIN *join,
To avoid bad matches, we don't make ref_table_rows less than 100.
*/
keyuse->ref_table_rows= ~(ha_rows) 0; // If no ref
- if (keyuse->used_tables &
+ if (keyuse->usable && keyuse->used_tables &
(map= (keyuse->used_tables & ~join->const_table_map &
~OUTER_REF_TABLE_BIT)))
{
@@ -4411,7 +4525,8 @@ best_access_path(JOIN *join,
if 1. expression doesn't refer to forward tables
2. we won't get two ref-or-null's
*/
- if (!(remaining_tables & keyuse->used_tables) &&
+ if (keyuse->usable &&
+ !(remaining_tables & keyuse->used_tables) &&
!(ref_or_null_part && (keyuse->optimize &
KEY_OPTIMIZE_REF_OR_NULL)))
{
@@ -5915,9 +6030,11 @@ static bool create_ref_for_key(JOIN *joi
uint i;
for (i=0 ; i < keyparts ; keyuse++,i++)
{
- while (keyuse->keypart != i ||
- ((~used_tables) & keyuse->used_tables))
+ while (keyuse->keypart != i || ((~used_tables) & keyuse->used_tables) ||
+ !keyuse->usable)
+ {
keyuse++; /* Skip other parts */
+ }
uint maybe_null= test(keyinfo->key_part[i].null_bit);
j->ref.items[i]=keyuse->val; // Save for cond removal
@@ -16853,8 +16970,11 @@ static void select_describe(JOIN *join,
unit;
unit= unit->next_unit())
{
- if (mysql_explain_union(thd, unit, result))
- DBUG_VOID_RETURN;
+ if (!(unit->item && unit->item->eliminated))
+ {
+ if (mysql_explain_union(thd, unit, result))
+ DBUG_VOID_RETURN;
+ }
}
DBUG_VOID_RETURN;
}
=== modified file 'sql/sql_select.h'
--- a/sql/sql_select.h 2009-06-14 12:35:04 +0000
+++ b/sql/sql_select.h 2009-06-22 11:46:31 +0000
@@ -51,6 +51,7 @@ typedef struct keyuse_t {
NULL - Otherwise (the source equality can't be turned off)
*/
bool *cond_guard;
+ bool usable;
} KEYUSE;
class store_key;
1
0

[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (knielsen:2715)
by knielsen@knielsen-hq.org 22 Jun '09
by knielsen@knielsen-hq.org 22 Jun '09
22 Jun '09
#At lp:maria
2715 knielsen(a)knielsen-hq.org 2009-06-22
More XtraDB after-merge fixes following review and buildbot runs:
- Better fix for --innodb-use-sys-malloc causing Valgrind warnings.
- Different fix for INNODB_IBUF_MAX_SIZE variable changing default value.
- Fix some problems with the safe mutex lazy init patch.
modified:
mysql-test/include/mtr_check.sql
mysql-test/lib/mtr_cases.pm
mysql-test/mysql-test-run.pl
mysys/thr_mutex.c
storage/xtradb/ibuf/ibuf0ibuf.c
per-file messages:
mysql-test/include/mtr_check.sql
Do not check INNODB_IBUF_MAX_SIZE for changes. It is not a dynamic variable, so cannot
be changed by a test case anyway, and the value may vary slightly from one start of the
server to the next.
mysql-test/lib/mtr_cases.pm
Even just starting and stopping the server with --innodb-use-sys-malloc to check for
disabled test case under valgrind will cause valgrind leak warnings. So add not_valgrind
to the list of conditions also tested for directly in mysql-test-run.pl.
mysql-test/mysql-test-run.pl
Even just starting and stopping the server with --innodb-use-sys-malloc to check for
disabled test case under valgrind will cause valgrind leak warnings. So add not_valgrind
to the list of conditions also tested for directly in mysql-test-run.pl.
mysys/thr_mutex.c
Fix a few problems found during review of the lazy init safe mutex patch.
storage/xtradb/ibuf/ibuf0ibuf.c
Revert previous fix of INNODB_IBUF_MAX_SIZE default varying slightly between server starts.
(Fixed instead by ignoring that variable in the test suite).
=== modified file 'mysql-test/include/mtr_check.sql'
--- a/mysql-test/include/mtr_check.sql 2009-02-19 09:01:25 +0000
+++ b/mysql-test/include/mtr_check.sql 2009-06-22 08:06:35 +0000
@@ -12,7 +12,9 @@ BEGIN
-- Dump all global variables except those
-- that are supposed to change
SELECT * FROM INFORMATION_SCHEMA.GLOBAL_VARIABLES
- WHERE variable_name != 'timestamp' and variable_name != "debug" order by variable_name;
+ WHERE variable_name != 'timestamp' AND variable_name != "debug"
+ AND variable_name != 'INNODB_IBUF_MAX_SIZE'
+ ORDER BY variable_name;
-- Dump all databases, there should be none
-- except those that was created during bootstrap
=== modified file 'mysql-test/lib/mtr_cases.pm'
--- a/mysql-test/lib/mtr_cases.pm 2009-03-20 14:39:37 +0000
+++ b/mysql-test/lib/mtr_cases.pm 2009-06-22 08:06:35 +0000
@@ -970,6 +970,16 @@ sub collect_one_test_case {
}
}
+ if ( $tinfo->{'not_valgrind'} )
+ {
+ if ( $::opt_valgrind_mysqld )
+ {
+ $tinfo->{'skip'}= 1;
+ $tinfo->{'comment'}= "Not compatible with Valgrind testing";
+ return $tinfo;
+ }
+ }
+
# ----------------------------------------------------------------------
# Find config file to use if not already selected in <testname>.opt file
# ----------------------------------------------------------------------
@@ -1050,6 +1060,7 @@ my @tags=
["include/ndb_master-slave.inc", "ndb_test", 1],
["federated.inc", "federated_test", 1],
["include/not_embedded.inc", "not_embedded", 1],
+ ["include/not_valgrind.inc", "not_valgrind", 1],
);
=== modified file 'mysql-test/mysql-test-run.pl'
--- a/mysql-test/mysql-test-run.pl 2009-06-18 12:39:21 +0000
+++ b/mysql-test/mysql-test-run.pl 2009-06-22 08:06:35 +0000
@@ -224,7 +224,7 @@ my $opt_strace_client;
our $opt_user = "root";
my $opt_valgrind= 0;
-my $opt_valgrind_mysqld= 0;
+our $opt_valgrind_mysqld= 0;
my $opt_valgrind_mysqltest= 0;
my @default_valgrind_args= ("--show-reachable=yes");
my @valgrind_args;
=== modified file 'mysys/thr_mutex.c'
--- a/mysys/thr_mutex.c 2009-06-09 15:08:46 +0000
+++ b/mysys/thr_mutex.c 2009-06-22 08:06:35 +0000
@@ -160,6 +160,9 @@ static int safe_mutex_lazy_init_deadlock
&mp->locked_mutex, sizeof(*mp->locked_mutex),
&mp->used_mutex, sizeof(*mp->used_mutex), NullS))
{
+ /* Disable deadlock handling for this mutex */
+ mp->create_flags|= MYF_NO_DEADLOCK_DETECTION;
+ mp->active_flags|= MYF_NO_DEADLOCK_DETECTION;
return 1; /* Error */
}
@@ -196,6 +199,9 @@ int safe_mutex_init(safe_mutex_t *mp,
mp->line= line;
/* Skip the very common '&' prefix from the autogenerated name */
mp->name= name[0] == '&' ? name + 1 : name;
+
+ if (!safe_mutex_deadlock_detector)
+ my_flags|= MYF_NO_DEADLOCK_DETECTION;
/* Deadlock detection is initialised only lazily, on first use. */
mp->create_flags= my_flags;
=== modified file 'storage/xtradb/ibuf/ibuf0ibuf.c'
--- a/storage/xtradb/ibuf/ibuf0ibuf.c 2009-06-09 15:08:46 +0000
+++ b/storage/xtradb/ibuf/ibuf0ibuf.c 2009-06-22 08:06:35 +0000
@@ -422,12 +422,7 @@ ibuf_init_at_db_start(void)
grow in size, as the references on the upper levels of the tree can
change */
- /* The default for ibuf_max_size is calculated from the requested
- buffer pool size srv_buf_pool_size, not the actual size as returned
- by buf_pool_get_curr_size(). The latter can differ from the former
- by one page due to alignment requirements, and we do not want a
- user-visible variable like INNODB_IBUF_MAX_SIZE to vary at random. */
- ibuf->max_size = ut_min( srv_buf_pool_size / UNIV_PAGE_SIZE
+ ibuf->max_size = ut_min( buf_pool_get_curr_size() / UNIV_PAGE_SIZE
/ IBUF_POOL_SIZE_PER_MAX_SIZE, (ulint) srv_ibuf_max_size / UNIV_PAGE_SIZE);
srv_ibuf_max_size = (long long) ibuf->max_size * UNIV_PAGE_SIZE;
1
0

[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (sanja:2712) Bug#41098
by sanja@askmonty.org 20 Jun '09
by sanja@askmonty.org 20 Jun '09
20 Jun '09
#At lp:maria
2712 sanja(a)askmonty.org 2009-06-11
Real fix for bug Bug#41098 (http://bugs.mysql.com/bug.php?id=41098) Invalidate tables changed in insert after unlocking tables when the result of insert become really visible.
modified:
sql/handler.h
sql/mysql_priv.h
sql/sql_base.cc
sql/sql_cache.cc
sql/sql_cache.h
sql/sql_delete.cc
sql/sql_insert.cc
sql/sql_load.cc
sql/sql_parse.cc
sql/sql_partition.cc
sql/sql_rename.cc
sql/sql_table.cc
sql/sql_update.cc
sql/sql_view.cc
sql/table.h
storage/maria/ha_maria.h
storage/myisam/ha_myisam.h
per-file messages:
sql/handler.h
table type for nontransactional tables with delayed to unlock insert visibility.
sql/mysql_priv.h
Invalidate call changed.
sql/sql_base.cc
Invalidation of marked tables.
sql/sql_cache.cc
Marking tables for on-unlock-invalidation added.
sql/sql_cache.h
Invalidate call changed.
sql/sql_delete.cc
Invalidate call changed.
sql/sql_insert.cc
Invalidate call changed.
sql/sql_load.cc
Invalidate call changed.
sql/sql_parse.cc
Invalidate call changed.
sql/sql_partition.cc
Invalidate call changed.
sql/sql_rename.cc
Invalidate call changed.
sql/sql_table.cc
Invalidate call changed.
sql/sql_update.cc
Invalidate call changed.
sql/sql_view.cc
Invalidate call changed.
sql/table.h
mark for tables which need query cache invalidation on unlock.
storage/maria/ha_maria.h
MyISAM and maria 1.5 use the new type of table for query cache.
storage/myisam/ha_myisam.h
MyISAM and maria 1.5 use the new type of table for query cache.
=== modified file 'sql/handler.h'
--- a/sql/handler.h 2009-02-19 09:01:25 +0000
+++ b/sql/handler.h 2009-06-11 12:45:53 +0000
@@ -247,6 +247,11 @@
#define HA_CACHE_TBL_NOCACHE 1
#define HA_CACHE_TBL_ASKTRANSACT 2
#define HA_CACHE_TBL_TRANSACT 4
+/**
+ Non transactional table but insert results visible for other threads
+ only on unlock
+*/
+#define HA_CACHE_TBL_NTRNS_INS2LOCK 8
/* Options of START TRANSACTION statement (and later of SET TRANSACTION stmt) */
#define MYSQL_START_TRANS_OPT_WITH_CONS_SNAPSHOT 1
=== modified file 'sql/mysql_priv.h'
--- a/sql/mysql_priv.h 2009-04-25 10:05:32 +0000
+++ b/sql/mysql_priv.h 2009-06-11 12:45:53 +0000
@@ -893,7 +893,7 @@ struct Query_cache_query_flags
#define query_cache_init() query_cache.init()
#define query_cache_resize(A) query_cache.resize(A)
#define query_cache_set_min_res_unit(A) query_cache.set_min_res_unit(A)
-#define query_cache_invalidate3(A, B, C) query_cache.invalidate(A, B, C)
+#define query_cache_invalidate4(A, B, C, D) query_cache.invalidate(A, B, C, D)
#define query_cache_invalidate1(A) query_cache.invalidate(A)
#define query_cache_send_result_to_client(A, B, C) \
query_cache.send_result_to_client(A, B, C)
@@ -912,7 +912,7 @@ struct Query_cache_query_flags
#define query_cache_init()
#define query_cache_resize(A)
#define query_cache_set_min_res_unit(A)
-#define query_cache_invalidate3(A, B, C)
+#define query_cache_invalidate4(A, B, C, D)
#define query_cache_invalidate1(A)
#define query_cache_send_result_to_client(A, B, C) 0
#define query_cache_invalidate_by_MyISAM_filename_ref NULL
=== modified file 'sql/sql_base.cc'
--- a/sql/sql_base.cc 2009-05-19 09:28:05 +0000
+++ b/sql/sql_base.cc 2009-06-11 12:45:53 +0000
@@ -1373,6 +1373,15 @@ bool close_thread_table(THD *thd, TABLE
table->s->table_name.str, (long) table));
*table_ptr=table->next;
+
+ /* Invalidate if it has mark about changing in insert
+ (not all tables has such marks */
+ if (table->changed_in_insert)
+ {
+ table->changed_in_insert= FALSE;
+ query_cache_invalidate4(thd, table, FALSE, FALSE);
+ }
+
/*
When closing a MERGE parent or child table, detach the children first.
Clear child table references to force new assignment at next open.
=== modified file 'sql/sql_cache.cc'
--- a/sql/sql_cache.cc 2009-04-25 10:05:32 +0000
+++ b/sql/sql_cache.cc 2009-06-11 12:45:53 +0000
@@ -1542,12 +1542,19 @@ err:
}
-/*
+/**
Remove all cached queries that uses any of the tables in the list
+
+ @param thd Thread handler
+ @param tables_used List of tables used in this operation
+ @param using_transactions Not in autocommit mode
+ @param insert It is insert operation
+
*/
void Query_cache::invalidate(THD *thd, TABLE_LIST *tables_used,
- my_bool using_transactions)
+ my_bool using_transactions,
+ my_bool insert)
{
DBUG_ENTER("Query_cache::invalidate (table list)");
@@ -1567,6 +1574,15 @@ void Query_cache::invalidate(THD *thd, T
force transaction finish.
*/
thd->add_changed_table(tables_used->table);
+ else if (insert &&
+ (tables_used->table->file->table_cache_type() ==
+ HA_CACHE_TBL_NTRNS_INS2LOCK))
+ {
+ /* for other threads */
+ tables_used->table->changed_in_insert= TRUE;
+ /* for this thread */
+ invalidate_table(thd, tables_used);
+ }
else
invalidate_table(thd, tables_used);
}
@@ -1619,12 +1635,18 @@ void Query_cache::invalidate_locked_for_
DBUG_VOID_RETURN;
}
-/*
+/**
Remove all cached queries that uses the given table
+
+ @param thd Thread handler
+ @param table TABLE descriptor
+ @param using_transactions Not in autocommit mode
+ @param insert It is insert operation
*/
-void Query_cache::invalidate(THD *thd, TABLE *table,
- my_bool using_transactions)
+void Query_cache::invalidate(THD *thd, TABLE *table,
+ my_bool using_transactions,
+ my_bool insert)
{
DBUG_ENTER("Query_cache::invalidate (table)");
@@ -1633,13 +1655,22 @@ void Query_cache::invalidate(THD *thd, T
if (using_transactions &&
(table->file->table_cache_type() == HA_CACHE_TBL_TRANSACT))
thd->add_changed_table(table);
+ else if (insert &&
+ (table->file->table_cache_type() ==
+ HA_CACHE_TBL_NTRNS_INS2LOCK))
+ {
+ /* for other threads */
+ table->changed_in_insert= TRUE;
+ /* for this thread */
+ invalidate_table(thd, table);
+ }
else
invalidate_table(thd, table);
-
DBUG_VOID_RETURN;
}
+
void Query_cache::invalidate(THD *thd, const char *key, uint32 key_length,
my_bool using_transactions)
{
=== modified file 'sql/sql_cache.h'
--- a/sql/sql_cache.h 2008-07-24 13:41:55 +0000
+++ b/sql/sql_cache.h 2009-06-11 12:45:53 +0000
@@ -445,10 +445,11 @@ protected:
/* Remove all queries that uses any of the listed following tables */
void invalidate(THD* thd, TABLE_LIST *tables_used,
- my_bool using_transactions);
+ my_bool using_transactions, my_bool insert);
void invalidate(CHANGED_TABLE_LIST *tables_used);
void invalidate_locked_for_write(TABLE_LIST *tables_used);
- void invalidate(THD* thd, TABLE *table, my_bool using_transactions);
+ void invalidate(THD* thd, TABLE *table,
+ my_bool using_transactions, my_bool insert);
void invalidate(THD *thd, const char *key, uint32 key_length,
my_bool using_transactions);
=== modified file 'sql/sql_delete.cc'
--- a/sql/sql_delete.cc 2009-04-25 09:04:38 +0000
+++ b/sql/sql_delete.cc 2009-06-11 12:45:53 +0000
@@ -370,7 +370,7 @@ cleanup:
*/
if (deleted)
{
- query_cache_invalidate3(thd, table_list, 1);
+ query_cache_invalidate4(thd, table_list, TRUE, FALSE);
}
delete select;
@@ -783,7 +783,7 @@ void multi_delete::abort()
/* Something already deleted so we have to invalidate cache */
if (deleted)
- query_cache_invalidate3(thd, delete_tables, 1);
+ query_cache_invalidate4(thd, delete_tables, TRUE, FALSE);
/*
If rows from the first table only has been deleted and it is
@@ -933,7 +933,7 @@ bool multi_delete::send_eof()
*/
if (deleted)
{
- query_cache_invalidate3(thd, delete_tables, 1);
+ query_cache_invalidate4(thd, delete_tables, TRUE, FALSE);
}
if ((local_error == 0) || thd->transaction.stmt.modified_non_trans_table)
{
@@ -1074,7 +1074,7 @@ bool mysql_truncate(THD *thd, TABLE_LIST
error= ha_create_table(thd, path, table_list->db, table_list->table_name,
&create_info, 1);
VOID(pthread_mutex_unlock(&LOCK_open));
- query_cache_invalidate3(thd, table_list, 0);
+ query_cache_invalidate4(thd, table_list, FALSE, FALSE);
end:
if (!dont_send_ok)
=== modified file 'sql/sql_insert.cc'
--- a/sql/sql_insert.cc 2009-04-25 10:05:32 +0000
+++ b/sql/sql_insert.cc 2009-06-11 12:45:53 +0000
@@ -880,7 +880,7 @@ bool mysql_insert(THD *thd,TABLE_LIST *t
For the transactional algorithm to work the invalidation must be
before binlog writing and ha_autocommit_or_rollback
*/
- query_cache_invalidate3(thd, table_list, 1);
+ query_cache_invalidate4(thd, table_list, TRUE, TRUE);
}
if ((changed && error <= 0) ||
thd->transaction.stmt.modified_non_trans_table ||
@@ -2743,7 +2743,7 @@ bool Delayed_insert::handle_inserts(void
DBUG_PRINT("error", ("HA_EXTRA_NO_CACHE failed in loop"));
goto err;
}
- query_cache_invalidate3(&thd, table, 1);
+ query_cache_invalidate4(&thd, table, TRUE, TRUE);
if (thr_reschedule_write_lock(*thd.lock->locks))
{
/* This is not known to happen. */
@@ -2785,7 +2785,7 @@ bool Delayed_insert::handle_inserts(void
DBUG_PRINT("error", ("HA_EXTRA_NO_CACHE failed after loop"));
goto err;
}
- query_cache_invalidate3(&thd, table, 1);
+ query_cache_invalidate4(&thd, table, TRUE, TRUE);
pthread_mutex_lock(&mutex);
DBUG_RETURN(0);
@@ -3208,7 +3208,7 @@ bool select_insert::send_eof()
We must invalidate the table in the query cache before binlog writing
and ha_autocommit_or_rollback.
*/
- query_cache_invalidate3(thd, table, 1);
+ query_cache_invalidate4(thd, table, TRUE, TRUE);
if (thd->transaction.stmt.modified_non_trans_table)
thd->transaction.all.modified_non_trans_table= TRUE;
}
@@ -3299,7 +3299,7 @@ void select_insert::abort() {
if (!thd->current_stmt_binlog_row_based && !can_rollback_data())
thd->transaction.all.modified_non_trans_table= TRUE;
if (changed)
- query_cache_invalidate3(thd, table, 1);
+ query_cache_invalidate4(thd, table, TRUE, TRUE);
}
DBUG_ASSERT(transactional_table || !changed ||
thd->transaction.stmt.modified_non_trans_table);
=== modified file 'sql/sql_load.cc'
--- a/sql/sql_load.cc 2009-05-19 09:28:05 +0000
+++ b/sql/sql_load.cc 2009-06-11 12:45:53 +0000
@@ -446,7 +446,7 @@ int mysql_load(THD *thd,sql_exchange *ex
We must invalidate the table in query cache before binlog writing and
ha_autocommit_...
*/
- query_cache_invalidate3(thd, table_list, 0);
+ query_cache_invalidate4(thd, table_list, FALSE, FALSE);
if (error)
{
if (read_file_from_client)
=== modified file 'sql/sql_parse.cc'
--- a/sql/sql_parse.cc 2009-04-25 10:05:32 +0000
+++ b/sql/sql_parse.cc 2009-06-11 12:45:53 +0000
@@ -3211,7 +3211,7 @@ end_with_restore_list:
/* INSERT ... SELECT should invalidate only the very first table */
TABLE_LIST *save_table= first_table->next_local;
first_table->next_local= 0;
- query_cache_invalidate3(thd, first_table, 1);
+ query_cache_invalidate4(thd, first_table, TRUE, FALSE);
first_table->next_local= save_table;
}
delete sel_result;
=== modified file 'sql/sql_partition.cc'
--- a/sql/sql_partition.cc 2009-02-15 10:58:34 +0000
+++ b/sql/sql_partition.cc 2009-06-11 12:45:53 +0000
@@ -4012,7 +4012,7 @@ static int fast_end_partition(THD *thd,
thd->proc_info="end";
if (!is_empty)
- query_cache_invalidate3(thd, table_list, 0);
+ query_cache_invalidate4(thd, table_list, FALSE, FALSE);
error= ha_autocommit_or_rollback(thd, 0);
if (end_active_trans(thd))
=== modified file 'sql/sql_rename.cc'
--- a/sql/sql_rename.cc 2008-02-19 12:45:21 +0000
+++ b/sql/sql_rename.cc 2009-06-11 12:45:53 +0000
@@ -182,7 +182,7 @@ bool mysql_rename_tables(THD *thd, TABLE
}
if (!error)
- query_cache_invalidate3(thd, table_list, 0);
+ query_cache_invalidate4(thd, table_list, FALSE, FALSE);
pthread_mutex_lock(&LOCK_open);
unlock_table_names(thd, table_list, (TABLE_LIST*) 0);
=== modified file 'sql/sql_table.cc'
--- a/sql/sql_table.cc 2009-06-02 09:58:27 +0000
+++ b/sql/sql_table.cc 2009-06-11 12:45:53 +0000
@@ -1775,7 +1775,7 @@ int mysql_rm_table_part2(THD *thd, TABLE
if (some_tables_deleted || tmp_table_deleted || !error)
{
- query_cache_invalidate3(thd, tables, 0);
+ query_cache_invalidate4(thd, tables, FALSE, FALSE);
if (!dont_log_query)
{
if (!thd->current_stmt_binlog_row_based ||
@@ -4372,7 +4372,7 @@ static bool mysql_admin_table(THD* thd,
if (thd->killed)
goto err;
/* Flush entries in the query cache involving this table. */
- query_cache_invalidate3(thd, table->table, 0);
+ query_cache_invalidate4(thd, table->table, FALSE, FALSE);
open_for_modify= 0;
}
@@ -4628,7 +4628,7 @@ send_result_message:
pthread_mutex_unlock(&LOCK_open);
}
/* May be something modified consequently we have to invalidate cache */
- query_cache_invalidate3(thd, table->table, 0);
+ query_cache_invalidate4(thd, table->table, FALSE, FALSE);
}
}
ha_autocommit_or_rollback(thd, 0);
@@ -5163,7 +5163,7 @@ mysql_discard_or_import_tablespace(THD *
The 0 in the call below means 'not in a transaction', which means
immediate invalidation; that is probably what we wish here
*/
- query_cache_invalidate3(thd, table_list, 0);
+ query_cache_invalidate4(thd, table_list, FALSE, FALSE);
/* The ALTER TABLE is always in its own transaction */
error = ha_autocommit_or_rollback(thd, 0);
@@ -6433,7 +6433,7 @@ view_err:
unlink_open_table(thd, name_lock, FALSE);
VOID(pthread_mutex_unlock(&LOCK_open));
table_list->table= NULL; // For query cache
- query_cache_invalidate3(thd, table_list, 0);
+ query_cache_invalidate4(thd, table_list, FALSE, FALSE);
DBUG_RETURN(error);
}
@@ -7097,7 +7097,7 @@ view_err:
ha_flush_logs(old_db_type);
}
table_list->table=0; // For query cache
- query_cache_invalidate3(thd, table_list, 0);
+ query_cache_invalidate4(thd, table_list, FALSE, FALSE);
if (thd->locked_tables && (new_name != table_name || new_db != db))
{
=== modified file 'sql/sql_update.cc'
--- a/sql/sql_update.cc 2009-04-25 09:04:38 +0000
+++ b/sql/sql_update.cc 2009-06-11 12:45:53 +0000
@@ -779,7 +779,7 @@ int mysql_update(THD *thd,
*/
if (updated)
{
- query_cache_invalidate3(thd, table_list, 1);
+ query_cache_invalidate4(thd, table_list, TRUE, FALSE);
}
/*
@@ -1780,7 +1780,7 @@ void multi_update::abort()
/* Something already updated so we have to invalidate cache */
if (updated)
- query_cache_invalidate3(thd, update_tables, 1);
+ query_cache_invalidate4(thd, update_tables, TRUE, FALSE);
/*
If all tables that has been updated are trans safe then just do rollback.
If not attempt to do remaining updates.
@@ -2023,7 +2023,7 @@ bool multi_update::send_eof()
if (updated)
{
- query_cache_invalidate3(thd, update_tables, 1);
+ query_cache_invalidate4(thd, update_tables, TRUE, FALSE);
}
/*
Write the SQL statement to the binlog if we updated
=== modified file 'sql/sql_view.cc'
--- a/sql/sql_view.cc 2009-04-25 10:05:32 +0000
+++ b/sql/sql_view.cc 2009-06-11 12:45:53 +0000
@@ -667,7 +667,7 @@ bool mysql_create_view(THD *thd, TABLE_L
VOID(pthread_mutex_unlock(&LOCK_open));
if (mode != VIEW_CREATE_NEW)
- query_cache_invalidate3(thd, view, 0);
+ query_cache_invalidate4(thd, view, FALSE, FALSE);
start_waiting_global_read_lock(thd);
if (res)
goto err;
@@ -1631,7 +1631,7 @@ bool mysql_drop_view(THD *thd, TABLE_LIS
pthread_mutex_unlock(&share->mutex);
release_table_share(share, RELEASE_WAIT_FOR_DROP);
}
- query_cache_invalidate3(thd, view, 0);
+ query_cache_invalidate4(thd, view, FALSE, FALSE);
sp_cache_invalidate();
}
@@ -1983,7 +1983,7 @@ mysql_rename_view(THD *thd,
DBUG_RETURN(1);
/* remove cache entries */
- query_cache_invalidate3(thd, view, 0);
+ query_cache_invalidate4(thd, view, FALSE, FALSE);
sp_cache_invalidate();
error= FALSE;
=== modified file 'sql/table.h'
--- a/sql/table.h 2009-02-19 09:01:25 +0000
+++ b/sql/table.h 2009-06-11 12:45:53 +0000
@@ -790,6 +790,7 @@ struct st_table {
my_bool get_fields_in_item_tree; /* Signal to fix_field */
/* If MERGE children attached to parent. See top comment in ha_myisammrg.cc */
my_bool children_attached;
+ my_bool changed_in_insert; /* Have been changed in insert since last lock */
REGINFO reginfo; /* field connections */
MEM_ROOT mem_root;
=== modified file 'storage/maria/ha_maria.h'
--- a/storage/maria/ha_maria.h 2008-12-02 22:02:52 +0000
+++ b/storage/maria/ha_maria.h 2009-06-11 12:45:53 +0000
@@ -164,4 +164,6 @@ public:
return file;
}
static int implicit_commit(THD *thd, bool new_trn);
+ /** Type of table for caching query */
+ virtual uint8 table_cache_type() { return HA_CACHE_TBL_NTRNS_INS2LOCK; }
};
=== modified file 'storage/myisam/ha_myisam.h'
--- a/storage/myisam/ha_myisam.h 2008-06-28 12:45:15 +0000
+++ b/storage/myisam/ha_myisam.h 2009-06-11 12:45:53 +0000
@@ -147,4 +147,6 @@ class ha_myisam: public handler
{
return file;
}
+ /** Type of table for caching query */
+ virtual uint8 table_cache_type() { return HA_CACHE_TBL_NTRNS_INS2LOCK; }
};
2
1

[Maria-developers] Updated (by Guest): Using the Valgrind API in mysqld (23)
by worklog-noreply@askmonty.org 20 Jun '09
by worklog-noreply@askmonty.org 20 Jun '09
20 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Using the Valgrind API in mysqld
CREATION DATE..: Fri, 22 May 2009, 11:43
SUPERVISOR.....: Psergey
IMPLEMENTOR....: Knielsen
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 23 (http://askmonty.org/worklog/?tid=23)
VERSION........: Server-9.x
STATUS.........: Complete
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 40 (hours remain)
ORIG. ESTIMATE.: 40
PROGRESS NOTES:
-=-=(Guest - Sat, 20 Jun 2009, 21:35)=-=-
Version updated.
--- /tmp/wklog.23.old.7421 2009-06-20 21:35:40.000000000 +0300
+++ /tmp/wklog.23.new.7421 2009-06-20 21:35:40.000000000 +0300
@@ -1 +1 @@
-Connector/J-3.2
+Server-9.x
-=-=(Guest - Fri, 19 Jun 2009, 20:48)=-=-
Version updated.
--- /tmp/wklog.23.old.1520 2009-06-19 20:48:56.000000000 +0300
+++ /tmp/wklog.23.new.1520 2009-06-19 20:48:56.000000000 +0300
@@ -1 +1 @@
-GUI-Tools-3.0
+Connector/J-3.2
-=-=(Guest - Fri, 19 Jun 2009, 20:45)=-=-
Version updated.
--- /tmp/wklog.23.old.1480 2009-06-19 20:45:53.000000000 +0300
+++ /tmp/wklog.23.new.1480 2009-06-19 20:45:53.000000000 +0300
@@ -1 +1 @@
-Connector/.NET-2.0
+GUI-Tools-3.0
-=-=(Guest - Fri, 19 Jun 2009, 20:42)=-=-
Version updated.
--- /tmp/wklog.23.old.1373 2009-06-19 20:42:48.000000000 +0300
+++ /tmp/wklog.23.new.1373 2009-06-19 20:42:48.000000000 +0300
@@ -1 +1 @@
-WorkLog-3.4
+Connector/.NET-2.0
-=-=(Guest - Fri, 19 Jun 2009, 20:39)=-=-
Version updated.
--- /tmp/wklog.23.old.1347 2009-06-19 20:39:44.000000000 +0300
+++ /tmp/wklog.23.new.1347 2009-06-19 20:39:44.000000000 +0300
@@ -1 +1 @@
-Server-6.0
+WorkLog-3.4
-=-=(Guest - Fri, 19 Jun 2009, 20:36)=-=-
Version updated.
--- /tmp/wklog.23.old.1240 2009-06-19 20:36:40.000000000 +0300
+++ /tmp/wklog.23.new.1240 2009-06-19 20:36:40.000000000 +0300
@@ -1 +1 @@
-Connector/J-5.2
+Server-6.0
-=-=(Guest - Fri, 19 Jun 2009, 20:33)=-=-
Version updated.
--- /tmp/wklog.23.old.1136 2009-06-19 20:33:36.000000000 +0300
+++ /tmp/wklog.23.new.1136 2009-06-19 20:33:36.000000000 +0300
@@ -1 +1 @@
-Server-5.1
+Connector/J-5.2
-=-=(Guest - Fri, 19 Jun 2009, 20:30)=-=-
Version updated.
--- /tmp/wklog.23.old.1109 2009-06-19 20:30:32.000000000 +0300
+++ /tmp/wklog.23.new.1109 2009-06-19 20:30:32.000000000 +0300
@@ -1 +1 @@
-Maria-1.0
+Server-5.1
-=-=(Guest - Fri, 19 Jun 2009, 20:27)=-=-
Version updated.
--- /tmp/wklog.23.old.1004 2009-06-19 20:27:28.000000000 +0300
+++ /tmp/wklog.23.new.1004 2009-06-19 20:27:28.000000000 +0300
@@ -1 +1 @@
-Connector/J-3.1
+Maria-1.0
-=-=(Guest - Fri, 19 Jun 2009, 20:24)=-=-
Version updated.
--- /tmp/wklog.23.old.907 2009-06-19 20:24:24.000000000 +0300
+++ /tmp/wklog.23.new.907 2009-06-19 20:24:24.000000000 +0300
@@ -1 +1 @@
-Maria-1.1
+Connector/J-3.1
------------------------------------------------------------
-=-=(View All Progress Notes, 173 total)=-=-
http://askmonty.org/worklog/index.pl?tid=23&nolimit=1
DESCRIPTION:
Valgrind (the memcheck tool) has some very useful APIs that can be used in mysqld
when testing with Valgrind to improve testing and/or debugging:
file:///usr/share/doc/valgrind/html/mc-manual.html#mc-manual.clientreqs
file:///usr/share/doc/valgrind/html/mc-manual.html#mc-manual.mempools
This worklog is about adding configure checks and headers to allow to use these
in a way that continues to work on machines where the Valgrind headers or
functionality is missing.
It also includes adding some basic Valgrind enhancements:
- Adding Valgrind annotations to custom memory allocators so that Valgrind can
detect leaks, use-before-init, and use-after-free problems also for these
allocators.
- Adding checks for definedness in appropriate places (eg. when calling libz).
HIGH-LEVEL SPECIFICATION:
With custom memory allocators, using the Valgrind APIs we can tell Valgrind when
a memory block is allocated (so that data read from memory is marked as undefined
instead of being defined or not at random depending on prior use); and when a
memory block is freed (so that use after freeing can be reported as an error).
In some cases cheking for leaks may also be appropriate.
Another possibility is to add an explicit check for whether memory is defined.
One place this would be useful is when calling libz. Due to the design of that
library, Valgrind produces lots of false alarms about using undefined values
(I think the issue is that it runs a few bytes off of initialized memory to
reduce boundary checks in each loop iteration, then after the loop has checks to
avoid using the undefined part of the result). This means we have lots of libz
Valgrind suppressions and continue to add more as new warnings surface. So we
might easily miss a real problem in this area. This could be improved by adding
explicit checks at the call to libz functions that the passed memory is properly
defined.
Another use is to improve debugging. It is often the case when debugging a
warning about using un-initialised memory that the detection happens long after
the real problem, the un-initialized value being passed along through the code
for a long time before being detected. This makes debugging the problem slow.
By adding in strategic places code that asserts that a specific value must be
initialised, it is possible to detect problems earlier, speeding up debugging.
Such code can be added in more places over time as development and debugging
goes on.
See also a patch here: http://bugs.mysql.com/bug.php?id=44582
LOW-LEVEL DESIGN:
Two places where we call into libz, and where checking for defined parameters
would be good:
- mysys/my_compress.c
- sql/item_strfunc.cc (Item_func_compress).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Using the Valgrind API in mysqld (23)
by worklog-noreply@askmonty.org 20 Jun '09
by worklog-noreply@askmonty.org 20 Jun '09
20 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Using the Valgrind API in mysqld
CREATION DATE..: Fri, 22 May 2009, 11:43
SUPERVISOR.....: Psergey
IMPLEMENTOR....: Knielsen
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 23 (http://askmonty.org/worklog/?tid=23)
VERSION........: Server-9.x
STATUS.........: Complete
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 40 (hours remain)
ORIG. ESTIMATE.: 40
PROGRESS NOTES:
-=-=(Guest - Sat, 20 Jun 2009, 21:35)=-=-
Version updated.
--- /tmp/wklog.23.old.7421 2009-06-20 21:35:40.000000000 +0300
+++ /tmp/wklog.23.new.7421 2009-06-20 21:35:40.000000000 +0300
@@ -1 +1 @@
-Connector/J-3.2
+Server-9.x
-=-=(Guest - Fri, 19 Jun 2009, 20:48)=-=-
Version updated.
--- /tmp/wklog.23.old.1520 2009-06-19 20:48:56.000000000 +0300
+++ /tmp/wklog.23.new.1520 2009-06-19 20:48:56.000000000 +0300
@@ -1 +1 @@
-GUI-Tools-3.0
+Connector/J-3.2
-=-=(Guest - Fri, 19 Jun 2009, 20:45)=-=-
Version updated.
--- /tmp/wklog.23.old.1480 2009-06-19 20:45:53.000000000 +0300
+++ /tmp/wklog.23.new.1480 2009-06-19 20:45:53.000000000 +0300
@@ -1 +1 @@
-Connector/.NET-2.0
+GUI-Tools-3.0
-=-=(Guest - Fri, 19 Jun 2009, 20:42)=-=-
Version updated.
--- /tmp/wklog.23.old.1373 2009-06-19 20:42:48.000000000 +0300
+++ /tmp/wklog.23.new.1373 2009-06-19 20:42:48.000000000 +0300
@@ -1 +1 @@
-WorkLog-3.4
+Connector/.NET-2.0
-=-=(Guest - Fri, 19 Jun 2009, 20:39)=-=-
Version updated.
--- /tmp/wklog.23.old.1347 2009-06-19 20:39:44.000000000 +0300
+++ /tmp/wklog.23.new.1347 2009-06-19 20:39:44.000000000 +0300
@@ -1 +1 @@
-Server-6.0
+WorkLog-3.4
-=-=(Guest - Fri, 19 Jun 2009, 20:36)=-=-
Version updated.
--- /tmp/wklog.23.old.1240 2009-06-19 20:36:40.000000000 +0300
+++ /tmp/wklog.23.new.1240 2009-06-19 20:36:40.000000000 +0300
@@ -1 +1 @@
-Connector/J-5.2
+Server-6.0
-=-=(Guest - Fri, 19 Jun 2009, 20:33)=-=-
Version updated.
--- /tmp/wklog.23.old.1136 2009-06-19 20:33:36.000000000 +0300
+++ /tmp/wklog.23.new.1136 2009-06-19 20:33:36.000000000 +0300
@@ -1 +1 @@
-Server-5.1
+Connector/J-5.2
-=-=(Guest - Fri, 19 Jun 2009, 20:30)=-=-
Version updated.
--- /tmp/wklog.23.old.1109 2009-06-19 20:30:32.000000000 +0300
+++ /tmp/wklog.23.new.1109 2009-06-19 20:30:32.000000000 +0300
@@ -1 +1 @@
-Maria-1.0
+Server-5.1
-=-=(Guest - Fri, 19 Jun 2009, 20:27)=-=-
Version updated.
--- /tmp/wklog.23.old.1004 2009-06-19 20:27:28.000000000 +0300
+++ /tmp/wklog.23.new.1004 2009-06-19 20:27:28.000000000 +0300
@@ -1 +1 @@
-Connector/J-3.1
+Maria-1.0
-=-=(Guest - Fri, 19 Jun 2009, 20:24)=-=-
Version updated.
--- /tmp/wklog.23.old.907 2009-06-19 20:24:24.000000000 +0300
+++ /tmp/wklog.23.new.907 2009-06-19 20:24:24.000000000 +0300
@@ -1 +1 @@
-Maria-1.1
+Connector/J-3.1
------------------------------------------------------------
-=-=(View All Progress Notes, 173 total)=-=-
http://askmonty.org/worklog/index.pl?tid=23&nolimit=1
DESCRIPTION:
Valgrind (the memcheck tool) has some very useful APIs that can be used in mysqld
when testing with Valgrind to improve testing and/or debugging:
file:///usr/share/doc/valgrind/html/mc-manual.html#mc-manual.clientreqs
file:///usr/share/doc/valgrind/html/mc-manual.html#mc-manual.mempools
This worklog is about adding configure checks and headers to allow to use these
in a way that continues to work on machines where the Valgrind headers or
functionality is missing.
It also includes adding some basic Valgrind enhancements:
- Adding Valgrind annotations to custom memory allocators so that Valgrind can
detect leaks, use-before-init, and use-after-free problems also for these
allocators.
- Adding checks for definedness in appropriate places (eg. when calling libz).
HIGH-LEVEL SPECIFICATION:
With custom memory allocators, using the Valgrind APIs we can tell Valgrind when
a memory block is allocated (so that data read from memory is marked as undefined
instead of being defined or not at random depending on prior use); and when a
memory block is freed (so that use after freeing can be reported as an error).
In some cases cheking for leaks may also be appropriate.
Another possibility is to add an explicit check for whether memory is defined.
One place this would be useful is when calling libz. Due to the design of that
library, Valgrind produces lots of false alarms about using undefined values
(I think the issue is that it runs a few bytes off of initialized memory to
reduce boundary checks in each loop iteration, then after the loop has checks to
avoid using the undefined part of the result). This means we have lots of libz
Valgrind suppressions and continue to add more as new warnings surface. So we
might easily miss a real problem in this area. This could be improved by adding
explicit checks at the call to libz functions that the passed memory is properly
defined.
Another use is to improve debugging. It is often the case when debugging a
warning about using un-initialised memory that the detection happens long after
the real problem, the un-initialized value being passed along through the code
for a long time before being detected. This makes debugging the problem slow.
By adding in strategic places code that asserts that a specific value must be
initialised, it is possible to detect problems earlier, speeding up debugging.
Such code can be added in more places over time as development and debugging
goes on.
See also a patch here: http://bugs.mysql.com/bug.php?id=44582
LOW-LEVEL DESIGN:
Two places where we call into libz, and where checking for defined parameters
would be good:
- mysys/my_compress.c
- sql/item_strfunc.cc (Item_func_compress).
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Better choice between range and index_merge/intersection options (26)
by worklog-noreply@askmonty.org 20 Jun '09
by worklog-noreply@askmonty.org 20 Jun '09
20 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Better choice between range and index_merge/intersection options
CREATION DATE..: Wed, 27 May 2009, 15:20
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 26 (http://askmonty.org/worklog/?tid=26)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Sat, 20 Jun 2009, 09:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.26.old.21828 2009-06-20 09:39:25.000000000 +0300
+++ /tmp/wklog.26.new.21828 2009-06-20 09:39:25.000000000 +0300
@@ -1 +1,9 @@
+* Not a spec but rather preliminary notes*
+User cases
+----------
+
+* BUG#32254: Index merge used unnecessarily
+* BUG#34869: Almost 300% regression in index merge intersect performance
+* BUG#40051: needed way to prevent optimizer from using index_merge on useless
+keys (no testcase)
-=-=(Guest - Sat, 13 Jun 2009, 06:51)=-=-
Category updated.
--- /tmp/wklog.26.old.26624 2009-06-13 06:51:56.000000000 +0300
+++ /tmp/wklog.26.new.26624 2009-06-13 06:51:56.000000000 +0300
@@ -1 +1 @@
-Server-BackLog
+Server-Sprint
-=-=(Guest - Sat, 13 Jun 2009, 06:06)=-=-
Category updated.
--- /tmp/wklog.26.old.24784 2009-06-13 06:06:28.000000000 +0300
+++ /tmp/wklog.26.new.24784 2009-06-13 06:06:28.000000000 +0300
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-BackLog
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 26
DESCRIPTION:
The optimizer does a cost-based choice between possible range and
index_merge/intersection scans. There are some issues with it:
- index_merge/intersection gets chosen even when there is a single
multi-part index that covers all keys. Measurements show that this is
a poor choice.
- The picked index_merge/intersection can use a redundant set of indexes:
it will be intersect(idx1, ..., idxN) where all columns in idxN are covered
by other used indexes.
This WL is to fix these limitations.
HIGH-LEVEL SPECIFICATION:
* Not a spec but rather preliminary notes*
User cases
----------
* BUG#32254: Index merge used unnecessarily
* BUG#34869: Almost 300% regression in index merge intersect performance
* BUG#40051: needed way to prevent optimizer from using index_merge on useless
keys (no testcase)
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Better choice between range and index_merge/intersection options (26)
by worklog-noreply@askmonty.org 20 Jun '09
by worklog-noreply@askmonty.org 20 Jun '09
20 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Better choice between range and index_merge/intersection options
CREATION DATE..: Wed, 27 May 2009, 15:20
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 26 (http://askmonty.org/worklog/?tid=26)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Sat, 20 Jun 2009, 09:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.26.old.21828 2009-06-20 09:39:25.000000000 +0300
+++ /tmp/wklog.26.new.21828 2009-06-20 09:39:25.000000000 +0300
@@ -1 +1,9 @@
+* Not a spec but rather preliminary notes*
+User cases
+----------
+
+* BUG#32254: Index merge used unnecessarily
+* BUG#34869: Almost 300% regression in index merge intersect performance
+* BUG#40051: needed way to prevent optimizer from using index_merge on useless
+keys (no testcase)
-=-=(Guest - Sat, 13 Jun 2009, 06:51)=-=-
Category updated.
--- /tmp/wklog.26.old.26624 2009-06-13 06:51:56.000000000 +0300
+++ /tmp/wklog.26.new.26624 2009-06-13 06:51:56.000000000 +0300
@@ -1 +1 @@
-Server-BackLog
+Server-Sprint
-=-=(Guest - Sat, 13 Jun 2009, 06:06)=-=-
Category updated.
--- /tmp/wklog.26.old.24784 2009-06-13 06:06:28.000000000 +0300
+++ /tmp/wklog.26.new.24784 2009-06-13 06:06:28.000000000 +0300
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-BackLog
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 26
DESCRIPTION:
The optimizer does a cost-based choice between possible range and
index_merge/intersection scans. There are some issues with it:
- index_merge/intersection gets chosen even when there is a single
multi-part index that covers all keys. Measurements show that this is
a poor choice.
- The picked index_merge/intersection can use a redundant set of indexes:
it will be intersect(idx1, ..., idxN) where all columns in idxN are covered
by other used indexes.
This WL is to fix these limitations.
HIGH-LEVEL SPECIFICATION:
* Not a spec but rather preliminary notes*
User cases
----------
* BUG#32254: Index merge used unnecessarily
* BUG#34869: Almost 300% regression in index merge intersect performance
* BUG#40051: needed way to prevent optimizer from using index_merge on useless
keys (no testcase)
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
by worklog-noreply@askmonty.org 20 Jun '09
by worklog-noreply@askmonty.org 20 Jun '09
20 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: 9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Sat, 20 Jun 2009, 09:34)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.21663 2009-06-20 09:34:48.000000000 +0300
+++ /tmp/wklog.24.new.21663 2009-06-20 09:34:48.000000000 +0300
@@ -4,6 +4,7 @@
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
+3. Testing and required coverage
</contents>
1. Current implementation overview
@@ -240,3 +241,14 @@
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
+
+3. Testing and required coverage
+================================
+So far could find the following user cases:
+
+* BUG#17259: Query optimizer chooses wrong index
+* BUG#17673: Optimizer does not use Index Merge optimization in some cases
+* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
+* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
+
+
-=-=(Guest - Thu, 18 Jun 2009, 16:55)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.19152 2009-06-18 16:55:00.000000000 +0300
+++ /tmp/wklog.24.new.19152 2009-06-18 16:55:00.000000000 +0300
@@ -141,13 +141,15 @@
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
+
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
-1. Don't remove index_merge part of the tree.
+A1. Don't remove index_merge part of the tree (this will take care of
+ DISCARD-IMERGE-1 problem)
-2. Push range conditions down into index_merge trees that may support them.
+A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
@@ -155,8 +157,86 @@
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
-3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
-2.2 New tree_or()
+2.2 New tree_or()
+-----------------
+O1. Dont remove non-range plans:
+ Current tree_or() code will refuse to produce index_merge plans for
+ conditions like
+
+ "t.key1part2=const OR t.key2part1=const"
+
+ (this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
+ the AND condition is not usable for range access, and the operation of
+ tree_and() guaranteed that there was no way it could changed to make a
+ usable range plan. With new tree_and() and rule A2, this is no longer the
+ case. For example for this query:
+
+ (t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
+
+ it will construct a
+
+ imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
+
+ then tree_and() will apply rule A2 to push the range down into index merge
+ and after that we'll have:
+
+ range(t.key1part1=const)
+ imerge(
+ t.key1part2=const AND t.key1part1=const,
+ t.key2part1=const
+ )
+ note that imerge(...) describes a usable index_merge plan and it's possible
+ that it will be the best access path.
+
+O2. "Create index_merge accesses when possible"
+ Current tree_or() will not create index_merge access when it could create
+ non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
+ in the current implementation" section). This will be changed to work as
+ follows: we will create index_merge made for index scans that didn't have
+ their match in the other sel_tree.
+ Ilustrating it with an example:
+
+ | sel_tree_A | sel_tree_B | A or B | include in index_merge?
+ ------+------------+------------+--------+------------------------
+ key1 | cond1 | cond2 | condM | no
+ key2 | cond3 | cond4 | NULL | no
+ key3 | cond5 | | | yes, A-side
+ key4 | cond6 | | | yes, A-side
+ key5 | | cond7 | | yes, B-side
+ key6 | | cond8 | | yes, B-side
+
+ here we assume that
+ - (cond1 OR cond2) did produce a combined range. Not including them in
+ index_merge.
+ - (cond3 OR cond4) didn't produce a usable range (e.g. they were
+ t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
+ didn't yield any range list)
+ - All other scand didn't have their counterparts, so we'll end up with a
+ SEL_TREE of:
+
+ range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
+ .
+
+O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
+that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
+seen any complaints that could be attributed to it.
+If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
+lift it ,and produce a cross-product:
+
+ ((key1p OR key2p) AND (key3p OR key4p))
+ OR
+ ((key5p OR key6p) AND (key7p OR key8p))
+
+ = (key1p OR key2p OR key5p OR key6p) AND // this part is currently
+ (key3p OR key4p OR key5p OR key6p) AND // produced
+
+ (key1p OR key2p OR key5p OR key6p) AND // this part will be added
+ (key3p OR key4p OR key5p OR key6p) //.
+
+In order to limit the impact of this combinatorial explosion, we will
+introduce a rule that we won't generate more than #defined
+MAX_IMERGE_OPTS options.
-=-=(Guest - Thu, 18 Jun 2009, 14:56)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.15612 2009-06-18 14:56:09.000000000 +0300
+++ /tmp/wklog.24.new.15612 2009-06-18 14:56:09.000000000 +0300
@@ -1 +1,162 @@
+<contents>
+1. Current implementation overview
+1.1. Problems in the current implementation
+2. New implementation
+2.1 New tree_and()
+2.2 New tree_or()
+</contents>
+
+1. Current implementation overview
+==================================
+At the moment, range analyzer works as follows:
+
+SEL_TREE structure represents
+
+ # There are sel_trees, a sel_tree is either range or merge tree
+ sel_tree = range_tree | imerge_tree
+
+ # a range tree has range access options, possibly for several keys
+ range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+
+ # merge tree represents several way to index_merge
+ imerge_tree = imerge1 AND imerge2 AND ...
+
+ # a way to do index merge == a set to use of different indexes.
+ imergeX = range_tree1 OR range_tree2 OR ..
+ where no pair of range_treeX have ranges over the same index.
+
+
+ tree_and(A, B)
+ {
+ if (both A and B are range trees)
+ return a range_tree with computed intersection for each range;
+ if (only one of A and B is a range tree)
+ return that tree; // DISCARD-IMERGE-1
+ // at this point both trees are index_merge trees
+ return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
+ }
+
+
+ tree_or(A, B)
+ {
+ if (A and B are range trees)
+ {
+ R = new range_tree;
+ for each index i
+ R.add(range_union(A.range(i), B.range(i)));
+
+ if (R has at least one range access)
+ return R;
+ else
+ {
+ /* could not build any range accesses. construct index_merge */
+ remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from B;
+ return new index_merge(A, B);
+ }
+ }
+ else if (A is range tree and B is index_merge tree (or vice versa))
+ {
+ Perform this transformation:
+
+ range_treeA // this is A
+ OR
+ (range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
+ (range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ =
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+
+ Now each line represents an index_merge..
+ }
+ else if (both A and B are index_merge trees)
+ {
+ Perform this transformation:
+
+ imergeA1 AND imergeA2 AND ... AND imergeAN
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN
+
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+
+ imergeA1
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN =
+
+ = (combine imergeA1 with each of the imergeB{i} ) =
+
+ combine(imergeA1 OR imergeB1) AND
+ combine(imergeA1 OR imergeB2) AND
+ ... AND
+ combine(imergeA1 OR imergeBN)
+ }
+ }
+
+1.1. Problems in the current implementation
+-------------------------------------------
+As marked in the code above:
+
+DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
+the WHERE clause has this form:
+
+ (t.key1=c1 OR t.key2=c2) AND t.badkey < c3
+
+DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
+the WHERE clause has this form (conditions t.badkey may have abritrary form):
+
+ (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+
+DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
+two indexes:
+
+ INDEX i1(col1, col2),
+ INDEX i2(col1, col3)
+
+and this WHERE clause:
+
+ col1=c1 AND (col2=c2 OR col3=c3)
+
+The optimizer will generate the plans that only use the "col1=c1" part. The
+right side of the AND will be ignored even if it has good selectivity.
+
+
+2. New implementation
+=====================
+
+<general idea>
+* Don't start fighting combinatorial explosion until we've actually got one.
+</>
+
+SEL_TREE structure will be now able to hold both index_merge and range scan
+candidates at the same time. That is,
+
+ sel_tree2 = range_tree AND imerge_tree
+
+where both parts are optional (i.e. can be empty)
+
+Operations on SEL_ARG trees will be modified to produce/process the trees of
+this kind:
+
+2.1 New tree_and()
+------------------
+In order not to lose plans, we'll make these changes:
+
+1. Don't remove index_merge part of the tree.
+
+2. Push range conditions down into index_merge trees that may support them.
+ if one tree has range(key1) and the other tree has imerge(key1 OR key2)
+ then perform an equvalent of this operation:
+
+ rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
+
+ (rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
+
+3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+ concatenate them together.
+
+2.2 New tree_or()
-=-=(Guest - Sat, 13 Jun 2009, 06:29)=-=-
Category updated.
--- /tmp/wklog.24.old.25753 2009-06-13 06:29:10.000000000 +0300
+++ /tmp/wklog.24.new.25753 2009-06-13 06:29:10.000000000 +0300
@@ -1 +1 @@
-Server-BackLog
+Server-Sprint
-=-=(Guest - Sat, 13 Jun 2009, 06:14)=-=-
Category updated.
--- /tmp/wklog.24.old.24991 2009-06-13 06:14:03.000000000 +0300
+++ /tmp/wklog.24.new.24991 2009-06-13 06:14:03.000000000 +0300
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-BackLog
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 24
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Category updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Client-BackLog
+Server-RawIdeaBin
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Version updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Server-9.x
+9.x
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
------------------------------------------------------------
-=-=(View All Progress Notes, 12 total)=-=-
http://askmonty.org/worklog/index.pl?tid=24&nolimit=1
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
LOW-LEVEL DESIGN:
<contents>
1. Current implementation overview
1.1. Problems in the current implementation
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
3. Testing and required coverage
</contents>
1. Current implementation overview
==================================
At the moment, range analyzer works as follows:
SEL_TREE structure represents
# There are sel_trees, a sel_tree is either range or merge tree
sel_tree = range_tree | imerge_tree
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
# a way to do index merge == a set to use of different indexes.
imergeX = range_tree1 OR range_tree2 OR ..
where no pair of range_treeX have ranges over the same index.
tree_and(A, B)
{
if (both A and B are range trees)
return a range_tree with computed intersection for each range;
if (only one of A and B is a range tree)
return that tree; // DISCARD-IMERGE-1
// at this point both trees are index_merge trees
return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
}
tree_or(A, B)
{
if (A and B are range trees)
{
R = new range_tree;
for each index i
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
return R;
else
{
/* could not build any range accesses. construct index_merge */
remove non-ranges from A; // DISCARD-IMERGE-2
remove non-ranges from B;
return new index_merge(A, B);
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
{
Perform this transformation:
range_treeA // this is A
OR
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
(range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
Now each line represents an index_merge..
}
else if (both A and B are index_merge trees)
{
Perform this transformation:
imergeA1 AND imergeA2 AND ... AND imergeAN
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
-> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
imergeA1
OR
imergeB1 AND imergeB2 AND ... AND imergeBN =
= (combine imergeA1 with each of the imergeB{i} ) =
combine(imergeA1 OR imergeB1) AND
combine(imergeA1 OR imergeB2) AND
... AND
combine(imergeA1 OR imergeBN)
}
}
1.1. Problems in the current implementation
-------------------------------------------
As marked in the code above:
DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
the WHERE clause has this form:
(t.key1=c1 OR t.key2=c2) AND t.badkey < c3
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
(t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
INDEX i1(col1, col2),
INDEX i2(col1, col3)
and this WHERE clause:
col1=c1 AND (col2=c2 OR col3=c3)
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
2. New implementation
=====================
<general idea>
* Don't start fighting combinatorial explosion until we've actually got one.
</>
SEL_TREE structure will be now able to hold both index_merge and range scan
candidates at the same time. That is,
sel_tree2 = range_tree AND imerge_tree
where both parts are optional (i.e. can be empty)
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
A1. Don't remove index_merge part of the tree (this will take care of
DISCARD-IMERGE-1 problem)
A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
2.2 New tree_or()
-----------------
O1. Dont remove non-range plans:
Current tree_or() code will refuse to produce index_merge plans for
conditions like
"t.key1part2=const OR t.key2part1=const"
(this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
the AND condition is not usable for range access, and the operation of
tree_and() guaranteed that there was no way it could changed to make a
usable range plan. With new tree_and() and rule A2, this is no longer the
case. For example for this query:
(t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
it will construct a
imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
then tree_and() will apply rule A2 to push the range down into index merge
and after that we'll have:
range(t.key1part1=const)
imerge(
t.key1part2=const AND t.key1part1=const,
t.key2part1=const
)
note that imerge(...) describes a usable index_merge plan and it's possible
that it will be the best access path.
O2. "Create index_merge accesses when possible"
Current tree_or() will not create index_merge access when it could create
non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
in the current implementation" section). This will be changed to work as
follows: we will create index_merge made for index scans that didn't have
their match in the other sel_tree.
Ilustrating it with an example:
| sel_tree_A | sel_tree_B | A or B | include in index_merge?
------+------------+------------+--------+------------------------
key1 | cond1 | cond2 | condM | no
key2 | cond3 | cond4 | NULL | no
key3 | cond5 | | | yes, A-side
key4 | cond6 | | | yes, A-side
key5 | | cond7 | | yes, B-side
key6 | | cond8 | | yes, B-side
here we assume that
- (cond1 OR cond2) did produce a combined range. Not including them in
index_merge.
- (cond3 OR cond4) didn't produce a usable range (e.g. they were
t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
didn't yield any range list)
- All other scand didn't have their counterparts, so we'll end up with a
SEL_TREE of:
range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
.
O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
seen any complaints that could be attributed to it.
If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
lift it ,and produce a cross-product:
((key1p OR key2p) AND (key3p OR key4p))
OR
((key5p OR key6p) AND (key7p OR key8p))
= (key1p OR key2p OR key5p OR key6p) AND // this part is currently
(key3p OR key4p OR key5p OR key6p) AND // produced
(key1p OR key2p OR key5p OR key6p) AND // this part will be added
(key3p OR key4p OR key5p OR key6p) //.
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
3. Testing and required coverage
================================
So far could find the following user cases:
* BUG#17259: Query optimizer chooses wrong index
* BUG#17673: Optimizer does not use Index Merge optimization in some cases
* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
by worklog-noreply@askmonty.org 20 Jun '09
by worklog-noreply@askmonty.org 20 Jun '09
20 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: 9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Sat, 20 Jun 2009, 09:34)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.21663 2009-06-20 09:34:48.000000000 +0300
+++ /tmp/wklog.24.new.21663 2009-06-20 09:34:48.000000000 +0300
@@ -4,6 +4,7 @@
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
+3. Testing and required coverage
</contents>
1. Current implementation overview
@@ -240,3 +241,14 @@
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
+
+3. Testing and required coverage
+================================
+So far could find the following user cases:
+
+* BUG#17259: Query optimizer chooses wrong index
+* BUG#17673: Optimizer does not use Index Merge optimization in some cases
+* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
+* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
+
+
-=-=(Guest - Thu, 18 Jun 2009, 16:55)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.19152 2009-06-18 16:55:00.000000000 +0300
+++ /tmp/wklog.24.new.19152 2009-06-18 16:55:00.000000000 +0300
@@ -141,13 +141,15 @@
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
+
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
-1. Don't remove index_merge part of the tree.
+A1. Don't remove index_merge part of the tree (this will take care of
+ DISCARD-IMERGE-1 problem)
-2. Push range conditions down into index_merge trees that may support them.
+A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
@@ -155,8 +157,86 @@
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
-3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
-2.2 New tree_or()
+2.2 New tree_or()
+-----------------
+O1. Dont remove non-range plans:
+ Current tree_or() code will refuse to produce index_merge plans for
+ conditions like
+
+ "t.key1part2=const OR t.key2part1=const"
+
+ (this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
+ the AND condition is not usable for range access, and the operation of
+ tree_and() guaranteed that there was no way it could changed to make a
+ usable range plan. With new tree_and() and rule A2, this is no longer the
+ case. For example for this query:
+
+ (t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
+
+ it will construct a
+
+ imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
+
+ then tree_and() will apply rule A2 to push the range down into index merge
+ and after that we'll have:
+
+ range(t.key1part1=const)
+ imerge(
+ t.key1part2=const AND t.key1part1=const,
+ t.key2part1=const
+ )
+ note that imerge(...) describes a usable index_merge plan and it's possible
+ that it will be the best access path.
+
+O2. "Create index_merge accesses when possible"
+ Current tree_or() will not create index_merge access when it could create
+ non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
+ in the current implementation" section). This will be changed to work as
+ follows: we will create index_merge made for index scans that didn't have
+ their match in the other sel_tree.
+ Ilustrating it with an example:
+
+ | sel_tree_A | sel_tree_B | A or B | include in index_merge?
+ ------+------------+------------+--------+------------------------
+ key1 | cond1 | cond2 | condM | no
+ key2 | cond3 | cond4 | NULL | no
+ key3 | cond5 | | | yes, A-side
+ key4 | cond6 | | | yes, A-side
+ key5 | | cond7 | | yes, B-side
+ key6 | | cond8 | | yes, B-side
+
+ here we assume that
+ - (cond1 OR cond2) did produce a combined range. Not including them in
+ index_merge.
+ - (cond3 OR cond4) didn't produce a usable range (e.g. they were
+ t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
+ didn't yield any range list)
+ - All other scand didn't have their counterparts, so we'll end up with a
+ SEL_TREE of:
+
+ range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
+ .
+
+O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
+that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
+seen any complaints that could be attributed to it.
+If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
+lift it ,and produce a cross-product:
+
+ ((key1p OR key2p) AND (key3p OR key4p))
+ OR
+ ((key5p OR key6p) AND (key7p OR key8p))
+
+ = (key1p OR key2p OR key5p OR key6p) AND // this part is currently
+ (key3p OR key4p OR key5p OR key6p) AND // produced
+
+ (key1p OR key2p OR key5p OR key6p) AND // this part will be added
+ (key3p OR key4p OR key5p OR key6p) //.
+
+In order to limit the impact of this combinatorial explosion, we will
+introduce a rule that we won't generate more than #defined
+MAX_IMERGE_OPTS options.
-=-=(Guest - Thu, 18 Jun 2009, 14:56)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.15612 2009-06-18 14:56:09.000000000 +0300
+++ /tmp/wklog.24.new.15612 2009-06-18 14:56:09.000000000 +0300
@@ -1 +1,162 @@
+<contents>
+1. Current implementation overview
+1.1. Problems in the current implementation
+2. New implementation
+2.1 New tree_and()
+2.2 New tree_or()
+</contents>
+
+1. Current implementation overview
+==================================
+At the moment, range analyzer works as follows:
+
+SEL_TREE structure represents
+
+ # There are sel_trees, a sel_tree is either range or merge tree
+ sel_tree = range_tree | imerge_tree
+
+ # a range tree has range access options, possibly for several keys
+ range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+
+ # merge tree represents several way to index_merge
+ imerge_tree = imerge1 AND imerge2 AND ...
+
+ # a way to do index merge == a set to use of different indexes.
+ imergeX = range_tree1 OR range_tree2 OR ..
+ where no pair of range_treeX have ranges over the same index.
+
+
+ tree_and(A, B)
+ {
+ if (both A and B are range trees)
+ return a range_tree with computed intersection for each range;
+ if (only one of A and B is a range tree)
+ return that tree; // DISCARD-IMERGE-1
+ // at this point both trees are index_merge trees
+ return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
+ }
+
+
+ tree_or(A, B)
+ {
+ if (A and B are range trees)
+ {
+ R = new range_tree;
+ for each index i
+ R.add(range_union(A.range(i), B.range(i)));
+
+ if (R has at least one range access)
+ return R;
+ else
+ {
+ /* could not build any range accesses. construct index_merge */
+ remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from B;
+ return new index_merge(A, B);
+ }
+ }
+ else if (A is range tree and B is index_merge tree (or vice versa))
+ {
+ Perform this transformation:
+
+ range_treeA // this is A
+ OR
+ (range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
+ (range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ =
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+
+ Now each line represents an index_merge..
+ }
+ else if (both A and B are index_merge trees)
+ {
+ Perform this transformation:
+
+ imergeA1 AND imergeA2 AND ... AND imergeAN
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN
+
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+
+ imergeA1
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN =
+
+ = (combine imergeA1 with each of the imergeB{i} ) =
+
+ combine(imergeA1 OR imergeB1) AND
+ combine(imergeA1 OR imergeB2) AND
+ ... AND
+ combine(imergeA1 OR imergeBN)
+ }
+ }
+
+1.1. Problems in the current implementation
+-------------------------------------------
+As marked in the code above:
+
+DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
+the WHERE clause has this form:
+
+ (t.key1=c1 OR t.key2=c2) AND t.badkey < c3
+
+DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
+the WHERE clause has this form (conditions t.badkey may have abritrary form):
+
+ (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+
+DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
+two indexes:
+
+ INDEX i1(col1, col2),
+ INDEX i2(col1, col3)
+
+and this WHERE clause:
+
+ col1=c1 AND (col2=c2 OR col3=c3)
+
+The optimizer will generate the plans that only use the "col1=c1" part. The
+right side of the AND will be ignored even if it has good selectivity.
+
+
+2. New implementation
+=====================
+
+<general idea>
+* Don't start fighting combinatorial explosion until we've actually got one.
+</>
+
+SEL_TREE structure will be now able to hold both index_merge and range scan
+candidates at the same time. That is,
+
+ sel_tree2 = range_tree AND imerge_tree
+
+where both parts are optional (i.e. can be empty)
+
+Operations on SEL_ARG trees will be modified to produce/process the trees of
+this kind:
+
+2.1 New tree_and()
+------------------
+In order not to lose plans, we'll make these changes:
+
+1. Don't remove index_merge part of the tree.
+
+2. Push range conditions down into index_merge trees that may support them.
+ if one tree has range(key1) and the other tree has imerge(key1 OR key2)
+ then perform an equvalent of this operation:
+
+ rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
+
+ (rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
+
+3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+ concatenate them together.
+
+2.2 New tree_or()
-=-=(Guest - Sat, 13 Jun 2009, 06:29)=-=-
Category updated.
--- /tmp/wklog.24.old.25753 2009-06-13 06:29:10.000000000 +0300
+++ /tmp/wklog.24.new.25753 2009-06-13 06:29:10.000000000 +0300
@@ -1 +1 @@
-Server-BackLog
+Server-Sprint
-=-=(Guest - Sat, 13 Jun 2009, 06:14)=-=-
Category updated.
--- /tmp/wklog.24.old.24991 2009-06-13 06:14:03.000000000 +0300
+++ /tmp/wklog.24.new.24991 2009-06-13 06:14:03.000000000 +0300
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-BackLog
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 24
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Category updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Client-BackLog
+Server-RawIdeaBin
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Version updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Server-9.x
+9.x
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
------------------------------------------------------------
-=-=(View All Progress Notes, 12 total)=-=-
http://askmonty.org/worklog/index.pl?tid=24&nolimit=1
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
LOW-LEVEL DESIGN:
<contents>
1. Current implementation overview
1.1. Problems in the current implementation
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
3. Testing and required coverage
</contents>
1. Current implementation overview
==================================
At the moment, range analyzer works as follows:
SEL_TREE structure represents
# There are sel_trees, a sel_tree is either range or merge tree
sel_tree = range_tree | imerge_tree
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
# a way to do index merge == a set to use of different indexes.
imergeX = range_tree1 OR range_tree2 OR ..
where no pair of range_treeX have ranges over the same index.
tree_and(A, B)
{
if (both A and B are range trees)
return a range_tree with computed intersection for each range;
if (only one of A and B is a range tree)
return that tree; // DISCARD-IMERGE-1
// at this point both trees are index_merge trees
return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
}
tree_or(A, B)
{
if (A and B are range trees)
{
R = new range_tree;
for each index i
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
return R;
else
{
/* could not build any range accesses. construct index_merge */
remove non-ranges from A; // DISCARD-IMERGE-2
remove non-ranges from B;
return new index_merge(A, B);
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
{
Perform this transformation:
range_treeA // this is A
OR
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
(range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
Now each line represents an index_merge..
}
else if (both A and B are index_merge trees)
{
Perform this transformation:
imergeA1 AND imergeA2 AND ... AND imergeAN
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
-> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
imergeA1
OR
imergeB1 AND imergeB2 AND ... AND imergeBN =
= (combine imergeA1 with each of the imergeB{i} ) =
combine(imergeA1 OR imergeB1) AND
combine(imergeA1 OR imergeB2) AND
... AND
combine(imergeA1 OR imergeBN)
}
}
1.1. Problems in the current implementation
-------------------------------------------
As marked in the code above:
DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
the WHERE clause has this form:
(t.key1=c1 OR t.key2=c2) AND t.badkey < c3
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
(t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
INDEX i1(col1, col2),
INDEX i2(col1, col3)
and this WHERE clause:
col1=c1 AND (col2=c2 OR col3=c3)
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
2. New implementation
=====================
<general idea>
* Don't start fighting combinatorial explosion until we've actually got one.
</>
SEL_TREE structure will be now able to hold both index_merge and range scan
candidates at the same time. That is,
sel_tree2 = range_tree AND imerge_tree
where both parts are optional (i.e. can be empty)
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
A1. Don't remove index_merge part of the tree (this will take care of
DISCARD-IMERGE-1 problem)
A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
2.2 New tree_or()
-----------------
O1. Dont remove non-range plans:
Current tree_or() code will refuse to produce index_merge plans for
conditions like
"t.key1part2=const OR t.key2part1=const"
(this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
the AND condition is not usable for range access, and the operation of
tree_and() guaranteed that there was no way it could changed to make a
usable range plan. With new tree_and() and rule A2, this is no longer the
case. For example for this query:
(t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
it will construct a
imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
then tree_and() will apply rule A2 to push the range down into index merge
and after that we'll have:
range(t.key1part1=const)
imerge(
t.key1part2=const AND t.key1part1=const,
t.key2part1=const
)
note that imerge(...) describes a usable index_merge plan and it's possible
that it will be the best access path.
O2. "Create index_merge accesses when possible"
Current tree_or() will not create index_merge access when it could create
non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
in the current implementation" section). This will be changed to work as
follows: we will create index_merge made for index scans that didn't have
their match in the other sel_tree.
Ilustrating it with an example:
| sel_tree_A | sel_tree_B | A or B | include in index_merge?
------+------------+------------+--------+------------------------
key1 | cond1 | cond2 | condM | no
key2 | cond3 | cond4 | NULL | no
key3 | cond5 | | | yes, A-side
key4 | cond6 | | | yes, A-side
key5 | | cond7 | | yes, B-side
key6 | | cond8 | | yes, B-side
here we assume that
- (cond1 OR cond2) did produce a combined range. Not including them in
index_merge.
- (cond3 OR cond4) didn't produce a usable range (e.g. they were
t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
didn't yield any range list)
- All other scand didn't have their counterparts, so we'll end up with a
SEL_TREE of:
range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
.
O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
seen any complaints that could be attributed to it.
If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
lift it ,and produce a cross-product:
((key1p OR key2p) AND (key3p OR key4p))
OR
((key5p OR key6p) AND (key7p OR key8p))
= (key1p OR key2p OR key5p OR key6p) AND // this part is currently
(key3p OR key4p OR key5p OR key6p) AND // produced
(key1p OR key2p OR key5p OR key6p) AND // this part will be added
(key3p OR key4p OR key5p OR key6p) //.
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
3. Testing and required coverage
================================
So far could find the following user cases:
* BUG#17259: Query optimizer chooses wrong index
* BUG#17673: Optimizer does not use Index Merge optimization in some cases
* BUG#23322: Optimizer sometimes erroniously prefers other index over index merge
* BUG#30151: optimizer is very reluctant to chose index_merge algorithm
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (knielsen:2702)
by knielsen@knielsen-hq.org 19 Jun '09
by knielsen@knielsen-hq.org 19 Jun '09
19 Jun '09
#At lp:maria
2702 knielsen(a)knielsen-hq.org 2009-06-09
XtraDB after-merge fixes.
Fixes to get the test suite to run without failures.
removed:
storage/xtradb/setup.sh
modified:
mysql-test/r/information_schema.result
mysql-test/r/information_schema_all_engines.result
mysql-test/r/innodb-autoinc.result
mysql-test/r/innodb-index.result
mysql-test/r/innodb-zip.result
mysql-test/r/innodb.result
mysql-test/r/innodb_bug36169.result
mysql-test/r/innodb_xtradb_bug317074.result
mysql-test/r/row-checksum-old.result
mysql-test/r/row-checksum.result
mysql-test/t/information_schema.test
mysql-test/t/innodb-analyze.test
mysql-test/t/innodb-autoinc.test
mysql-test/t/innodb-index.test
mysql-test/t/innodb-zip.test
mysql-test/t/innodb.test
mysql-test/t/innodb_bug34300.test
mysql-test/t/innodb_bug36169.test
mysql-test/t/innodb_bug36172.test
mysql-test/t/innodb_xtradb_bug317074.test
mysql-test/t/partition_innodb.test
mysys/thr_mutex.c
storage/xtradb/ibuf/ibuf0ibuf.c
storage/xtradb/include/sync0rw.h
storage/xtradb/include/sync0rw.ic
storage/xtradb/include/univ.i
storage/xtradb/srv/srv0start.c
per-file messages:
mysql-test/r/information_schema.result
Additional variables available now.
Sort output to avoid depending on engine order.
mysql-test/r/information_schema_all_engines.result
More variables now.
mysql-test/r/innodb-autoinc.result
Avoid picking up pbxt variables in result
mysql-test/r/innodb-index.result
Save state to not corrupt following testcases.
Suppress an expected warning.
mysql-test/r/innodb-zip.result
Work around a problem with dependency on zlib version
mysql-test/r/innodb.result
Checksums have changed in Maria.
Save and restore server state to not corrupt following testcases.
mysql-test/r/innodb_bug36169.result
Save and restore server state to not corrupt following testcases.
mysql-test/r/innodb_xtradb_bug317074.result
Save and restore server state to not corrupt following testcases.
mysql-test/r/row-checksum-old.result
Update result file
mysql-test/r/row-checksum.result
Update result file
mysql-test/t/information_schema.test
Sort output to avoid depending on engine order.
mysql-test/t/innodb-analyze.test
Save and restore server state to not corrupt following testcases.
mysql-test/t/innodb-autoinc.test
Save and restore server state to not corrupt following testcases.
mysql-test/t/innodb-index.test
Save state to not corrupt following testcases.
Suppress an expected warning.
mysql-test/t/innodb-zip.test
Work around a problem with dependency on zlib version
mysql-test/t/innodb.test
Save and restore server state to not corrupt following testcases.
Update --replace statements for new mysql-test-run
mysql-test/t/innodb_bug34300.test
Save and restore server state to not corrupt following testcases.
mysql-test/t/innodb_bug36169.test
Save and restore server state to not corrupt following testcases.
mysql-test/t/innodb_bug36172.test
Save and restore server state to not corrupt following testcases.
mysql-test/t/innodb_xtradb_bug317074.test
Save and restore server state to not corrupt following testcases.
mysql-test/t/partition_innodb.test
Fix regexps to work with new SHOW INNODB STATUS output.
mysys/thr_mutex.c
Initialize mutex deadlock detection lazily.
This allows to test XtraDB, which initializes huge amounts of mutexes without using any but a few of them.
storage/xtradb/ibuf/ibuf0ibuf.c
Fix problem where value of INNODB_IBUF_MAX_SIZE would depend on the alignment of memory
allocated by the buffer pool.
storage/xtradb/include/sync0rw.h
Fix XtraDB to compile without GCC atomic operation intrinsics (performance may suffer
when they are not available though).
storage/xtradb/include/sync0rw.ic
Fix XtraDB to compile without GCC atomic operation intrinsics (performance may suffer
when they are not available though).
storage/xtradb/include/univ.i
Fix for MariaDB
storage/xtradb/setup.sh
Remove no longer needed file from XtraDB.
storage/xtradb/srv/srv0start.c
Fix for MariaDB
=== modified file 'mysql-test/r/information_schema.result'
--- a/mysql-test/r/information_schema.result 2009-04-08 16:55:26 +0000
+++ b/mysql-test/r/information_schema.result 2009-06-09 15:08:46 +0000
@@ -42,7 +42,7 @@ WHERE table_schema IN ('mysql', 'INFORMA
table_name<>'ndb_binlog_index' AND
table_name<>'ndb_apply_status' AND
NOT (table_schema = 'INFORMATION_SCHEMA' AND table_name LIKE 'PBXT_%');
-select * from v1;
+select * from v1 ORDER BY c COLLATE utf8_bin;
c
CHARACTER_SETS
COLLATIONS
@@ -54,6 +54,17 @@ EVENTS
FILES
GLOBAL_STATUS
GLOBAL_VARIABLES
+INNODB_BUFFER_POOL_PAGES
+INNODB_BUFFER_POOL_PAGES_BLOB
+INNODB_BUFFER_POOL_PAGES_INDEX
+INNODB_CMP
+INNODB_CMPMEM
+INNODB_CMPMEM_RESET
+INNODB_CMP_RESET
+INNODB_LOCKS
+INNODB_LOCK_WAITS
+INNODB_RSEG
+INNODB_TRX
KEY_COLUMN_USAGE
PARTITIONS
PLUGINS
@@ -72,6 +83,7 @@ TABLE_PRIVILEGES
TRIGGERS
USER_PRIVILEGES
VIEWS
+XTRADB_ENHANCEMENTS
columns_priv
db
event
@@ -87,6 +99,11 @@ proc
procs_priv
servers
slow_log
+t1
+t2
+t3
+t4
+t5
tables_priv
time_zone
time_zone_leap_second
@@ -94,11 +111,6 @@ time_zone_name
time_zone_transition
time_zone_transition_type
user
-t1
-t4
-t2
-t3
-t5
v1
select c,table_name from v1
inner join information_schema.TABLES v2 on (v1.c=v2.table_name)
@@ -800,6 +812,8 @@ TABLES CREATE_TIME datetime
TABLES UPDATE_TIME datetime
TABLES CHECK_TIME datetime
TRIGGERS CREATED datetime
+INNODB_TRX trx_started datetime
+INNODB_TRX trx_wait_started datetime
event execute_at datetime
event last_executed datetime
event starts datetime
@@ -848,6 +862,7 @@ TABLES TABLE_NAME select
TABLE_CONSTRAINTS TABLE_NAME select
TABLE_PRIVILEGES TABLE_NAME select
VIEWS TABLE_NAME select
+INNODB_BUFFER_POOL_PAGES_INDEX table_name select
delete from mysql.user where user='mysqltest_4';
delete from mysql.db where user='mysqltest_4';
flush privileges;
@@ -1223,12 +1238,12 @@ DROP PROCEDURE p1;
DROP USER mysql_bug20230@localhost;
SELECT MAX(table_name) FROM information_schema.tables WHERE table_schema IN ('mysql', 'INFORMATION_SCHEMA', 'test');
MAX(table_name)
-VIEWS
+XTRADB_ENHANCEMENTS
SELECT table_name from information_schema.tables
WHERE table_name=(SELECT MAX(table_name)
FROM information_schema.tables WHERE table_schema IN ('mysql', 'INFORMATION_SCHEMA', 'test'));
table_name
-VIEWS
+XTRADB_ENHANCEMENTS
DROP TABLE IF EXISTS bug23037;
DROP FUNCTION IF EXISTS get_value;
SELECT COLUMN_NAME, MD5(COLUMN_DEFAULT), LENGTH(COLUMN_DEFAULT) FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME='bug23037';
=== modified file 'mysql-test/r/information_schema_all_engines.result'
--- a/mysql-test/r/information_schema_all_engines.result 2009-04-08 16:55:26 +0000
+++ b/mysql-test/r/information_schema_all_engines.result 2009-06-09 15:08:46 +0000
@@ -29,6 +29,18 @@ TABLE_PRIVILEGES
TRIGGERS
USER_PRIVILEGES
VIEWS
+INNODB_BUFFER_POOL_PAGES_INDEX
+INNODB_RSEG
+INNODB_LOCKS
+INNODB_BUFFER_POOL_PAGES
+XTRADB_ENHANCEMENTS
+INNODB_TRX
+INNODB_BUFFER_POOL_PAGES_BLOB
+INNODB_LOCK_WAITS
+INNODB_CMP_RESET
+INNODB_CMP
+INNODB_CMPMEM_RESET
+INNODB_CMPMEM
PBXT_STATISTICS
SELECT t.table_name, c1.column_name
FROM information_schema.tables t
@@ -73,6 +85,18 @@ TABLE_PRIVILEGES TABLE_SCHEMA
TRIGGERS TRIGGER_SCHEMA
USER_PRIVILEGES GRANTEE
VIEWS TABLE_SCHEMA
+INNODB_BUFFER_POOL_PAGES_INDEX schema_name
+INNODB_RSEG rseg_id
+INNODB_LOCKS lock_id
+INNODB_BUFFER_POOL_PAGES page_type
+XTRADB_ENHANCEMENTS name
+INNODB_TRX trx_id
+INNODB_BUFFER_POOL_PAGES_BLOB space_id
+INNODB_LOCK_WAITS requesting_trx_id
+INNODB_CMP_RESET page_size
+INNODB_CMP page_size
+INNODB_CMPMEM_RESET page_size
+INNODB_CMPMEM page_size
PBXT_STATISTICS ID
SELECT t.table_name, c1.column_name
FROM information_schema.tables t
@@ -117,6 +141,18 @@ TABLE_PRIVILEGES TABLE_SCHEMA
TRIGGERS TRIGGER_SCHEMA
USER_PRIVILEGES GRANTEE
VIEWS TABLE_SCHEMA
+INNODB_BUFFER_POOL_PAGES_INDEX schema_name
+INNODB_RSEG rseg_id
+INNODB_LOCKS lock_id
+INNODB_BUFFER_POOL_PAGES page_type
+XTRADB_ENHANCEMENTS name
+INNODB_TRX trx_id
+INNODB_BUFFER_POOL_PAGES_BLOB space_id
+INNODB_LOCK_WAITS requesting_trx_id
+INNODB_CMP_RESET page_size
+INNODB_CMP page_size
+INNODB_CMPMEM_RESET page_size
+INNODB_CMPMEM page_size
PBXT_STATISTICS ID
select 1 as f1 from information_schema.tables where "CHARACTER_SETS"=
(select cast(table_name as char) from information_schema.tables
@@ -149,6 +185,17 @@ EVENTS information_schema.EVENTS 1
FILES information_schema.FILES 1
GLOBAL_STATUS information_schema.GLOBAL_STATUS 1
GLOBAL_VARIABLES information_schema.GLOBAL_VARIABLES 1
+INNODB_BUFFER_POOL_PAGES information_schema.INNODB_BUFFER_POOL_PAGES 1
+INNODB_BUFFER_POOL_PAGES_BLOB information_schema.INNODB_BUFFER_POOL_PAGES_BLOB 1
+INNODB_BUFFER_POOL_PAGES_INDEX information_schema.INNODB_BUFFER_POOL_PAGES_INDEX 1
+INNODB_CMP information_schema.INNODB_CMP 1
+INNODB_CMPMEM information_schema.INNODB_CMPMEM 1
+INNODB_CMPMEM_RESET information_schema.INNODB_CMPMEM_RESET 1
+INNODB_CMP_RESET information_schema.INNODB_CMP_RESET 1
+INNODB_LOCKS information_schema.INNODB_LOCKS 1
+INNODB_LOCK_WAITS information_schema.INNODB_LOCK_WAITS 1
+INNODB_RSEG information_schema.INNODB_RSEG 1
+INNODB_TRX information_schema.INNODB_TRX 1
KEY_COLUMN_USAGE information_schema.KEY_COLUMN_USAGE 1
PARTITIONS information_schema.PARTITIONS 1
PBXT_STATISTICS information_schema.PBXT_STATISTICS 1
@@ -168,6 +215,7 @@ TABLE_PRIVILEGES information_schema.TABL
TRIGGERS information_schema.TRIGGERS 1
USER_PRIVILEGES information_schema.USER_PRIVILEGES 1
VIEWS information_schema.VIEWS 1
+XTRADB_ENHANCEMENTS information_schema.XTRADB_ENHANCEMENTS 1
Database: information_schema
+---------------------------------------+
| Tables |
@@ -200,6 +248,18 @@ Database: information_schema
| TRIGGERS |
| USER_PRIVILEGES |
| VIEWS |
+| INNODB_BUFFER_POOL_PAGES_INDEX |
+| INNODB_RSEG |
+| INNODB_LOCKS |
+| INNODB_BUFFER_POOL_PAGES |
+| XTRADB_ENHANCEMENTS |
+| INNODB_TRX |
+| INNODB_BUFFER_POOL_PAGES_BLOB |
+| INNODB_LOCK_WAITS |
+| INNODB_CMP_RESET |
+| INNODB_CMP |
+| INNODB_CMPMEM_RESET |
+| INNODB_CMPMEM |
| PBXT_STATISTICS |
+---------------------------------------+
Database: INFORMATION_SCHEMA
@@ -234,6 +294,18 @@ Database: INFORMATION_SCHEMA
| TRIGGERS |
| USER_PRIVILEGES |
| VIEWS |
+| INNODB_BUFFER_POOL_PAGES_INDEX |
+| INNODB_RSEG |
+| INNODB_LOCKS |
+| INNODB_BUFFER_POOL_PAGES |
+| XTRADB_ENHANCEMENTS |
+| INNODB_TRX |
+| INNODB_BUFFER_POOL_PAGES_BLOB |
+| INNODB_LOCK_WAITS |
+| INNODB_CMP_RESET |
+| INNODB_CMP |
+| INNODB_CMPMEM_RESET |
+| INNODB_CMPMEM |
| PBXT_STATISTICS |
+---------------------------------------+
Wildcard: inf_rmation_schema
@@ -244,5 +316,5 @@ Wildcard: inf_rmation_schema
+--------------------+
SELECT table_schema, count(*) FROM information_schema.TABLES WHERE table_schema IN ('mysql', 'INFORMATION_SCHEMA', 'test', 'mysqltest') AND table_name<>'ndb_binlog_index' AND table_name<>'ndb_apply_status' GROUP BY TABLE_SCHEMA;
table_schema count(*)
-information_schema 29
+information_schema 41
mysql 22
=== modified file 'mysql-test/r/innodb-autoinc.result'
--- a/mysql-test/r/innodb-autoinc.result 2009-06-09 13:19:13 +0000
+++ b/mysql-test/r/innodb-autoinc.result 2009-06-09 15:08:46 +0000
@@ -197,7 +197,7 @@ c1 c2
5 9
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=100, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 100
auto_increment_offset 10
@@ -230,7 +230,7 @@ c1
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
@@ -269,7 +269,7 @@ c1
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
@@ -282,7 +282,7 @@ SELECT * FROM t1;
c1
-1
SET @@SESSION.AUTO_INCREMENT_INCREMENT=100, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 100
auto_increment_offset 10
@@ -315,7 +315,7 @@ c1
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
@@ -330,7 +330,7 @@ SELECT * FROM t1;
c1
1
SET @@SESSION.AUTO_INCREMENT_INCREMENT=100, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 100
auto_increment_offset 10
@@ -370,7 +370,7 @@ c1
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
@@ -385,7 +385,7 @@ SELECT * FROM t1;
c1
1
SET @@SESSION.AUTO_INCREMENT_INCREMENT=100, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 100
auto_increment_offset 10
@@ -419,7 +419,7 @@ c1
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
@@ -434,7 +434,7 @@ c1
1
9223372036854775794
SET @@SESSION.AUTO_INCREMENT_INCREMENT=2, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 2
auto_increment_offset 10
@@ -452,7 +452,7 @@ c1
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
@@ -467,7 +467,7 @@ c1
1
18446744073709551603
SET @@SESSION.AUTO_INCREMENT_INCREMENT=2, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 2
auto_increment_offset 10
@@ -485,7 +485,7 @@ c1
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
@@ -500,7 +500,7 @@ c1
1
18446744073709551603
SET @@SESSION.AUTO_INCREMENT_INCREMENT=5, @@SESSION.AUTO_INCREMENT_OFFSET=7;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 5
auto_increment_offset 7
@@ -514,7 +514,7 @@ c1
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
@@ -533,7 +533,7 @@ c1
-9223372036854775806
1
SET @@SESSION.AUTO_INCREMENT_INCREMENT=3, @@SESSION.AUTO_INCREMENT_OFFSET=3;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 3
auto_increment_offset 3
@@ -550,7 +550,7 @@ c1
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
@@ -568,7 +568,7 @@ SET @@SESSION.AUTO_INCREMENT_INCREMENT=1
Warnings:
Warning 1292 Truncated incorrect auto_increment_increment value: '1152921504606846976'
Warning 1292 Truncated incorrect auto_increment_offset value: '1152921504606846976'
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 65535
auto_increment_offset 65535
@@ -581,7 +581,7 @@ c1
DROP TABLE t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
=== modified file 'mysql-test/r/innodb-index.result'
--- a/mysql-test/r/innodb-index.result 2009-06-09 13:19:13 +0000
+++ b/mysql-test/r/innodb-index.result 2009-06-09 15:08:46 +0000
@@ -1,3 +1,4 @@
+SET @save_innodb_file_format_check=@@global.innodb_file_format_check;
create table t1(a int not null, b int, c char(10) not null, d varchar(20)) engine = innodb;
insert into t1 values (5,5,'oo','oo'),(4,4,'tr','tr'),(3,4,'ad','ad'),(2,3,'ak','ak');
commit;
@@ -47,6 +48,7 @@ t1 CREATE TABLE `t1` (
KEY `b` (`b`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `t1#1`(a INT PRIMARY KEY) ENGINE=InnoDB;
+call mtr.add_suppression(" table `test`\\.`t1#[12]` already exists in InnoDB internal");
alter table t1 add unique index (c), add index (d);
ERROR HY000: Table 'test.t1#1' already exists
rename table `t1#1` to `t1#2`;
@@ -1132,3 +1134,4 @@ t2 CREATE TABLE `t2` (
) ENGINE=InnoDB DEFAULT CHARSET=latin1
DROP TABLE t2;
DROP TABLE t1;
+SET GLOBAL innodb_file_format_check=@save_innodb_file_format_check;
=== modified file 'mysql-test/r/innodb-zip.result'
--- a/mysql-test/r/innodb-zip.result 2009-06-09 13:19:13 +0000
+++ b/mysql-test/r/innodb-zip.result 2009-06-09 15:08:46 +0000
@@ -141,7 +141,7 @@ drop table t1;
CREATE TABLE t1(c TEXT, PRIMARY KEY (c(440)))
ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=1 CHARSET=ASCII;
ERROR 42000: Row size too large. The maximum row size for the used table type, not counting BLOBs, is 8126. You have to change some columns to TEXT or BLOBs
-CREATE TABLE t1(c TEXT, PRIMARY KEY (c(439)))
+CREATE TABLE t1(c TEXT, PRIMARY KEY (c(438)))
ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=1 CHARSET=ASCII;
INSERT INTO t1 VALUES(REPEAT('A',512)),(REPEAT('B',512));
DROP TABLE t1;
=== modified file 'mysql-test/r/innodb.result'
--- a/mysql-test/r/innodb.result 2009-06-09 13:19:13 +0000
+++ b/mysql-test/r/innodb.result 2009-06-09 15:08:46 +0000
@@ -1433,7 +1433,7 @@ insert t2 select * from t1;
insert t3 select * from t1;
checksum table t1, t2, t3, t4 quick;
Table Checksum
-test.t1 2948697075
+test.t1 3442722830
test.t2 NULL
test.t3 NULL
test.t4 NULL
@@ -1441,17 +1441,17 @@ Warnings:
Error 1146 Table 'test.t4' doesn't exist
checksum table t1, t2, t3, t4;
Table Checksum
-test.t1 2948697075
-test.t2 2948697075
-test.t3 2948697075
+test.t1 3442722830
+test.t2 3442722830
+test.t3 3442722830
test.t4 NULL
Warnings:
Error 1146 Table 'test.t4' doesn't exist
checksum table t1, t2, t3, t4 extended;
Table Checksum
-test.t1 2948697075
-test.t2 2948697075
-test.t3 2948697075
+test.t1 3442722830
+test.t2 3442722830
+test.t3 3442722830
test.t4 NULL
Warnings:
Error 1146 Table 'test.t4' doesn't exist
@@ -1781,6 +1781,7 @@ set global innodb_sync_spin_loops=20;
show variables like "innodb_sync_spin_loops";
Variable_name Value
innodb_sync_spin_loops 20
+SET @old_innodb_thread_concurrency= @@global.innodb_thread_concurrency;
show variables like "innodb_thread_concurrency";
Variable_name Value
innodb_thread_concurrency 0
@@ -1798,6 +1799,7 @@ set global innodb_thread_concurrency=16;
show variables like "innodb_thread_concurrency";
Variable_name Value
innodb_thread_concurrency 16
+SET @@global.innodb_thread_concurrency= @old_innodb_thread_concurrency;
show variables like "innodb_concurrency_tickets";
Variable_name Value
innodb_concurrency_tickets 500
=== modified file 'mysql-test/r/innodb_bug36169.result'
--- a/mysql-test/r/innodb_bug36169.result 2009-06-09 13:19:13 +0000
+++ b/mysql-test/r/innodb_bug36169.result 2009-06-09 15:08:46 +0000
@@ -1,2 +1,5 @@
+SET @save_innodb_file_format=@@global.innodb_file_format;
+SET @save_innodb_file_format_check=@@global.innodb_file_format_check;
+SET @save_innodb_file_per_table=@@global.innodb_file_per_table;
SET GLOBAL innodb_file_format='Barracuda';
SET GLOBAL innodb_file_per_table=ON;
=== modified file 'mysql-test/r/innodb_xtradb_bug317074.result'
--- a/mysql-test/r/innodb_xtradb_bug317074.result 2009-06-09 13:19:13 +0000
+++ b/mysql-test/r/innodb_xtradb_bug317074.result 2009-06-09 15:08:46 +0000
@@ -1,2 +1,5 @@
+SET @save_innodb_file_format=@@global.innodb_file_format;
+SET @save_innodb_file_format_check=@@global.innodb_file_format_check;
+SET @save_innodb_file_per_table=@@global.innodb_file_per_table;
SET GLOBAL innodb_file_format='Barracuda';
SET GLOBAL innodb_file_per_table=ON;
=== modified file 'mysql-test/r/row-checksum-old.result'
--- a/mysql-test/r/row-checksum-old.result 2008-06-28 12:45:15 +0000
+++ b/mysql-test/r/row-checksum-old.result 2009-06-09 15:08:46 +0000
@@ -72,6 +72,8 @@ Table Checksum
test.t1 4108368782
drop table if exists t1;
create table t1 (a int null, v varchar(100)) engine=innodb checksum=0 row_format=fixed;
+Warnings:
+Warning 1478 InnoDB: assuming ROW_FORMAT=COMPACT.
insert into t1 values(null, null), (1, "hello");
checksum table t1;
Table Checksum
=== modified file 'mysql-test/r/row-checksum.result'
--- a/mysql-test/r/row-checksum.result 2008-06-28 12:45:15 +0000
+++ b/mysql-test/r/row-checksum.result 2009-06-09 15:08:46 +0000
@@ -72,6 +72,8 @@ Table Checksum
test.t1 3885665021
drop table if exists t1;
create table t1 (a int null, v varchar(100)) engine=innodb checksum=0 row_format=fixed;
+Warnings:
+Warning 1478 InnoDB: assuming ROW_FORMAT=COMPACT.
insert into t1 values(null, null), (1, "hello");
checksum table t1;
Table Checksum
=== modified file 'mysql-test/t/information_schema.test'
--- a/mysql-test/t/information_schema.test 2009-04-08 16:55:26 +0000
+++ b/mysql-test/t/information_schema.test 2009-06-09 15:08:46 +0000
@@ -43,7 +43,7 @@ create view v1 (c) as
table_name<>'ndb_binlog_index' AND
table_name<>'ndb_apply_status' AND
NOT (table_schema = 'INFORMATION_SCHEMA' AND table_name LIKE 'PBXT_%');
-select * from v1;
+select * from v1 ORDER BY c COLLATE utf8_bin;
select c,table_name from v1
inner join information_schema.TABLES v2 on (v1.c=v2.table_name)
=== modified file 'mysql-test/t/innodb-analyze.test'
--- a/mysql-test/t/innodb-analyze.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb-analyze.test 2009-06-09 15:08:46 +0000
@@ -11,6 +11,7 @@
-- disable_result_log
-- enable_warnings
+SET @save_innodb_stats_sample_pages=@@innodb_stats_sample_pages;
SET GLOBAL innodb_stats_sample_pages=0;
# check that the value has been adjusted to 1
@@ -60,4 +61,5 @@ ANALYZE TABLE innodb_analyze;
SET GLOBAL innodb_stats_sample_pages=16;
ANALYZE TABLE innodb_analyze;
+SET GLOBAL innodb_stats_sample_pages=@save_innodb_stats_sample_pages;
DROP TABLE innodb_analyze;
=== modified file 'mysql-test/t/innodb-autoinc.test'
--- a/mysql-test/t/innodb-autoinc.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb-autoinc.test 2009-06-09 15:08:46 +0000
@@ -156,7 +156,7 @@ DROP TABLE t1;
#
# Test changes to AUTOINC next value calculation
SET @@SESSION.AUTO_INCREMENT_INCREMENT=100, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (c1 INT AUTO_INCREMENT, PRIMARY KEY(c1)) ENGINE=InnoDB;
INSERT INTO t1 VALUES (NULL),(5),(NULL);
@@ -173,7 +173,7 @@ DROP TABLE t1;
# Reset the AUTOINC session variables
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (c1 INT AUTO_INCREMENT, PRIMARY KEY(c1)) ENGINE=InnoDB;
INSERT INTO t1 VALUES(0);
@@ -193,13 +193,13 @@ DROP TABLE t1;
# Reset the AUTOINC session variables
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (c1 INT AUTO_INCREMENT, PRIMARY KEY(c1)) ENGINE=InnoDB;
INSERT INTO t1 VALUES(-1);
SELECT * FROM t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=100, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
INSERT INTO t1 VALUES (-2), (NULL),(2),(NULL);
INSERT INTO t1 VALUES (250),(NULL);
SELECT * FROM t1;
@@ -214,13 +214,13 @@ DROP TABLE t1;
# Reset the AUTOINC session variables
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (c1 INT UNSIGNED AUTO_INCREMENT, PRIMARY KEY(c1)) ENGINE=InnoDB;
INSERT INTO t1 VALUES(-1);
SELECT * FROM t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=100, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
INSERT INTO t1 VALUES (-2);
INSERT INTO t1 VALUES (NULL);
INSERT INTO t1 VALUES (2);
@@ -240,13 +240,13 @@ DROP TABLE t1;
# Reset the AUTOINC session variables
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (c1 INT UNSIGNED AUTO_INCREMENT, PRIMARY KEY(c1)) ENGINE=InnoDB;
INSERT INTO t1 VALUES(-1);
SELECT * FROM t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=100, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
INSERT INTO t1 VALUES (-2),(NULL),(2),(NULL);
INSERT INTO t1 VALUES (250),(NULL);
SELECT * FROM t1;
@@ -262,7 +262,7 @@ DROP TABLE t1;
# Check for overflow handling when increment is > 1
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (c1 BIGINT AUTO_INCREMENT, PRIMARY KEY(c1)) ENGINE=InnoDB;
# TODO: Fix the autoinc init code
@@ -271,7 +271,7 @@ INSERT INTO t1 VALUES(NULL);
INSERT INTO t1 VALUES (9223372036854775794); #-- 2^63 - 14
SELECT * FROM t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=2, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
# This should just fit
INSERT INTO t1 VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL);
SELECT * FROM t1;
@@ -281,7 +281,7 @@ DROP TABLE t1;
# Check for overflow handling when increment and offser are > 1
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (c1 BIGINT UNSIGNED AUTO_INCREMENT, PRIMARY KEY(c1)) ENGINE=InnoDB;
# TODO: Fix the autoinc init code
@@ -290,7 +290,7 @@ INSERT INTO t1 VALUES(NULL);
INSERT INTO t1 VALUES (18446744073709551603); #-- 2^64 - 13
SELECT * FROM t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=2, @@SESSION.AUTO_INCREMENT_OFFSET=10;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
# This should fail because of overflow but it doesn't, it seems to be
# a MySQL server bug. It wraps around to 0 for the last value.
# See MySQL Bug# 39828
@@ -313,7 +313,7 @@ DROP TABLE t1;
# Check for overflow handling when increment and offset are odd numbers
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (c1 BIGINT UNSIGNED AUTO_INCREMENT, PRIMARY KEY(c1)) ENGINE=InnoDB;
# TODO: Fix the autoinc init code
@@ -322,7 +322,7 @@ INSERT INTO t1 VALUES(NULL);
INSERT INTO t1 VALUES (18446744073709551603); #-- 2^64 - 13
SELECT * FROM t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=5, @@SESSION.AUTO_INCREMENT_OFFSET=7;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
# This should fail because of overflow but it doesn't. It fails with
# a duplicate entry message because of a MySQL server bug, it wraps
# around. See MySQL Bug# 39828, once MySQL fix the bug we can replace
@@ -344,7 +344,7 @@ DROP TABLE t1;
# and check for large -ve numbers
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (c1 BIGINT AUTO_INCREMENT, PRIMARY KEY(c1)) ENGINE=InnoDB;
# TODO: Fix the autoinc init code
@@ -355,7 +355,7 @@ INSERT INTO t1 VALUES(-92233720368547758
INSERT INTO t1 VALUES(-9223372036854775808); #-- -2^63
SELECT * FROM t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=3, @@SESSION.AUTO_INCREMENT_OFFSET=3;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
INSERT INTO t1 VALUES (NULL),(NULL), (NULL);
SELECT * FROM t1;
DROP TABLE t1;
@@ -364,7 +364,7 @@ DROP TABLE t1;
# large numbers 2^60
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (c1 BIGINT UNSIGNED AUTO_INCREMENT, PRIMARY KEY(c1)) ENGINE=InnoDB;
# TODO: Fix the autoinc init code
@@ -373,7 +373,7 @@ INSERT INTO t1 VALUES(NULL);
INSERT INTO t1 VALUES (18446744073709551610); #-- 2^64 - 2
SELECT * FROM t1;
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1152921504606846976, @@SESSION.AUTO_INCREMENT_OFFSET=1152921504606846976;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
# This should fail because of overflow but it doesn't. It wraps around
# and the autoinc values look bogus too.
# See MySQL Bug# 39828, once MySQL fix the bug we can enable the error
@@ -396,7 +396,7 @@ DROP TABLE t1;
#
SET @@SESSION.AUTO_INCREMENT_INCREMENT=1, @@SESSION.AUTO_INCREMENT_OFFSET=1;
SET @@INSERT_ID=1;
-SHOW VARIABLES LIKE "%auto_inc%";
+SHOW VARIABLES LIKE "auto_inc%";
CREATE TABLE t1 (c1 DOUBLE NOT NULL AUTO_INCREMENT, c2 INT, PRIMARY KEY (c1)) ENGINE=InnoDB;
INSERT INTO t1 VALUES(NULL, 1);
INSERT INTO t1 VALUES(NULL, 2);
=== modified file 'mysql-test/t/innodb-index.test'
--- a/mysql-test/t/innodb-index.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb-index.test 2009-06-09 15:08:46 +0000
@@ -1,5 +1,7 @@
-- source include/have_innodb.inc
+SET @save_innodb_file_format_check=@@global.innodb_file_format_check;
+
create table t1(a int not null, b int, c char(10) not null, d varchar(20)) engine = innodb;
insert into t1 values (5,5,'oo','oo'),(4,4,'tr','tr'),(3,4,'ad','ad'),(2,3,'ak','ak');
commit;
@@ -20,6 +22,8 @@ show create table t1;
# Check how existing tables interfere with temporary tables.
CREATE TABLE `t1#1`(a INT PRIMARY KEY) ENGINE=InnoDB;
+call mtr.add_suppression(" table `test`\\.`t1#[12]` already exists in InnoDB internal");
+
--error 156
alter table t1 add unique index (c), add index (d);
rename table `t1#1` to `t1#2`;
@@ -509,3 +513,4 @@ SHOW CREATE TABLE t2;
DROP TABLE t2;
DROP TABLE t1;
+SET GLOBAL innodb_file_format_check=@save_innodb_file_format_check;
=== modified file 'mysql-test/t/innodb-zip.test'
--- a/mysql-test/t/innodb-zip.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb-zip.test 2009-06-09 15:08:46 +0000
@@ -105,7 +105,11 @@ drop table t1;
--error ER_TOO_BIG_ROWSIZE
CREATE TABLE t1(c TEXT, PRIMARY KEY (c(440)))
ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=1 CHARSET=ASCII;
-CREATE TABLE t1(c TEXT, PRIMARY KEY (c(439)))
+# The maximum key size for a compressed row actually depends on the
+# version of libz used, as account must be taken for the maximum
+# compressed size of a key, and this differs between libz
+# versions. Some libz versions allow a size of 439, some only 438.
+CREATE TABLE t1(c TEXT, PRIMARY KEY (c(438)))
ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=1 CHARSET=ASCII;
INSERT INTO t1 VALUES(REPEAT('A',512)),(REPEAT('B',512));
DROP TABLE t1;
=== modified file 'mysql-test/t/innodb.test'
--- a/mysql-test/t/innodb.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb.test 2009-06-09 15:08:46 +0000
@@ -1163,7 +1163,7 @@ drop table t2;
# Test error handling
# Embedded server doesn't chdir to data directory
---replace_result $MYSQLTEST_VARDIR . master-data/ ''
+--replace_result $MYSQLTEST_VARDIR . mysqld.1/data/ ''
--error ER_WRONG_FK_DEF
create table t2 (id int(11) not null, id2 int(11) not null, constraint t1_id_fk foreign key (id2,id) references t1 (id)) engine = innodb;
@@ -1318,6 +1318,7 @@ set global innodb_sync_spin_loops=20;
show variables like "innodb_sync_spin_loops";
# Test for innodb_thread_concurrency variable
+SET @old_innodb_thread_concurrency= @@global.innodb_thread_concurrency;
show variables like "innodb_thread_concurrency";
set global innodb_thread_concurrency=1001;
show variables like "innodb_thread_concurrency";
@@ -1325,6 +1326,7 @@ set global innodb_thread_concurrency=0;
show variables like "innodb_thread_concurrency";
set global innodb_thread_concurrency=16;
show variables like "innodb_thread_concurrency";
+SET @@global.innodb_thread_concurrency= @old_innodb_thread_concurrency;
# Test for innodb_concurrency_tickets variable
show variables like "innodb_concurrency_tickets";
@@ -1357,7 +1359,7 @@ source include/varchar.inc;
#
# Embedded server doesn't chdir to data directory
---replace_result $MYSQLTEST_VARDIR . master-data/ ''
+--replace_result $MYSQLTEST_VARDIR . mysqld.1/data/ ''
create table t1 (v varchar(65530), key(v));
drop table t1;
create table t1 (v varchar(65536));
@@ -1632,7 +1634,7 @@ disconnect b;
set foreign_key_checks=0;
create table t2 (a int primary key, b int, foreign key (b) references t1(a)) engine = innodb;
# Embedded server doesn't chdir to data directory
---replace_result $MYSQLTEST_VARDIR . master-data/ ''
+--replace_result $MYSQLTEST_VARDIR . mysqld.1/data/ ''
-- error 1005
create table t1(a char(10) primary key, b varchar(20)) engine = innodb;
set foreign_key_checks=1;
@@ -1644,7 +1646,7 @@ drop table t2;
set foreign_key_checks=0;
create table t1(a varchar(10) primary key) engine = innodb DEFAULT CHARSET=latin1;
# Embedded server doesn't chdir to data directory
---replace_result $MYSQLTEST_VARDIR . master-data/ ''
+--replace_result $MYSQLTEST_VARDIR . mysqld.1/data/ ''
-- error 1005
create table t2 (a varchar(10), foreign key (a) references t1(a)) engine = innodb DEFAULT CHARSET=utf8;
set foreign_key_checks=1;
@@ -1675,7 +1677,7 @@ set foreign_key_checks=0;
create table t2 (a varchar(10), foreign key (a) references t1(a)) engine = innodb DEFAULT CHARSET=latin1;
create table t3(a varchar(10) primary key) engine = innodb DEFAULT CHARSET=utf8;
# Embedded server doesn't chdir to data directory
---replace_result $MYSQLTEST_VARDIR . master-data/ ''
+--replace_result $MYSQLTEST_VARDIR . mysqld.1/data/ ''
-- error 1025
rename table t3 to t1;
set foreign_key_checks=1;
@@ -2315,7 +2317,7 @@ ALTER TABLE t2 ADD FOREIGN KEY (a) REFER
# mysqltest first does replace_regex, then replace_result
--replace_regex /'[^']*test\/#sql-[0-9a-f_]*'/'#sql-temporary'/
# Embedded server doesn't chdir to data directory
---replace_result $MYSQLTEST_VARDIR . master-data/ ''
+--replace_result $MYSQLTEST_VARDIR . mysqld.1/data/ ''
--error 1025
ALTER TABLE t2 MODIFY a INT NOT NULL;
DELETE FROM t1;
=== modified file 'mysql-test/t/innodb_bug34300.test'
--- a/mysql-test/t/innodb_bug34300.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb_bug34300.test 2009-06-09 15:08:46 +0000
@@ -9,6 +9,7 @@
-- disable_result_log
# set packet size and reconnect
+SET @save_max_allowed_packet=@@global.max_allowed_packet;
SET @@global.max_allowed_packet=16777216;
--connect (newconn, localhost, root,,)
@@ -30,3 +31,6 @@ ALTER TABLE bug34300 ADD COLUMN (f10 INT
SELECT f4, f8 FROM bug34300;
DROP TABLE bug34300;
+disconnect newconn;
+connection default;
+SET @@global.max_allowed_packet=@save_max_allowed_packet;
=== modified file 'mysql-test/t/innodb_bug36169.test'
--- a/mysql-test/t/innodb_bug36169.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb_bug36169.test 2009-06-09 15:08:46 +0000
@@ -5,6 +5,9 @@
-- source include/have_innodb.inc
+SET @save_innodb_file_format=@@global.innodb_file_format;
+SET @save_innodb_file_format_check=@@global.innodb_file_format_check;
+SET @save_innodb_file_per_table=@@global.innodb_file_per_table;
SET GLOBAL innodb_file_format='Barracuda';
SET GLOBAL innodb_file_per_table=ON;
@@ -1145,6 +1148,10 @@ KEY `idx44` (`col176`(100),`col42`,`col7
KEY `idx45` (`col2`(27),`col27`(116))
)engine=innodb ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=1;
+SET GLOBAL innodb_file_format=@save_innodb_file_format;
+SET GLOBAL innodb_file_format_check=@save_innodb_file_format_check;
+SET GLOBAL innodb_file_per_table=@save_innodb_file_per_table;
+
DROP TABLE IF EXISTS table0;
DROP TABLE IF EXISTS table1;
DROP TABLE IF EXISTS table2;
=== modified file 'mysql-test/t/innodb_bug36172.test'
--- a/mysql-test/t/innodb_bug36172.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb_bug36172.test 2009-06-09 15:08:46 +0000
@@ -14,6 +14,9 @@ SET storage_engine=InnoDB;
-- disable_query_log
-- disable_result_log
+SET @save_innodb_file_format=@@global.innodb_file_format;
+SET @save_innodb_file_format_check=@@global.innodb_file_format_check;
+SET @save_innodb_file_per_table=@@global.innodb_file_per_table;
SET GLOBAL innodb_file_format='Barracuda';
SET GLOBAL innodb_file_per_table=on;
@@ -23,4 +26,8 @@ insert ignore into `table0` set `col23`
CHECK TABLE table0 EXTENDED;
INSERT IGNORE INTO `table0` SET `col19` = '19940127002709', `col20` = 2383927.9055146948, `col21` = 4293243420.5621204000, `col22` = '20511211123705', `col23` = 4289899778.6573381000, `col24` = 4293449279.0540481000, `col25` = 'emphysemic', `col26` = 'dentally', `col27` = '2347406', `col28` = 'eruct', `col30` = 1222, `col31` = 4294372994.9941406000, `col32` = 4291385574.1173744000, `col33` = 'borrowing\'s', `col34` = 'septics', `col35` = 'ratter\'s', `col36` = 'Kaye', `col37` = 'Florentia', `col38` = 'allium', `col39` = 'barkeep', `col40` = '19510407003441', `col41` = 4293559200.4215522000, `col42` = 22482, `col43` = 'decussate', `col44` = 'Brom\'s', `col45` = 'violated', `col46` = 4925506.4635456400, `col47` = 930549, `col48` = '51296066', `col49` = 'voluminously', `col50` = '29306676', `col51` = -88, `col52` = -2153690, `col53` = 4290250202.1464887000, `col54` = 'expropriation', `col55` = 'Aberdeen\'s', `col56` = 20343, `col58` = '19640415171532', `col59` = 'extern', `col60` = 'Ubana', `col61` = 4290487961.8539081000, `col62` = '2147', `col63` = -24271, `col64` = '20750801194548', `col65` = 'Cunaxa\'s', `col66` = 'pasticcio', `col67` = 2795817, `col68` = 'Indore\'s', `col70` = 6864127, `col71` = '1817832', `col72` = '20540506114211', `col73` = '20040101012300', `col74` = 'rationalized', `col75` = '45522', `col76` = 'indene', `col77` = -6964559, `col78` = 4247535.5266884370, `col79` = '20720416124357', `col80` = '2143', `col81` = 4292060102.4466386000, `col82` = 'striving', `col83` = 'boneblack\'s', `col84` = 'redolent', `col85` = 6489697.9009369183, `col86` = 4287473465.9731131000, `col87` = 7726015, `col88` = 'perplexed', `col89` = '17153791', `col90` = 5478587.1108127078, `col91` = 4287091404.7004304000, `col92` = 'Boulez\'s', `col93` = '2931278';
CHECK TABLE table0 EXTENDED;
+
+SET GLOBAL innodb_file_format=@save_innodb_file_format;
+SET GLOBAL innodb_file_format_check=@save_innodb_file_format_check;
+SET GLOBAL innodb_file_per_table=@save_innodb_file_per_table;
DROP TABLE table0;
=== modified file 'mysql-test/t/innodb_xtradb_bug317074.test'
--- a/mysql-test/t/innodb_xtradb_bug317074.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb_xtradb_bug317074.test 2009-06-09 15:08:46 +0000
@@ -1,5 +1,8 @@
-- source include/have_innodb.inc
+SET @save_innodb_file_format=@@global.innodb_file_format;
+SET @save_innodb_file_format_check=@@global.innodb_file_format_check;
+SET @save_innodb_file_per_table=@@global.innodb_file_per_table;
SET GLOBAL innodb_file_format='Barracuda';
SET GLOBAL innodb_file_per_table=ON;
@@ -35,4 +38,8 @@ DROP PROCEDURE insert_many;
# The bug is hangup at the following statement
ALTER TABLE test1 ENGINE=MyISAM;
+SET GLOBAL innodb_file_format=@save_innodb_file_format;
+SET GLOBAL innodb_file_format_check=@save_innodb_file_format_check;
+SET GLOBAL innodb_file_per_table=@save_innodb_file_per_table;
+
DROP TABLE test1;
=== modified file 'mysql-test/t/partition_innodb.test'
--- a/mysql-test/t/partition_innodb.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/partition_innodb.test 2009-06-09 15:08:46 +0000
@@ -27,14 +27,14 @@ UPDATE t1 SET DATA = data*2 WHERE id = 3
# grouping/referencing in replace_regex is very slow on long strings,
# removing all before/after the interesting row before grouping/referencing
---replace_regex /.*---TRANSACTION [0-9A-F]+, .*, OS thread id [0-9]+// /MySQL thread id [0-9]+, query id [0-9]+ .*// /.*([0-9]+ lock struct\(s\)), heap size [0-9]+, ([0-9]+ row lock\(s\)).*/\1 \2/
+--replace_regex /.*LIST OF TRANSACTIONS FOR EACH SESSION:// /MySQL thread id [0-9]+, query id [0-9]+ .*// /.*([0-9]+ lock struct\(s\)), heap size [0-9]+, ([0-9]+ row lock\(s\)).*/\1 \2/
SHOW ENGINE InnoDB STATUS;
UPDATE t1 SET data = data*2 WHERE data = 2;
# grouping/referencing in replace_regex is very slow on long strings,
# removing all before/after the interesting row before grouping/referencing
---replace_regex /.*---TRANSACTION [0-9A-F]+, .*, OS thread id [0-9]+// /MySQL thread id [0-9]+, query id [0-9]+ .*// /.*([0-9]+ lock struct\(s\)), heap size [0-9]+, ([0-9]+ row lock\(s\)).*/\1 \2/
+--replace_regex /.*LIST OF TRANSACTIONS FOR EACH SESSION:// /MySQL thread id [0-9]+, query id [0-9]+ .*// /.*([0-9]+ lock struct\(s\)), heap size [0-9]+, ([0-9]+ row lock\(s\)).*/\1 \2/
SHOW ENGINE InnoDB STATUS;
SET @@session.tx_isolation = @old_tx_isolation;
=== modified file 'mysys/thr_mutex.c'
--- a/mysys/thr_mutex.c 2009-02-19 09:01:25 +0000
+++ b/mysys/thr_mutex.c 2009-06-09 15:08:46 +0000
@@ -149,6 +149,35 @@ static inline void remove_from_active_li
mp->prev= mp->next= 0;
}
+/*
+ We initialise the hashes for deadlock detection lazily.
+ This greatly helps with performance when lots of mutexes are initiased but
+ only a few of them are actually used (eg. XtraDB).
+*/
+static int safe_mutex_lazy_init_deadlock_detection(safe_mutex_t *mp)
+{
+ if (!my_multi_malloc(MY_FAE | MY_WME,
+ &mp->locked_mutex, sizeof(*mp->locked_mutex),
+ &mp->used_mutex, sizeof(*mp->used_mutex), NullS))
+ {
+ return 1; /* Error */
+ }
+
+ pthread_mutex_lock(&THR_LOCK_mutex);
+ mp->id= ++safe_mutex_id;
+ pthread_mutex_unlock(&THR_LOCK_mutex);
+ hash_init(mp->locked_mutex, &my_charset_bin,
+ 1000,
+ offsetof(safe_mutex_deadlock_t, id),
+ sizeof(mp->id),
+ 0, 0, HASH_UNIQUE);
+ hash_init(mp->used_mutex, &my_charset_bin,
+ 1000,
+ offsetof(safe_mutex_t, id),
+ sizeof(mp->id),
+ 0, 0, HASH_UNIQUE);
+ return 0;
+}
int safe_mutex_init(safe_mutex_t *mp,
const pthread_mutexattr_t *attr __attribute__((unused)),
@@ -167,35 +196,8 @@ int safe_mutex_init(safe_mutex_t *mp,
mp->line= line;
/* Skip the very common '&' prefix from the autogenerated name */
mp->name= name[0] == '&' ? name + 1 : name;
+ /* Deadlock detection is initialised only lazily, on first use. */
- if (safe_mutex_deadlock_detector && !( my_flags & MYF_NO_DEADLOCK_DETECTION))
- {
- if (!my_multi_malloc(MY_FAE | MY_WME,
- &mp->locked_mutex, sizeof(*mp->locked_mutex),
- &mp->used_mutex, sizeof(*mp->used_mutex), NullS))
- {
- /* Disable deadlock handling for this mutex */
- my_flags|= MYF_NO_DEADLOCK_DETECTION;
- }
- else
- {
- pthread_mutex_lock(&THR_LOCK_mutex);
- mp->id= ++safe_mutex_id;
- pthread_mutex_unlock(&THR_LOCK_mutex);
- hash_init(mp->locked_mutex, &my_charset_bin,
- 1000,
- offsetof(safe_mutex_deadlock_t, id),
- sizeof(mp->id),
- 0, 0, HASH_UNIQUE);
- hash_init(mp->used_mutex, &my_charset_bin,
- 1000,
- offsetof(safe_mutex_t, id),
- sizeof(mp->id),
- 0, 0, HASH_UNIQUE);
- }
- }
- else
- my_flags|= MYF_NO_DEADLOCK_DETECTION;
mp->create_flags= my_flags;
#ifdef SAFE_MUTEX_DETECT_DESTROY
@@ -310,7 +312,8 @@ int safe_mutex_lock(safe_mutex_t *mp, my
/* Deadlock detection */
mp->prev= mp->next= 0;
- if (!(mp->active_flags & (MYF_TRY_LOCK | MYF_NO_DEADLOCK_DETECTION)))
+ if (!(mp->active_flags & (MYF_TRY_LOCK | MYF_NO_DEADLOCK_DETECTION)) &&
+ (mp->used_mutex != NULL || !safe_mutex_lazy_init_deadlock_detection(mp)))
{
safe_mutex_t **mutex_in_use= my_thread_var_mutex_in_use();
@@ -643,7 +646,7 @@ int safe_mutex_destroy(safe_mutex_t *mp,
void safe_mutex_free_deadlock_data(safe_mutex_t *mp)
{
/* Free all entries that points to this one */
- if (!(mp->create_flags & MYF_NO_DEADLOCK_DETECTION))
+ if (!(mp->create_flags & MYF_NO_DEADLOCK_DETECTION) && mp->used_mutex != NULL)
{
pthread_mutex_lock(&THR_LOCK_mutex);
my_hash_iterate(mp->used_mutex,
=== modified file 'storage/xtradb/ibuf/ibuf0ibuf.c'
--- a/storage/xtradb/ibuf/ibuf0ibuf.c 2009-03-26 06:11:11 +0000
+++ b/storage/xtradb/ibuf/ibuf0ibuf.c 2009-06-09 15:08:46 +0000
@@ -422,7 +422,12 @@ ibuf_init_at_db_start(void)
grow in size, as the references on the upper levels of the tree can
change */
- ibuf->max_size = ut_min( buf_pool_get_curr_size() / UNIV_PAGE_SIZE
+ /* The default for ibuf_max_size is calculated from the requested
+ buffer pool size srv_buf_pool_size, not the actual size as returned
+ by buf_pool_get_curr_size(). The latter can differ from the former
+ by one page due to alignment requirements, and we do not want a
+ user-visible variable like INNODB_IBUF_MAX_SIZE to vary at random. */
+ ibuf->max_size = ut_min( srv_buf_pool_size / UNIV_PAGE_SIZE
/ IBUF_POOL_SIZE_PER_MAX_SIZE, (ulint) srv_ibuf_max_size / UNIV_PAGE_SIZE);
srv_ibuf_max_size = (long long) ibuf->max_size * UNIV_PAGE_SIZE;
=== modified file 'storage/xtradb/include/sync0rw.h'
--- a/storage/xtradb/include/sync0rw.h 2009-03-26 06:11:11 +0000
+++ b/storage/xtradb/include/sync0rw.h 2009-06-09 15:08:46 +0000
@@ -357,6 +357,8 @@ rw_lock_get_x_lock_count(
rw_lock_t* lock); /* in: rw-lock */
/************************************************************************
Accessor functions for rw lock. */
+
+#ifdef INNODB_RW_LOCKS_USE_ATOMICS
UNIV_INLINE
ulint
rw_lock_get_s_waiters(
@@ -372,6 +374,14 @@ ulint
rw_lock_get_wx_waiters(
/*================*/
rw_lock_t* lock);
+#else /* !INNODB_RW_LOCKS_USE_ATOMICS */
+UNIV_INLINE
+ulint
+rw_lock_get_waiters(
+/*==================*/
+ rw_lock_t* lock);
+#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
+
UNIV_INLINE
ulint
rw_lock_get_writer(
@@ -488,6 +498,7 @@ rw_lock_debug_print(
rw_lock_debug_t* info); /* in: debug struct */
#endif /* UNIV_SYNC_DEBUG */
+/*
#ifndef INNODB_RW_LOCKS_USE_ATOMICS
#error INNODB_RW_LOCKS_USE_ATOMICS is not defined. Do you use enough new GCC or compatibles?
#error Or do you use exact options for CFLAGS?
@@ -495,6 +506,7 @@ rw_lock_debug_print(
#error e.g. (for Sparc_64): "-m64 -mcpu=v9"
#error Otherwise, this build may be slower than normal version.
#endif
+*/
/* NOTE! The structure appears here only for the compiler to know its size.
Do not use its fields directly! The structure used in the spin lock
=== modified file 'storage/xtradb/include/sync0rw.ic'
--- a/storage/xtradb/include/sync0rw.ic 2009-03-26 06:11:11 +0000
+++ b/storage/xtradb/include/sync0rw.ic 2009-06-09 15:08:46 +0000
@@ -68,6 +68,8 @@ rw_lock_remove_debug_info(
/************************************************************************
Accessor functions for rw lock. */
+
+#ifdef INNODB_RW_LOCKS_USE_ATOMICS
UNIV_INLINE
ulint
rw_lock_get_s_waiters(
@@ -93,23 +95,32 @@ rw_lock_get_wx_waiters(
{
return(lock->wait_ex_waiters);
}
+#else /* !INNODB_RW_LOCKS_USE_ATOMICS */
+UNIV_INLINE
+ulint
+rw_lock_get_waiters(
+/*================*/
+ /* out: 1 if waiters, 0 otherwise */
+ rw_lock_t* lock) /* in: rw-lock */
+{
+ return(lock->waiters);
+}
+#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
+
/************************************************************************
Sets lock->waiters to 1. It is not an error if lock->waiters is already
1. On platforms where ATOMIC builtins are used this function enforces a
memory barrier. */
+#ifdef INNODB_RW_LOCKS_USE_ATOMICS
UNIV_INLINE
void
rw_lock_set_s_waiter_flag(
/*====================*/
rw_lock_t* lock) /* in: rw-lock */
{
-#ifdef INNODB_RW_LOCKS_USE_ATOMICS
// os_compare_and_swap(&lock->s_waiters, 0, 1);
__sync_lock_test_and_set(&lock->s_waiters, 1);
-#else /* INNODB_RW_LOCKS_USE_ATOMICS */
- lock->s_waiters = 1;
-#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
}
UNIV_INLINE
void
@@ -117,12 +128,8 @@ rw_lock_set_x_waiter_flag(
/*====================*/
rw_lock_t* lock) /* in: rw-lock */
{
-#ifdef INNODB_RW_LOCKS_USE_ATOMICS
// os_compare_and_swap(&lock->x_waiters, 0, 1);
__sync_lock_test_and_set(&lock->x_waiters, 1);
-#else /* INNODB_RW_LOCKS_USE_ATOMICS */
- lock->x_waiters = 1;
-#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
}
UNIV_INLINE
void
@@ -130,30 +137,34 @@ rw_lock_set_wx_waiter_flag(
/*====================*/
rw_lock_t* lock) /* in: rw-lock */
{
-#ifdef INNODB_RW_LOCKS_USE_ATOMICS
// os_compare_and_swap(&lock->wait_ex_waiters, 0, 1);
__sync_lock_test_and_set(&lock->wait_ex_waiters, 1);
-#else /* INNODB_RW_LOCKS_USE_ATOMICS */
- lock->wait_ex_waiters = 1;
-#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
}
+#else /* !INNODB_RW_LOCKS_USE_ATOMICS */
+UNIV_INLINE
+void
+rw_lock_set_waiter_flag(
+/*====================*/
+ rw_lock_t* lock) /* in: rw-lock */
+{
+ lock->waiters = 1;
+}
+#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
/************************************************************************
Resets lock->waiters to 0. It is not an error if lock->waiters is already
0. On platforms where ATOMIC builtins are used this function enforces a
memory barrier. */
+#ifdef INNODB_RW_LOCKS_USE_ATOMICS
+
UNIV_INLINE
void
rw_lock_reset_s_waiter_flag(
/*======================*/
rw_lock_t* lock) /* in: rw-lock */
{
-#ifdef INNODB_RW_LOCKS_USE_ATOMICS
// os_compare_and_swap(&lock->s_waiters, 1, 0);
__sync_lock_test_and_set(&lock->s_waiters, 0);
-#else /* INNODB_RW_LOCKS_USE_ATOMICS */
- lock->s_waiters = 0;
-#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
}
UNIV_INLINE
void
@@ -161,12 +172,8 @@ rw_lock_reset_x_waiter_flag(
/*======================*/
rw_lock_t* lock) /* in: rw-lock */
{
-#ifdef INNODB_RW_LOCKS_USE_ATOMICS
// os_compare_and_swap(&lock->x_waiters, 1, 0);
__sync_lock_test_and_set(&lock->x_waiters, 0);
-#else /* INNODB_RW_LOCKS_USE_ATOMICS */
- lock->x_waiters = 0;
-#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
}
UNIV_INLINE
void
@@ -174,13 +181,19 @@ rw_lock_reset_wx_waiter_flag(
/*======================*/
rw_lock_t* lock) /* in: rw-lock */
{
-#ifdef INNODB_RW_LOCKS_USE_ATOMICS
// os_compare_and_swap(&lock->wait_ex_waiters, 1, 0);
__sync_lock_test_and_set(&lock->wait_ex_waiters, 0);
-#else /* INNODB_RW_LOCKS_USE_ATOMICS */
- lock->wait_ex_waiters = 0;
-#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
}
+#else /* !INNODB_RW_LOCKS_USE_ATOMICS */
+UNIV_INLINE
+void
+rw_lock_reset_waiter_flag(
+/*======================*/
+ rw_lock_t* lock) /* in: rw-lock */
+{
+ lock->waiters = 0;
+}
+#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
/**********************************************************************
Returns the write-status of the lock - this function made more sense
=== modified file 'storage/xtradb/include/univ.i'
--- a/storage/xtradb/include/univ.i 2009-03-26 06:11:11 +0000
+++ b/storage/xtradb/include/univ.i 2009-06-09 15:08:46 +0000
@@ -210,7 +210,7 @@ operations (very slow); also UNIV_DEBUG
#define UNIV_BTR_DEBUG /* check B-tree links */
#define UNIV_LIGHT_MEM_DEBUG /* light memory debugging */
-#ifdef HAVE_purify
+#ifdef HAVE_valgrind
/* The following sets all new allocated memory to zero before use:
this can be used to eliminate unnecessary Purify warnings, but note that
it also masks many bugs Purify could detect. For detailed Purify analysis it
=== removed file 'storage/xtradb/setup.sh'
--- a/storage/xtradb/setup.sh 2009-06-09 11:16:11 +0000
+++ b/storage/xtradb/setup.sh 1970-01-01 00:00:00 +0000
@@ -1,47 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 1995, 2009, Innobase Oy. All Rights Reserved.
-#
-# This program is free software; you can redistribute it and/or modify it under
-# the terms of the GNU General Public License as published by the Free Software
-# Foundation; version 2 of the License.
-#
-# This program is distributed in the hope that it will be useful, but WITHOUT
-# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
-# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License along with
-# this program; if not, write to the Free Software Foundation, Inc., 59 Temple
-# Place, Suite 330, Boston, MA 02111-1307 USA
-#
-# Prepare the MySQL source code tree for building
-# with checked-out InnoDB Subversion directory.
-
-# This script assumes that the current directory is storage/innobase.
-
-set -eu
-
-TARGETDIR=../storage/innobase
-
-# link the build scripts
-BUILDSCRIPTS="compile-innodb compile-innodb-debug"
-for script in $BUILDSCRIPTS ; do
- ln -sf $TARGETDIR/$script ../../BUILD/
-done
-
-cd ../../mysql-test/t
-ln -sf ../$TARGETDIR/mysql-test/*.test ../$TARGETDIR/mysql-test/*.opt .
-cd ../r
-ln -sf ../$TARGETDIR/mysql-test/*.result .
-cd ../include
-ln -sf ../$TARGETDIR/mysql-test/*.inc .
-
-# Apply any patches that are needed to make the mysql-test suite successful.
-# These patches are usually needed because of deviations of behavior between
-# the stock InnoDB and the InnoDB Plugin.
-cd ../..
-for patch in storage/innobase/mysql-test/patches/*.diff ; do
- if [ "${patch}" != "storage/innobase/mysql-test/patches/*.diff" ] ; then
- patch -p0 < ${patch}
- fi
-done
=== modified file 'storage/xtradb/srv/srv0start.c'
--- a/storage/xtradb/srv/srv0start.c 2009-03-26 06:11:11 +0000
+++ b/storage/xtradb/srv/srv0start.c 2009-06-09 15:08:46 +0000
@@ -124,7 +124,7 @@ static char* srv_monitor_file_name;
/* Avoid warnings when using purify */
-#ifdef HAVE_purify
+#ifdef HAVE_valgrind
static int inno_bcmp(register const char *s1, register const char *s2,
register uint len)
{
3
2

[Maria-developers] bzr commit into MariaDB 5.1, with Maria 1.5:maria branch (knielsen:2714)
by knielsen@knielsen-hq.org 19 Jun '09
by knielsen@knielsen-hq.org 19 Jun '09
19 Jun '09
#At lp:maria
2714 knielsen(a)knielsen-hq.org 2009-06-18
Fix test cases after merge of XtraDB into MariaDB.
Manually merge some InnoDB changes into XtraDB.
Fix ALTER TABLE bug in XtraDB with wrong comparison of row type.
modified:
mysql-test/include/varchar.inc
mysql-test/mysql-test-run.pl
mysql-test/r/innodb.result
mysql-test/t/innodb-use-sys-malloc.test
sql/sql_table.cc
storage/xtradb/handler/ha_innodb.cc
storage/xtradb/include/pars0pars.h
storage/xtradb/include/univ.i
per-file messages:
mysql-test/include/varchar.inc
Fix in test case that which of several duplicate keys triggers an error is not
deterministic.
mysql-test/mysql-test-run.pl
InnoDB does not bother to free resources individually during shutdown, but due to using
its own memory tracking it nevertheless can free everything at exit. But XtraDB adds an
option, on by default, to skip this extra tracking. This causes lots of Valgrind
warnings, so needs to be disabled for Valgrind testing.
mysql-test/r/innodb.result
Fix in test case that which of several duplicate keys triggers an error is not
deterministic.
mysql-test/t/innodb-use-sys-malloc.test
InnoDB does not bother to free resources individually during shutdown, but due to using
its own memory tracking it nevertheless can free everything at exit. But XtraDB adds an
option, on by default, to skip this extra tracking. This causes lots of Valgrind
warnings, so needs to be disabled for Valgrind testing.
sql/sql_table.cc
Add some useful DBUG while debugging alter table.
storage/xtradb/handler/ha_innodb.cc
Fix that check_if_incompatible_data did not realise that ROW_TYPE_DEFAULT is identical
to the default row format ROW_TYPE_COMPACT, causing excessive table copying in
ALTER TABLE
Add some useful DBUG while debugging alter table.
Manually merge into XtraDB a few small changes for InnoDB from upstream MySQL.
storage/xtradb/include/pars0pars.h
Manually merge into XtraDB a few small changes for InnoDB from upstream MySQL.
storage/xtradb/include/univ.i
Manually merge a MariaDB fix in InnoDB into XtraDB.
=== modified file 'mysql-test/include/varchar.inc'
--- a/mysql-test/include/varchar.inc 2008-05-14 06:50:16 +0000
+++ b/mysql-test/include/varchar.inc 2009-06-18 12:39:21 +0000
@@ -86,6 +86,8 @@ explain select count(*) from t1 where v
--replace_column 9 #
explain select count(*) from t1 where v between 'a' and 'a ' and v between 'a ' and 'b\n';
+# Which duplicate entry triggers error is not deterministic.
+--replace_regex /Duplicate entry '[^']+' for key/Duplicate entry '{ ' for key/
--error ER_DUP_ENTRY
alter table t1 add unique(v);
alter table t1 add key(v);
=== modified file 'mysql-test/mysql-test-run.pl'
--- a/mysql-test/mysql-test-run.pl 2009-06-05 15:35:22 +0000
+++ b/mysql-test/mysql-test-run.pl 2009-06-18 12:39:21 +0000
@@ -1356,6 +1356,18 @@ sub command_line_setup {
join(" ", @valgrind_args), "\"");
}
+ # InnoDB does not bother to do individual de-allocations at exit. Instead it
+ # relies on a custom allocator to track every allocation, and frees all at
+ # once during exit.
+ # In XtraDB, an option use-sys-malloc is introduced (and on by default) to
+ # disable this (for performance). But this exposes Valgrind to all the
+ # missing de-allocations, so we need to disable it to at least get
+ # meaningful leak checking for the rest of the server.
+ if ($opt_valgrind_mysqld)
+ {
+ push(@opt_extra_mysqld_opt, "--loose-skip-innodb-use-sys-malloc");
+ }
+
mtr_report("Checking supported features...");
check_ndbcluster_support(\%mysqld_variables);
=== modified file 'mysql-test/r/innodb.result'
--- a/mysql-test/r/innodb.result 2009-06-09 15:08:46 +0000
+++ b/mysql-test/r/innodb.result 2009-06-18 12:39:21 +0000
@@ -1970,7 +1970,7 @@ explain select count(*) from t1 where v
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t1 ref v v 13 const # Using where; Using index
alter table t1 add unique(v);
-ERROR 23000: Duplicate entry 'v' for key 'v_2'
+ERROR 23000: Duplicate entry '{ ' for key 'v_2'
alter table t1 add key(v);
select concat('*',v,'*',c,'*',t,'*') as qq from t1 where v='a';
qq
=== modified file 'mysql-test/t/innodb-use-sys-malloc.test'
--- a/mysql-test/t/innodb-use-sys-malloc.test 2009-06-09 13:19:13 +0000
+++ b/mysql-test/t/innodb-use-sys-malloc.test 2009-06-18 12:39:21 +0000
@@ -1,4 +1,7 @@
--source include/have_innodb.inc
+# XtraDB has lots of memory leak warnings at shutdown when
+# --innodb-use-sys-malloc
+--source include/not_valgrind.inc
#display current value of innodb_use_sys_malloc
SELECT @@GLOBAL.innodb_use_sys_malloc;
=== modified file 'sql/sql_table.cc'
--- a/sql/sql_table.cc 2009-06-02 09:58:27 +0000
+++ b/sql/sql_table.cc 2009-06-18 12:39:21 +0000
@@ -5333,6 +5333,7 @@ compare_tables(TABLE *table,
!table->s->mysql_version ||
(table->s->frm_version < FRM_VER_TRUE_VARCHAR && varchar))
{
+ DBUG_PRINT("info", ("Basic checks -> ALTER_TABLE_DATA_CHANGED"));
*need_copy_table= ALTER_TABLE_DATA_CHANGED;
DBUG_RETURN(0);
}
@@ -5361,6 +5362,8 @@ compare_tables(TABLE *table,
if ((tmp_new_field->flags & NOT_NULL_FLAG) !=
(uint) (field->flags & NOT_NULL_FLAG))
{
+ DBUG_PRINT("info", ("NULL behaviour difference in field '%s' -> "
+ "ALTER_TABLE_DATA_CHANGED", new_field->field_name));
*need_copy_table= ALTER_TABLE_DATA_CHANGED;
DBUG_RETURN(0);
}
@@ -5382,6 +5385,8 @@ compare_tables(TABLE *table,
/* Evaluate changes bitmap and send to check_if_incompatible_data() */
if (!(tmp= field->is_equal(tmp_new_field)))
{
+ DBUG_PRINT("info", ("!field_is_equal('%s') -> ALTER_TABLE_DATA_CHANGED",
+ new_field->field_name));
*need_copy_table= ALTER_TABLE_DATA_CHANGED;
DBUG_RETURN(0);
}
@@ -5515,16 +5520,22 @@ compare_tables(TABLE *table,
/* Check if changes are compatible with current handler without a copy */
if (table->file->check_if_incompatible_data(create_info, changes))
{
+ DBUG_PRINT("info", ("check_if_incompatible_data() -> "
+ "ALTER_TABLE_DATA_CHANGED"));
*need_copy_table= ALTER_TABLE_DATA_CHANGED;
DBUG_RETURN(0);
}
if (*index_drop_count || *index_add_count)
{
+ DBUG_PRINT("info", ("Index dropped=%u added=%u -> "
+ "ALTER_TABLE_INDEX_CHANGED",
+ *index_drop_count, *index_add_count));
*need_copy_table= ALTER_TABLE_INDEX_CHANGED;
DBUG_RETURN(0);
}
+ DBUG_PRINT("info", (" -> ALTER_TABLE_METADATA_ONLY"));
*need_copy_table= ALTER_TABLE_METADATA_ONLY; // Tables are compatible
DBUG_RETURN(0);
}
=== modified file 'storage/xtradb/handler/ha_innodb.cc'
--- a/storage/xtradb/handler/ha_innodb.cc 2009-06-11 12:53:26 +0000
+++ b/storage/xtradb/handler/ha_innodb.cc 2009-06-18 12:39:21 +0000
@@ -5018,7 +5018,6 @@ convert_search_mode_to_innobase(
case HA_READ_MBR_WITHIN:
case HA_READ_MBR_DISJOINT:
case HA_READ_MBR_EQUAL:
- my_error(ER_TABLE_CANT_HANDLE_SPKEYS, MYF(0));
return(PAGE_CUR_UNSUPP);
/* do not use "default:" in order to produce a gcc warning:
enumeration value '...' not handled in switch
@@ -6720,6 +6719,7 @@ innobase_rename_table(
int error;
char* norm_to;
char* norm_from;
+ DBUG_ENTER("innobase_rename_table");
if (lower_case_table_names) {
srv_lower_case_table_names = TRUE;
@@ -6747,6 +6747,7 @@ innobase_rename_table(
if (error != DB_SUCCESS) {
FILE* ef = dict_foreign_err_file;
+ DBUG_PRINT("info", ("rename failed: %d", error));
fputs("InnoDB: Renaming table ", ef);
ut_print_name(ef, trx, TRUE, norm_from);
fputs(" to ", ef);
@@ -6767,7 +6768,7 @@ innobase_rename_table(
my_free(norm_to, MYF(0));
my_free(norm_from, MYF(0));
- return error;
+ DBUG_RETURN(error);
}
/*************************************************************************
Renames an InnoDB table. */
@@ -6900,7 +6901,7 @@ ha_innobase::records_in_range(
mode2);
} else {
- n_rows = 0;
+ n_rows = HA_POS_ERROR;
}
mem_heap_free(heap);
@@ -7614,7 +7615,7 @@ ha_innobase::get_foreign_key_list(THD *t
f_key_info.referenced_key_name = thd_make_lex_string(
thd, f_key_info.referenced_key_name,
foreign->referenced_index->name,
- strlen(foreign->referenced_index->name), 1);
+ (uint) strlen(foreign->referenced_index->name), 1);
}
else
f_key_info.referenced_key_name= 0;
@@ -8227,7 +8228,7 @@ innodb_show_status(
bool result = FALSE;
- if (stat_print(thd, innobase_hton_name, strlen(innobase_hton_name),
+ if (stat_print(thd, innobase_hton_name, (uint) strlen(innobase_hton_name),
STRING_WITH_LEN(""), str, flen)) {
result= TRUE;
}
@@ -8258,7 +8259,7 @@ innodb_mutex_show_status(
ulint rw_lock_count_os_yield= 0;
ulonglong rw_lock_wait_time= 0;
#endif /* UNIV_DEBUG */
- uint hton_name_len= strlen(innobase_hton_name), buf1len, buf2len;
+ uint hton_name_len= (uint) strlen(innobase_hton_name), buf1len, buf2len;
DBUG_ENTER("innodb_mutex_show_status");
DBUG_ASSERT(hton == innodb_hton_ptr);
@@ -8302,9 +8303,9 @@ innodb_mutex_show_status(
rw_lock_wait_time += mutex->lspent_time;
}
#else /* UNIV_DEBUG */
- buf1len= my_snprintf(buf1, sizeof(buf1), "%s:%lu",
+ buf1len= (uint) my_snprintf(buf1, sizeof(buf1), "%s:%lu",
mutex->cfile_name, (ulong) mutex->cline);
- buf2len= my_snprintf(buf2, sizeof(buf2), "os_waits=%lu",
+ buf2len= (uint) my_snprintf(buf2, sizeof(buf2), "os_waits=%lu",
mutex->count_os_wait);
if (stat_print(thd, innobase_hton_name,
@@ -8860,7 +8861,7 @@ ha_innobase::get_error_message(int error
{
trx_t* trx = check_trx_exists(ha_thd());
- buf->copy(trx->detailed_error, strlen(trx->detailed_error),
+ buf->copy(trx->detailed_error, (uint) strlen(trx->detailed_error),
system_charset_info);
return(FALSE);
@@ -9294,31 +9295,49 @@ ha_innobase::check_if_incompatible_data(
HA_CREATE_INFO* info,
uint table_changes)
{
+ enum row_type row_type, info_row_type;
+ DBUG_ENTER("ha_innobase::check_if_incompatible_data");
+
if (table_changes != IS_EQUAL_YES) {
- return(COMPATIBLE_DATA_NO);
+ DBUG_PRINT("info", ("table_changes != IS_EQUAL_YES "
+ "-> COMPATIBLE_DATA_NO"));
+ DBUG_RETURN(COMPATIBLE_DATA_NO);
}
/* Check that auto_increment value was not changed */
if ((info->used_fields & HA_CREATE_USED_AUTO) &&
info->auto_increment_value != 0) {
- return(COMPATIBLE_DATA_NO);
+ DBUG_PRINT("info", ("auto_increment_value changed -> "
+ "COMPATIBLE_DATA_NO"));
+ DBUG_RETURN(COMPATIBLE_DATA_NO);
}
/* Check that row format didn't change */
+ row_type = get_row_type();
+ info_row_type = info->row_type;
+ /* Default is compact. */
+ if (info_row_type == ROW_TYPE_DEFAULT)
+ info_row_type = ROW_TYPE_COMPACT;
if ((info->used_fields & HA_CREATE_USED_ROW_FORMAT) &&
- get_row_type() != info->row_type) {
+ row_type != info_row_type) {
- return(COMPATIBLE_DATA_NO);
+ DBUG_PRINT("info", ("get_row_type()=%d != info->row_type=%d -> "
+ "COMPATIBLE_DATA_NO",
+ row_type, info->row_type));
+ DBUG_RETURN(COMPATIBLE_DATA_NO);
}
/* Specifying KEY_BLOCK_SIZE requests a rebuild of the table. */
if (info->used_fields & HA_CREATE_USED_KEY_BLOCK_SIZE) {
- return(COMPATIBLE_DATA_NO);
+ DBUG_PRINT("info", ("HA_CREATE_USED_KEY_BLOCK_SIZE -> "
+ "COMPATIBLE_DATA_NO"));
+ DBUG_RETURN(COMPATIBLE_DATA_NO);
}
- return(COMPATIBLE_DATA_YES);
+ DBUG_PRINT("info", (" -> COMPATIBLE_DATA_YES"));
+ DBUG_RETURN(COMPATIBLE_DATA_YES);
}
/****************************************************************
=== modified file 'storage/xtradb/include/pars0pars.h'
--- a/storage/xtradb/include/pars0pars.h 2009-03-26 06:11:11 +0000
+++ b/storage/xtradb/include/pars0pars.h 2009-06-18 12:39:21 +0000
@@ -700,7 +700,7 @@ struct for_node_struct{
definition */
que_node_t* loop_start_limit;/* initial value of loop variable */
que_node_t* loop_end_limit; /* end value of loop variable */
- int loop_end_value; /* evaluated value for the end value:
+ lint loop_end_value; /* evaluated value for the end value:
it is calculated only when the loop
is entered, and will not change within
the loop */
=== modified file 'storage/xtradb/include/univ.i'
--- a/storage/xtradb/include/univ.i 2009-06-11 12:53:26 +0000
+++ b/storage/xtradb/include/univ.i 2009-06-18 12:39:21 +0000
@@ -133,10 +133,10 @@ from Makefile.in->ut0auxconf.h */
# endif /* HAVE_ATOMIC_PTHREAD_T */
#endif /* HAVE_GCC_ATOMIC_BUILTINS */
-/* We only try to do explicit inlining of functions with gcc and
-Microsoft Visual C++ */
+/* Enable explicit inlining of functions only for compilers known to
+support it. */
-# if !defined(__GNUC__)
+# if !defined(__GNUC__) && !defined(__SUNPRO_C)
# undef UNIV_MUST_NOT_INLINE /* Remove compiler warning */
# define UNIV_MUST_NOT_INLINE
# endif
3
2

[Maria-developers] Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
by worklog-noreply@askmonty.org 18 Jun '09
by worklog-noreply@askmonty.org 18 Jun '09
18 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: 9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Thu, 18 Jun 2009, 16:55)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.19152 2009-06-18 16:55:00.000000000 +0300
+++ /tmp/wklog.24.new.19152 2009-06-18 16:55:00.000000000 +0300
@@ -141,13 +141,15 @@
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
+
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
-1. Don't remove index_merge part of the tree.
+A1. Don't remove index_merge part of the tree (this will take care of
+ DISCARD-IMERGE-1 problem)
-2. Push range conditions down into index_merge trees that may support them.
+A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
@@ -155,8 +157,86 @@
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
-3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
-2.2 New tree_or()
+2.2 New tree_or()
+-----------------
+O1. Dont remove non-range plans:
+ Current tree_or() code will refuse to produce index_merge plans for
+ conditions like
+
+ "t.key1part2=const OR t.key2part1=const"
+
+ (this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
+ the AND condition is not usable for range access, and the operation of
+ tree_and() guaranteed that there was no way it could changed to make a
+ usable range plan. With new tree_and() and rule A2, this is no longer the
+ case. For example for this query:
+
+ (t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
+
+ it will construct a
+
+ imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
+
+ then tree_and() will apply rule A2 to push the range down into index merge
+ and after that we'll have:
+
+ range(t.key1part1=const)
+ imerge(
+ t.key1part2=const AND t.key1part1=const,
+ t.key2part1=const
+ )
+ note that imerge(...) describes a usable index_merge plan and it's possible
+ that it will be the best access path.
+
+O2. "Create index_merge accesses when possible"
+ Current tree_or() will not create index_merge access when it could create
+ non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
+ in the current implementation" section). This will be changed to work as
+ follows: we will create index_merge made for index scans that didn't have
+ their match in the other sel_tree.
+ Ilustrating it with an example:
+
+ | sel_tree_A | sel_tree_B | A or B | include in index_merge?
+ ------+------------+------------+--------+------------------------
+ key1 | cond1 | cond2 | condM | no
+ key2 | cond3 | cond4 | NULL | no
+ key3 | cond5 | | | yes, A-side
+ key4 | cond6 | | | yes, A-side
+ key5 | | cond7 | | yes, B-side
+ key6 | | cond8 | | yes, B-side
+
+ here we assume that
+ - (cond1 OR cond2) did produce a combined range. Not including them in
+ index_merge.
+ - (cond3 OR cond4) didn't produce a usable range (e.g. they were
+ t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
+ didn't yield any range list)
+ - All other scand didn't have their counterparts, so we'll end up with a
+ SEL_TREE of:
+
+ range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
+ .
+
+O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
+that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
+seen any complaints that could be attributed to it.
+If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
+lift it ,and produce a cross-product:
+
+ ((key1p OR key2p) AND (key3p OR key4p))
+ OR
+ ((key5p OR key6p) AND (key7p OR key8p))
+
+ = (key1p OR key2p OR key5p OR key6p) AND // this part is currently
+ (key3p OR key4p OR key5p OR key6p) AND // produced
+
+ (key1p OR key2p OR key5p OR key6p) AND // this part will be added
+ (key3p OR key4p OR key5p OR key6p) //.
+
+In order to limit the impact of this combinatorial explosion, we will
+introduce a rule that we won't generate more than #defined
+MAX_IMERGE_OPTS options.
-=-=(Guest - Thu, 18 Jun 2009, 14:56)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.15612 2009-06-18 14:56:09.000000000 +0300
+++ /tmp/wklog.24.new.15612 2009-06-18 14:56:09.000000000 +0300
@@ -1 +1,162 @@
+<contents>
+1. Current implementation overview
+1.1. Problems in the current implementation
+2. New implementation
+2.1 New tree_and()
+2.2 New tree_or()
+</contents>
+
+1. Current implementation overview
+==================================
+At the moment, range analyzer works as follows:
+
+SEL_TREE structure represents
+
+ # There are sel_trees, a sel_tree is either range or merge tree
+ sel_tree = range_tree | imerge_tree
+
+ # a range tree has range access options, possibly for several keys
+ range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+
+ # merge tree represents several way to index_merge
+ imerge_tree = imerge1 AND imerge2 AND ...
+
+ # a way to do index merge == a set to use of different indexes.
+ imergeX = range_tree1 OR range_tree2 OR ..
+ where no pair of range_treeX have ranges over the same index.
+
+
+ tree_and(A, B)
+ {
+ if (both A and B are range trees)
+ return a range_tree with computed intersection for each range;
+ if (only one of A and B is a range tree)
+ return that tree; // DISCARD-IMERGE-1
+ // at this point both trees are index_merge trees
+ return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
+ }
+
+
+ tree_or(A, B)
+ {
+ if (A and B are range trees)
+ {
+ R = new range_tree;
+ for each index i
+ R.add(range_union(A.range(i), B.range(i)));
+
+ if (R has at least one range access)
+ return R;
+ else
+ {
+ /* could not build any range accesses. construct index_merge */
+ remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from B;
+ return new index_merge(A, B);
+ }
+ }
+ else if (A is range tree and B is index_merge tree (or vice versa))
+ {
+ Perform this transformation:
+
+ range_treeA // this is A
+ OR
+ (range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
+ (range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ =
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+
+ Now each line represents an index_merge..
+ }
+ else if (both A and B are index_merge trees)
+ {
+ Perform this transformation:
+
+ imergeA1 AND imergeA2 AND ... AND imergeAN
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN
+
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+
+ imergeA1
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN =
+
+ = (combine imergeA1 with each of the imergeB{i} ) =
+
+ combine(imergeA1 OR imergeB1) AND
+ combine(imergeA1 OR imergeB2) AND
+ ... AND
+ combine(imergeA1 OR imergeBN)
+ }
+ }
+
+1.1. Problems in the current implementation
+-------------------------------------------
+As marked in the code above:
+
+DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
+the WHERE clause has this form:
+
+ (t.key1=c1 OR t.key2=c2) AND t.badkey < c3
+
+DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
+the WHERE clause has this form (conditions t.badkey may have abritrary form):
+
+ (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+
+DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
+two indexes:
+
+ INDEX i1(col1, col2),
+ INDEX i2(col1, col3)
+
+and this WHERE clause:
+
+ col1=c1 AND (col2=c2 OR col3=c3)
+
+The optimizer will generate the plans that only use the "col1=c1" part. The
+right side of the AND will be ignored even if it has good selectivity.
+
+
+2. New implementation
+=====================
+
+<general idea>
+* Don't start fighting combinatorial explosion until we've actually got one.
+</>
+
+SEL_TREE structure will be now able to hold both index_merge and range scan
+candidates at the same time. That is,
+
+ sel_tree2 = range_tree AND imerge_tree
+
+where both parts are optional (i.e. can be empty)
+
+Operations on SEL_ARG trees will be modified to produce/process the trees of
+this kind:
+
+2.1 New tree_and()
+------------------
+In order not to lose plans, we'll make these changes:
+
+1. Don't remove index_merge part of the tree.
+
+2. Push range conditions down into index_merge trees that may support them.
+ if one tree has range(key1) and the other tree has imerge(key1 OR key2)
+ then perform an equvalent of this operation:
+
+ rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
+
+ (rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
+
+3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+ concatenate them together.
+
+2.2 New tree_or()
-=-=(Guest - Sat, 13 Jun 2009, 06:29)=-=-
Category updated.
--- /tmp/wklog.24.old.25753 2009-06-13 06:29:10.000000000 +0300
+++ /tmp/wklog.24.new.25753 2009-06-13 06:29:10.000000000 +0300
@@ -1 +1 @@
-Server-BackLog
+Server-Sprint
-=-=(Guest - Sat, 13 Jun 2009, 06:14)=-=-
Category updated.
--- /tmp/wklog.24.old.24991 2009-06-13 06:14:03.000000000 +0300
+++ /tmp/wklog.24.new.24991 2009-06-13 06:14:03.000000000 +0300
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-BackLog
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 24
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Category updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Client-BackLog
+Server-RawIdeaBin
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Version updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Server-9.x
+9.x
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Version updated.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
------------------------------------------------------------
-=-=(View All Progress Notes, 11 total)=-=-
http://askmonty.org/worklog/index.pl?tid=24&nolimit=1
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
LOW-LEVEL DESIGN:
<contents>
1. Current implementation overview
1.1. Problems in the current implementation
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
</contents>
1. Current implementation overview
==================================
At the moment, range analyzer works as follows:
SEL_TREE structure represents
# There are sel_trees, a sel_tree is either range or merge tree
sel_tree = range_tree | imerge_tree
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
# a way to do index merge == a set to use of different indexes.
imergeX = range_tree1 OR range_tree2 OR ..
where no pair of range_treeX have ranges over the same index.
tree_and(A, B)
{
if (both A and B are range trees)
return a range_tree with computed intersection for each range;
if (only one of A and B is a range tree)
return that tree; // DISCARD-IMERGE-1
// at this point both trees are index_merge trees
return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
}
tree_or(A, B)
{
if (A and B are range trees)
{
R = new range_tree;
for each index i
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
return R;
else
{
/* could not build any range accesses. construct index_merge */
remove non-ranges from A; // DISCARD-IMERGE-2
remove non-ranges from B;
return new index_merge(A, B);
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
{
Perform this transformation:
range_treeA // this is A
OR
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
(range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
Now each line represents an index_merge..
}
else if (both A and B are index_merge trees)
{
Perform this transformation:
imergeA1 AND imergeA2 AND ... AND imergeAN
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
-> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
imergeA1
OR
imergeB1 AND imergeB2 AND ... AND imergeBN =
= (combine imergeA1 with each of the imergeB{i} ) =
combine(imergeA1 OR imergeB1) AND
combine(imergeA1 OR imergeB2) AND
... AND
combine(imergeA1 OR imergeBN)
}
}
1.1. Problems in the current implementation
-------------------------------------------
As marked in the code above:
DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
the WHERE clause has this form:
(t.key1=c1 OR t.key2=c2) AND t.badkey < c3
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
(t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
INDEX i1(col1, col2),
INDEX i2(col1, col3)
and this WHERE clause:
col1=c1 AND (col2=c2 OR col3=c3)
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
2. New implementation
=====================
<general idea>
* Don't start fighting combinatorial explosion until we've actually got one.
</>
SEL_TREE structure will be now able to hold both index_merge and range scan
candidates at the same time. That is,
sel_tree2 = range_tree AND imerge_tree
where both parts are optional (i.e. can be empty)
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
A1. Don't remove index_merge part of the tree (this will take care of
DISCARD-IMERGE-1 problem)
A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
2.2 New tree_or()
-----------------
O1. Dont remove non-range plans:
Current tree_or() code will refuse to produce index_merge plans for
conditions like
"t.key1part2=const OR t.key2part1=const"
(this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
the AND condition is not usable for range access, and the operation of
tree_and() guaranteed that there was no way it could changed to make a
usable range plan. With new tree_and() and rule A2, this is no longer the
case. For example for this query:
(t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
it will construct a
imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
then tree_and() will apply rule A2 to push the range down into index merge
and after that we'll have:
range(t.key1part1=const)
imerge(
t.key1part2=const AND t.key1part1=const,
t.key2part1=const
)
note that imerge(...) describes a usable index_merge plan and it's possible
that it will be the best access path.
O2. "Create index_merge accesses when possible"
Current tree_or() will not create index_merge access when it could create
non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
in the current implementation" section). This will be changed to work as
follows: we will create index_merge made for index scans that didn't have
their match in the other sel_tree.
Ilustrating it with an example:
| sel_tree_A | sel_tree_B | A or B | include in index_merge?
------+------------+------------+--------+------------------------
key1 | cond1 | cond2 | condM | no
key2 | cond3 | cond4 | NULL | no
key3 | cond5 | | | yes, A-side
key4 | cond6 | | | yes, A-side
key5 | | cond7 | | yes, B-side
key6 | | cond8 | | yes, B-side
here we assume that
- (cond1 OR cond2) did produce a combined range. Not including them in
index_merge.
- (cond3 OR cond4) didn't produce a usable range (e.g. they were
t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
didn't yield any range list)
- All other scand didn't have their counterparts, so we'll end up with a
SEL_TREE of:
range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
.
O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
seen any complaints that could be attributed to it.
If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
lift it ,and produce a cross-product:
((key1p OR key2p) AND (key3p OR key4p))
OR
((key5p OR key6p) AND (key7p OR key8p))
= (key1p OR key2p OR key5p OR key6p) AND // this part is currently
(key3p OR key4p OR key5p OR key6p) AND // produced
(key1p OR key2p OR key5p OR key6p) AND // this part will be added
(key3p OR key4p OR key5p OR key6p) //.
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
by worklog-noreply@askmonty.org 18 Jun '09
by worklog-noreply@askmonty.org 18 Jun '09
18 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: 9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Thu, 18 Jun 2009, 16:55)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.19152 2009-06-18 16:55:00.000000000 +0300
+++ /tmp/wklog.24.new.19152 2009-06-18 16:55:00.000000000 +0300
@@ -141,13 +141,15 @@
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
+
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
-1. Don't remove index_merge part of the tree.
+A1. Don't remove index_merge part of the tree (this will take care of
+ DISCARD-IMERGE-1 problem)
-2. Push range conditions down into index_merge trees that may support them.
+A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
@@ -155,8 +157,86 @@
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
-3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
-2.2 New tree_or()
+2.2 New tree_or()
+-----------------
+O1. Dont remove non-range plans:
+ Current tree_or() code will refuse to produce index_merge plans for
+ conditions like
+
+ "t.key1part2=const OR t.key2part1=const"
+
+ (this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
+ the AND condition is not usable for range access, and the operation of
+ tree_and() guaranteed that there was no way it could changed to make a
+ usable range plan. With new tree_and() and rule A2, this is no longer the
+ case. For example for this query:
+
+ (t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
+
+ it will construct a
+
+ imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
+
+ then tree_and() will apply rule A2 to push the range down into index merge
+ and after that we'll have:
+
+ range(t.key1part1=const)
+ imerge(
+ t.key1part2=const AND t.key1part1=const,
+ t.key2part1=const
+ )
+ note that imerge(...) describes a usable index_merge plan and it's possible
+ that it will be the best access path.
+
+O2. "Create index_merge accesses when possible"
+ Current tree_or() will not create index_merge access when it could create
+ non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
+ in the current implementation" section). This will be changed to work as
+ follows: we will create index_merge made for index scans that didn't have
+ their match in the other sel_tree.
+ Ilustrating it with an example:
+
+ | sel_tree_A | sel_tree_B | A or B | include in index_merge?
+ ------+------------+------------+--------+------------------------
+ key1 | cond1 | cond2 | condM | no
+ key2 | cond3 | cond4 | NULL | no
+ key3 | cond5 | | | yes, A-side
+ key4 | cond6 | | | yes, A-side
+ key5 | | cond7 | | yes, B-side
+ key6 | | cond8 | | yes, B-side
+
+ here we assume that
+ - (cond1 OR cond2) did produce a combined range. Not including them in
+ index_merge.
+ - (cond3 OR cond4) didn't produce a usable range (e.g. they were
+ t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
+ didn't yield any range list)
+ - All other scand didn't have their counterparts, so we'll end up with a
+ SEL_TREE of:
+
+ range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
+ .
+
+O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
+that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
+seen any complaints that could be attributed to it.
+If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
+lift it ,and produce a cross-product:
+
+ ((key1p OR key2p) AND (key3p OR key4p))
+ OR
+ ((key5p OR key6p) AND (key7p OR key8p))
+
+ = (key1p OR key2p OR key5p OR key6p) AND // this part is currently
+ (key3p OR key4p OR key5p OR key6p) AND // produced
+
+ (key1p OR key2p OR key5p OR key6p) AND // this part will be added
+ (key3p OR key4p OR key5p OR key6p) //.
+
+In order to limit the impact of this combinatorial explosion, we will
+introduce a rule that we won't generate more than #defined
+MAX_IMERGE_OPTS options.
-=-=(Guest - Thu, 18 Jun 2009, 14:56)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.15612 2009-06-18 14:56:09.000000000 +0300
+++ /tmp/wklog.24.new.15612 2009-06-18 14:56:09.000000000 +0300
@@ -1 +1,162 @@
+<contents>
+1. Current implementation overview
+1.1. Problems in the current implementation
+2. New implementation
+2.1 New tree_and()
+2.2 New tree_or()
+</contents>
+
+1. Current implementation overview
+==================================
+At the moment, range analyzer works as follows:
+
+SEL_TREE structure represents
+
+ # There are sel_trees, a sel_tree is either range or merge tree
+ sel_tree = range_tree | imerge_tree
+
+ # a range tree has range access options, possibly for several keys
+ range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+
+ # merge tree represents several way to index_merge
+ imerge_tree = imerge1 AND imerge2 AND ...
+
+ # a way to do index merge == a set to use of different indexes.
+ imergeX = range_tree1 OR range_tree2 OR ..
+ where no pair of range_treeX have ranges over the same index.
+
+
+ tree_and(A, B)
+ {
+ if (both A and B are range trees)
+ return a range_tree with computed intersection for each range;
+ if (only one of A and B is a range tree)
+ return that tree; // DISCARD-IMERGE-1
+ // at this point both trees are index_merge trees
+ return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
+ }
+
+
+ tree_or(A, B)
+ {
+ if (A and B are range trees)
+ {
+ R = new range_tree;
+ for each index i
+ R.add(range_union(A.range(i), B.range(i)));
+
+ if (R has at least one range access)
+ return R;
+ else
+ {
+ /* could not build any range accesses. construct index_merge */
+ remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from B;
+ return new index_merge(A, B);
+ }
+ }
+ else if (A is range tree and B is index_merge tree (or vice versa))
+ {
+ Perform this transformation:
+
+ range_treeA // this is A
+ OR
+ (range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
+ (range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ =
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+
+ Now each line represents an index_merge..
+ }
+ else if (both A and B are index_merge trees)
+ {
+ Perform this transformation:
+
+ imergeA1 AND imergeA2 AND ... AND imergeAN
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN
+
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+
+ imergeA1
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN =
+
+ = (combine imergeA1 with each of the imergeB{i} ) =
+
+ combine(imergeA1 OR imergeB1) AND
+ combine(imergeA1 OR imergeB2) AND
+ ... AND
+ combine(imergeA1 OR imergeBN)
+ }
+ }
+
+1.1. Problems in the current implementation
+-------------------------------------------
+As marked in the code above:
+
+DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
+the WHERE clause has this form:
+
+ (t.key1=c1 OR t.key2=c2) AND t.badkey < c3
+
+DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
+the WHERE clause has this form (conditions t.badkey may have abritrary form):
+
+ (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+
+DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
+two indexes:
+
+ INDEX i1(col1, col2),
+ INDEX i2(col1, col3)
+
+and this WHERE clause:
+
+ col1=c1 AND (col2=c2 OR col3=c3)
+
+The optimizer will generate the plans that only use the "col1=c1" part. The
+right side of the AND will be ignored even if it has good selectivity.
+
+
+2. New implementation
+=====================
+
+<general idea>
+* Don't start fighting combinatorial explosion until we've actually got one.
+</>
+
+SEL_TREE structure will be now able to hold both index_merge and range scan
+candidates at the same time. That is,
+
+ sel_tree2 = range_tree AND imerge_tree
+
+where both parts are optional (i.e. can be empty)
+
+Operations on SEL_ARG trees will be modified to produce/process the trees of
+this kind:
+
+2.1 New tree_and()
+------------------
+In order not to lose plans, we'll make these changes:
+
+1. Don't remove index_merge part of the tree.
+
+2. Push range conditions down into index_merge trees that may support them.
+ if one tree has range(key1) and the other tree has imerge(key1 OR key2)
+ then perform an equvalent of this operation:
+
+ rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
+
+ (rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
+
+3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+ concatenate them together.
+
+2.2 New tree_or()
-=-=(Guest - Sat, 13 Jun 2009, 06:29)=-=-
Category updated.
--- /tmp/wklog.24.old.25753 2009-06-13 06:29:10.000000000 +0300
+++ /tmp/wklog.24.new.25753 2009-06-13 06:29:10.000000000 +0300
@@ -1 +1 @@
-Server-BackLog
+Server-Sprint
-=-=(Guest - Sat, 13 Jun 2009, 06:14)=-=-
Category updated.
--- /tmp/wklog.24.old.24991 2009-06-13 06:14:03.000000000 +0300
+++ /tmp/wklog.24.new.24991 2009-06-13 06:14:03.000000000 +0300
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-BackLog
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 24
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Category updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Client-BackLog
+Server-RawIdeaBin
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Version updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Server-9.x
+9.x
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Version updated.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
------------------------------------------------------------
-=-=(View All Progress Notes, 11 total)=-=-
http://askmonty.org/worklog/index.pl?tid=24&nolimit=1
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
LOW-LEVEL DESIGN:
<contents>
1. Current implementation overview
1.1. Problems in the current implementation
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
</contents>
1. Current implementation overview
==================================
At the moment, range analyzer works as follows:
SEL_TREE structure represents
# There are sel_trees, a sel_tree is either range or merge tree
sel_tree = range_tree | imerge_tree
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
# a way to do index merge == a set to use of different indexes.
imergeX = range_tree1 OR range_tree2 OR ..
where no pair of range_treeX have ranges over the same index.
tree_and(A, B)
{
if (both A and B are range trees)
return a range_tree with computed intersection for each range;
if (only one of A and B is a range tree)
return that tree; // DISCARD-IMERGE-1
// at this point both trees are index_merge trees
return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
}
tree_or(A, B)
{
if (A and B are range trees)
{
R = new range_tree;
for each index i
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
return R;
else
{
/* could not build any range accesses. construct index_merge */
remove non-ranges from A; // DISCARD-IMERGE-2
remove non-ranges from B;
return new index_merge(A, B);
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
{
Perform this transformation:
range_treeA // this is A
OR
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
(range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
Now each line represents an index_merge..
}
else if (both A and B are index_merge trees)
{
Perform this transformation:
imergeA1 AND imergeA2 AND ... AND imergeAN
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
-> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
imergeA1
OR
imergeB1 AND imergeB2 AND ... AND imergeBN =
= (combine imergeA1 with each of the imergeB{i} ) =
combine(imergeA1 OR imergeB1) AND
combine(imergeA1 OR imergeB2) AND
... AND
combine(imergeA1 OR imergeBN)
}
}
1.1. Problems in the current implementation
-------------------------------------------
As marked in the code above:
DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
the WHERE clause has this form:
(t.key1=c1 OR t.key2=c2) AND t.badkey < c3
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
(t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
INDEX i1(col1, col2),
INDEX i2(col1, col3)
and this WHERE clause:
col1=c1 AND (col2=c2 OR col3=c3)
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
2. New implementation
=====================
<general idea>
* Don't start fighting combinatorial explosion until we've actually got one.
</>
SEL_TREE structure will be now able to hold both index_merge and range scan
candidates at the same time. That is,
sel_tree2 = range_tree AND imerge_tree
where both parts are optional (i.e. can be empty)
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
A1. Don't remove index_merge part of the tree (this will take care of
DISCARD-IMERGE-1 problem)
A2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
A3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
2.2 New tree_or()
-----------------
O1. Dont remove non-range plans:
Current tree_or() code will refuse to produce index_merge plans for
conditions like
"t.key1part2=const OR t.key2part1=const"
(this is marked as DISCARD-IMERGE-3). This was justifed as the left part of
the AND condition is not usable for range access, and the operation of
tree_and() guaranteed that there was no way it could changed to make a
usable range plan. With new tree_and() and rule A2, this is no longer the
case. For example for this query:
(t.key1part2=const OR t.key2part1=const) AND t.key1part1=const
it will construct a
imerge(t.key1part2=const OR t.key2part1=const), range(t.key1part1=const)
then tree_and() will apply rule A2 to push the range down into index merge
and after that we'll have:
range(t.key1part1=const)
imerge(
t.key1part2=const AND t.key1part1=const,
t.key2part1=const
)
note that imerge(...) describes a usable index_merge plan and it's possible
that it will be the best access path.
O2. "Create index_merge accesses when possible"
Current tree_or() will not create index_merge access when it could create
non-index merge access (see DISCARD-IMERGE-3 and its example in the "Problems
in the current implementation" section). This will be changed to work as
follows: we will create index_merge made for index scans that didn't have
their match in the other sel_tree.
Ilustrating it with an example:
| sel_tree_A | sel_tree_B | A or B | include in index_merge?
------+------------+------------+--------+------------------------
key1 | cond1 | cond2 | condM | no
key2 | cond3 | cond4 | NULL | no
key3 | cond5 | | | yes, A-side
key4 | cond6 | | | yes, A-side
key5 | | cond7 | | yes, B-side
key6 | | cond8 | | yes, B-side
here we assume that
- (cond1 OR cond2) did produce a combined range. Not including them in
index_merge.
- (cond3 OR cond4) didn't produce a usable range (e.g. they were
t.key1part1=c1 AND t.key1part2=c1, respectively, and combining them
didn't yield any range list)
- All other scand didn't have their counterparts, so we'll end up with a
SEL_TREE of:
range(condM) AND index_merge((cond5 AND cond6),(cond7 AND cond8))
.
O4. There is no O4. DISCARD-INDEX-MERGE-4 will remain there. The idea is
that although DISCARD-INDEX-MERGE-4 does discard plans, so far we haven
seen any complaints that could be attributed to it.
If we face the need to lift DISCARD-INDEX-MERGE-4, our answer will be to
lift it ,and produce a cross-product:
((key1p OR key2p) AND (key3p OR key4p))
OR
((key5p OR key6p) AND (key7p OR key8p))
= (key1p OR key2p OR key5p OR key6p) AND // this part is currently
(key3p OR key4p OR key5p OR key6p) AND // produced
(key1p OR key2p OR key5p OR key6p) AND // this part will be added
(key3p OR key4p OR key5p OR key6p) //.
In order to limit the impact of this combinatorial explosion, we will
introduce a rule that we won't generate more than #defined
MAX_IMERGE_OPTS options.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
by worklog-noreply@askmonty.org 18 Jun '09
by worklog-noreply@askmonty.org 18 Jun '09
18 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: 9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Thu, 18 Jun 2009, 14:56)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.15612 2009-06-18 14:56:09.000000000 +0300
+++ /tmp/wklog.24.new.15612 2009-06-18 14:56:09.000000000 +0300
@@ -1 +1,162 @@
+<contents>
+1. Current implementation overview
+1.1. Problems in the current implementation
+2. New implementation
+2.1 New tree_and()
+2.2 New tree_or()
+</contents>
+
+1. Current implementation overview
+==================================
+At the moment, range analyzer works as follows:
+
+SEL_TREE structure represents
+
+ # There are sel_trees, a sel_tree is either range or merge tree
+ sel_tree = range_tree | imerge_tree
+
+ # a range tree has range access options, possibly for several keys
+ range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+
+ # merge tree represents several way to index_merge
+ imerge_tree = imerge1 AND imerge2 AND ...
+
+ # a way to do index merge == a set to use of different indexes.
+ imergeX = range_tree1 OR range_tree2 OR ..
+ where no pair of range_treeX have ranges over the same index.
+
+
+ tree_and(A, B)
+ {
+ if (both A and B are range trees)
+ return a range_tree with computed intersection for each range;
+ if (only one of A and B is a range tree)
+ return that tree; // DISCARD-IMERGE-1
+ // at this point both trees are index_merge trees
+ return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
+ }
+
+
+ tree_or(A, B)
+ {
+ if (A and B are range trees)
+ {
+ R = new range_tree;
+ for each index i
+ R.add(range_union(A.range(i), B.range(i)));
+
+ if (R has at least one range access)
+ return R;
+ else
+ {
+ /* could not build any range accesses. construct index_merge */
+ remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from B;
+ return new index_merge(A, B);
+ }
+ }
+ else if (A is range tree and B is index_merge tree (or vice versa))
+ {
+ Perform this transformation:
+
+ range_treeA // this is A
+ OR
+ (range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
+ (range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ =
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+
+ Now each line represents an index_merge..
+ }
+ else if (both A and B are index_merge trees)
+ {
+ Perform this transformation:
+
+ imergeA1 AND imergeA2 AND ... AND imergeAN
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN
+
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+
+ imergeA1
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN =
+
+ = (combine imergeA1 with each of the imergeB{i} ) =
+
+ combine(imergeA1 OR imergeB1) AND
+ combine(imergeA1 OR imergeB2) AND
+ ... AND
+ combine(imergeA1 OR imergeBN)
+ }
+ }
+
+1.1. Problems in the current implementation
+-------------------------------------------
+As marked in the code above:
+
+DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
+the WHERE clause has this form:
+
+ (t.key1=c1 OR t.key2=c2) AND t.badkey < c3
+
+DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
+the WHERE clause has this form (conditions t.badkey may have abritrary form):
+
+ (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+
+DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
+two indexes:
+
+ INDEX i1(col1, col2),
+ INDEX i2(col1, col3)
+
+and this WHERE clause:
+
+ col1=c1 AND (col2=c2 OR col3=c3)
+
+The optimizer will generate the plans that only use the "col1=c1" part. The
+right side of the AND will be ignored even if it has good selectivity.
+
+
+2. New implementation
+=====================
+
+<general idea>
+* Don't start fighting combinatorial explosion until we've actually got one.
+</>
+
+SEL_TREE structure will be now able to hold both index_merge and range scan
+candidates at the same time. That is,
+
+ sel_tree2 = range_tree AND imerge_tree
+
+where both parts are optional (i.e. can be empty)
+
+Operations on SEL_ARG trees will be modified to produce/process the trees of
+this kind:
+
+2.1 New tree_and()
+------------------
+In order not to lose plans, we'll make these changes:
+
+1. Don't remove index_merge part of the tree.
+
+2. Push range conditions down into index_merge trees that may support them.
+ if one tree has range(key1) and the other tree has imerge(key1 OR key2)
+ then perform an equvalent of this operation:
+
+ rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
+
+ (rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
+
+3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+ concatenate them together.
+
+2.2 New tree_or()
-=-=(Guest - Sat, 13 Jun 2009, 06:29)=-=-
Category updated.
--- /tmp/wklog.24.old.25753 2009-06-13 06:29:10.000000000 +0300
+++ /tmp/wklog.24.new.25753 2009-06-13 06:29:10.000000000 +0300
@@ -1 +1 @@
-Server-BackLog
+Server-Sprint
-=-=(Guest - Sat, 13 Jun 2009, 06:14)=-=-
Category updated.
--- /tmp/wklog.24.old.24991 2009-06-13 06:14:03.000000000 +0300
+++ /tmp/wklog.24.new.24991 2009-06-13 06:14:03.000000000 +0300
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-BackLog
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 24
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Category updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Client-BackLog
+Server-RawIdeaBin
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Version updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Server-9.x
+9.x
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Version updated.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Guest - Tue, 26 May 2009, 13:27)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.305 2009-05-26 13:27:32.000000000 +0300
+++ /tmp/wklog.24.new.305 2009-05-26 13:27:32.000000000 +0300
@@ -1 +1,70 @@
+(Not a ready HLS but draft)
+<contents>
+Solution overview
+Limitations
+TODO
+
+</contents>
+
+Solution overview
+=================
+The idea is to delay discarding potential index_merge plans until the point
+where it is really necessary.
+
+This way, we won't have to do much changes in the range analyzer, but will be
+able to keep potential index_merge plan just enough so that it's possible to
+take it into consideration together with range access plans.
+
+Since there are no changes in the optimizer, the ability to consider both
+range and index_merge options will be limited to WHERE clauses of this form:
+
+ WHERE := range_cond(key1_1) AND
+ range_cond(key2_1) AND
+ other_cond AND
+ index_merge_OR_cond1(key3_1, key3_2, ...)
+ index_merge_OR_cond2(key4_1, key4_2, ...)
+
+where
+
+ index_merge_OR_cond{N} := (range_cond(keyN_1) OR
+ range_cond(keyN_2) OR ...)
+
+
+ range_cond(keyX) := condition that allows to construct range access of keyX
+ and doesn't allow to construct range/index_merge accesses
+ for any keys of the table in question.
+
+
+For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
+
+ SEL_TREE(
+ range(key1_1),
+ ...
+ range(key2_1),
+ SEL_IMERGE( (1)
+ SEL_TREE(key3_1})
+ SEL_TREE(key3_2})
+ ...
+ )
+ ...
+ )
+
+which can be used to make a cost-based choice between range and index_merge.
+
+Limitations
+-----------
+This will not be a full solution in a sense that the range analyzer will not
+be able to produce sel_tree (1) if the WHERE clause is specified in other form
+(e.g. brackets were opened).
+
+TODO
+----
+* is it a problem if there are keys that are referred to both from
+ index_merge and from range access?
+
+* How strict is the limitation on the form of the WHERE?
+
+* TODO: The optimizer didn't compare costs of index_merge and range before (ok
+ it did but that was done for accesses to different tables). Will there be any
+ possible gotchas here?
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
LOW-LEVEL DESIGN:
<contents>
1. Current implementation overview
1.1. Problems in the current implementation
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
</contents>
1. Current implementation overview
==================================
At the moment, range analyzer works as follows:
SEL_TREE structure represents
# There are sel_trees, a sel_tree is either range or merge tree
sel_tree = range_tree | imerge_tree
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
# a way to do index merge == a set to use of different indexes.
imergeX = range_tree1 OR range_tree2 OR ..
where no pair of range_treeX have ranges over the same index.
tree_and(A, B)
{
if (both A and B are range trees)
return a range_tree with computed intersection for each range;
if (only one of A and B is a range tree)
return that tree; // DISCARD-IMERGE-1
// at this point both trees are index_merge trees
return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
}
tree_or(A, B)
{
if (A and B are range trees)
{
R = new range_tree;
for each index i
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
return R;
else
{
/* could not build any range accesses. construct index_merge */
remove non-ranges from A; // DISCARD-IMERGE-2
remove non-ranges from B;
return new index_merge(A, B);
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
{
Perform this transformation:
range_treeA // this is A
OR
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
(range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
Now each line represents an index_merge..
}
else if (both A and B are index_merge trees)
{
Perform this transformation:
imergeA1 AND imergeA2 AND ... AND imergeAN
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
-> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
imergeA1
OR
imergeB1 AND imergeB2 AND ... AND imergeBN =
= (combine imergeA1 with each of the imergeB{i} ) =
combine(imergeA1 OR imergeB1) AND
combine(imergeA1 OR imergeB2) AND
... AND
combine(imergeA1 OR imergeBN)
}
}
1.1. Problems in the current implementation
-------------------------------------------
As marked in the code above:
DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
the WHERE clause has this form:
(t.key1=c1 OR t.key2=c2) AND t.badkey < c3
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
(t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
INDEX i1(col1, col2),
INDEX i2(col1, col3)
and this WHERE clause:
col1=c1 AND (col2=c2 OR col3=c3)
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
2. New implementation
=====================
<general idea>
* Don't start fighting combinatorial explosion until we've actually got one.
</>
SEL_TREE structure will be now able to hold both index_merge and range scan
candidates at the same time. That is,
sel_tree2 = range_tree AND imerge_tree
where both parts are optional (i.e. can be empty)
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
1. Don't remove index_merge part of the tree.
2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
2.2 New tree_or()
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
by worklog-noreply@askmonty.org 18 Jun '09
by worklog-noreply@askmonty.org 18 Jun '09
18 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-Sprint
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: 9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Thu, 18 Jun 2009, 14:56)=-=-
Low Level Design modified.
--- /tmp/wklog.24.old.15612 2009-06-18 14:56:09.000000000 +0300
+++ /tmp/wklog.24.new.15612 2009-06-18 14:56:09.000000000 +0300
@@ -1 +1,162 @@
+<contents>
+1. Current implementation overview
+1.1. Problems in the current implementation
+2. New implementation
+2.1 New tree_and()
+2.2 New tree_or()
+</contents>
+
+1. Current implementation overview
+==================================
+At the moment, range analyzer works as follows:
+
+SEL_TREE structure represents
+
+ # There are sel_trees, a sel_tree is either range or merge tree
+ sel_tree = range_tree | imerge_tree
+
+ # a range tree has range access options, possibly for several keys
+ range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
+
+ # merge tree represents several way to index_merge
+ imerge_tree = imerge1 AND imerge2 AND ...
+
+ # a way to do index merge == a set to use of different indexes.
+ imergeX = range_tree1 OR range_tree2 OR ..
+ where no pair of range_treeX have ranges over the same index.
+
+
+ tree_and(A, B)
+ {
+ if (both A and B are range trees)
+ return a range_tree with computed intersection for each range;
+ if (only one of A and B is a range tree)
+ return that tree; // DISCARD-IMERGE-1
+ // at this point both trees are index_merge trees
+ return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
+ }
+
+
+ tree_or(A, B)
+ {
+ if (A and B are range trees)
+ {
+ R = new range_tree;
+ for each index i
+ R.add(range_union(A.range(i), B.range(i)));
+
+ if (R has at least one range access)
+ return R;
+ else
+ {
+ /* could not build any range accesses. construct index_merge */
+ remove non-ranges from A; // DISCARD-IMERGE-2
+ remove non-ranges from B;
+ return new index_merge(A, B);
+ }
+ }
+ else if (A is range tree and B is index_merge tree (or vice versa))
+ {
+ Perform this transformation:
+
+ range_treeA // this is A
+ OR
+ (range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
+ (range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
+ =
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+ (range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
+ ...
+ (range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
+
+ Now each line represents an index_merge..
+ }
+ else if (both A and B are index_merge trees)
+ {
+ Perform this transformation:
+
+ imergeA1 AND imergeA2 AND ... AND imergeAN
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN
+
+ -> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
+
+ imergeA1
+ OR
+ imergeB1 AND imergeB2 AND ... AND imergeBN =
+
+ = (combine imergeA1 with each of the imergeB{i} ) =
+
+ combine(imergeA1 OR imergeB1) AND
+ combine(imergeA1 OR imergeB2) AND
+ ... AND
+ combine(imergeA1 OR imergeBN)
+ }
+ }
+
+1.1. Problems in the current implementation
+-------------------------------------------
+As marked in the code above:
+
+DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
+the WHERE clause has this form:
+
+ (t.key1=c1 OR t.key2=c2) AND t.badkey < c3
+
+DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
+the WHERE clause has this form (conditions t.badkey may have abritrary form):
+
+ (t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
+
+DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
+two indexes:
+
+ INDEX i1(col1, col2),
+ INDEX i2(col1, col3)
+
+and this WHERE clause:
+
+ col1=c1 AND (col2=c2 OR col3=c3)
+
+The optimizer will generate the plans that only use the "col1=c1" part. The
+right side of the AND will be ignored even if it has good selectivity.
+
+
+2. New implementation
+=====================
+
+<general idea>
+* Don't start fighting combinatorial explosion until we've actually got one.
+</>
+
+SEL_TREE structure will be now able to hold both index_merge and range scan
+candidates at the same time. That is,
+
+ sel_tree2 = range_tree AND imerge_tree
+
+where both parts are optional (i.e. can be empty)
+
+Operations on SEL_ARG trees will be modified to produce/process the trees of
+this kind:
+
+2.1 New tree_and()
+------------------
+In order not to lose plans, we'll make these changes:
+
+1. Don't remove index_merge part of the tree.
+
+2. Push range conditions down into index_merge trees that may support them.
+ if one tree has range(key1) and the other tree has imerge(key1 OR key2)
+ then perform an equvalent of this operation:
+
+ rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
+
+ (rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
+
+3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
+ concatenate them together.
+
+2.2 New tree_or()
-=-=(Guest - Sat, 13 Jun 2009, 06:29)=-=-
Category updated.
--- /tmp/wklog.24.old.25753 2009-06-13 06:29:10.000000000 +0300
+++ /tmp/wklog.24.new.25753 2009-06-13 06:29:10.000000000 +0300
@@ -1 +1 @@
-Server-BackLog
+Server-Sprint
-=-=(Guest - Sat, 13 Jun 2009, 06:14)=-=-
Category updated.
--- /tmp/wklog.24.old.24991 2009-06-13 06:14:03.000000000 +0300
+++ /tmp/wklog.24.new.24991 2009-06-13 06:14:03.000000000 +0300
@@ -1 +1 @@
-Server-RawIdeaBin
+Server-BackLog
-=-=(Psergey - Wed, 03 Jun 2009, 12:09)=-=-
Dependency created: 30 now depends on 24
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Category updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Client-BackLog
+Server-RawIdeaBin
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Version updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Server-9.x
+9.x
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Version updated.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Guest - Tue, 26 May 2009, 13:27)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.305 2009-05-26 13:27:32.000000000 +0300
+++ /tmp/wklog.24.new.305 2009-05-26 13:27:32.000000000 +0300
@@ -1 +1,70 @@
+(Not a ready HLS but draft)
+<contents>
+Solution overview
+Limitations
+TODO
+
+</contents>
+
+Solution overview
+=================
+The idea is to delay discarding potential index_merge plans until the point
+where it is really necessary.
+
+This way, we won't have to do much changes in the range analyzer, but will be
+able to keep potential index_merge plan just enough so that it's possible to
+take it into consideration together with range access plans.
+
+Since there are no changes in the optimizer, the ability to consider both
+range and index_merge options will be limited to WHERE clauses of this form:
+
+ WHERE := range_cond(key1_1) AND
+ range_cond(key2_1) AND
+ other_cond AND
+ index_merge_OR_cond1(key3_1, key3_2, ...)
+ index_merge_OR_cond2(key4_1, key4_2, ...)
+
+where
+
+ index_merge_OR_cond{N} := (range_cond(keyN_1) OR
+ range_cond(keyN_2) OR ...)
+
+
+ range_cond(keyX) := condition that allows to construct range access of keyX
+ and doesn't allow to construct range/index_merge accesses
+ for any keys of the table in question.
+
+
+For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
+
+ SEL_TREE(
+ range(key1_1),
+ ...
+ range(key2_1),
+ SEL_IMERGE( (1)
+ SEL_TREE(key3_1})
+ SEL_TREE(key3_2})
+ ...
+ )
+ ...
+ )
+
+which can be used to make a cost-based choice between range and index_merge.
+
+Limitations
+-----------
+This will not be a full solution in a sense that the range analyzer will not
+be able to produce sel_tree (1) if the WHERE clause is specified in other form
+(e.g. brackets were opened).
+
+TODO
+----
+* is it a problem if there are keys that are referred to both from
+ index_merge and from range access?
+
+* How strict is the limitation on the form of the WHERE?
+
+* TODO: The optimizer didn't compare costs of index_merge and range before (ok
+ it did but that was done for accesses to different tables). Will there be any
+ possible gotchas here?
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
LOW-LEVEL DESIGN:
<contents>
1. Current implementation overview
1.1. Problems in the current implementation
2. New implementation
2.1 New tree_and()
2.2 New tree_or()
</contents>
1. Current implementation overview
==================================
At the moment, range analyzer works as follows:
SEL_TREE structure represents
# There are sel_trees, a sel_tree is either range or merge tree
sel_tree = range_tree | imerge_tree
# a range tree has range access options, possibly for several keys
range_tree = range(key1) AND range(key2) AND ... AND range(keyN);
# merge tree represents several way to index_merge
imerge_tree = imerge1 AND imerge2 AND ...
# a way to do index merge == a set to use of different indexes.
imergeX = range_tree1 OR range_tree2 OR ..
where no pair of range_treeX have ranges over the same index.
tree_and(A, B)
{
if (both A and B are range trees)
return a range_tree with computed intersection for each range;
if (only one of A and B is a range tree)
return that tree; // DISCARD-IMERGE-1
// at this point both trees are index_merge trees
return concat_lists( A.imerge1 ... A.imergeN, B.imerge1 ... B.imergeN);
}
tree_or(A, B)
{
if (A and B are range trees)
{
R = new range_tree;
for each index i
R.add(range_union(A.range(i), B.range(i)));
if (R has at least one range access)
return R;
else
{
/* could not build any range accesses. construct index_merge */
remove non-ranges from A; // DISCARD-IMERGE-2
remove non-ranges from B;
return new index_merge(A, B);
}
}
else if (A is range tree and B is index_merge tree (or vice versa))
{
Perform this transformation:
range_treeA // this is A
OR
(range_treeB_11 OR range_treeB_12 OR ... OR range_treeB_1N) AND
(range_treeB_21 OR range_treeB_22 OR ... OR range_treeB_2N) AND
...
(range_treeB_K1 OR range_treeB_K2 OR ... OR range_treeB_kN) AND
=
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
(range_treeA OR range_treeB_21 OR ... OR range_treeB_2N) AND
...
(range_treeA OR range_treeB_11 OR ... OR range_treeB_1N) AND
Now each line represents an index_merge..
}
else if (both A and B are index_merge trees)
{
Perform this transformation:
imergeA1 AND imergeA2 AND ... AND imergeAN
OR
imergeB1 AND imergeB2 AND ... AND imergeBN
-> (discard all imergeA{i=2,3,...} -> // DISCARD-IMERGE-3
imergeA1
OR
imergeB1 AND imergeB2 AND ... AND imergeBN =
= (combine imergeA1 with each of the imergeB{i} ) =
combine(imergeA1 OR imergeB1) AND
combine(imergeA1 OR imergeB2) AND
... AND
combine(imergeA1 OR imergeBN)
}
}
1.1. Problems in the current implementation
-------------------------------------------
As marked in the code above:
DISCARD-IMERGE-1 step will cause index_merge option to be discarded when
the WHERE clause has this form:
(t.key1=c1 OR t.key2=c2) AND t.badkey < c3
DISCARD-IMERGE-2 step will cause index_merge option to be discarded when
the WHERE clause has this form (conditions t.badkey may have abritrary form):
(t.badkey<c1 AND t.key1=c1) OR (t.key1=c2 AND t.badkey < c2)
DISCARD-IMERGE-3 manifests itself as the following effect: suppose there are
two indexes:
INDEX i1(col1, col2),
INDEX i2(col1, col3)
and this WHERE clause:
col1=c1 AND (col2=c2 OR col3=c3)
The optimizer will generate the plans that only use the "col1=c1" part. The
right side of the AND will be ignored even if it has good selectivity.
2. New implementation
=====================
<general idea>
* Don't start fighting combinatorial explosion until we've actually got one.
</>
SEL_TREE structure will be now able to hold both index_merge and range scan
candidates at the same time. That is,
sel_tree2 = range_tree AND imerge_tree
where both parts are optional (i.e. can be empty)
Operations on SEL_ARG trees will be modified to produce/process the trees of
this kind:
2.1 New tree_and()
------------------
In order not to lose plans, we'll make these changes:
1. Don't remove index_merge part of the tree.
2. Push range conditions down into index_merge trees that may support them.
if one tree has range(key1) and the other tree has imerge(key1 OR key2)
then perform an equvalent of this operation:
rangeA(key1) AND ( rangeB(key1) OR rangeB(key2)) =
(rangeA(key1) AND rangeB(key1)) OR (rangeA(key1) AND rangeB(key2))
3. Just as before: if both sel_tree A and sel_tree B have index_merge options,
concatenate them together.
2.2 New tree_or()
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Table elimination (17)
by worklog-noreply@askmonty.org 18 Jun '09
by worklog-noreply@askmonty.org 18 Jun '09
18 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination
CREATION DATE..: Sun, 10 May 2009, 19:57
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 17 (http://askmonty.org/worklog/?tid=17)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Thu, 18 Jun 2009, 04:15)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.29969 2009-06-18 04:15:23.000000000 +0300
+++ /tmp/wklog.17.new.29969 2009-06-18 04:15:23.000000000 +0300
@@ -158,3 +158,43 @@
from user/EXPLAIN point of view: no. constant table is the one that we read
one record from. eliminated table is the one that we don't acccess at all.
+* What is described above will not be able to eliminate this outer join
+ create unique index idx on tableB (id, fromDate);
+ ...
+ left outer join
+ tableB B
+ on
+ B.id = A.id
+ and
+ B.fromDate = (select max(sub.fromDate)
+ from tableB sub where sub.id = A.id);
+
+ This is because condition "B.fromDate= func(tableB)" cannot be used.
+ Reason#1: update_ref_and_keys() does not consider such conditions to
+ be of any use (and indeed they are not usable for ref access)
+ so they are not put into KEYUSE array.
+ Reason#2: even if they were put there, we would need to be able to tell
+ between predicates like
+ B.fromDate= func(B.id) // guarantees only one matching row as
+ // B.id is already bound by B.id=A.id
+ // hence B.fromDate becomes bound too.
+ and
+ "B.fromDate= func(B.*)" // Can potentially have many matching
+ // records.
+ We need to
+ - Have update_ref_and_keys() create KEYUSE elements for such equalities
+ - Have eliminate_tables() and friends make a more accurate check.
+ The right check is to check whether all parts of a unique key are bound.
+ If we have keypartX to be bound, then t.keypartY=func(keypartX) makes
+ keypartY to be bound.
+ The difficulty here is that correlated subquery predicate cannot tell what
+ columns it depends on (it only remembers tables).
+ Traversing the predicate is expensive and complicated.
+ We're leaning towards making each subquery predicate have a List<Item> with
+ items that
+ - are in the current select
+ - and it depends on.
+ This list will be useful in certain other subquery optimizations as well,
+ it is cheap to collect it in fix_fields() phase, so it will be collected
+ for every subquery predicate.
+
-=-=(Guest - Thu, 18 Jun 2009, 02:48)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.27792 2009-06-18 02:48:45.000000000 +0300
+++ /tmp/wklog.17.new.27792 2009-06-18 02:48:45.000000000 +0300
@@ -89,14 +89,14 @@
- queries that would use elimination
- queries that are very similar to one above (so that they would have same
QEP, execution cost, etc) but cannot use table elimination.
+then compare run times and make a conclusion about whether dbms supports table
+elimination.
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
-- Re-check how this works with equality propagation.
-
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
@@ -141,8 +141,13 @@
7. Additional issues
--------------------
-* We remove ON clauses within semi-join nests. If these clauses contain
+* We remove ON clauses within outer join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
+ Yes. Current approach: when removing an outer join nest, walk the ON clause
+ and mark subselects as eliminated. Then let EXPLAIN code check if the
+ SELECT was eliminated before the printing (EXPLAIN is generated by doing
+ a recursive descent, so the check will also cause children of eliminated
+ selects not to be printed)
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
-=-=(Guest - Thu, 18 Jun 2009, 02:24)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.27162 2009-06-18 02:24:14.000000000 +0300
+++ /tmp/wklog.17.new.27162 2009-06-18 02:24:14.000000000 +0300
@@ -83,9 +83,12 @@
5. Tests and benchmarks
-----------------------
-Should create a benchmark in sql-bench which checks if the dbms has table
+Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
-TODO elaborate
+[According to Monty] Run
+ - queries that would use elimination
+ - queries that are very similar to one above (so that they would have same
+ QEP, execution cost, etc) but cannot use table elimination.
6. Todo, issues to resolve
--------------------------
@@ -109,33 +112,37 @@
6.2 Resolved
~~~~~~~~~~~~
-- outer->inner join conversion is not a problem for table elimination.
+* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
-7. Additional issues
---------------------
-* We remove ON clauses within semi-join nests. If these clauses contain
- subqueries, they probably should be gone from EXPLAIN output also?
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-* Aggregate functions report they depend on all tables, that is,
+* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
- always. If we want table elimination to work in presence of grouping, need
- to devise some other way of analyzing aggregate functions.
+ always. Fixed it, now aggregate function reports it depends on
+ tables that its arguments depend on. In particular, COUNT(*) reports
+ that it depends on no tables (item_count_star->used_tables()==0).
+ One consequence of that is that "item->used_tables()==0" is not
+ equivalent to "item->const_item()==true" anymore (not sure if it's
+ "anymore" or this has been already happening).
+
+* EXPLAIN EXTENDED warning text was generated after the JOIN object has
+ been discarded. This didn't allow to use information about join plan
+ when printing the warning. Fixed this by keeping the JOIN objects until
+ we've printed the warning (have also an intent to remove the const
+ tables from the join output).
-* Should eliminated tables be shown in EXPLAIN EXTENDED?
- - If we just ignore the question, they will be shown
- - this is what happens for constant tables, too.
- - I don't see how showing them could be of any use. They only make it
- harder to read the rewritten query.
- It turns out that
- - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
- lifetime) changes.
- - it is hard to have it show per-execution data. This is because the warning
- text is generated after the execution structures have been destroyed.
+7. Additional issues
+--------------------
+* We remove ON clauses within semi-join nests. If these clauses contain
+ subqueries, they probably should be gone from EXPLAIN output also?
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
@@ -143,8 +150,6 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+ from user/EXPLAIN point of view: no. constant table is the one that we read
+ one record from. eliminated table is the one that we don't acccess at all.
-* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- - affected tables must not be eliminated
- - tables that are used on the right side of the SET x=y assignments must
- not be eliminated either.
-=-=(Guest - Tue, 16 Jun 2009, 17:01)=-=-
Dependency deleted: 29 no longer depends on 17
-=-=(Guest - Wed, 10 Jun 2009, 01:23)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.1842 2009-06-10 01:23:42.000000000 +0300
+++ /tmp/wklog.17.new.1842 2009-06-10 01:23:42.000000000 +0300
@@ -131,6 +131,11 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+ It turns out that
+ - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
+ lifetime) changes.
+ - it is hard to have it show per-execution data. This is because the warning
+ text is generated after the execution structures have been destroyed.
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
-=-=(Guest - Wed, 03 Jun 2009, 22:01)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.21801 2009-06-03 22:01:34.000000000 +0300
+++ /tmp/wklog.17.new.21801 2009-06-03 22:01:34.000000000 +0300
@@ -1,3 +1,6 @@
+The code (currently in development) is at lp:
+~maria-captains/maria/maria-5.1-table-elimination tree.
+
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
-=-=(Guest - Wed, 03 Jun 2009, 15:04)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.20378 2009-06-03 15:04:54.000000000 +0300
+++ /tmp/wklog.17.new.20378 2009-06-03 15:04:54.000000000 +0300
@@ -135,3 +135,8 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-=-=(Psergey - Wed, 03 Jun 2009, 12:07)=-=-
Dependency created: 29 now depends on 17
-=-=(Guest - Tue, 02 Jun 2009, 00:54)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.23548 2009-06-02 00:54:13.000000000 +0300
+++ /tmp/wklog.17.new.23548 2009-06-02 00:54:13.000000000 +0300
@@ -128,3 +128,10 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+
+* Table elimination is performed after constant table detection (but before
+ the range analysis). Constant tables are technically different from
+ eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
+ Considering we've already done the join_read_const_table() call, is there any
+ real difference between constant table and eliminated one? If there is, should
+ we mark const tables also as eliminated?
-=-=(Psergey - Mon, 01 Jun 2009, 20:46)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.17448 2009-06-01 20:46:40.000000000 +0300
+++ /tmp/wklog.17.new.17448 2009-06-01 20:46:40.000000000 +0300
@@ -122,3 +122,9 @@
always. If we want table elimination to work in presence of grouping, need
to devise some other way of analyzing aggregate functions.
+
+* Should eliminated tables be shown in EXPLAIN EXTENDED?
+ - If we just ignore the question, they will be shown
+ - this is what happens for constant tables, too.
+ - I don't see how showing them could be of any use. They only make it
+ harder to read the rewritten query.
------------------------------------------------------------
-=-=(View All Progress Notes, 26 total)=-=-
http://askmonty.org/worklog/index.pl?tid=17&nolimit=1
DESCRIPTION:
Eliminate not needed tables from SELECT queries..
This will speed up some views and automatically generated queries.
Example:
CREATE TABLE B (id int primary key);
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
In this case we can remove table B and the join from the query.
HIGH-LEVEL SPECIFICATION:
Here is an extended explanation of table elimination.
Table elimination is a feature found in some modern query optimizers, of
which Microsoft SQL Server 2005/2008 seems to have the most advanced
implementation. Oracle 11g has also been confirmed to use table
elimination but not to the same extent.
Basically, what table elimination does, is to remove tables from the
execution plan when it is unnecessary to include them. This can, of
course, only happen if the right circumstances arise. Let us for example
look at the following query:
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
still contain it's original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB. If
A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
then two rows will match in the join condition. The only way to know
what the result will look like is to actually touch both tables during
execution.
Instead, let's say that tableB contains rows that make it possible to
place a unique constraint on the column B.id, for example and often the
case a primary key. In this situation we know that we will get exactly
as many rows as there are in tableA, since joining with tableB cannot
introduce any duplicates. If further, as in the example query, we do not
select any columns from tableB, touching that table during execution is
unnecessary. We can remove the whole join operation from the execution
plan.
Both SQL Server 2005/2008 and Oracle 11g will deploy table elimination
in the case described above. Let us look at a more advanced query, where
Oracle fails.
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (
select
max(sub.fromDate)
from
tableB sub
where
sub.id = A.id
);
In this example we have added another join condition, which ensures
that we only pick the matching row from tableB having the latest
fromDate. In this case tableB will contain duplicates of the column
B.id, so in order to ensure uniqueness the primary key has to contain
the fromDate column as well. In other words the primary key of tableB
is (B.id, B.fromDate).
Furthermore, since the subselect ensures that we only pick the latest
B.fromDate for a given B.id we know that at most one row will match
the join condition. We will again have the situation where joining
with tableB cannot affect the number of rows in the result set. Since
we do not select any columns from tableB, the whole join operation can
be eliminated from the execution plan.
SQL Server 2005/2008 will deploy table elimination in this situation as
well. We have not found a way to make Oracle 11g use it for this type of
query. Queries like these arise in two situations. Either when you have
denormalized model consisting of a fact table with several related
dimension tables, or when you have a highly normalized model where each
attribute is stored in its own table. The example with the subselect is
common whenever you store historized/versioned data.
LOW-LEVEL DESIGN:
The code (currently in development) is at lp:
~maria-captains/maria/maria-5.1-table-elimination tree.
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
2. Removal operation properties
3. Removal operation
4. User interface
5. Tests and benchmarks
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
sides of outer joins.
1. Conditions for removal
-------------------------
We can eliminate an inner side of outer join if:
1. For each record combination of outer tables, it will always produce
exactly one record.
2. There are no references to columns of the inner tables anywhere else in
the query.
#1 means that every table inside the outer join nest is:
- is a constant table:
= because it can be accessed via eq_ref(const) access, or
= it is a zero-rows or one-row MyISAM-like table [MARK1]
- has an eq_ref access method candidate.
#2 means that WHERE clause, ON clauses of embedding outer joins, ORDER BY,
GROUP BY and HAVING do not refer to the inner tables of the outer join
nest.
1.1 Quick check if there are candidates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before we start to enumerate join nests, here is a quick way to check if
there *can be* something to be removed:
if ((tables used in select_list |
tables used in group/order by UNION |
tables used in where) != bitmap_of_all_tables)
{
attempt table elimination;
}
2. Removal operation properties
-------------------------------
* There is always one way to remove (no choice to remove either this or that)
* It is always better to remove as much tables as possible (at least within
our cost model).
Thus, no need for any cost calculations/etc. It's an unconditional rewrite.
3. Removal operation
--------------------
* Remove the outer join nest's nested join structure (i.e. get the
outer join's TABLE_LIST object $OJ and remove it from $OJ->embedding,
$OJ->embedding->nested_join. Update table_map's of all ancestor nested
joins). [MARK2]
* Move the tables and their JOIN_TABs to front like it is done with const
tables, with exception that if eliminated outer join nest was within
another outer join nest, that shouldn't prevent us from moving away the
eliminated tables.
* Update join->table_count and all-join-tables bitmap.
* That's it. Nothing else?
4. User interface
-----------------
* We'll add an @@optimizer switch flag for table elimination. Tentative
name: 'table_elimination'.
(Note ^^ utility of the above questioned ^, as table elimination can never
be worse than no elimination. We're leaning towards not adding the flag)
* EXPLAIN will not show the removed tables at all. This will allow to check
if tables were removed, and also will behave nicely with anchor model and
VIEWs: stuff that user doesn't care about just won't be there.
5. Tests and benchmarks
-----------------------
Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
[According to Monty] Run
- queries that would use elimination
- queries that are very similar to one above (so that they would have same
QEP, execution cost, etc) but cannot use table elimination.
then compare run times and make a conclusion about whether dbms supports table
elimination.
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
to limit the applicability by removing [MARK1] as that can change during
lifetime of the statement.
The other option is to do table elimination every time. This will require to
rework operation [MARK2] to be undoable.
I'm leaning towards doing the former. With anchor modeling, it is unlikely
that we'll meet outer joins which have N inner tables of which some are 1-row
MyISAM tables that do not have primary key.
6.2 Resolved
~~~~~~~~~~~~
* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- affected tables must not be eliminated
- tables that are used on the right side of the SET x=y assignments must
not be eliminated either.
* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
always. Fixed it, now aggregate function reports it depends on
tables that its arguments depend on. In particular, COUNT(*) reports
that it depends on no tables (item_count_star->used_tables()==0).
One consequence of that is that "item->used_tables()==0" is not
equivalent to "item->const_item()==true" anymore (not sure if it's
"anymore" or this has been already happening).
* EXPLAIN EXTENDED warning text was generated after the JOIN object has
been discarded. This didn't allow to use information about join plan
when printing the warning. Fixed this by keeping the JOIN objects until
we've printed the warning (have also an intent to remove the const
tables from the join output).
7. Additional issues
--------------------
* We remove ON clauses within outer join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
Yes. Current approach: when removing an outer join nest, walk the ON clause
and mark subselects as eliminated. Then let EXPLAIN code check if the
SELECT was eliminated before the printing (EXPLAIN is generated by doing
a recursive descent, so the check will also cause children of eliminated
selects not to be printed)
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
from user/EXPLAIN point of view: no. constant table is the one that we read
one record from. eliminated table is the one that we don't acccess at all.
* What is described above will not be able to eliminate this outer join
create unique index idx on tableB (id, fromDate);
...
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (select max(sub.fromDate)
from tableB sub where sub.id = A.id);
This is because condition "B.fromDate= func(tableB)" cannot be used.
Reason#1: update_ref_and_keys() does not consider such conditions to
be of any use (and indeed they are not usable for ref access)
so they are not put into KEYUSE array.
Reason#2: even if they were put there, we would need to be able to tell
between predicates like
B.fromDate= func(B.id) // guarantees only one matching row as
// B.id is already bound by B.id=A.id
// hence B.fromDate becomes bound too.
and
"B.fromDate= func(B.*)" // Can potentially have many matching
// records.
We need to
- Have update_ref_and_keys() create KEYUSE elements for such equalities
- Have eliminate_tables() and friends make a more accurate check.
The right check is to check whether all parts of a unique key are bound.
If we have keypartX to be bound, then t.keypartY=func(keypartX) makes
keypartY to be bound.
The difficulty here is that correlated subquery predicate cannot tell what
columns it depends on (it only remembers tables).
Traversing the predicate is expensive and complicated.
We're leaning towards making each subquery predicate have a List<Item> with
items that
- are in the current select
- and it depends on.
This list will be useful in certain other subquery optimizations as well,
it is cheap to collect it in fix_fields() phase, so it will be collected
for every subquery predicate.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Table elimination (17)
by worklog-noreply@askmonty.org 18 Jun '09
by worklog-noreply@askmonty.org 18 Jun '09
18 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination
CREATION DATE..: Sun, 10 May 2009, 19:57
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 17 (http://askmonty.org/worklog/?tid=17)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Thu, 18 Jun 2009, 04:15)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.29969 2009-06-18 04:15:23.000000000 +0300
+++ /tmp/wklog.17.new.29969 2009-06-18 04:15:23.000000000 +0300
@@ -158,3 +158,43 @@
from user/EXPLAIN point of view: no. constant table is the one that we read
one record from. eliminated table is the one that we don't acccess at all.
+* What is described above will not be able to eliminate this outer join
+ create unique index idx on tableB (id, fromDate);
+ ...
+ left outer join
+ tableB B
+ on
+ B.id = A.id
+ and
+ B.fromDate = (select max(sub.fromDate)
+ from tableB sub where sub.id = A.id);
+
+ This is because condition "B.fromDate= func(tableB)" cannot be used.
+ Reason#1: update_ref_and_keys() does not consider such conditions to
+ be of any use (and indeed they are not usable for ref access)
+ so they are not put into KEYUSE array.
+ Reason#2: even if they were put there, we would need to be able to tell
+ between predicates like
+ B.fromDate= func(B.id) // guarantees only one matching row as
+ // B.id is already bound by B.id=A.id
+ // hence B.fromDate becomes bound too.
+ and
+ "B.fromDate= func(B.*)" // Can potentially have many matching
+ // records.
+ We need to
+ - Have update_ref_and_keys() create KEYUSE elements for such equalities
+ - Have eliminate_tables() and friends make a more accurate check.
+ The right check is to check whether all parts of a unique key are bound.
+ If we have keypartX to be bound, then t.keypartY=func(keypartX) makes
+ keypartY to be bound.
+ The difficulty here is that correlated subquery predicate cannot tell what
+ columns it depends on (it only remembers tables).
+ Traversing the predicate is expensive and complicated.
+ We're leaning towards making each subquery predicate have a List<Item> with
+ items that
+ - are in the current select
+ - and it depends on.
+ This list will be useful in certain other subquery optimizations as well,
+ it is cheap to collect it in fix_fields() phase, so it will be collected
+ for every subquery predicate.
+
-=-=(Guest - Thu, 18 Jun 2009, 02:48)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.27792 2009-06-18 02:48:45.000000000 +0300
+++ /tmp/wklog.17.new.27792 2009-06-18 02:48:45.000000000 +0300
@@ -89,14 +89,14 @@
- queries that would use elimination
- queries that are very similar to one above (so that they would have same
QEP, execution cost, etc) but cannot use table elimination.
+then compare run times and make a conclusion about whether dbms supports table
+elimination.
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
-- Re-check how this works with equality propagation.
-
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
@@ -141,8 +141,13 @@
7. Additional issues
--------------------
-* We remove ON clauses within semi-join nests. If these clauses contain
+* We remove ON clauses within outer join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
+ Yes. Current approach: when removing an outer join nest, walk the ON clause
+ and mark subselects as eliminated. Then let EXPLAIN code check if the
+ SELECT was eliminated before the printing (EXPLAIN is generated by doing
+ a recursive descent, so the check will also cause children of eliminated
+ selects not to be printed)
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
-=-=(Guest - Thu, 18 Jun 2009, 02:24)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.27162 2009-06-18 02:24:14.000000000 +0300
+++ /tmp/wklog.17.new.27162 2009-06-18 02:24:14.000000000 +0300
@@ -83,9 +83,12 @@
5. Tests and benchmarks
-----------------------
-Should create a benchmark in sql-bench which checks if the dbms has table
+Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
-TODO elaborate
+[According to Monty] Run
+ - queries that would use elimination
+ - queries that are very similar to one above (so that they would have same
+ QEP, execution cost, etc) but cannot use table elimination.
6. Todo, issues to resolve
--------------------------
@@ -109,33 +112,37 @@
6.2 Resolved
~~~~~~~~~~~~
-- outer->inner join conversion is not a problem for table elimination.
+* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
-7. Additional issues
---------------------
-* We remove ON clauses within semi-join nests. If these clauses contain
- subqueries, they probably should be gone from EXPLAIN output also?
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-* Aggregate functions report they depend on all tables, that is,
+* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
- always. If we want table elimination to work in presence of grouping, need
- to devise some other way of analyzing aggregate functions.
+ always. Fixed it, now aggregate function reports it depends on
+ tables that its arguments depend on. In particular, COUNT(*) reports
+ that it depends on no tables (item_count_star->used_tables()==0).
+ One consequence of that is that "item->used_tables()==0" is not
+ equivalent to "item->const_item()==true" anymore (not sure if it's
+ "anymore" or this has been already happening).
+
+* EXPLAIN EXTENDED warning text was generated after the JOIN object has
+ been discarded. This didn't allow to use information about join plan
+ when printing the warning. Fixed this by keeping the JOIN objects until
+ we've printed the warning (have also an intent to remove the const
+ tables from the join output).
-* Should eliminated tables be shown in EXPLAIN EXTENDED?
- - If we just ignore the question, they will be shown
- - this is what happens for constant tables, too.
- - I don't see how showing them could be of any use. They only make it
- harder to read the rewritten query.
- It turns out that
- - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
- lifetime) changes.
- - it is hard to have it show per-execution data. This is because the warning
- text is generated after the execution structures have been destroyed.
+7. Additional issues
+--------------------
+* We remove ON clauses within semi-join nests. If these clauses contain
+ subqueries, they probably should be gone from EXPLAIN output also?
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
@@ -143,8 +150,6 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+ from user/EXPLAIN point of view: no. constant table is the one that we read
+ one record from. eliminated table is the one that we don't acccess at all.
-* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- - affected tables must not be eliminated
- - tables that are used on the right side of the SET x=y assignments must
- not be eliminated either.
-=-=(Guest - Tue, 16 Jun 2009, 17:01)=-=-
Dependency deleted: 29 no longer depends on 17
-=-=(Guest - Wed, 10 Jun 2009, 01:23)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.1842 2009-06-10 01:23:42.000000000 +0300
+++ /tmp/wklog.17.new.1842 2009-06-10 01:23:42.000000000 +0300
@@ -131,6 +131,11 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+ It turns out that
+ - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
+ lifetime) changes.
+ - it is hard to have it show per-execution data. This is because the warning
+ text is generated after the execution structures have been destroyed.
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
-=-=(Guest - Wed, 03 Jun 2009, 22:01)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.21801 2009-06-03 22:01:34.000000000 +0300
+++ /tmp/wklog.17.new.21801 2009-06-03 22:01:34.000000000 +0300
@@ -1,3 +1,6 @@
+The code (currently in development) is at lp:
+~maria-captains/maria/maria-5.1-table-elimination tree.
+
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
-=-=(Guest - Wed, 03 Jun 2009, 15:04)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.20378 2009-06-03 15:04:54.000000000 +0300
+++ /tmp/wklog.17.new.20378 2009-06-03 15:04:54.000000000 +0300
@@ -135,3 +135,8 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-=-=(Psergey - Wed, 03 Jun 2009, 12:07)=-=-
Dependency created: 29 now depends on 17
-=-=(Guest - Tue, 02 Jun 2009, 00:54)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.23548 2009-06-02 00:54:13.000000000 +0300
+++ /tmp/wklog.17.new.23548 2009-06-02 00:54:13.000000000 +0300
@@ -128,3 +128,10 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+
+* Table elimination is performed after constant table detection (but before
+ the range analysis). Constant tables are technically different from
+ eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
+ Considering we've already done the join_read_const_table() call, is there any
+ real difference between constant table and eliminated one? If there is, should
+ we mark const tables also as eliminated?
-=-=(Psergey - Mon, 01 Jun 2009, 20:46)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.17448 2009-06-01 20:46:40.000000000 +0300
+++ /tmp/wklog.17.new.17448 2009-06-01 20:46:40.000000000 +0300
@@ -122,3 +122,9 @@
always. If we want table elimination to work in presence of grouping, need
to devise some other way of analyzing aggregate functions.
+
+* Should eliminated tables be shown in EXPLAIN EXTENDED?
+ - If we just ignore the question, they will be shown
+ - this is what happens for constant tables, too.
+ - I don't see how showing them could be of any use. They only make it
+ harder to read the rewritten query.
------------------------------------------------------------
-=-=(View All Progress Notes, 26 total)=-=-
http://askmonty.org/worklog/index.pl?tid=17&nolimit=1
DESCRIPTION:
Eliminate not needed tables from SELECT queries..
This will speed up some views and automatically generated queries.
Example:
CREATE TABLE B (id int primary key);
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
In this case we can remove table B and the join from the query.
HIGH-LEVEL SPECIFICATION:
Here is an extended explanation of table elimination.
Table elimination is a feature found in some modern query optimizers, of
which Microsoft SQL Server 2005/2008 seems to have the most advanced
implementation. Oracle 11g has also been confirmed to use table
elimination but not to the same extent.
Basically, what table elimination does, is to remove tables from the
execution plan when it is unnecessary to include them. This can, of
course, only happen if the right circumstances arise. Let us for example
look at the following query:
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
still contain it's original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB. If
A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
then two rows will match in the join condition. The only way to know
what the result will look like is to actually touch both tables during
execution.
Instead, let's say that tableB contains rows that make it possible to
place a unique constraint on the column B.id, for example and often the
case a primary key. In this situation we know that we will get exactly
as many rows as there are in tableA, since joining with tableB cannot
introduce any duplicates. If further, as in the example query, we do not
select any columns from tableB, touching that table during execution is
unnecessary. We can remove the whole join operation from the execution
plan.
Both SQL Server 2005/2008 and Oracle 11g will deploy table elimination
in the case described above. Let us look at a more advanced query, where
Oracle fails.
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (
select
max(sub.fromDate)
from
tableB sub
where
sub.id = A.id
);
In this example we have added another join condition, which ensures
that we only pick the matching row from tableB having the latest
fromDate. In this case tableB will contain duplicates of the column
B.id, so in order to ensure uniqueness the primary key has to contain
the fromDate column as well. In other words the primary key of tableB
is (B.id, B.fromDate).
Furthermore, since the subselect ensures that we only pick the latest
B.fromDate for a given B.id we know that at most one row will match
the join condition. We will again have the situation where joining
with tableB cannot affect the number of rows in the result set. Since
we do not select any columns from tableB, the whole join operation can
be eliminated from the execution plan.
SQL Server 2005/2008 will deploy table elimination in this situation as
well. We have not found a way to make Oracle 11g use it for this type of
query. Queries like these arise in two situations. Either when you have
denormalized model consisting of a fact table with several related
dimension tables, or when you have a highly normalized model where each
attribute is stored in its own table. The example with the subselect is
common whenever you store historized/versioned data.
LOW-LEVEL DESIGN:
The code (currently in development) is at lp:
~maria-captains/maria/maria-5.1-table-elimination tree.
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
2. Removal operation properties
3. Removal operation
4. User interface
5. Tests and benchmarks
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
sides of outer joins.
1. Conditions for removal
-------------------------
We can eliminate an inner side of outer join if:
1. For each record combination of outer tables, it will always produce
exactly one record.
2. There are no references to columns of the inner tables anywhere else in
the query.
#1 means that every table inside the outer join nest is:
- is a constant table:
= because it can be accessed via eq_ref(const) access, or
= it is a zero-rows or one-row MyISAM-like table [MARK1]
- has an eq_ref access method candidate.
#2 means that WHERE clause, ON clauses of embedding outer joins, ORDER BY,
GROUP BY and HAVING do not refer to the inner tables of the outer join
nest.
1.1 Quick check if there are candidates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before we start to enumerate join nests, here is a quick way to check if
there *can be* something to be removed:
if ((tables used in select_list |
tables used in group/order by UNION |
tables used in where) != bitmap_of_all_tables)
{
attempt table elimination;
}
2. Removal operation properties
-------------------------------
* There is always one way to remove (no choice to remove either this or that)
* It is always better to remove as much tables as possible (at least within
our cost model).
Thus, no need for any cost calculations/etc. It's an unconditional rewrite.
3. Removal operation
--------------------
* Remove the outer join nest's nested join structure (i.e. get the
outer join's TABLE_LIST object $OJ and remove it from $OJ->embedding,
$OJ->embedding->nested_join. Update table_map's of all ancestor nested
joins). [MARK2]
* Move the tables and their JOIN_TABs to front like it is done with const
tables, with exception that if eliminated outer join nest was within
another outer join nest, that shouldn't prevent us from moving away the
eliminated tables.
* Update join->table_count and all-join-tables bitmap.
* That's it. Nothing else?
4. User interface
-----------------
* We'll add an @@optimizer switch flag for table elimination. Tentative
name: 'table_elimination'.
(Note ^^ utility of the above questioned ^, as table elimination can never
be worse than no elimination. We're leaning towards not adding the flag)
* EXPLAIN will not show the removed tables at all. This will allow to check
if tables were removed, and also will behave nicely with anchor model and
VIEWs: stuff that user doesn't care about just won't be there.
5. Tests and benchmarks
-----------------------
Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
[According to Monty] Run
- queries that would use elimination
- queries that are very similar to one above (so that they would have same
QEP, execution cost, etc) but cannot use table elimination.
then compare run times and make a conclusion about whether dbms supports table
elimination.
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
to limit the applicability by removing [MARK1] as that can change during
lifetime of the statement.
The other option is to do table elimination every time. This will require to
rework operation [MARK2] to be undoable.
I'm leaning towards doing the former. With anchor modeling, it is unlikely
that we'll meet outer joins which have N inner tables of which some are 1-row
MyISAM tables that do not have primary key.
6.2 Resolved
~~~~~~~~~~~~
* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- affected tables must not be eliminated
- tables that are used on the right side of the SET x=y assignments must
not be eliminated either.
* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
always. Fixed it, now aggregate function reports it depends on
tables that its arguments depend on. In particular, COUNT(*) reports
that it depends on no tables (item_count_star->used_tables()==0).
One consequence of that is that "item->used_tables()==0" is not
equivalent to "item->const_item()==true" anymore (not sure if it's
"anymore" or this has been already happening).
* EXPLAIN EXTENDED warning text was generated after the JOIN object has
been discarded. This didn't allow to use information about join plan
when printing the warning. Fixed this by keeping the JOIN objects until
we've printed the warning (have also an intent to remove the const
tables from the join output).
7. Additional issues
--------------------
* We remove ON clauses within outer join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
Yes. Current approach: when removing an outer join nest, walk the ON clause
and mark subselects as eliminated. Then let EXPLAIN code check if the
SELECT was eliminated before the printing (EXPLAIN is generated by doing
a recursive descent, so the check will also cause children of eliminated
selects not to be printed)
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
from user/EXPLAIN point of view: no. constant table is the one that we read
one record from. eliminated table is the one that we don't acccess at all.
* What is described above will not be able to eliminate this outer join
create unique index idx on tableB (id, fromDate);
...
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (select max(sub.fromDate)
from tableB sub where sub.id = A.id);
This is because condition "B.fromDate= func(tableB)" cannot be used.
Reason#1: update_ref_and_keys() does not consider such conditions to
be of any use (and indeed they are not usable for ref access)
so they are not put into KEYUSE array.
Reason#2: even if they were put there, we would need to be able to tell
between predicates like
B.fromDate= func(B.id) // guarantees only one matching row as
// B.id is already bound by B.id=A.id
// hence B.fromDate becomes bound too.
and
"B.fromDate= func(B.*)" // Can potentially have many matching
// records.
We need to
- Have update_ref_and_keys() create KEYUSE elements for such equalities
- Have eliminate_tables() and friends make a more accurate check.
The right check is to check whether all parts of a unique key are bound.
If we have keypartX to be bound, then t.keypartY=func(keypartX) makes
keypartY to be bound.
The difficulty here is that correlated subquery predicate cannot tell what
columns it depends on (it only remembers tables).
Traversing the predicate is expensive and complicated.
We're leaning towards making each subquery predicate have a List<Item> with
items that
- are in the current select
- and it depends on.
This list will be useful in certain other subquery optimizations as well,
it is cheap to collect it in fix_fields() phase, so it will be collected
for every subquery predicate.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Table elimination (17)
by worklog-noreply@askmonty.org 18 Jun '09
by worklog-noreply@askmonty.org 18 Jun '09
18 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination
CREATION DATE..: Sun, 10 May 2009, 19:57
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 17 (http://askmonty.org/worklog/?tid=17)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Thu, 18 Jun 2009, 02:48)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.27792 2009-06-18 02:48:45.000000000 +0300
+++ /tmp/wklog.17.new.27792 2009-06-18 02:48:45.000000000 +0300
@@ -89,14 +89,14 @@
- queries that would use elimination
- queries that are very similar to one above (so that they would have same
QEP, execution cost, etc) but cannot use table elimination.
+then compare run times and make a conclusion about whether dbms supports table
+elimination.
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
-- Re-check how this works with equality propagation.
-
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
@@ -141,8 +141,13 @@
7. Additional issues
--------------------
-* We remove ON clauses within semi-join nests. If these clauses contain
+* We remove ON clauses within outer join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
+ Yes. Current approach: when removing an outer join nest, walk the ON clause
+ and mark subselects as eliminated. Then let EXPLAIN code check if the
+ SELECT was eliminated before the printing (EXPLAIN is generated by doing
+ a recursive descent, so the check will also cause children of eliminated
+ selects not to be printed)
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
-=-=(Guest - Thu, 18 Jun 2009, 02:24)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.27162 2009-06-18 02:24:14.000000000 +0300
+++ /tmp/wklog.17.new.27162 2009-06-18 02:24:14.000000000 +0300
@@ -83,9 +83,12 @@
5. Tests and benchmarks
-----------------------
-Should create a benchmark in sql-bench which checks if the dbms has table
+Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
-TODO elaborate
+[According to Monty] Run
+ - queries that would use elimination
+ - queries that are very similar to one above (so that they would have same
+ QEP, execution cost, etc) but cannot use table elimination.
6. Todo, issues to resolve
--------------------------
@@ -109,33 +112,37 @@
6.2 Resolved
~~~~~~~~~~~~
-- outer->inner join conversion is not a problem for table elimination.
+* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
-7. Additional issues
---------------------
-* We remove ON clauses within semi-join nests. If these clauses contain
- subqueries, they probably should be gone from EXPLAIN output also?
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-* Aggregate functions report they depend on all tables, that is,
+* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
- always. If we want table elimination to work in presence of grouping, need
- to devise some other way of analyzing aggregate functions.
+ always. Fixed it, now aggregate function reports it depends on
+ tables that its arguments depend on. In particular, COUNT(*) reports
+ that it depends on no tables (item_count_star->used_tables()==0).
+ One consequence of that is that "item->used_tables()==0" is not
+ equivalent to "item->const_item()==true" anymore (not sure if it's
+ "anymore" or this has been already happening).
+
+* EXPLAIN EXTENDED warning text was generated after the JOIN object has
+ been discarded. This didn't allow to use information about join plan
+ when printing the warning. Fixed this by keeping the JOIN objects until
+ we've printed the warning (have also an intent to remove the const
+ tables from the join output).
-* Should eliminated tables be shown in EXPLAIN EXTENDED?
- - If we just ignore the question, they will be shown
- - this is what happens for constant tables, too.
- - I don't see how showing them could be of any use. They only make it
- harder to read the rewritten query.
- It turns out that
- - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
- lifetime) changes.
- - it is hard to have it show per-execution data. This is because the warning
- text is generated after the execution structures have been destroyed.
+7. Additional issues
+--------------------
+* We remove ON clauses within semi-join nests. If these clauses contain
+ subqueries, they probably should be gone from EXPLAIN output also?
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
@@ -143,8 +150,6 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+ from user/EXPLAIN point of view: no. constant table is the one that we read
+ one record from. eliminated table is the one that we don't acccess at all.
-* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- - affected tables must not be eliminated
- - tables that are used on the right side of the SET x=y assignments must
- not be eliminated either.
-=-=(Guest - Tue, 16 Jun 2009, 17:01)=-=-
Dependency deleted: 29 no longer depends on 17
-=-=(Guest - Wed, 10 Jun 2009, 01:23)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.1842 2009-06-10 01:23:42.000000000 +0300
+++ /tmp/wklog.17.new.1842 2009-06-10 01:23:42.000000000 +0300
@@ -131,6 +131,11 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+ It turns out that
+ - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
+ lifetime) changes.
+ - it is hard to have it show per-execution data. This is because the warning
+ text is generated after the execution structures have been destroyed.
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
-=-=(Guest - Wed, 03 Jun 2009, 22:01)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.21801 2009-06-03 22:01:34.000000000 +0300
+++ /tmp/wklog.17.new.21801 2009-06-03 22:01:34.000000000 +0300
@@ -1,3 +1,6 @@
+The code (currently in development) is at lp:
+~maria-captains/maria/maria-5.1-table-elimination tree.
+
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
-=-=(Guest - Wed, 03 Jun 2009, 15:04)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.20378 2009-06-03 15:04:54.000000000 +0300
+++ /tmp/wklog.17.new.20378 2009-06-03 15:04:54.000000000 +0300
@@ -135,3 +135,8 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-=-=(Psergey - Wed, 03 Jun 2009, 12:07)=-=-
Dependency created: 29 now depends on 17
-=-=(Guest - Tue, 02 Jun 2009, 00:54)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.23548 2009-06-02 00:54:13.000000000 +0300
+++ /tmp/wklog.17.new.23548 2009-06-02 00:54:13.000000000 +0300
@@ -128,3 +128,10 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+
+* Table elimination is performed after constant table detection (but before
+ the range analysis). Constant tables are technically different from
+ eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
+ Considering we've already done the join_read_const_table() call, is there any
+ real difference between constant table and eliminated one? If there is, should
+ we mark const tables also as eliminated?
-=-=(Psergey - Mon, 01 Jun 2009, 20:46)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.17448 2009-06-01 20:46:40.000000000 +0300
+++ /tmp/wklog.17.new.17448 2009-06-01 20:46:40.000000000 +0300
@@ -122,3 +122,9 @@
always. If we want table elimination to work in presence of grouping, need
to devise some other way of analyzing aggregate functions.
+
+* Should eliminated tables be shown in EXPLAIN EXTENDED?
+ - If we just ignore the question, they will be shown
+ - this is what happens for constant tables, too.
+ - I don't see how showing them could be of any use. They only make it
+ harder to read the rewritten query.
-=-=(Guest - Mon, 01 Jun 2009, 12:49)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.32202 2009-06-01 12:49:15.000000000 +0300
+++ /tmp/wklog.17.new.32202 2009-06-01 12:49:15.000000000 +0300
@@ -8,7 +8,7 @@
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
-
+7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
@@ -116,3 +116,9 @@
* We remove ON clauses within semi-join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
+* Aggregate functions report they depend on all tables, that is,
+
+ item_agg_func->used_tables() == (1ULL << join->tables) - 1
+
+ always. If we want table elimination to work in presence of grouping, need
+ to devise some other way of analyzing aggregate functions.
------------------------------------------------------------
-=-=(View All Progress Notes, 25 total)=-=-
http://askmonty.org/worklog/index.pl?tid=17&nolimit=1
DESCRIPTION:
Eliminate not needed tables from SELECT queries..
This will speed up some views and automatically generated queries.
Example:
CREATE TABLE B (id int primary key);
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
In this case we can remove table B and the join from the query.
HIGH-LEVEL SPECIFICATION:
Here is an extended explanation of table elimination.
Table elimination is a feature found in some modern query optimizers, of
which Microsoft SQL Server 2005/2008 seems to have the most advanced
implementation. Oracle 11g has also been confirmed to use table
elimination but not to the same extent.
Basically, what table elimination does, is to remove tables from the
execution plan when it is unnecessary to include them. This can, of
course, only happen if the right circumstances arise. Let us for example
look at the following query:
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
still contain it's original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB. If
A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
then two rows will match in the join condition. The only way to know
what the result will look like is to actually touch both tables during
execution.
Instead, let's say that tableB contains rows that make it possible to
place a unique constraint on the column B.id, for example and often the
case a primary key. In this situation we know that we will get exactly
as many rows as there are in tableA, since joining with tableB cannot
introduce any duplicates. If further, as in the example query, we do not
select any columns from tableB, touching that table during execution is
unnecessary. We can remove the whole join operation from the execution
plan.
Both SQL Server 2005/2008 and Oracle 11g will deploy table elimination
in the case described above. Let us look at a more advanced query, where
Oracle fails.
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (
select
max(sub.fromDate)
from
tableB sub
where
sub.id = A.id
);
In this example we have added another join condition, which ensures
that we only pick the matching row from tableB having the latest
fromDate. In this case tableB will contain duplicates of the column
B.id, so in order to ensure uniqueness the primary key has to contain
the fromDate column as well. In other words the primary key of tableB
is (B.id, B.fromDate).
Furthermore, since the subselect ensures that we only pick the latest
B.fromDate for a given B.id we know that at most one row will match
the join condition. We will again have the situation where joining
with tableB cannot affect the number of rows in the result set. Since
we do not select any columns from tableB, the whole join operation can
be eliminated from the execution plan.
SQL Server 2005/2008 will deploy table elimination in this situation as
well. We have not found a way to make Oracle 11g use it for this type of
query. Queries like these arise in two situations. Either when you have
denormalized model consisting of a fact table with several related
dimension tables, or when you have a highly normalized model where each
attribute is stored in its own table. The example with the subselect is
common whenever you store historized/versioned data.
LOW-LEVEL DESIGN:
The code (currently in development) is at lp:
~maria-captains/maria/maria-5.1-table-elimination tree.
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
2. Removal operation properties
3. Removal operation
4. User interface
5. Tests and benchmarks
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
sides of outer joins.
1. Conditions for removal
-------------------------
We can eliminate an inner side of outer join if:
1. For each record combination of outer tables, it will always produce
exactly one record.
2. There are no references to columns of the inner tables anywhere else in
the query.
#1 means that every table inside the outer join nest is:
- is a constant table:
= because it can be accessed via eq_ref(const) access, or
= it is a zero-rows or one-row MyISAM-like table [MARK1]
- has an eq_ref access method candidate.
#2 means that WHERE clause, ON clauses of embedding outer joins, ORDER BY,
GROUP BY and HAVING do not refer to the inner tables of the outer join
nest.
1.1 Quick check if there are candidates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before we start to enumerate join nests, here is a quick way to check if
there *can be* something to be removed:
if ((tables used in select_list |
tables used in group/order by UNION |
tables used in where) != bitmap_of_all_tables)
{
attempt table elimination;
}
2. Removal operation properties
-------------------------------
* There is always one way to remove (no choice to remove either this or that)
* It is always better to remove as much tables as possible (at least within
our cost model).
Thus, no need for any cost calculations/etc. It's an unconditional rewrite.
3. Removal operation
--------------------
* Remove the outer join nest's nested join structure (i.e. get the
outer join's TABLE_LIST object $OJ and remove it from $OJ->embedding,
$OJ->embedding->nested_join. Update table_map's of all ancestor nested
joins). [MARK2]
* Move the tables and their JOIN_TABs to front like it is done with const
tables, with exception that if eliminated outer join nest was within
another outer join nest, that shouldn't prevent us from moving away the
eliminated tables.
* Update join->table_count and all-join-tables bitmap.
* That's it. Nothing else?
4. User interface
-----------------
* We'll add an @@optimizer switch flag for table elimination. Tentative
name: 'table_elimination'.
(Note ^^ utility of the above questioned ^, as table elimination can never
be worse than no elimination. We're leaning towards not adding the flag)
* EXPLAIN will not show the removed tables at all. This will allow to check
if tables were removed, and also will behave nicely with anchor model and
VIEWs: stuff that user doesn't care about just won't be there.
5. Tests and benchmarks
-----------------------
Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
[According to Monty] Run
- queries that would use elimination
- queries that are very similar to one above (so that they would have same
QEP, execution cost, etc) but cannot use table elimination.
then compare run times and make a conclusion about whether dbms supports table
elimination.
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
to limit the applicability by removing [MARK1] as that can change during
lifetime of the statement.
The other option is to do table elimination every time. This will require to
rework operation [MARK2] to be undoable.
I'm leaning towards doing the former. With anchor modeling, it is unlikely
that we'll meet outer joins which have N inner tables of which some are 1-row
MyISAM tables that do not have primary key.
6.2 Resolved
~~~~~~~~~~~~
* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- affected tables must not be eliminated
- tables that are used on the right side of the SET x=y assignments must
not be eliminated either.
* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
always. Fixed it, now aggregate function reports it depends on
tables that its arguments depend on. In particular, COUNT(*) reports
that it depends on no tables (item_count_star->used_tables()==0).
One consequence of that is that "item->used_tables()==0" is not
equivalent to "item->const_item()==true" anymore (not sure if it's
"anymore" or this has been already happening).
* EXPLAIN EXTENDED warning text was generated after the JOIN object has
been discarded. This didn't allow to use information about join plan
when printing the warning. Fixed this by keeping the JOIN objects until
we've printed the warning (have also an intent to remove the const
tables from the join output).
7. Additional issues
--------------------
* We remove ON clauses within outer join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
Yes. Current approach: when removing an outer join nest, walk the ON clause
and mark subselects as eliminated. Then let EXPLAIN code check if the
SELECT was eliminated before the printing (EXPLAIN is generated by doing
a recursive descent, so the check will also cause children of eliminated
selects not to be printed)
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
from user/EXPLAIN point of view: no. constant table is the one that we read
one record from. eliminated table is the one that we don't acccess at all.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Table elimination (17)
by worklog-noreply@askmonty.org 18 Jun '09
by worklog-noreply@askmonty.org 18 Jun '09
18 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination
CREATION DATE..: Sun, 10 May 2009, 19:57
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 17 (http://askmonty.org/worklog/?tid=17)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Thu, 18 Jun 2009, 02:48)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.27792 2009-06-18 02:48:45.000000000 +0300
+++ /tmp/wklog.17.new.27792 2009-06-18 02:48:45.000000000 +0300
@@ -89,14 +89,14 @@
- queries that would use elimination
- queries that are very similar to one above (so that they would have same
QEP, execution cost, etc) but cannot use table elimination.
+then compare run times and make a conclusion about whether dbms supports table
+elimination.
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
-- Re-check how this works with equality propagation.
-
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
@@ -141,8 +141,13 @@
7. Additional issues
--------------------
-* We remove ON clauses within semi-join nests. If these clauses contain
+* We remove ON clauses within outer join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
+ Yes. Current approach: when removing an outer join nest, walk the ON clause
+ and mark subselects as eliminated. Then let EXPLAIN code check if the
+ SELECT was eliminated before the printing (EXPLAIN is generated by doing
+ a recursive descent, so the check will also cause children of eliminated
+ selects not to be printed)
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
-=-=(Guest - Thu, 18 Jun 2009, 02:24)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.27162 2009-06-18 02:24:14.000000000 +0300
+++ /tmp/wklog.17.new.27162 2009-06-18 02:24:14.000000000 +0300
@@ -83,9 +83,12 @@
5. Tests and benchmarks
-----------------------
-Should create a benchmark in sql-bench which checks if the dbms has table
+Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
-TODO elaborate
+[According to Monty] Run
+ - queries that would use elimination
+ - queries that are very similar to one above (so that they would have same
+ QEP, execution cost, etc) but cannot use table elimination.
6. Todo, issues to resolve
--------------------------
@@ -109,33 +112,37 @@
6.2 Resolved
~~~~~~~~~~~~
-- outer->inner join conversion is not a problem for table elimination.
+* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
-7. Additional issues
---------------------
-* We remove ON clauses within semi-join nests. If these clauses contain
- subqueries, they probably should be gone from EXPLAIN output also?
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-* Aggregate functions report they depend on all tables, that is,
+* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
- always. If we want table elimination to work in presence of grouping, need
- to devise some other way of analyzing aggregate functions.
+ always. Fixed it, now aggregate function reports it depends on
+ tables that its arguments depend on. In particular, COUNT(*) reports
+ that it depends on no tables (item_count_star->used_tables()==0).
+ One consequence of that is that "item->used_tables()==0" is not
+ equivalent to "item->const_item()==true" anymore (not sure if it's
+ "anymore" or this has been already happening).
+
+* EXPLAIN EXTENDED warning text was generated after the JOIN object has
+ been discarded. This didn't allow to use information about join plan
+ when printing the warning. Fixed this by keeping the JOIN objects until
+ we've printed the warning (have also an intent to remove the const
+ tables from the join output).
-* Should eliminated tables be shown in EXPLAIN EXTENDED?
- - If we just ignore the question, they will be shown
- - this is what happens for constant tables, too.
- - I don't see how showing them could be of any use. They only make it
- harder to read the rewritten query.
- It turns out that
- - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
- lifetime) changes.
- - it is hard to have it show per-execution data. This is because the warning
- text is generated after the execution structures have been destroyed.
+7. Additional issues
+--------------------
+* We remove ON clauses within semi-join nests. If these clauses contain
+ subqueries, they probably should be gone from EXPLAIN output also?
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
@@ -143,8 +150,6 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+ from user/EXPLAIN point of view: no. constant table is the one that we read
+ one record from. eliminated table is the one that we don't acccess at all.
-* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- - affected tables must not be eliminated
- - tables that are used on the right side of the SET x=y assignments must
- not be eliminated either.
-=-=(Guest - Tue, 16 Jun 2009, 17:01)=-=-
Dependency deleted: 29 no longer depends on 17
-=-=(Guest - Wed, 10 Jun 2009, 01:23)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.1842 2009-06-10 01:23:42.000000000 +0300
+++ /tmp/wklog.17.new.1842 2009-06-10 01:23:42.000000000 +0300
@@ -131,6 +131,11 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+ It turns out that
+ - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
+ lifetime) changes.
+ - it is hard to have it show per-execution data. This is because the warning
+ text is generated after the execution structures have been destroyed.
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
-=-=(Guest - Wed, 03 Jun 2009, 22:01)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.21801 2009-06-03 22:01:34.000000000 +0300
+++ /tmp/wklog.17.new.21801 2009-06-03 22:01:34.000000000 +0300
@@ -1,3 +1,6 @@
+The code (currently in development) is at lp:
+~maria-captains/maria/maria-5.1-table-elimination tree.
+
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
-=-=(Guest - Wed, 03 Jun 2009, 15:04)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.20378 2009-06-03 15:04:54.000000000 +0300
+++ /tmp/wklog.17.new.20378 2009-06-03 15:04:54.000000000 +0300
@@ -135,3 +135,8 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-=-=(Psergey - Wed, 03 Jun 2009, 12:07)=-=-
Dependency created: 29 now depends on 17
-=-=(Guest - Tue, 02 Jun 2009, 00:54)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.23548 2009-06-02 00:54:13.000000000 +0300
+++ /tmp/wklog.17.new.23548 2009-06-02 00:54:13.000000000 +0300
@@ -128,3 +128,10 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+
+* Table elimination is performed after constant table detection (but before
+ the range analysis). Constant tables are technically different from
+ eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
+ Considering we've already done the join_read_const_table() call, is there any
+ real difference between constant table and eliminated one? If there is, should
+ we mark const tables also as eliminated?
-=-=(Psergey - Mon, 01 Jun 2009, 20:46)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.17448 2009-06-01 20:46:40.000000000 +0300
+++ /tmp/wklog.17.new.17448 2009-06-01 20:46:40.000000000 +0300
@@ -122,3 +122,9 @@
always. If we want table elimination to work in presence of grouping, need
to devise some other way of analyzing aggregate functions.
+
+* Should eliminated tables be shown in EXPLAIN EXTENDED?
+ - If we just ignore the question, they will be shown
+ - this is what happens for constant tables, too.
+ - I don't see how showing them could be of any use. They only make it
+ harder to read the rewritten query.
-=-=(Guest - Mon, 01 Jun 2009, 12:49)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.32202 2009-06-01 12:49:15.000000000 +0300
+++ /tmp/wklog.17.new.32202 2009-06-01 12:49:15.000000000 +0300
@@ -8,7 +8,7 @@
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
-
+7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
@@ -116,3 +116,9 @@
* We remove ON clauses within semi-join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
+* Aggregate functions report they depend on all tables, that is,
+
+ item_agg_func->used_tables() == (1ULL << join->tables) - 1
+
+ always. If we want table elimination to work in presence of grouping, need
+ to devise some other way of analyzing aggregate functions.
------------------------------------------------------------
-=-=(View All Progress Notes, 25 total)=-=-
http://askmonty.org/worklog/index.pl?tid=17&nolimit=1
DESCRIPTION:
Eliminate not needed tables from SELECT queries..
This will speed up some views and automatically generated queries.
Example:
CREATE TABLE B (id int primary key);
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
In this case we can remove table B and the join from the query.
HIGH-LEVEL SPECIFICATION:
Here is an extended explanation of table elimination.
Table elimination is a feature found in some modern query optimizers, of
which Microsoft SQL Server 2005/2008 seems to have the most advanced
implementation. Oracle 11g has also been confirmed to use table
elimination but not to the same extent.
Basically, what table elimination does, is to remove tables from the
execution plan when it is unnecessary to include them. This can, of
course, only happen if the right circumstances arise. Let us for example
look at the following query:
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
still contain it's original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB. If
A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
then two rows will match in the join condition. The only way to know
what the result will look like is to actually touch both tables during
execution.
Instead, let's say that tableB contains rows that make it possible to
place a unique constraint on the column B.id, for example and often the
case a primary key. In this situation we know that we will get exactly
as many rows as there are in tableA, since joining with tableB cannot
introduce any duplicates. If further, as in the example query, we do not
select any columns from tableB, touching that table during execution is
unnecessary. We can remove the whole join operation from the execution
plan.
Both SQL Server 2005/2008 and Oracle 11g will deploy table elimination
in the case described above. Let us look at a more advanced query, where
Oracle fails.
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (
select
max(sub.fromDate)
from
tableB sub
where
sub.id = A.id
);
In this example we have added another join condition, which ensures
that we only pick the matching row from tableB having the latest
fromDate. In this case tableB will contain duplicates of the column
B.id, so in order to ensure uniqueness the primary key has to contain
the fromDate column as well. In other words the primary key of tableB
is (B.id, B.fromDate).
Furthermore, since the subselect ensures that we only pick the latest
B.fromDate for a given B.id we know that at most one row will match
the join condition. We will again have the situation where joining
with tableB cannot affect the number of rows in the result set. Since
we do not select any columns from tableB, the whole join operation can
be eliminated from the execution plan.
SQL Server 2005/2008 will deploy table elimination in this situation as
well. We have not found a way to make Oracle 11g use it for this type of
query. Queries like these arise in two situations. Either when you have
denormalized model consisting of a fact table with several related
dimension tables, or when you have a highly normalized model where each
attribute is stored in its own table. The example with the subselect is
common whenever you store historized/versioned data.
LOW-LEVEL DESIGN:
The code (currently in development) is at lp:
~maria-captains/maria/maria-5.1-table-elimination tree.
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
2. Removal operation properties
3. Removal operation
4. User interface
5. Tests and benchmarks
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
sides of outer joins.
1. Conditions for removal
-------------------------
We can eliminate an inner side of outer join if:
1. For each record combination of outer tables, it will always produce
exactly one record.
2. There are no references to columns of the inner tables anywhere else in
the query.
#1 means that every table inside the outer join nest is:
- is a constant table:
= because it can be accessed via eq_ref(const) access, or
= it is a zero-rows or one-row MyISAM-like table [MARK1]
- has an eq_ref access method candidate.
#2 means that WHERE clause, ON clauses of embedding outer joins, ORDER BY,
GROUP BY and HAVING do not refer to the inner tables of the outer join
nest.
1.1 Quick check if there are candidates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before we start to enumerate join nests, here is a quick way to check if
there *can be* something to be removed:
if ((tables used in select_list |
tables used in group/order by UNION |
tables used in where) != bitmap_of_all_tables)
{
attempt table elimination;
}
2. Removal operation properties
-------------------------------
* There is always one way to remove (no choice to remove either this or that)
* It is always better to remove as much tables as possible (at least within
our cost model).
Thus, no need for any cost calculations/etc. It's an unconditional rewrite.
3. Removal operation
--------------------
* Remove the outer join nest's nested join structure (i.e. get the
outer join's TABLE_LIST object $OJ and remove it from $OJ->embedding,
$OJ->embedding->nested_join. Update table_map's of all ancestor nested
joins). [MARK2]
* Move the tables and their JOIN_TABs to front like it is done with const
tables, with exception that if eliminated outer join nest was within
another outer join nest, that shouldn't prevent us from moving away the
eliminated tables.
* Update join->table_count and all-join-tables bitmap.
* That's it. Nothing else?
4. User interface
-----------------
* We'll add an @@optimizer switch flag for table elimination. Tentative
name: 'table_elimination'.
(Note ^^ utility of the above questioned ^, as table elimination can never
be worse than no elimination. We're leaning towards not adding the flag)
* EXPLAIN will not show the removed tables at all. This will allow to check
if tables were removed, and also will behave nicely with anchor model and
VIEWs: stuff that user doesn't care about just won't be there.
5. Tests and benchmarks
-----------------------
Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
[According to Monty] Run
- queries that would use elimination
- queries that are very similar to one above (so that they would have same
QEP, execution cost, etc) but cannot use table elimination.
then compare run times and make a conclusion about whether dbms supports table
elimination.
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
to limit the applicability by removing [MARK1] as that can change during
lifetime of the statement.
The other option is to do table elimination every time. This will require to
rework operation [MARK2] to be undoable.
I'm leaning towards doing the former. With anchor modeling, it is unlikely
that we'll meet outer joins which have N inner tables of which some are 1-row
MyISAM tables that do not have primary key.
6.2 Resolved
~~~~~~~~~~~~
* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- affected tables must not be eliminated
- tables that are used on the right side of the SET x=y assignments must
not be eliminated either.
* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
always. Fixed it, now aggregate function reports it depends on
tables that its arguments depend on. In particular, COUNT(*) reports
that it depends on no tables (item_count_star->used_tables()==0).
One consequence of that is that "item->used_tables()==0" is not
equivalent to "item->const_item()==true" anymore (not sure if it's
"anymore" or this has been already happening).
* EXPLAIN EXTENDED warning text was generated after the JOIN object has
been discarded. This didn't allow to use information about join plan
when printing the warning. Fixed this by keeping the JOIN objects until
we've printed the warning (have also an intent to remove the const
tables from the join output).
7. Additional issues
--------------------
* We remove ON clauses within outer join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
Yes. Current approach: when removing an outer join nest, walk the ON clause
and mark subselects as eliminated. Then let EXPLAIN code check if the
SELECT was eliminated before the printing (EXPLAIN is generated by doing
a recursive descent, so the check will also cause children of eliminated
selects not to be printed)
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
from user/EXPLAIN point of view: no. constant table is the one that we read
one record from. eliminated table is the one that we don't acccess at all.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Table elimination (17)
by worklog-noreply@askmonty.org 18 Jun '09
by worklog-noreply@askmonty.org 18 Jun '09
18 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination
CREATION DATE..: Sun, 10 May 2009, 19:57
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 17 (http://askmonty.org/worklog/?tid=17)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Thu, 18 Jun 2009, 02:24)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.27162 2009-06-18 02:24:14.000000000 +0300
+++ /tmp/wklog.17.new.27162 2009-06-18 02:24:14.000000000 +0300
@@ -83,9 +83,12 @@
5. Tests and benchmarks
-----------------------
-Should create a benchmark in sql-bench which checks if the dbms has table
+Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
-TODO elaborate
+[According to Monty] Run
+ - queries that would use elimination
+ - queries that are very similar to one above (so that they would have same
+ QEP, execution cost, etc) but cannot use table elimination.
6. Todo, issues to resolve
--------------------------
@@ -109,33 +112,37 @@
6.2 Resolved
~~~~~~~~~~~~
-- outer->inner join conversion is not a problem for table elimination.
+* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
-7. Additional issues
---------------------
-* We remove ON clauses within semi-join nests. If these clauses contain
- subqueries, they probably should be gone from EXPLAIN output also?
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-* Aggregate functions report they depend on all tables, that is,
+* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
- always. If we want table elimination to work in presence of grouping, need
- to devise some other way of analyzing aggregate functions.
+ always. Fixed it, now aggregate function reports it depends on
+ tables that its arguments depend on. In particular, COUNT(*) reports
+ that it depends on no tables (item_count_star->used_tables()==0).
+ One consequence of that is that "item->used_tables()==0" is not
+ equivalent to "item->const_item()==true" anymore (not sure if it's
+ "anymore" or this has been already happening).
+
+* EXPLAIN EXTENDED warning text was generated after the JOIN object has
+ been discarded. This didn't allow to use information about join plan
+ when printing the warning. Fixed this by keeping the JOIN objects until
+ we've printed the warning (have also an intent to remove the const
+ tables from the join output).
-* Should eliminated tables be shown in EXPLAIN EXTENDED?
- - If we just ignore the question, they will be shown
- - this is what happens for constant tables, too.
- - I don't see how showing them could be of any use. They only make it
- harder to read the rewritten query.
- It turns out that
- - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
- lifetime) changes.
- - it is hard to have it show per-execution data. This is because the warning
- text is generated after the execution structures have been destroyed.
+7. Additional issues
+--------------------
+* We remove ON clauses within semi-join nests. If these clauses contain
+ subqueries, they probably should be gone from EXPLAIN output also?
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
@@ -143,8 +150,6 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+ from user/EXPLAIN point of view: no. constant table is the one that we read
+ one record from. eliminated table is the one that we don't acccess at all.
-* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- - affected tables must not be eliminated
- - tables that are used on the right side of the SET x=y assignments must
- not be eliminated either.
-=-=(Guest - Tue, 16 Jun 2009, 17:01)=-=-
Dependency deleted: 29 no longer depends on 17
-=-=(Guest - Wed, 10 Jun 2009, 01:23)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.1842 2009-06-10 01:23:42.000000000 +0300
+++ /tmp/wklog.17.new.1842 2009-06-10 01:23:42.000000000 +0300
@@ -131,6 +131,11 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+ It turns out that
+ - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
+ lifetime) changes.
+ - it is hard to have it show per-execution data. This is because the warning
+ text is generated after the execution structures have been destroyed.
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
-=-=(Guest - Wed, 03 Jun 2009, 22:01)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.21801 2009-06-03 22:01:34.000000000 +0300
+++ /tmp/wklog.17.new.21801 2009-06-03 22:01:34.000000000 +0300
@@ -1,3 +1,6 @@
+The code (currently in development) is at lp:
+~maria-captains/maria/maria-5.1-table-elimination tree.
+
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
-=-=(Guest - Wed, 03 Jun 2009, 15:04)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.20378 2009-06-03 15:04:54.000000000 +0300
+++ /tmp/wklog.17.new.20378 2009-06-03 15:04:54.000000000 +0300
@@ -135,3 +135,8 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-=-=(Psergey - Wed, 03 Jun 2009, 12:07)=-=-
Dependency created: 29 now depends on 17
-=-=(Guest - Tue, 02 Jun 2009, 00:54)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.23548 2009-06-02 00:54:13.000000000 +0300
+++ /tmp/wklog.17.new.23548 2009-06-02 00:54:13.000000000 +0300
@@ -128,3 +128,10 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+
+* Table elimination is performed after constant table detection (but before
+ the range analysis). Constant tables are technically different from
+ eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
+ Considering we've already done the join_read_const_table() call, is there any
+ real difference between constant table and eliminated one? If there is, should
+ we mark const tables also as eliminated?
-=-=(Psergey - Mon, 01 Jun 2009, 20:46)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.17448 2009-06-01 20:46:40.000000000 +0300
+++ /tmp/wklog.17.new.17448 2009-06-01 20:46:40.000000000 +0300
@@ -122,3 +122,9 @@
always. If we want table elimination to work in presence of grouping, need
to devise some other way of analyzing aggregate functions.
+
+* Should eliminated tables be shown in EXPLAIN EXTENDED?
+ - If we just ignore the question, they will be shown
+ - this is what happens for constant tables, too.
+ - I don't see how showing them could be of any use. They only make it
+ harder to read the rewritten query.
-=-=(Guest - Mon, 01 Jun 2009, 12:49)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.32202 2009-06-01 12:49:15.000000000 +0300
+++ /tmp/wklog.17.new.32202 2009-06-01 12:49:15.000000000 +0300
@@ -8,7 +8,7 @@
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
-
+7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
@@ -116,3 +116,9 @@
* We remove ON clauses within semi-join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
+* Aggregate functions report they depend on all tables, that is,
+
+ item_agg_func->used_tables() == (1ULL << join->tables) - 1
+
+ always. If we want table elimination to work in presence of grouping, need
+ to devise some other way of analyzing aggregate functions.
-=-=(Guest - Fri, 29 May 2009, 00:45)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.1348 2009-05-29 00:45:21.000000000 +0300
+++ /tmp/wklog.17.new.1348 2009-05-29 00:45:21.000000000 +0300
@@ -111,3 +111,8 @@
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
+7. Additional issues
+--------------------
+* We remove ON clauses within semi-join nests. If these clauses contain
+ subqueries, they probably should be gone from EXPLAIN output also?
+
------------------------------------------------------------
-=-=(View All Progress Notes, 24 total)=-=-
http://askmonty.org/worklog/index.pl?tid=17&nolimit=1
DESCRIPTION:
Eliminate not needed tables from SELECT queries..
This will speed up some views and automatically generated queries.
Example:
CREATE TABLE B (id int primary key);
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
In this case we can remove table B and the join from the query.
HIGH-LEVEL SPECIFICATION:
Here is an extended explanation of table elimination.
Table elimination is a feature found in some modern query optimizers, of
which Microsoft SQL Server 2005/2008 seems to have the most advanced
implementation. Oracle 11g has also been confirmed to use table
elimination but not to the same extent.
Basically, what table elimination does, is to remove tables from the
execution plan when it is unnecessary to include them. This can, of
course, only happen if the right circumstances arise. Let us for example
look at the following query:
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
still contain it's original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB. If
A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
then two rows will match in the join condition. The only way to know
what the result will look like is to actually touch both tables during
execution.
Instead, let's say that tableB contains rows that make it possible to
place a unique constraint on the column B.id, for example and often the
case a primary key. In this situation we know that we will get exactly
as many rows as there are in tableA, since joining with tableB cannot
introduce any duplicates. If further, as in the example query, we do not
select any columns from tableB, touching that table during execution is
unnecessary. We can remove the whole join operation from the execution
plan.
Both SQL Server 2005/2008 and Oracle 11g will deploy table elimination
in the case described above. Let us look at a more advanced query, where
Oracle fails.
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (
select
max(sub.fromDate)
from
tableB sub
where
sub.id = A.id
);
In this example we have added another join condition, which ensures
that we only pick the matching row from tableB having the latest
fromDate. In this case tableB will contain duplicates of the column
B.id, so in order to ensure uniqueness the primary key has to contain
the fromDate column as well. In other words the primary key of tableB
is (B.id, B.fromDate).
Furthermore, since the subselect ensures that we only pick the latest
B.fromDate for a given B.id we know that at most one row will match
the join condition. We will again have the situation where joining
with tableB cannot affect the number of rows in the result set. Since
we do not select any columns from tableB, the whole join operation can
be eliminated from the execution plan.
SQL Server 2005/2008 will deploy table elimination in this situation as
well. We have not found a way to make Oracle 11g use it for this type of
query. Queries like these arise in two situations. Either when you have
denormalized model consisting of a fact table with several related
dimension tables, or when you have a highly normalized model where each
attribute is stored in its own table. The example with the subselect is
common whenever you store historized/versioned data.
LOW-LEVEL DESIGN:
The code (currently in development) is at lp:
~maria-captains/maria/maria-5.1-table-elimination tree.
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
2. Removal operation properties
3. Removal operation
4. User interface
5. Tests and benchmarks
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
sides of outer joins.
1. Conditions for removal
-------------------------
We can eliminate an inner side of outer join if:
1. For each record combination of outer tables, it will always produce
exactly one record.
2. There are no references to columns of the inner tables anywhere else in
the query.
#1 means that every table inside the outer join nest is:
- is a constant table:
= because it can be accessed via eq_ref(const) access, or
= it is a zero-rows or one-row MyISAM-like table [MARK1]
- has an eq_ref access method candidate.
#2 means that WHERE clause, ON clauses of embedding outer joins, ORDER BY,
GROUP BY and HAVING do not refer to the inner tables of the outer join
nest.
1.1 Quick check if there are candidates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before we start to enumerate join nests, here is a quick way to check if
there *can be* something to be removed:
if ((tables used in select_list |
tables used in group/order by UNION |
tables used in where) != bitmap_of_all_tables)
{
attempt table elimination;
}
2. Removal operation properties
-------------------------------
* There is always one way to remove (no choice to remove either this or that)
* It is always better to remove as much tables as possible (at least within
our cost model).
Thus, no need for any cost calculations/etc. It's an unconditional rewrite.
3. Removal operation
--------------------
* Remove the outer join nest's nested join structure (i.e. get the
outer join's TABLE_LIST object $OJ and remove it from $OJ->embedding,
$OJ->embedding->nested_join. Update table_map's of all ancestor nested
joins). [MARK2]
* Move the tables and their JOIN_TABs to front like it is done with const
tables, with exception that if eliminated outer join nest was within
another outer join nest, that shouldn't prevent us from moving away the
eliminated tables.
* Update join->table_count and all-join-tables bitmap.
* That's it. Nothing else?
4. User interface
-----------------
* We'll add an @@optimizer switch flag for table elimination. Tentative
name: 'table_elimination'.
(Note ^^ utility of the above questioned ^, as table elimination can never
be worse than no elimination. We're leaning towards not adding the flag)
* EXPLAIN will not show the removed tables at all. This will allow to check
if tables were removed, and also will behave nicely with anchor model and
VIEWs: stuff that user doesn't care about just won't be there.
5. Tests and benchmarks
-----------------------
Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
[According to Monty] Run
- queries that would use elimination
- queries that are very similar to one above (so that they would have same
QEP, execution cost, etc) but cannot use table elimination.
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
- Re-check how this works with equality propagation.
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
to limit the applicability by removing [MARK1] as that can change during
lifetime of the statement.
The other option is to do table elimination every time. This will require to
rework operation [MARK2] to be undoable.
I'm leaning towards doing the former. With anchor modeling, it is unlikely
that we'll meet outer joins which have N inner tables of which some are 1-row
MyISAM tables that do not have primary key.
6.2 Resolved
~~~~~~~~~~~~
* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- affected tables must not be eliminated
- tables that are used on the right side of the SET x=y assignments must
not be eliminated either.
* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
always. Fixed it, now aggregate function reports it depends on
tables that its arguments depend on. In particular, COUNT(*) reports
that it depends on no tables (item_count_star->used_tables()==0).
One consequence of that is that "item->used_tables()==0" is not
equivalent to "item->const_item()==true" anymore (not sure if it's
"anymore" or this has been already happening).
* EXPLAIN EXTENDED warning text was generated after the JOIN object has
been discarded. This didn't allow to use information about join plan
when printing the warning. Fixed this by keeping the JOIN objects until
we've printed the warning (have also an intent to remove the const
tables from the join output).
7. Additional issues
--------------------
* We remove ON clauses within semi-join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
from user/EXPLAIN point of view: no. constant table is the one that we read
one record from. eliminated table is the one that we don't acccess at all.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0

[Maria-developers] Updated (by Guest): Table elimination (17)
by worklog-noreply@askmonty.org 18 Jun '09
by worklog-noreply@askmonty.org 18 Jun '09
18 Jun '09
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Table elimination
CREATION DATE..: Sun, 10 May 2009, 19:57
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......:
CATEGORY.......: Server-Sprint
TASK ID........: 17 (http://askmonty.org/worklog/?tid=17)
VERSION........: Server-5.1
STATUS.........: Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Thu, 18 Jun 2009, 02:24)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.27162 2009-06-18 02:24:14.000000000 +0300
+++ /tmp/wklog.17.new.27162 2009-06-18 02:24:14.000000000 +0300
@@ -83,9 +83,12 @@
5. Tests and benchmarks
-----------------------
-Should create a benchmark in sql-bench which checks if the dbms has table
+Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
-TODO elaborate
+[According to Monty] Run
+ - queries that would use elimination
+ - queries that are very similar to one above (so that they would have same
+ QEP, execution cost, etc) but cannot use table elimination.
6. Todo, issues to resolve
--------------------------
@@ -109,33 +112,37 @@
6.2 Resolved
~~~~~~~~~~~~
-- outer->inner join conversion is not a problem for table elimination.
+* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
-7. Additional issues
---------------------
-* We remove ON clauses within semi-join nests. If these clauses contain
- subqueries, they probably should be gone from EXPLAIN output also?
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-* Aggregate functions report they depend on all tables, that is,
+* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
- always. If we want table elimination to work in presence of grouping, need
- to devise some other way of analyzing aggregate functions.
+ always. Fixed it, now aggregate function reports it depends on
+ tables that its arguments depend on. In particular, COUNT(*) reports
+ that it depends on no tables (item_count_star->used_tables()==0).
+ One consequence of that is that "item->used_tables()==0" is not
+ equivalent to "item->const_item()==true" anymore (not sure if it's
+ "anymore" or this has been already happening).
+
+* EXPLAIN EXTENDED warning text was generated after the JOIN object has
+ been discarded. This didn't allow to use information about join plan
+ when printing the warning. Fixed this by keeping the JOIN objects until
+ we've printed the warning (have also an intent to remove the const
+ tables from the join output).
-* Should eliminated tables be shown in EXPLAIN EXTENDED?
- - If we just ignore the question, they will be shown
- - this is what happens for constant tables, too.
- - I don't see how showing them could be of any use. They only make it
- harder to read the rewritten query.
- It turns out that
- - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
- lifetime) changes.
- - it is hard to have it show per-execution data. This is because the warning
- text is generated after the execution structures have been destroyed.
+7. Additional issues
+--------------------
+* We remove ON clauses within semi-join nests. If these clauses contain
+ subqueries, they probably should be gone from EXPLAIN output also?
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
@@ -143,8 +150,6 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+ from user/EXPLAIN point of view: no. constant table is the one that we read
+ one record from. eliminated table is the one that we don't acccess at all.
-* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- - affected tables must not be eliminated
- - tables that are used on the right side of the SET x=y assignments must
- not be eliminated either.
-=-=(Guest - Tue, 16 Jun 2009, 17:01)=-=-
Dependency deleted: 29 no longer depends on 17
-=-=(Guest - Wed, 10 Jun 2009, 01:23)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.1842 2009-06-10 01:23:42.000000000 +0300
+++ /tmp/wklog.17.new.1842 2009-06-10 01:23:42.000000000 +0300
@@ -131,6 +131,11 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+ It turns out that
+ - it is easy to have EXPLAIN EXTENDED show permanent (once-per-statement
+ lifetime) changes.
+ - it is hard to have it show per-execution data. This is because the warning
+ text is generated after the execution structures have been destroyed.
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
-=-=(Guest - Wed, 03 Jun 2009, 22:01)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.21801 2009-06-03 22:01:34.000000000 +0300
+++ /tmp/wklog.17.new.21801 2009-06-03 22:01:34.000000000 +0300
@@ -1,3 +1,6 @@
+The code (currently in development) is at lp:
+~maria-captains/maria/maria-5.1-table-elimination tree.
+
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
-=-=(Guest - Wed, 03 Jun 2009, 15:04)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.20378 2009-06-03 15:04:54.000000000 +0300
+++ /tmp/wklog.17.new.20378 2009-06-03 15:04:54.000000000 +0300
@@ -135,3 +135,8 @@
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
+
+* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
+ - affected tables must not be eliminated
+ - tables that are used on the right side of the SET x=y assignments must
+ not be eliminated either.
-=-=(Psergey - Wed, 03 Jun 2009, 12:07)=-=-
Dependency created: 29 now depends on 17
-=-=(Guest - Tue, 02 Jun 2009, 00:54)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.23548 2009-06-02 00:54:13.000000000 +0300
+++ /tmp/wklog.17.new.23548 2009-06-02 00:54:13.000000000 +0300
@@ -128,3 +128,10 @@
- this is what happens for constant tables, too.
- I don't see how showing them could be of any use. They only make it
harder to read the rewritten query.
+
+* Table elimination is performed after constant table detection (but before
+ the range analysis). Constant tables are technically different from
+ eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
+ Considering we've already done the join_read_const_table() call, is there any
+ real difference between constant table and eliminated one? If there is, should
+ we mark const tables also as eliminated?
-=-=(Psergey - Mon, 01 Jun 2009, 20:46)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.17448 2009-06-01 20:46:40.000000000 +0300
+++ /tmp/wklog.17.new.17448 2009-06-01 20:46:40.000000000 +0300
@@ -122,3 +122,9 @@
always. If we want table elimination to work in presence of grouping, need
to devise some other way of analyzing aggregate functions.
+
+* Should eliminated tables be shown in EXPLAIN EXTENDED?
+ - If we just ignore the question, they will be shown
+ - this is what happens for constant tables, too.
+ - I don't see how showing them could be of any use. They only make it
+ harder to read the rewritten query.
-=-=(Guest - Mon, 01 Jun 2009, 12:49)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.32202 2009-06-01 12:49:15.000000000 +0300
+++ /tmp/wklog.17.new.32202 2009-06-01 12:49:15.000000000 +0300
@@ -8,7 +8,7 @@
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
-
+7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
@@ -116,3 +116,9 @@
* We remove ON clauses within semi-join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
+* Aggregate functions report they depend on all tables, that is,
+
+ item_agg_func->used_tables() == (1ULL << join->tables) - 1
+
+ always. If we want table elimination to work in presence of grouping, need
+ to devise some other way of analyzing aggregate functions.
-=-=(Guest - Fri, 29 May 2009, 00:45)=-=-
Low Level Design modified.
--- /tmp/wklog.17.old.1348 2009-05-29 00:45:21.000000000 +0300
+++ /tmp/wklog.17.new.1348 2009-05-29 00:45:21.000000000 +0300
@@ -111,3 +111,8 @@
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
+7. Additional issues
+--------------------
+* We remove ON clauses within semi-join nests. If these clauses contain
+ subqueries, they probably should be gone from EXPLAIN output also?
+
------------------------------------------------------------
-=-=(View All Progress Notes, 24 total)=-=-
http://askmonty.org/worklog/index.pl?tid=17&nolimit=1
DESCRIPTION:
Eliminate not needed tables from SELECT queries..
This will speed up some views and automatically generated queries.
Example:
CREATE TABLE B (id int primary key);
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
In this case we can remove table B and the join from the query.
HIGH-LEVEL SPECIFICATION:
Here is an extended explanation of table elimination.
Table elimination is a feature found in some modern query optimizers, of
which Microsoft SQL Server 2005/2008 seems to have the most advanced
implementation. Oracle 11g has also been confirmed to use table
elimination but not to the same extent.
Basically, what table elimination does, is to remove tables from the
execution plan when it is unnecessary to include them. This can, of
course, only happen if the right circumstances arise. Let us for example
look at the following query:
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id;
When using A as the left table we ensure that the query will return at
least as many rows as there are in that table. For rows where the join
condition (B.id = A.id) is not met the selected column (A.colA) will
still contain it's original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is
found in tableA if there are duplicates of the column B.id in tableB. If
A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"]
then two rows will match in the join condition. The only way to know
what the result will look like is to actually touch both tables during
execution.
Instead, let's say that tableB contains rows that make it possible to
place a unique constraint on the column B.id, for example and often the
case a primary key. In this situation we know that we will get exactly
as many rows as there are in tableA, since joining with tableB cannot
introduce any duplicates. If further, as in the example query, we do not
select any columns from tableB, touching that table during execution is
unnecessary. We can remove the whole join operation from the execution
plan.
Both SQL Server 2005/2008 and Oracle 11g will deploy table elimination
in the case described above. Let us look at a more advanced query, where
Oracle fails.
select
A.colA
from
tableA A
left outer join
tableB B
on
B.id = A.id
and
B.fromDate = (
select
max(sub.fromDate)
from
tableB sub
where
sub.id = A.id
);
In this example we have added another join condition, which ensures
that we only pick the matching row from tableB having the latest
fromDate. In this case tableB will contain duplicates of the column
B.id, so in order to ensure uniqueness the primary key has to contain
the fromDate column as well. In other words the primary key of tableB
is (B.id, B.fromDate).
Furthermore, since the subselect ensures that we only pick the latest
B.fromDate for a given B.id we know that at most one row will match
the join condition. We will again have the situation where joining
with tableB cannot affect the number of rows in the result set. Since
we do not select any columns from tableB, the whole join operation can
be eliminated from the execution plan.
SQL Server 2005/2008 will deploy table elimination in this situation as
well. We have not found a way to make Oracle 11g use it for this type of
query. Queries like these arise in two situations. Either when you have
denormalized model consisting of a fact table with several related
dimension tables, or when you have a highly normalized model where each
attribute is stored in its own table. The example with the subselect is
common whenever you store historized/versioned data.
LOW-LEVEL DESIGN:
The code (currently in development) is at lp:
~maria-captains/maria/maria-5.1-table-elimination tree.
<contents>
1. Conditions for removal
1.1 Quick check if there are candidates
2. Removal operation properties
3. Removal operation
4. User interface
5. Tests and benchmarks
6. Todo, issues to resolve
6.1 To resolve
6.2 Resolved
7. Additional issues
</contents>
It's not really about elimination of tables, it's about elimination of inner
sides of outer joins.
1. Conditions for removal
-------------------------
We can eliminate an inner side of outer join if:
1. For each record combination of outer tables, it will always produce
exactly one record.
2. There are no references to columns of the inner tables anywhere else in
the query.
#1 means that every table inside the outer join nest is:
- is a constant table:
= because it can be accessed via eq_ref(const) access, or
= it is a zero-rows or one-row MyISAM-like table [MARK1]
- has an eq_ref access method candidate.
#2 means that WHERE clause, ON clauses of embedding outer joins, ORDER BY,
GROUP BY and HAVING do not refer to the inner tables of the outer join
nest.
1.1 Quick check if there are candidates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before we start to enumerate join nests, here is a quick way to check if
there *can be* something to be removed:
if ((tables used in select_list |
tables used in group/order by UNION |
tables used in where) != bitmap_of_all_tables)
{
attempt table elimination;
}
2. Removal operation properties
-------------------------------
* There is always one way to remove (no choice to remove either this or that)
* It is always better to remove as much tables as possible (at least within
our cost model).
Thus, no need for any cost calculations/etc. It's an unconditional rewrite.
3. Removal operation
--------------------
* Remove the outer join nest's nested join structure (i.e. get the
outer join's TABLE_LIST object $OJ and remove it from $OJ->embedding,
$OJ->embedding->nested_join. Update table_map's of all ancestor nested
joins). [MARK2]
* Move the tables and their JOIN_TABs to front like it is done with const
tables, with exception that if eliminated outer join nest was within
another outer join nest, that shouldn't prevent us from moving away the
eliminated tables.
* Update join->table_count and all-join-tables bitmap.
* That's it. Nothing else?
4. User interface
-----------------
* We'll add an @@optimizer switch flag for table elimination. Tentative
name: 'table_elimination'.
(Note ^^ utility of the above questioned ^, as table elimination can never
be worse than no elimination. We're leaning towards not adding the flag)
* EXPLAIN will not show the removed tables at all. This will allow to check
if tables were removed, and also will behave nicely with anchor model and
VIEWs: stuff that user doesn't care about just won't be there.
5. Tests and benchmarks
-----------------------
Create a benchmark in sql-bench which checks if the DBMS has table
elimination.
[According to Monty] Run
- queries that would use elimination
- queries that are very similar to one above (so that they would have same
QEP, execution cost, etc) but cannot use table elimination.
6. Todo, issues to resolve
--------------------------
6.1 To resolve
~~~~~~~~~~~~~~
- Re-check how this works with equality propagation.
- Relationship with prepared statements.
On one hand, it's natural to desire to make table elimination a
once-per-statement operation, like outer->inner join conversion. We'll have
to limit the applicability by removing [MARK1] as that can change during
lifetime of the statement.
The other option is to do table elimination every time. This will require to
rework operation [MARK2] to be undoable.
I'm leaning towards doing the former. With anchor modeling, it is unlikely
that we'll meet outer joins which have N inner tables of which some are 1-row
MyISAM tables that do not have primary key.
6.2 Resolved
~~~~~~~~~~~~
* outer->inner join conversion is not a problem for table elimination.
We make outer->inner conversions based on predicates in WHERE. If the WHERE
referred to an inner table (requirement for OJ->IJ conversion) then table
elimination would not be applicable anyway.
* For Multi-table UPDATEs/DELETEs, need to also analyze the SET clause:
- affected tables must not be eliminated
- tables that are used on the right side of the SET x=y assignments must
not be eliminated either.
* Aggregate functions used to report that they depend on all tables, that is,
item_agg_func->used_tables() == (1ULL << join->tables) - 1
always. Fixed it, now aggregate function reports it depends on
tables that its arguments depend on. In particular, COUNT(*) reports
that it depends on no tables (item_count_star->used_tables()==0).
One consequence of that is that "item->used_tables()==0" is not
equivalent to "item->const_item()==true" anymore (not sure if it's
"anymore" or this has been already happening).
* EXPLAIN EXTENDED warning text was generated after the JOIN object has
been discarded. This didn't allow to use information about join plan
when printing the warning. Fixed this by keeping the JOIN objects until
we've printed the warning (have also an intent to remove the const
tables from the join output).
7. Additional issues
--------------------
* We remove ON clauses within semi-join nests. If these clauses contain
subqueries, they probably should be gone from EXPLAIN output also?
* Table elimination is performed after constant table detection (but before
the range analysis). Constant tables are technically different from
eliminated ones (e.g. the former are shown in EXPLAIN and the latter aren't).
Considering we've already done the join_read_const_table() call, is there any
real difference between constant table and eliminated one? If there is, should
we mark const tables also as eliminated?
from user/EXPLAIN point of view: no. constant table is the one that we read
one record from. eliminated table is the one that we don't acccess at all.
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)
1
0