Hi Timour, On Fri, Nov 18, 2011 at 11:16:27PM +0200, Timour Katchaounov wrote:
Sergey,
Please review the following patch. Do you have any idea about the change in func_compress.result ? I've seen that before, but don't remember what was the issue.
I think in this case it is fairly clear - a warning is produced every time a constant expression is evaluated. Your patch makes one less evaluation, so one fewer warning is emitted. The patch is ok to push after the below notes are addressed.
------------------------------------------------------------ revno: 3296 revision-id: timour@askmonty.org-20111118201920-fac3c9870u7eh898 parent: igor@askmonty.org-20111118173551-78xd3uuu5h25756u fixes bug(s): https://launchpad.net/bugs/833777 committer: timour@askmonty.org branch nick: 5.3-lpb833777 timestamp: Fri 2011-11-18 22:19:20 +0200 message: Fix bug lp:833777
Analysis: The optimizer distinguishes two kinds of 'constant' conditions: expensive ones, and non-expensive ones. The non-expensive conditions are evaluated inside make_join_select(), and if false, already the optimizer detects empty query results.
In order to avoid arbitrarily expensive optimization, the evaluation of expensive constant conditions is delayed until execution. These conditions are attached to JOIN::exec_const_cond and evaluated in the beginning of JOIN::exec. The relevant execution logic is:
JOIN::exec() { if (! join->exec_const_cond->val_int()) { produce an empty result; stop execution } continue execution execute the original WHERE clause (that contains exec_const_cond) ... }
As a result, when an expensive constant condition is TRUE, it is evaluated twice - once through JOIN::exec_const_cond, and once through JOIN::cond. When the expensive constant condition is a subquery, predicate, the subquery is evaluated twice. If we have many levels of subqueries, this logic results in a chain of recursive subquery executions that walk a perfect binary tree. The result is that for subquries with depth N, JOIN::exec is executed O(2^N) times.
Solution: Notice that the second execution of the constant conditions happens inside do_select(), in the branch: if (join->table_count == join->const_tables) { ... } In this case exec_const_cond is equivalent to the whole WHERE clause, therefore the WHERE clause has already been checked in the beginnig of JOIN::exec, and has been found to be true. The bug is addressed by not evaluating the WHERE clause if there was exec_const_conds, and it was TRUE.
=== modified file 'mysql-test/r/func_compress.result' --- a/mysql-test/r/func_compress.result 2011-02-03 15:00:28 +0000 +++ b/mysql-test/r/func_compress.result 2011-11-18 20:19:20 +0000 @@ -103,7 +103,6 @@ foo Warnings: Error 1259 ZLIB: Input data corrupted Error 1259 ZLIB: Input data corrupted -Error 1259 ZLIB: Input data corrupted explain select *, uncompress(a) from t1; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE t1 system NULL NULL NULL NULL 1
=== modified file 'mysql-test/r/subselect.result' --- a/mysql-test/r/subselect.result 2011-11-15 21:03:00 +0000 +++ b/mysql-test/r/subselect.result 2011-11-18 20:19:20 +0000 @@ -5665,4 +5665,14 @@ ERROR 21000: Subquery returns more than SET SESSION sql_mode=@old_sql_mode; DEALLOCATE PREPARE stmt; DROP TABLE t1; +# +# LP BUG#833777 Performance regression with deeply nested subqueries +# +create table t1 (a int not null, b char(10) not null); +insert into t1 values (1, 'a'); +set @@optimizer_switch='in_to_exists=on,semijoin=off,materialization=off,subquery_cache=off'; +select benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1)))))))))))))))))))))))))))))))))); +benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a +0
Why use 'benchmark' function? It seems totally redundant. Please change the test to look like: select (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in ... (select a from t1) ))))))))))))))))))))))))))))))))) AS foo; to make it less cluttered.
+drop table t1; set optimizer_switch=@subselect_tmp;
=== modified file 'mysql-test/r/subselect_no_mat.result' --- a/mysql-test/r/subselect_no_mat.result 2011-11-15 21:03:00 +0000 +++ b/mysql-test/r/subselect_no_mat.result 2011-11-18 20:19:20 +0000 @@ -5670,6 +5670,16 @@ ERROR 21000: Subquery returns more than SET SESSION sql_mode=@old_sql_mode; DEALLOCATE PREPARE stmt; DROP TABLE t1; +# +# LP BUG#833777 Performance regression with deeply nested subqueries +# +create table t1 (a int not null, b char(10) not null); +insert into t1 values (1, 'a'); +set @@optimizer_switch='in_to_exists=on,semijoin=off,materialization=off,subquery_cache=off'; +select benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1)))))))))))))))))))))))))))))))))); +benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a +0 +drop table t1; set optimizer_switch=@subselect_tmp; set optimizer_switch=default; select @@optimizer_switch like '%materialization=on%';
=== modified file 'mysql-test/r/subselect_no_opts.result' --- a/mysql-test/r/subselect_no_opts.result 2011-11-15 21:03:00 +0000 +++ b/mysql-test/r/subselect_no_opts.result 2011-11-18 20:19:20 +0000 @@ -5666,5 +5666,15 @@ ERROR 21000: Subquery returns more than SET SESSION sql_mode=@old_sql_mode; DEALLOCATE PREPARE stmt; DROP TABLE t1; +# +# LP BUG#833777 Performance regression with deeply nested subqueries +# +create table t1 (a int not null, b char(10) not null); +insert into t1 values (1, 'a'); +set @@optimizer_switch='in_to_exists=on,semijoin=off,materialization=off,subquery_cache=off'; +select benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1)))))))))))))))))))))))))))))))))); +benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a +0 +drop table t1; set optimizer_switch=@subselect_tmp; set @optimizer_switch_for_subselect_test=null;
=== modified file 'mysql-test/r/subselect_no_scache.result' --- a/mysql-test/r/subselect_no_scache.result 2011-11-15 21:03:00 +0000 +++ b/mysql-test/r/subselect_no_scache.result 2011-11-18 20:19:20 +0000 @@ -5669,6 +5669,16 @@ ERROR 21000: Subquery returns more than SET SESSION sql_mode=@old_sql_mode; DEALLOCATE PREPARE stmt; DROP TABLE t1; +# +# LP BUG#833777 Performance regression with deeply nested subqueries +# +create table t1 (a int not null, b char(10) not null); +insert into t1 values (1, 'a'); +set @@optimizer_switch='in_to_exists=on,semijoin=off,materialization=off,subquery_cache=off'; +select benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1)))))))))))))))))))))))))))))))))); +benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a +0 +drop table t1; set optimizer_switch=@subselect_tmp; set optimizer_switch=default; select @@optimizer_switch like '%subquery_cache=on%';
=== modified file 'mysql-test/r/subselect_no_semijoin.result' --- a/mysql-test/r/subselect_no_semijoin.result 2011-11-15 21:03:00 +0000 +++ b/mysql-test/r/subselect_no_semijoin.result 2011-11-18 20:19:20 +0000 @@ -5666,5 +5666,15 @@ ERROR 21000: Subquery returns more than SET SESSION sql_mode=@old_sql_mode; DEALLOCATE PREPARE stmt; DROP TABLE t1; +# +# LP BUG#833777 Performance regression with deeply nested subqueries +# +create table t1 (a int not null, b char(10) not null); +insert into t1 values (1, 'a'); +set @@optimizer_switch='in_to_exists=on,semijoin=off,materialization=off,subquery_cache=off'; +select benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1)))))))))))))))))))))))))))))))))); +benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a +0 +drop table t1; set optimizer_switch=@subselect_tmp; set @optimizer_switch_for_subselect_test=null;
=== modified file 'mysql-test/t/subselect.test' --- a/mysql-test/t/subselect.test 2011-11-15 21:03:00 +0000 +++ b/mysql-test/t/subselect.test 2011-11-18 20:19:20 +0000 @@ -4768,4 +4768,14 @@ SET SESSION sql_mode=@old_sql_mode; DEALLOCATE PREPARE stmt; DROP TABLE t1;
+--echo # +--echo # LP BUG#833777 Performance regression with deeply nested subqueries +--echo # + +create table t1 (a int not null, b char(10) not null); +insert into t1 values (1, 'a'); +set @@optimizer_switch='in_to_exists=on,semijoin=off,materialization=off,subquery_cache=off'; +select benchmark(1, (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1 where a in (select a from t1)))))))))))))))))))))))))))))))))); +drop table t1; + set optimizer_switch=@subselect_tmp;
=== modified file 'sql/sql_select.cc' --- a/sql/sql_select.cc 2011-11-15 21:03:00 +0000 +++ b/sql/sql_select.cc 2011-11-18 20:19:20 +0000 @@ -14782,9 +14782,15 @@ do_select(JOIN *join,List<Item> *fields, { /* HAVING will be checked after processing aggregate functions, - But WHERE should checkd here (we alredy have read tables) + But WHERE should checkd here (we alredy have read tables). + If there is join->exec_const_cond, and all tables are constant, then it + is equivalent to join->conds. exec_const_cond is already checked in the + beginning of JOIN::exec. If it is false, JOIN::exec returns zero + result already there, therefore execution reaches this point only if + exec_const_cond is TRUE. Since it is equvalent to join->conds, then + join->conds is also TRUE. */ - if (!join->conds || join->conds->val_int()) + if (!join->conds || join->exec_const_cond || join->conds->val_int()) { error= (*end_select)(join, 0, 0); if (error == NESTED_LOOP_OK || error == NESTED_LOOP_QUERY_LIMIT)
_______________________________________________ commits mailing list commits@mariadb.org https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
-- BR Sergei -- Sergei Petrunia, Software Developer Monty Program AB, http://askmonty.org Blog: http://s.petrunia.net/blog