[Maria-developers] Rev 2765: Change Field_enumerator to enumerate Item_field-s not Field-s. in file:///home/psergey/dev/maria-5.3-subqueries-r7/
At file:///home/psergey/dev/maria-5.3-subqueries-r7/ ------------------------------------------------------------ revno: 2765 revision-id: psergey@askmonty.org-20100221063223-h0f7u2low7rtjixc parent: psergey@askmonty.org-20100221033618-83dgm2h9ingzmhcc committer: Sergey Petrunya <psergey@askmonty.org> branch nick: maria-5.3-subqueries-r7 timestamp: Sun 2010-02-21 08:32:23 +0200 message: Change Field_enumerator to enumerate Item_field-s not Field-s. In Item_ref::fix_fields() do invoke mark_as_dependent() for outside references in all cases (see email for more details) === modified file 'sql/item.cc' --- a/sql/item.cc 2010-02-11 23:59:58 +0000 +++ b/sql/item.cc 2010-02-21 06:32:23 +0000 @@ -1959,7 +1959,7 @@ bool Item_field::enumerate_field_refs_processor(uchar *arg) { Field_enumerator *fe= (Field_enumerator*)arg; - fe->visit_field(field); + fe->visit_field(this); return FALSE; } @@ -5779,6 +5779,35 @@ set_properties(); } +/* + A Field_enumerator-compatible class that invokes mark_as_dependent() for + each field that is a reference to some ancestor of current_select. +*/ +class Dependency_marker: public Field_enumerator +{ +public: + THD *thd; + st_select_lex *current_select; + virtual void visit_field(Item_field *item) + { + // Find which select the field is in. This is achieved by walking up + // the select tree and looking for the table of interest. + st_select_lex *sel; + for (sel= current_select; sel; sel= sel->outer_select()) + { + TABLE_LIST *tbl; + for (tbl= sel->leaf_tables; tbl; tbl= tbl->next_leaf) + { + if (tbl->table == item->field->table) + { + if (sel != current_select) + mark_as_dependent(thd, sel, current_select, item, item); + return; + } + } + } + } +}; /** Resolve the name of a reference to a column reference. @@ -6038,6 +6067,20 @@ last_checked_context->select_lex->nest_level); } } + else + { + ; + /* + It could be that we're referring to something that's in ancestor selects. + We must make an appropriate mark_as_dependent() call for each such + outside reference. + */ + Dependency_marker dep_marker; + dep_marker.current_select= current_sel; + dep_marker.thd= thd; + (*ref)->walk(&Item::enumerate_field_refs_processor, FALSE, + (uchar*)&dep_marker); + } DBUG_ASSERT(*ref); /* === modified file 'sql/item.h' --- a/sql/item.h 2010-02-21 03:36:18 +0000 +++ b/sql/item.h 2010-02-21 06:32:23 +0000 @@ -1134,7 +1134,7 @@ class Field_enumerator { public: - virtual void visit_field(Field *field)= 0; + virtual void visit_field(Item_field *field)= 0; virtual ~Field_enumerator() {}; /* purecov: inspected */ }; === modified file 'sql/item_subselect.cc' --- a/sql/item_subselect.cc 2010-02-21 03:36:18 +0000 +++ b/sql/item_subselect.cc 2010-02-21 06:32:23 +0000 @@ -319,13 +319,13 @@ public: table_map used_tables; /* Collect used_tables here */ st_select_lex *new_parent; /* Select we're in */ - virtual void visit_field(Field *field) + virtual void visit_field(Item_field *item) { //for (TABLE_LIST *tbl= new_parent->leaf_tables; tbl; tbl= tbl->next_local) //{ // if (tbl->table == field->table) // { - used_tables|= field->table->map; + used_tables|= item->field->table->map; // return; // } //} === modified file 'sql/opt_table_elimination.cc' --- a/sql/opt_table_elimination.cc 2010-01-17 14:51:10 +0000 +++ b/sql/opt_table_elimination.cc 2010-02-21 06:32:23 +0000 @@ -922,8 +922,9 @@ Field_dependency_recorder(Dep_analysis_context *ctx_arg): ctx(ctx_arg) {} - void visit_field(Field *field) + void visit_field(Item_field *item) { + Field *field= item->field; Dep_value_table *tbl_dep; if ((tbl_dep= ctx->table_deps[field->table->tablenr])) {
On Sun, Feb 21, 2010 at 09:32:26AM +0300, Sergey Petrunya wrote:
At file:///home/psergey/dev/maria-5.3-subqueries-r7/
------------------------------------------------------------ revno: 2765 revision-id: psergey@askmonty.org-20100221063223-h0f7u2low7rtjixc parent: psergey@askmonty.org-20100221033618-83dgm2h9ingzmhcc committer: Sergey Petrunya <psergey@askmonty.org> branch nick: maria-5.3-subqueries-r7 timestamp: Sun 2010-02-21 08:32:23 +0200 message: Change Field_enumerator to enumerate Item_field-s not Field-s. In Item_ref::fix_fields() do invoke mark_as_dependent() for outside references in all cases (see email for more details)
The problem occurs only when using --ps-protocol and manifests itself as follows: CURRENT_TEST: main.subselect mysqltest: At line 258: query 'select numeropost as a FROM t1 ORDER BY (SELECT 1 FROM t1 HAVING a=1)' succeeded - should have failed with errno 1242... The result from queries just before the failure was: < snip > CREATE TABLE `t1` ( `numeropost` mediumint(8) unsigned NOT NULL auto_increment, `maxnumrep` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`numeropost`), UNIQUE KEY `maxnumrep` (`maxnumrep`) ) ENGINE=MyISAM ROW_FORMAT=FIXED; INSERT INTO t1 (numeropost,maxnumrep) VALUES (1,0),(2,1); select numeropost as a FROM t1 GROUP BY (SELECT 1 FROM t1 HAVING a=1); ERROR 21000: Subquery returns more than 1 row The lack of error is caused by "HAVING a=1" not being true, which itself is caused by "a" having garbage, because the optimizer evaluates the subquery before it starts exection of the upper select, which it does because it sees that subselect_item->used_tables() == 0 (*) which is wrong. (*) occurs because during EXECUTE running fix_fields() for "HAVING a=1" does not trigger any mark_as_dependent() calls. (I'm not sure why this happens in this case but not in others. Normally, fix_fields() call for outer references will cause mark_as_dependent calls. There is a whole mark_select_range_as_dependent() function which is called only by EXECUTE (and not by regular non-PS execution). Phyisically it looks as follows: bool Item_ref::fix_fields(THD *thd, Item **reference) { ... if (!ref || ref == not_found_item) { (All mark_as_dependent() or mark_select_range_as_dependent() calls are made from here) } DBUG_ASSERT(*ref); ... } For the example query, the if branch is taken during PREPARE command, but not in EXECUTE. This means that during EXECUTE branch we don't call mark.*as_dependent(), which causes subselect's outside reference not to be recorded, and then subselect_item->used_tables()==0 which leads to errors. To avoid the problem, I've changed the code to be: bool Item_ref::fix_fields(THD *thd, Item **reference) { ... if (!ref || ref == not_found_item) { (All mark_as_dependent() or mark_select_range_as_dependent() calls are made from here) } else { (Enumerate all table field references in *ref and call mark_as_dependent() as necessary) } DBUG_ASSERT(*ref); ... } BR Sergey -- Sergey Petrunia, Software Developer Monty Program AB, http://askmonty.org Blog: http://s.petrunia.net/blog
On Sun, Feb 21, 2010 at 09:07:34AM +0200, Sergey Petrunya wrote:
On Sun, Feb 21, 2010 at 09:32:26AM +0300, Sergey Petrunya wrote:
At file:///home/psergey/dev/maria-5.3-subqueries-r7/
------------------------------------------------------------ revno: 2765 revision-id: psergey@askmonty.org-20100221063223-h0f7u2low7rtjixc parent: psergey@askmonty.org-20100221033618-83dgm2h9ingzmhcc committer: Sergey Petrunya <psergey@askmonty.org> branch nick: maria-5.3-subqueries-r7 timestamp: Sun 2010-02-21 08:32:23 +0200 message: Change Field_enumerator to enumerate Item_field-s not Field-s. In Item_ref::fix_fields() do invoke mark_as_dependent() for outside references in all cases (see email for more details)
The problem occurs only when using --ps-protocol and manifests itself as follows:
CURRENT_TEST: main.subselect mysqltest: At line 258: query 'select numeropost as a FROM t1 ORDER BY (SELECT 1 FROM t1 HAVING a=1)' succeeded - should have failed with errno 1242...
The result from queries just before the failure was: < snip > CREATE TABLE `t1` ( `numeropost` mediumint(8) unsigned NOT NULL auto_increment, `maxnumrep` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`numeropost`), UNIQUE KEY `maxnumrep` (`maxnumrep`) ) ENGINE=MyISAM ROW_FORMAT=FIXED; INSERT INTO t1 (numeropost,maxnumrep) VALUES (1,0),(2,1); select numeropost as a FROM t1 GROUP BY (SELECT 1 FROM t1 HAVING a=1); ERROR 21000: Subquery returns more than 1 row
The lack of error is caused by "HAVING a=1" not being true, which itself is caused by "a" having garbage, because the optimizer evaluates the subquery before it starts exection of the upper select, which it does because it sees that
subselect_item->used_tables() == 0 (*)
which is wrong.
(*) occurs because during EXECUTE running fix_fields() for "HAVING a=1" does not trigger any mark_as_dependent() calls.
(I'm not sure why this happens in this case but not in others. Normally, fix_fields() call for outer references will cause mark_as_dependent calls. There is a whole mark_select_range_as_dependent() function which is called only by EXECUTE (and not by regular non-PS execution).
Phyisically it looks as follows:
bool Item_ref::fix_fields(THD *thd, Item **reference) { ... if (!ref || ref == not_found_item) { (All mark_as_dependent() or mark_select_range_as_dependent() calls are made from here) } DBUG_ASSERT(*ref); ... }
For the example query, the if branch is taken during PREPARE command, but not in EXECUTE. This means that during EXECUTE branch we don't call mark.*as_dependent(), which causes subselect's outside reference not to be recorded, and then subselect_item->used_tables()==0 which leads to errors.
To avoid the problem, I've changed the code to be:
bool Item_ref::fix_fields(THD *thd, Item **reference) { ... if (!ref || ref == not_found_item) { (All mark_as_dependent() or mark_select_range_as_dependent() calls are made from here) } else { (Enumerate all table field references in *ref and call mark_as_dependent() as necessary) } DBUG_ASSERT(*ref); ... }
An alternative would be to make Item_ref resolve its reference in every fix_fields() call. So I've tried to do that in this way: === modified file 'sql/item.cc' --- sql/item.cc 2010-02-24 11:33:42 +0000 +++ sql/item.cc 2010-02-27 15:51:00 +0000 @@ -6141,6 +6141,7 @@ void Item_ref::cleanup() { DBUG_ENTER("Item_ref::cleanup"); Item_ident::cleanup(); + ref= 0; //psergey-try result_field= 0; DBUG_VOID_RETURN; } And then I get numerous failures like this: ./mysql-test-run --retry=1 --ps-protocol t/subselect.test CURRENT_TEST: main.subselect mysqltest: At line 101: query 'select * from t3 where a not in (select b from t2)' failed: 1054: Unknown column '<no matter>.<left expr>' in 'where clause' This means that Item_ref is unable to re-resolve Item_ref::ref on second re-execution. The question for Sanja is whether it's possible to make Item_ref re-resolve Item_ref::ref and make appropriate mark_*as_dependent() calls. BR Sergey -- Sergey Petrunia, Software Developer Monty Program AB, http://askmonty.org Blog: http://s.petrunia.net/blog
participants (1)
-
Sergey Petrunya