Re: [Maria-developers] Rev 2765: Change Field_enumerator to enumerate Item_field-s not Field-s. in file:///home/psergey/dev/maria-5.3-subqueries-r7/

27 Feb 2010

On Sun, Feb 21, 2010 at 09:07:34AM +0200, Sergey Petrunya wrote:
...
On Sun, Feb 21, 2010 at 09:32:26AM +0300, Sergey Petrunya wrote:
...
At file:///home/psergey/dev/maria-5.3-subqueries-r7/
------------------------------------------------------------
revno: 2765
revision-id: psergey@askmonty.org-20100221063223-h0f7u2low7rtjixc
parent: psergey@askmonty.org-20100221033618-83dgm2h9ingzmhcc
committer: Sergey Petrunya <psergey@askmonty.org>
branch nick: maria-5.3-subqueries-r7
timestamp: Sun 2010-02-21 08:32:23 +0200
message:
  Change Field_enumerator to enumerate Item_field-s not Field-s.
  In Item_ref::fix_fields() do invoke mark_as_dependent() for outside 
  references in all cases (see email for more details)
The problem occurs only when using --ps-protocol and manifests itself as
follows:
CURRENT_TEST: main.subselect
mysqltest: At line 258: query 'select numeropost as a FROM t1 ORDER BY (SELECT 1 FROM t1 HAVING a=1)' succeeded - should have failed with errno 1242...
The result from queries just before the failure was:
< snip >
CREATE TABLE `t1` (
`numeropost` mediumint(8) unsigned NOT NULL auto_increment,
`maxnumrep` int(10) unsigned NOT NULL default '0',
PRIMARY KEY  (`numeropost`),
UNIQUE KEY `maxnumrep` (`maxnumrep`)
) ENGINE=MyISAM ROW_FORMAT=FIXED;
INSERT INTO t1 (numeropost,maxnumrep) VALUES (1,0),(2,1);
select numeropost as a FROM t1 GROUP BY (SELECT 1 FROM t1 HAVING a=1);
ERROR 21000: Subquery returns more than 1 row
The lack of error is caused by "HAVING a=1" not being true, which itself is
caused by "a" having garbage, because the optimizer evaluates the subquery
before it starts exection of the upper select, which it does because it sees
that
subselect_item->used_tables() == 0  (*)
which is wrong.
(*) occurs because during EXECUTE running fix_fields() for "HAVING a=1" does
not trigger any mark_as_dependent() calls.
(I'm not sure why this happens in this case but not in others. Normally,
fix_fields() call for outer references will cause mark_as_dependent calls. 
There is a whole mark_select_range_as_dependent() function which is called 
only by EXECUTE (and not by regular non-PS execution).
Phyisically it looks as follows:
bool Item_ref::fix_fields(THD *thd, Item **reference)
{
  ...
  if (!ref || ref == not_found_item)
  {
     (All mark_as_dependent() or mark_select_range_as_dependent() calls are
     made from here)
  }
  DBUG_ASSERT(*ref);
  ...
}
For the example query, the if branch is taken during PREPARE command, but not 
in EXECUTE. This means that during EXECUTE branch we don't call
mark.*as_dependent(), which causes subselect's outside reference not to be
recorded, and then subselect_item->used_tables()==0 which leads to errors.
To avoid the problem, I've changed the code to be:
bool Item_ref::fix_fields(THD *thd, Item **reference)
{
  ...
  if (!ref || ref == not_found_item)
  {
     (All mark_as_dependent() or mark_select_range_as_dependent() calls are
     made from here)
  }
  else
  {
    (Enumerate all table field references in *ref and call mark_as_dependent()
    as necessary)
  }
  DBUG_ASSERT(*ref);
  ...
}
An alternative would be to make Item_ref resolve its reference in every
fix_fields() call. So I've tried to do that in this way:

=== modified file 'sql/item.cc'

--- sql/item.cc 2010-02-24 11:33:42 +0000
+++ sql/item.cc 2010-02-27 15:51:00 +0000
@@ -6141,6 +6141,7 @@ void Item_ref::cleanup()
 {
   DBUG_ENTER("Item_ref::cleanup");
   Item_ident::cleanup();
+  ref= 0; //psergey-try
   result_field= 0;
   DBUG_VOID_RETURN;
 }


And then I get numerous failures like this:

./mysql-test-run --retry=1 --ps-protocol t/subselect.test

CURRENT_TEST: main.subselect
mysqltest: At line 101: query 'select * from t3 where a not in (select b from t2)' failed: 1054: Unknown column '<no matter>.<left expr>' in 'where clause'

This means that Item_ref is unable to re-resolve Item_ref::ref on second re-execution.

The question for Sanja is whether it's possible to make Item_ref re-resolve 
Item_ref::ref and make appropriate mark_*as_dependent() calls.

BR
 Sergey
-- 
Sergey Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog