[Maria-developers] GROUP BY with non-grouped column returns NULL in some cases?
Hi, I've noticed a difference in MariaDB's behavior compared to its documentation and compared to MySQL 5.1 (didn't check with MySQL 5.5 or 5.6 yet), and I wanted to know what's your opinion -- is it a bug or a conscious difference and documentation is just lagging. This doc https://mariadb.com/kb/en/select/#group-by says: "If you select a non-grouped column or a value computed from a non-grouped column, it is undefined which row the returned value is taken from". I believe this implies that selected value must be returned from a row group, not just something random. But I have a scenario when NULL is returned instead of real values. I understand that this is completely undefined behavior in SQL standard, but MySQL and MariaDB seem to have a documented support for this, that's why I'm asking. Here's the mysql session log: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 5 Server version: 10.0.8-MariaDB-log Source distribution Copyright (c) 2000, 2013, Oracle, Monty Program Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> create database test; Query OK, 1 row affected (0.00 sec) MariaDB [(none)]> use test; Database changed MariaDB [test]> CREATE TABLE t (id bigint(20), name varchar(30), parent bigint(20), primary key (id)) ENGINE=InnoDB; Query OK, 0 rows affected (0.27 sec) MariaDB [test]> INSERT INTO t values (1, 'a', NULL), (4, 'b', 1), (5, 'c', 4), (10, 'd', 1); Query OK, 4 rows affected (0.04 sec) Records: 4 Duplicates: 0 Warnings: 0 MariaDB [test]> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT OUTER JOIN t t2 ON (t1.parent = t2.id and t2.name <> 'a') GROUP BY t1.id; +----+------+------+------+ | id | name | id | name | +----+------+------+------+ | 1 | a | NULL | NULL | | 4 | b | NULL | NULL | | 5 | c | NULL | NULL | | 10 | d | NULL | NULL | +----+------+------+------+ 4 rows in set (0.00 sec) MariaDB [test]> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT OUTER JOIN t t2 ON (t1.parent = t2.id and t2.name <> 'a'); +----+------+------+------+ | id | name | id | name | +----+------+------+------+ | 1 | a | NULL | NULL | | 4 | b | NULL | NULL | | 5 | c | 4 | b | | 10 | d | NULL | NULL | +----+------+------+------+ 4 rows in set (0.00 sec) MariaDB [test]> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT OUTER JOIN t t2 ON (t1.parent = t2.id and t2.name <> 'a') GROUP BY t1.id, t1.name; +----+------+------+------+ | id | name | id | name | +----+------+------+------+ | 1 | a | NULL | NULL | | 4 | b | NULL | NULL | | 5 | c | 4 | b | | 10 | d | NULL | NULL | +----+------+------+------+ 4 rows in set (0.00 sec) MariaDB [test]> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT OUTER JOIN t t2 ON (t1.parent = t2.id) GROUP BY t1.id; +----+------+------+------+ | id | name | id | name | +----+------+------+------+ | 1 | a | NULL | NULL | | 4 | b | 1 | a | | 5 | c | 4 | b | | 10 | d | 1 | a | +----+------+------+------+ 4 rows in set (0.00 sec) As you see the first query returns NULLs instead of real values which can be seen in second query without GROUP BY. And if I add another column to GROUP BY or remove one condition from JOIN I can see real values again. I've also tested with the table without PRIMARY KEY and it also returned real values in the first query. So is this a bug? Thank you, Pavel
Hi, On Thu, Feb 20, 2014 at 10:28:52AM -0800, Pavel Ivanov wrote:
I've noticed a difference in MariaDB's behavior compared to its documentation and compared to MySQL 5.1 (didn't check with MySQL 5.5 or 5.6 yet), and I wanted to know what's your opinion -- is it a bug or a conscious difference and documentation is just lagging.
This doc https://mariadb.com/kb/en/select/#group-by says: "If you select a non-grouped column or a value computed from a non-grouped column, it is undefined which row the returned value is taken from". I believe this implies that selected value must be returned from a row group, not just something random.
I would agree with this reasoning. MySQL 5.5, MariaDB 5.5, and MySQL 5.6 will all return (5,c,4,b). MariaDB 10.0 returns (5,c,NULL,NULL). I have investigated where the NULL values come from. They come from the next value group. === TL;DR: investigation result == Let's run this on 10.0 (I have added a WHERE to remove irrelevant rows): mysql> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT OUTER JOIN t t2 ON (t1.parent = t2.id and t2.name <> 'a') where t1.id > 4 GROUP BY t1.id; +----+------+------+------+ | id | name | id | name | +----+------+------+------+ | 5 | c | NULL | NULL | | 10 | d | NULL | NULL | +----+------+------+------+ The query has GROUP BY, which is not optimized away (by either MySQL or MariaDB). GROUP BY is satisified by using "range" access on t1, which produces records in the order where groups follow one another. First, end_send_group() sees the row (5,c,...). This is the first row, so it starts a group. Then, end_send_group() is invoked for row (10,d,...). It sees that the GROUP BY expression value has changed, and tries to print the first group. The execution is here: #0 Field::is_null #1 Protocol_text::store #2 Item_field::send #3 Protocol::send_result_set_row #4 select_send::send_data #5 end_send_group #6 evaluate_null_complemented_join_record #7 sub_select #8 evaluate_join_record #9 sub_select The field object at #0 points to the "result field" of table t2 (a TABLE object has "current row" fields and "result fields". AFAIU, the latter are kind of buffers that are used e.g. for stashing away GROUP BY columns). Unfortunately, Field::is_null() checks table->null_row, which is 1, because row combination (10,d, ...) has a NULL-complemented row for t2. This is why we get NULLs instead of some value from (5,c,...) group. == Where does this come from == It seems, MySQL 5.6 had this bug in the past. We've got the code line where the problem happens in revid: monty@askmonty.org-20130325220313-r7oxaau0x0r5ead1, Temporary commit of 10.0-merge Since then, MySQL 5.6 has fixed it, but their related commits look scary (fix for another fix which doesn't fully fix it?): revid: sergey.glukhov@oracle.com-20130619102408-0a0bf6sl5d7tsys Bug#16620047 INCORRECT QUERY RESULT (AFTER SERVER UPGRADE) Regression is introduced in Bug11760517. According to the comment for the Bug11760517 fix, there is an additional cleanup in field.h(Field::is_null() function): Field::is_null() now checks table->null_row before checking the NULL bits. Order of NULLity checking is changed, firts it checks table->null_row and then null_ptr[row_offset]. In our case when we evaluate first result record, evaluate_null_complemented_join_record() function is called for outer table and it sets table->null_row to true. Then second result record field uses table->null_row to check if the field is null and result is NULL instead of real value. The fix is partial reversion of the field.h change (return original order of NULL check). Unfortunately this is not sufficient. A lot of test fails in the main suite after reversion. It happens because of another additional cleanup introduced Bug11760517 fix(see the comment): @ sql/table.h mark_as_null_row() no longer sets the NULL bits of the buffer Reversion of this change fixes all the problems except one, test case for Bug13541761 which is actually another regression of Bug11760517 fix. The fix of this problem is to not touch TABLE::null_flags during JOIN::cleanup. BR Sergei -- Sergei Petrunia, Software Developer MariaDB | Skype: sergefp | Blog: http://s.petrunia.net/blog
Sounds like a JIRA issue is in order here. I've created one: https://mariadb.atlassian.net/browse/MDEV-5719. On Fri, Feb 21, 2014 at 7:34 PM, Sergey Petrunia <sergey@mariadb.com> wrote:
Hi,
On Thu, Feb 20, 2014 at 10:28:52AM -0800, Pavel Ivanov wrote:
I've noticed a difference in MariaDB's behavior compared to its documentation and compared to MySQL 5.1 (didn't check with MySQL 5.5 or 5.6 yet), and I wanted to know what's your opinion -- is it a bug or a conscious difference and documentation is just lagging.
This doc https://mariadb.com/kb/en/select/#group-by says: "If you select a non-grouped column or a value computed from a non-grouped column, it is undefined which row the returned value is taken from". I believe this implies that selected value must be returned from a row group, not just something random.
I would agree with this reasoning.
MySQL 5.5, MariaDB 5.5, and MySQL 5.6 will all return (5,c,4,b).
MariaDB 10.0 returns (5,c,NULL,NULL).
I have investigated where the NULL values come from. They come from the next value group.
=== TL;DR: investigation result ==
Let's run this on 10.0 (I have added a WHERE to remove irrelevant rows):
mysql> SELECT t1.id, t1.name, t2.id, t2.name FROM t t1 LEFT OUTER JOIN t t2 ON (t1.parent = t2.id and t2.name <> 'a') where t1.id > 4 GROUP BY t1.id; +----+------+------+------+ | id | name | id | name | +----+------+------+------+ | 5 | c | NULL | NULL | | 10 | d | NULL | NULL | +----+------+------+------+
The query has GROUP BY, which is not optimized away (by either MySQL or MariaDB). GROUP BY is satisified by using "range" access on t1, which produces records in the order where groups follow one another.
First, end_send_group() sees the row (5,c,...). This is the first row, so it starts a group. Then, end_send_group() is invoked for row (10,d,...). It sees that the GROUP BY expression value has changed, and tries to print the first group. The execution is here:
#0 Field::is_null #1 Protocol_text::store #2 Item_field::send #3 Protocol::send_result_set_row #4 select_send::send_data #5 end_send_group #6 evaluate_null_complemented_join_record #7 sub_select #8 evaluate_join_record #9 sub_select
The field object at #0 points to the "result field" of table t2 (a TABLE object has "current row" fields and "result fields". AFAIU, the latter are kind of buffers that are used e.g. for stashing away GROUP BY columns).
Unfortunately, Field::is_null() checks table->null_row, which is 1, because row combination (10,d, ...) has a NULL-complemented row for t2. This is why we get NULLs instead of some value from (5,c,...) group.
== Where does this come from ==
It seems, MySQL 5.6 had this bug in the past. We've got the code line where the problem happens in
revid: monty@askmonty.org-20130325220313-r7oxaau0x0r5ead1, Temporary commit of 10.0-merge
Since then, MySQL 5.6 has fixed it, but their related commits look scary (fix for another fix which doesn't fully fix it?):
revid: sergey.glukhov@oracle.com-20130619102408-0a0bf6sl5d7tsys Bug#16620047 INCORRECT QUERY RESULT (AFTER SERVER UPGRADE) Regression is introduced in Bug11760517. According to the comment for the Bug11760517 fix, there is an additional cleanup in field.h(Field::is_null() function): Field::is_null() now checks table->null_row before checking the NULL bits.
Order of NULLity checking is changed, firts it checks table->null_row and then null_ptr[row_offset]. In our case when we evaluate first result record, evaluate_null_complemented_join_record() function is called for outer table and it sets table->null_row to true. Then second result record field uses table->null_row to check if the field is null and result is NULL instead of real value.
The fix is partial reversion of the field.h change (return original order of NULL check). Unfortunately this is not sufficient. A lot of test fails in the main suite after reversion. It happens because of another additional cleanup introduced Bug11760517 fix(see the comment): @ sql/table.h mark_as_null_row() no longer sets the NULL bits of the buffer
Reversion of this change fixes all the problems except one, test case for Bug13541761 which is actually another regression of Bug11760517 fix. The fix of this problem is to not touch TABLE::null_flags during JOIN::cleanup.
BR Sergei -- Sergei Petrunia, Software Developer MariaDB | Skype: sergefp | Blog: http://s.petrunia.net/blog
participants (2)
-
Pavel Ivanov
-
Sergey Petrunia