[Maria-discuss] query result inconsistency between MariaDB 10.0.x and Oracle MySQL 5.1.x
Greetings, I recently upgraded the db server behind an application from MySQL 5.1.73 (as shipped in Ubuntu 10.04 "Lucid") to MariaDB 10.0.11 (from the MariaDB repo). A colleague of mine found an inconsistency between the results produced by the two servers for a given query. What we don't know is, is this a bug (I gather Maria is aiming at 100% compatibility), or is this somehow due to the query relying on unspecified behaviour (that the two db servers are therefore free to optimize differently)? The query is: SELECT t1.id, t2.album_id FROM t1 LEFT OUTER JOIN t2 ON t1.data_id = t2.id AND t1.event_type IN (1002, 1001, 1000) WHERE t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) GROUP BY t1.id ORDER BY t1.id DESC LIMIT 0, 20; The MariaDB result looks like this: +-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | NULL | | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | NULL | | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | NULL | | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+ And the Oracle MySQL result looks like this: +-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | 196 |<-- different | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | 194 |<-- different | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | 193 |<-- different | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+ My colleague pointed out that if you EXPLAIN the queries, you can see that the two databases are interpreting the query differently -- see the "Extra" column. I can't paste the explain output here without using very long lines, so I've pastebinned it: http://pastebin.com/n2sbH0kY My colleague has made the data from these tables available here: https://dl.dropboxusercontent.com/u/7755033/fatdrop/test_case_data.sql We've found workarounds for this, but we're really wondering if we've found a problem (either in MariaDB-MySQL consistency, or in the query, or ... ?). Any assistance appreciated. Charles -- ------------------------------------------------------------------ Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> Software, consulting, and services available at http://pyropus.ca/ ------------------------------------------------------------------
maybe you should use something like MIN() MAX(), since you are using a GROUP BY i don't know if this is well documented but i think it's 2014-05-23 15:18 GMT-03:00 Charles Cazabon < charlesc-web-register-launchpad.net@pyropus.ca>:
Greetings,
I recently upgraded the db server behind an application from MySQL 5.1.73 (as shipped in Ubuntu 10.04 "Lucid") to MariaDB 10.0.11 (from the MariaDB repo).
A colleague of mine found an inconsistency between the results produced by the two servers for a given query. What we don't know is, is this a bug (I gather Maria is aiming at 100% compatibility), or is this somehow due to the query relying on unspecified behaviour (that the two db servers are therefore free to optimize differently)?
The query is:
SELECT t1.id, t2.album_id FROM t1 LEFT OUTER JOIN t2 ON t1.data_id = t2.id AND t1.event_type IN (1002, 1001, 1000) WHERE t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) GROUP BY t1.id ORDER BY t1.id DESC LIMIT 0, 20;
The MariaDB result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | NULL | | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | NULL | | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | NULL | | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
And the Oracle MySQL result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | 196 |<-- different | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | 194 |<-- different | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | 193 |<-- different | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
My colleague pointed out that if you EXPLAIN the queries, you can see that the two databases are interpreting the query differently -- see the "Extra" column. I can't paste the explain output here without using very long lines, so I've pastebinned it: http://pastebin.com/n2sbH0kY
My colleague has made the data from these tables available here: https://dl.dropboxusercontent.com/u/7755033/fatdrop/test_case_data.sql
We've found workarounds for this, but we're really wondering if we've found a problem (either in MariaDB-MySQL consistency, or in the query, or ... ?).
Any assistance appreciated.
Charles -- ------------------------------------------------------------------ Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> Software, consulting, and services available at http://pyropus.ca/ ------------------------------------------------------------------
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html here, but must check if mariadb have something like it 2014-05-23 15:21 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
maybe you should use something like MIN() MAX(), since you are using a GROUP BY i don't know if this is well documented but i think it's
2014-05-23 15:18 GMT-03:00 Charles Cazabon < charlesc-web-register-launchpad.net@pyropus.ca>:
Greetings,
I recently upgraded the db server behind an application from MySQL 5.1.73 (as shipped in Ubuntu 10.04 "Lucid") to MariaDB 10.0.11 (from the MariaDB repo).
A colleague of mine found an inconsistency between the results produced by the two servers for a given query. What we don't know is, is this a bug (I gather Maria is aiming at 100% compatibility), or is this somehow due to the query relying on unspecified behaviour (that the two db servers are therefore free to optimize differently)?
The query is:
SELECT t1.id, t2.album_id FROM t1 LEFT OUTER JOIN t2 ON t1.data_id = t2.id AND t1.event_type IN (1002, 1001, 1000) WHERE t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) GROUP BY t1.id ORDER BY t1.id DESC LIMIT 0, 20;
The MariaDB result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | NULL | | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | NULL | | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | NULL | | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
And the Oracle MySQL result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | 196 |<-- different | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | 194 |<-- different | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | 193 |<-- different | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
My colleague pointed out that if you EXPLAIN the queries, you can see that the two databases are interpreting the query differently -- see the "Extra" column. I can't paste the explain output here without using very long lines, so I've pastebinned it: http://pastebin.com/n2sbH0kY
My colleague has made the data from these tables available here: https://dl.dropboxusercontent.com/u/7755033/fatdrop/test_case_data.sql
We've found workarounds for this, but we're really wondering if we've found a problem (either in MariaDB-MySQL consistency, or in the query, or ... ?).
Any assistance appreciated.
Charles -- ------------------------------------------------------------------ Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> Software, consulting, and services available at http://pyropus.ca/ ------------------------------------------------------------------
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
orry many mails " The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. " 2014-05-23 15:32 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html
here, but must check if mariadb have something like it
2014-05-23 15:21 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
maybe you should use something like MIN() MAX(), since you are using a
GROUP BY i don't know if this is well documented but i think it's
2014-05-23 15:18 GMT-03:00 Charles Cazabon < charlesc-web-register-launchpad.net@pyropus.ca>:
Greetings,
I recently upgraded the db server behind an application from MySQL 5.1.73 (as shipped in Ubuntu 10.04 "Lucid") to MariaDB 10.0.11 (from the MariaDB repo).
A colleague of mine found an inconsistency between the results produced by the two servers for a given query. What we don't know is, is this a bug (I gather Maria is aiming at 100% compatibility), or is this somehow due to the query relying on unspecified behaviour (that the two db servers are therefore free to optimize differently)?
The query is:
SELECT t1.id, t2.album_id FROM t1 LEFT OUTER JOIN t2 ON t1.data_id = t2.id AND t1.event_type IN (1002, 1001, 1000) WHERE t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) GROUP BY t1.id ORDER BY t1.id DESC LIMIT 0, 20;
The MariaDB result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | NULL | | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | NULL | | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | NULL | | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
And the Oracle MySQL result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | 196 |<-- different | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | 194 |<-- different | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | 193 |<-- different | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
My colleague pointed out that if you EXPLAIN the queries, you can see that the two databases are interpreting the query differently -- see the "Extra" column. I can't paste the explain output here without using very long lines, so I've pastebinned it: http://pastebin.com/n2sbH0kY
My colleague has made the data from these tables available here: https://dl.dropboxusercontent.com/u/7755033/fatdrop/test_case_data.sql
We've found workarounds for this, but we're really wondering if we've found a problem (either in MariaDB-MySQL consistency, or in the query, or ... ?).
Any assistance appreciated.
Charles -- ------------------------------------------------------------------ Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> Software, consulting, and services available at http://pyropus.ca/ ------------------------------------------------------------------
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
from mariadb https://mariadb.com/kb/en/select/ You can use any of the grouping functions in your select expression. Their values will be calculated based on all the rows that have been grouped together for each result row. *If you select a non-grouped column or a value computed from a non-grouped column, it is undefined which row the returned value is taken from. *This is not permitted if the ONLY_FULL_GROUP_BY SQL_MODE <https://mariadb.com/kb/en/sql_mode/> is used. 2014-05-23 15:32 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
orry many mails
" The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. "
2014-05-23 15:32 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html
here, but must check if mariadb have something like it
2014-05-23 15:21 GMT-03:00 Roberto Spadim <roberto@spadim.com.br>:
maybe you should use something like MIN() MAX(), since you are using a
GROUP BY i don't know if this is well documented but i think it's
2014-05-23 15:18 GMT-03:00 Charles Cazabon < charlesc-web-register-launchpad.net@pyropus.ca>:
Greetings,
I recently upgraded the db server behind an application from MySQL 5.1.73 (as shipped in Ubuntu 10.04 "Lucid") to MariaDB 10.0.11 (from the MariaDB repo).
A colleague of mine found an inconsistency between the results produced by the two servers for a given query. What we don't know is, is this a bug (I gather Maria is aiming at 100% compatibility), or is this somehow due to the query relying on unspecified behaviour (that the two db servers are therefore free to optimize differently)?
The query is:
SELECT t1.id, t2.album_id FROM t1 LEFT OUTER JOIN t2 ON t1.data_id = t2.id AND t1.event_type IN (1002, 1001, 1000) WHERE t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) GROUP BY t1.id ORDER BY t1.id DESC LIMIT 0, 20;
The MariaDB result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | NULL | | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | NULL | | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | NULL | | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
And the Oracle MySQL result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | 196 |<-- different | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | 194 |<-- different | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | 193 |<-- different | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
My colleague pointed out that if you EXPLAIN the queries, you can see that the two databases are interpreting the query differently -- see the "Extra" column. I can't paste the explain output here without using very long lines, so I've pastebinned it: http://pastebin.com/n2sbH0kY
My colleague has made the data from these tables available here: https://dl.dropboxusercontent.com/u/7755033/fatdrop/test_case_data.sql
We've found workarounds for this, but we're really wondering if we've found a problem (either in MariaDB-MySQL consistency, or in the query, or ... ?).
Any assistance appreciated.
Charles -- ------------------------------------------------------------------ Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> Software, consulting, and services available at http://pyropus.ca/ ------------------------------------------------------------------
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
This is actually a bug https://mariadb.atlassian.net/browse/MDEV-5719. Pavel On Fri, May 23, 2014 at 11:18 AM, Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> wrote:
Greetings,
I recently upgraded the db server behind an application from MySQL 5.1.73 (as shipped in Ubuntu 10.04 "Lucid") to MariaDB 10.0.11 (from the MariaDB repo).
A colleague of mine found an inconsistency between the results produced by the two servers for a given query. What we don't know is, is this a bug (I gather Maria is aiming at 100% compatibility), or is this somehow due to the query relying on unspecified behaviour (that the two db servers are therefore free to optimize differently)?
The query is:
SELECT t1.id, t2.album_id FROM t1 LEFT OUTER JOIN t2 ON t1.data_id = t2.id AND t1.event_type IN (1002, 1001, 1000) WHERE t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) GROUP BY t1.id ORDER BY t1.id DESC LIMIT 0, 20;
The MariaDB result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | NULL | | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | NULL | | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | NULL | | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
And the Oracle MySQL result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | 196 |<-- different | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | 194 |<-- different | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | 193 |<-- different | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
My colleague pointed out that if you EXPLAIN the queries, you can see that the two databases are interpreting the query differently -- see the "Extra" column. I can't paste the explain output here without using very long lines, so I've pastebinned it: http://pastebin.com/n2sbH0kY
My colleague has made the data from these tables available here: https://dl.dropboxusercontent.com/u/7755033/fatdrop/test_case_data.sql
We've found workarounds for this, but we're really wondering if we've found a problem (either in MariaDB-MySQL consistency, or in the query, or ... ?).
Any assistance appreciated.
Charles -- ------------------------------------------------------------------ Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> Software, consulting, and services available at http://pyropus.ca/ ------------------------------------------------------------------
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hi!
From MySQL documentation: http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html
"MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. (...) However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate." This is the reason why I think that the ONLY_FULL_GROUP_BY sql_mode flag should always be on. In your specific case, I note that album_id is not in GROUP BY and is not passed to any aggregating function (such as MAX()). Regards Federico -------------------------------------------- El vie, 23/5/14, Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> escribió: Asunto: [Maria-discuss] query result inconsistency between MariaDB 10.0.x and Oracle MySQL 5.1.x Para: "MariaDB discuss" <maria-discuss@lists.launchpad.net> Fecha: viernes, 23 de mayo, 2014 20:18 Greetings, I recently upgraded the db server behind an application from MySQL 5.1.73 (as shipped in Ubuntu 10.04 "Lucid") to MariaDB 10.0.11 (from the MariaDB repo). A colleague of mine found an inconsistency between the results produced by the two servers for a given query. What we don't know is, is this a bug (I gather Maria is aiming at 100% compatibility), or is this somehow due to the query relying on unspecified behaviour (that the two db servers are therefore free to optimize differently)? The query is: SELECT t1.id, t2.album_id FROM t1 LEFT OUTER JOIN t2 ON t1.data_id = t2.id AND t1.event_type IN (1002, 1001, 1000) WHERE t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) GROUP BY t1.id ORDER BY t1.id DESC LIMIT 0, 20; The MariaDB result looks like this: +-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | NULL | | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | NULL | | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | NULL | | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+ And the Oracle MySQL result looks like this: +-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | 196 |<-- different | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | 194 |<-- different | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | 193 |<-- different | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+ My colleague pointed out that if you EXPLAIN the queries, you can see that the two databases are interpreting the query differently -- see the "Extra" column. I can't paste the explain output here without using very long lines, so I've pastebinned it: http://pastebin.com/n2sbH0kY My colleague has made the data from these tables available here: https://dl.dropboxusercontent.com/u/7755033/fatdrop/test_case_data.sql We've found workarounds for this, but we're really wondering if we've found a problem (either in MariaDB-MySQL consistency, or in the query, or ... ?). Any assistance appreciated. Charles -- ------------------------------------------------------------------ Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> Software, consulting, and services available at http://pyropus.ca/ ------------------------------------------------------------------ _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Roberto, Federico, While I agree with your comments regarding use of ONLY_FULL_GROUP_BY, the specific case has primary keys defined on (id), on both tables. So, the GROUP BY t1.id has (or should have) no effect as every group will have exactly one row and the results should be the deterministic. I think Pavel is correct and this is a bug. Pantelis On Fri, May 23, 2014 at 8:16 PM, Federico Razzoli <federico_raz@yahoo.it>wrote:
Hi! From MySQL documentation: http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html
"MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. (...) However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate."
This is the reason why I think that the ONLY_FULL_GROUP_BY sql_mode flag should always be on.
In your specific case, I note that album_id is not in GROUP BY and is not passed to any aggregating function (such as MAX()).
Regards Federico
-------------------------------------------- El vie, 23/5/14, Charles Cazabon < charlesc-web-register-launchpad.net@pyropus.ca> escribió:
Asunto: [Maria-discuss] query result inconsistency between MariaDB 10.0.x and Oracle MySQL 5.1.x Para: "MariaDB discuss" <maria-discuss@lists.launchpad.net> Fecha: viernes, 23 de mayo, 2014 20:18
Greetings,
I recently upgraded the db server behind an application from MySQL 5.1.73 (as shipped in Ubuntu 10.04 "Lucid") to MariaDB 10.0.11 (from the MariaDB repo).
A colleague of mine found an inconsistency between the results produced by the two servers for a given query. What we don't know is, is this a bug (I gather Maria is aiming at 100% compatibility), or is this somehow due to the query relying on unspecified behaviour (that the two db servers are therefore free to optimize differently)?
The query is:
SELECT t1.id, t2.album_id FROM t1 LEFT OUTER JOIN t2 ON t1.data_id = t2.id AND t1.event_type IN (1002, 1001, 1000) WHERE t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) GROUP BY t1.id ORDER BY t1.id DESC LIMIT 0, 20;
The MariaDB result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | NULL | | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | NULL | | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | NULL | | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
And the Oracle MySQL result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | 196 |<-- different | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | 194 |<-- different | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | 193 |<-- different | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
My colleague pointed out that if you EXPLAIN the queries, you can see that the two databases are interpreting the query differently -- see the "Extra" column. I can't paste the explain output here without using very long lines, so I've pastebinned it: http://pastebin.com/n2sbH0kY
My colleague has made the data from these tables available here: https://dl.dropboxusercontent.com/u/7755033/fatdrop/test_case_data.sql
We've found workarounds for this, but we're really wondering if we've found a problem (either in MariaDB-MySQL consistency, or in the query, or ... ?).
Any assistance appreciated.
Charles -- ------------------------------------------------------------------ Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> Software, consulting, and services available at http://pyropus.ca/ ------------------------------------------------------------------
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
i think it's a bug since sergei commented that some test cases don't pass i don't think it's a bug, since it's well documented that this kind of column used with group by could not be deterministic but as a work around try to use MIN() MAX() and check if the query 'become' deterministic and solve the problem, while developers check what should/shouldn't be done, solve a bug, or not solve 2014-05-23 16:50 GMT-03:00 Pantelis Theodosiou <ypercube@gmail.com>:
Roberto, Federico,
While I agree with your comments regarding use of ONLY_FULL_GROUP_BY, the specific case has primary keys defined on (id), on both tables.
So, the GROUP BY t1.id has (or should have) no effect as every group will have exactly one row and the results should be the deterministic. I think Pavel is correct and this is a bug.
Pantelis
On Fri, May 23, 2014 at 8:16 PM, Federico Razzoli <federico_raz@yahoo.it>wrote:
Hi! From MySQL documentation: http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html
"MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. (...) However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate."
This is the reason why I think that the ONLY_FULL_GROUP_BY sql_mode flag should always be on.
In your specific case, I note that album_id is not in GROUP BY and is not passed to any aggregating function (such as MAX()).
Regards Federico
-------------------------------------------- El vie, 23/5/14, Charles Cazabon < charlesc-web-register-launchpad.net@pyropus.ca> escribió:
Asunto: [Maria-discuss] query result inconsistency between MariaDB 10.0.x and Oracle MySQL 5.1.x Para: "MariaDB discuss" <maria-discuss@lists.launchpad.net> Fecha: viernes, 23 de mayo, 2014 20:18
Greetings,
I recently upgraded the db server behind an application from MySQL 5.1.73 (as shipped in Ubuntu 10.04 "Lucid") to MariaDB 10.0.11 (from the MariaDB repo).
A colleague of mine found an inconsistency between the results produced by the two servers for a given query. What we don't know is, is this a bug (I gather Maria is aiming at 100% compatibility), or is this somehow due to the query relying on unspecified behaviour (that the two db servers are therefore free to optimize differently)?
The query is:
SELECT t1.id, t2.album_id FROM t1 LEFT OUTER JOIN t2 ON t1.data_id = t2.id AND t1.event_type IN (1002, 1001, 1000) WHERE t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) GROUP BY t1.id ORDER BY t1.id DESC LIMIT 0, 20;
The MariaDB result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | NULL | | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | NULL | | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | NULL | | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
And the Oracle MySQL result looks like this:
+-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | 196 |<-- different | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | 194 |<-- different | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | 193 |<-- different | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+
My colleague pointed out that if you EXPLAIN the queries, you can see that the two databases are interpreting the query differently -- see the "Extra" column. I can't paste the explain output here without using very long lines, so I've pastebinned it: http://pastebin.com/n2sbH0kY
My colleague has made the data from these tables available here: https://dl.dropboxusercontent.com/u/7755033/fatdrop/test_case_data.sql
We've found workarounds for this, but we're really wondering if we've found a problem (either in MariaDB-MySQL consistency, or in the query, or ... ?).
Any assistance appreciated.
Charles -- ------------------------------------------------------------------ Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> Software, consulting, and services available at http://pyropus.ca/ ------------------------------------------------------------------
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
-- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
Pantelis, It is a left join and the ON doesn't include t1.id, so without GROUP BY, the id values are repeated. In fact, I get completely different results: MariaDB [test]> SELECT t1.id, t2.album_id -> FROM t1 -> LEFT OUTER JOIN t2 -> ON t1.data_id = t2.id -> AND t1.event_type IN (1002, 1001, 1000) -> WHERE -> t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) -> GROUP BY t1.id LIMIT 0, 20; +----+----------+ | id | album_id | +----+----------+ | 1 | NULL | | 2 | NULL | | 3 | 1 | | 4 | 1 | | 5 | NULL | | 6 | NULL | | 7 | NULL | | 8 | 2 | | 9 | 2 | | 10 | NULL | | 11 | NULL | | 12 | NULL | | 13 | 3 | | 14 | NULL | | 15 | NULL | | 16 | NULL | | 17 | 4 | | 18 | NULL | | 19 | NULL | | 20 | NULL | +----+----------+ 20 rows in set (0.00 sec) I'm not saying it is not a bug - I don't have an opinion about this. I'm just suggesting to drop the GROUP BY from this query. Regards Federico -------------------------------------------- El vie, 23/5/14, Pantelis Theodosiou <ypercube@gmail.com> escribió: Asunto: Re: [Maria-discuss] query result inconsistency between MariaDB 10.0.x and Oracle MySQL 5.1.x Para: "MariaDB discuss" <maria-discuss@lists.launchpad.net> Fecha: viernes, 23 de mayo, 2014 21:50 Roberto, Federico, While I agree with your comments regarding use of ONLY_FULL_GROUP_BY, the specific case has primary keys defined on (id), on both tables. So, the GROUP BY t1.id has (or should have) no effect as every group will have exactly one row and the results should be the deterministic. I think Pavel is correct and this is a bug. Pantelis On Fri, May 23, 2014 at 8:16 PM, Federico Razzoli <federico_raz@yahoo.it> wrote: Hi! From MySQL documentation: http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html "MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. (...) However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate." This is the reason why I think that the ONLY_FULL_GROUP_BY sql_mode flag should always be on. In your specific case, I note that album_id is not in GROUP BY and is not passed to any aggregating function (such as MAX()). Regards Federico -------------------------------------------- El vie, 23/5/14, Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> escribió: Asunto: [Maria-discuss] query result inconsistency between MariaDB 10.0.x and Oracle MySQL 5.1.x Para: "MariaDB discuss" <maria-discuss@lists.launchpad.net> Fecha: viernes, 23 de mayo, 2014 20:18 Greetings, I recently upgraded the db server behind an application from MySQL 5.1.73 (as shipped in Ubuntu 10.04 "Lucid") to MariaDB 10.0.11 (from the MariaDB repo). A colleague of mine found an inconsistency between the results produced by the two servers for a given query. What we don't know is, is this a bug (I gather Maria is aiming at 100% compatibility), or is this somehow due to the query relying on unspecified behaviour (that the two db servers are therefore free to optimize differently)? The query is: SELECT t1.id, t2.album_id FROM t1 LEFT OUTER JOIN t2 ON t1.data_id = t2.id AND t1.event_type IN (1002, 1001, 1000) WHERE t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203) GROUP BY t1.id ORDER BY t1.id DESC LIMIT 0, 20; The MariaDB result looks like this: +-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | NULL | | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | NULL | | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | NULL | | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+ And the Oracle MySQL result looks like this: +-----+----------+ | id | album_id | +-----+----------+ | 623 | NULL | | 622 | NULL | | 621 | NULL | | 620 | NULL | | 619 | NULL | | 618 | NULL | | 617 | NULL | | 616 | 196 |<-- different | 615 | NULL | | 614 | NULL | | 613 | NULL | | 612 | 194 | | 611 | 194 |<-- different | 610 | NULL | | 609 | NULL | | 608 | 193 | | 607 | 193 |<-- different | 606 | NULL | | 605 | NULL | | 604 | NULL | +-----+----------+ My colleague pointed out that if you EXPLAIN the queries, you can see that the two databases are interpreting the query differently -- see the "Extra" column. I can't paste the explain output here without using very long lines, so I've pastebinned it: http://pastebin.com/n2sbH0kY My colleague has made the data from these tables available here: https://dl.dropboxusercontent.com/u/7755033/fatdrop/test_case_data.sql We've found workarounds for this, but we're really wondering if we've found a problem (either in MariaDB-MySQL consistency, or in the query, or ... ?). Any assistance appreciated. Charles -- ------------------------------------------------------------------ Charles Cazabon <charlesc-web-register-launchpad.net@pyropus.ca> Software, consulting, and services available at http://pyropus.ca/ ------------------------------------------------------------------ _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp -----Adjunto en línea a continuación----- _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
participants (5)
-
Charles Cazabon
-
Federico Razzoli
-
Pantelis Theodosiou
-
Pavel Ivanov
-
Roberto Spadim