Hi all, On Mon, Nov 13, 2017 at 7:16 PM, Roberto Spadim <roberto@spadim.com.br> wrote:
2017-11-12 22:37 GMT-02:00 Federico Razzoli <federico_raz@yahoo.it>:
I just noticed this old MySQL bug: https://bugs.mysql.com/bug.php?id=10132
In the comments (2005 to 2014) everyone seems to agree that current behaviour should be treated as a bug. The main commenter now works for MariaDB.
That is right. And I recently filed a corresponding MariaDB bug, even copying the title: https://jira.mariadb.org/browse/MDEV-13542 Crashing on a corrupted page is unhelpful
What is currently your opinion?
if it execute
1: highly redundant systems. These places will tend to want an immediate total failure because their strategy tends to be to replace or reclone a slave.
ok... it take some time to recover innodb, but
2: less highly redundant or shared hosting. These places will tend to want to continue limited service to the maximum extent possible.
is more interesting, we don't need to stop database, just stop table space/table and return an error at engine level
Right. I have noticed these two schools of thought, also among Linux kernel developers. Some want to kill the system ASAP, others want to fail gracefully. On a redundant system, you might even want to disable redo logging to speed up some operations. If the system crashes, you just start over. Will this behaviour change at some point? If so, what will be the
consequences for Galera?
no idea, but it's a critical issue
Yes, I hope that it will be possible to fix this at some point, but unfortunately I cannot say when. It involves extensive changes to the InnoDB code base, to properly propagate errors up the call stack. It is not an easy task. Maybe it is feasible to improve the reliability piece by piece. I am afraid that it can only be done in a development branch before it reaches beta or GA status. At Oracle there were no resources allocated on this. The MySQL bug 10132 was closed for some time until I reopened it (in 2014, according to a comment timestamp). If I remember correctly, the motivation to close the bug was that MySQL 5.5 introduced the ability of marking an index corrupted (in CHECK TABLE only, and potentially with more devastating results: Bug#19584379 Reporting corruption may corrupt the innodb data dictionary, fixed by me in October 2014). IIRC, also CHECK TABLE could crash InnoDB due to a corrupted page. When it comes to Galera, I have understood that there are other ways to cause inconsistency between the nodes. One example ought to be enabling the auto-recalculation of persistent statistics and then updating the mysql.innodb_index_stats or mysql.innodb_table_stats tables from SQL. With best regards, Marko -- Marko Mäkelä, Lead Developer InnoDB MariaDB Corporation DON’T MISS M|18 MariaDB User Conference February 26 - 27, 2018 New York City https://m18.mariadb.com/