[MariaDB developers] Re: ea9a05f0821: This commit is about improving repair of tables

1 Mar 2024

Hi!

<cut>
'
...
So, I'd suggest the following logic instead:
* if USE_FRM - do USE_FRM stuff (or an warning)
* if marked corrupted or EXTENDED or doesn't need upgrade - run repair
        * if repair isn't supported = ALTER IGNORE unless needs upgrade
* if needs upgrade - run upgrade = ALTER IGNORE COPY
* if fails because of corruption - go to step 2 if repair wasn't run yet
I agree that a simplified repair logic is better than what we currently have.
However, I would prefer the following order if things for REPAIR

 if USE_FRM - do USE_FRM stuff (or an warning)
* if needs upgrade - run upgrade = ALTER IGNORE COPY. If fails run
'repair' if engine supports it.
 * if marked corrupted or FORCE or doesn't need upgrade - run repair
    * if repair isn't supported = ALTER IGNORE unless needs upgrade

There are a few reasons for this:
- Easier to document and understand
- 'need upgrade' only happens during mariadb-upgrade, which is the
first thing as user does when upgrading.
  In this case the table should not be crashed (if it is crashed, it
should be repaired on the old version).
-  With most engines, we cannot know if a table is corrupted
(especially during upgrade) , so it hard to both code and document
under which conditions
   the repair would be run.
- REPAIR EXTENDED is a bad way to force REPAIR as EXTENDED has well
documented meaning. Most users will not want to run REPAIR EXTENDED
  as it can take 100x longer than just REPAIR (as EXTENDED use less
memory (good) and does not use sorting to create indexes (safer but
bad performance).

The disadvantage in my suggestion is that if the table is both needing
upgrade and corrupted then ALTER TABLE FORCE may not be able to
recover as
many rows as REPAIR followed by ALTER TABLE FORCE.

It is a bit hard to now change the upgrade first tries ALTER TABLE
IGNORE and the repair. I would like to have this been part of another
MDEV
to get the current already improved ALTER/REPAIR code into 11.5
...
here USE_FRM is handled as before (except that it adds a warning if
USE_FRM isn't supported by the engine).
I agree with the warning.
...
After that the server runs
engine's repair if the table is marked corrupted or if EXTENDED was used
(so no need for a new FORCE syntax).
See above why EXTENDED is a bad option for this.
...
There is no error "engine does not
support REPAIR", the server falls back to ALTER automatically in that
case.
Agree. This is also what Elena has been suggesting and this is what I
had planned to do next on the repair task.
...
If the data need to be upgraded and ALTER fails, the server tries
repair and then ALTER again - just like MyISAM and Aria try different
repair methods.
Agree, but this is a bit tricky to get right. I suggest we leave this
part to another MDEV to be done later.
...
Also, note that ALTER should always use IGNORE to delete
duplicates in unique after collation of hash changes - that's what
repair in MyISAM/Aria does.
Good point. We had not thought about this one. For REPAIR converted to
AALER it is ok to run "ALTER TABLE FORCE IGNORE"
...
...
This commit is about improving repair of tables
now as there is an MDEV, it should be here in the subject line
It is already there in bb-11.4-timestamp

<cut>
...
...
Other things:
    - FORCE argument added to REPAIR to allow one to first run internal
      repair to fix damaged blocks and then follow it with ALTER TABLE.
    - REPAIR will not update frm_version if ha_check_for_upgrade() finds
      that table is still incompatible with current version. In this case the
      REPAIR will end with an error.
doesn't look like "the REPAIR will end with an error"
It should.  It is handled in the code:

    /*
      Give a warning if REPAIR TABLE was used but table still needs an
      ALTER TABLE. This can only happen for old type tables where REPAIR
      was using FORCE to recover old data.
    */
    if (operator_func == &handler::ha_repair && ! recreate_used &&
        (require_data_conversion || require_alter_table))
    {

There is even a test for this:
ctype_utf8_def_upgrade.result:

REPAIR TABLE t1 USE_FRM;
Table  Op      Msg_type        Msg_text
-test.t1        repair  status  OK
+test.t1        repair  note    Table data recovered
+t1     repair  error   Table rebuild required. Please do "ALTER TABLE `test.t1`
FORCE" or dump/reload to fix it!
+test.t1        repair  status  Operation failed

<cut>
...
...

diff --git a/include/my_handler_errors.h b/include/my_handler_errors.h
index df414888907..06dde5b3f74 100644
<cut>
...
...
+  "Incompatible key or row definition between the MariaDB .frm file and the information in the storage engine. You can try REPAIR TABLE ... USE_FRM possible followed by ALTER TABLE ... FORCE or dump and restore the table to fix this",
1. typo. "possibly"
2. please, don't recommend USE_FRM, it's a last resort lossy method,
   designed only for the case when MYI file is lost or corrupted.
This error ONLY is given in the case where there is a mismatch between
the .frm and the table definition.
There is NO other way to recover any data without using USE_FRM or
dump & restore.
The USE_FRM is designed to work for any engine that has data and index
in different files and the data file format
is compatible with MariaDB. In practice it is as safe or error prone
as dump and restore.
...
3. REPAIR USE_FRM is likey broken for CSV, but I couldn't test it,
   because REPAIR crashes on CSV tables with or without USE_FRM.
csv.test includes test for check and repair table. No crashes for me.
USE_FRM did not work, but that was a bug in CSV where it listed the
file extensions in the wrong order.
I fixed this and now USE_FRM also works for CSV.

<cut>
...
...
--- a/mysql-test/main/ctype_utf8_def_upgrade.result
+++ b/mysql-test/main/ctype_utf8_def_upgrade.result
@@ -10,24 +10,39 @@ SELECT @@character_set_database;
 latin1
 SET @@character_set_database="latin1";
 SELECT COUNT(*) FROM t1;
-ERROR HY000: Got error 190 "Incompatible key or row definition between the MariaDB .frm file and the information in the storage engine. You may have retry " from storage engine MyISAM
+ERROR HY000: Got error 190 "Incompatible key or row definition between the MariaDB .frm file and the information in the storage engine. You can try REPAIR TABLE ... USE_FRM possible followed by ALTER TABLE ... FORCE or dump and restore the table to fix this" from storage engine MyISAM
+ALTER TABLE t1 FORCE;
+ERROR HY000: Got error 190 "Incompatible key or row definition between the MariaDB .frm file and the information in the storage engine. You can try REPAIR TABLE ... USE_FRM possible followed by ALTER TABLE ... FORCE or dump and restore the table to fix this" from storage engine MyISAM
 CHECK TABLE t1;
 Table  Op  Msg_type  Msg_text
-test.t1  check Error Got error 190 "Incompatible key or row definition between the MariaDB .frm file and the information in the storage engine. You may have retry " from storage engine MyISAM
+test.t1  check Error Got error 190 "Incompatible key or row definition between the MariaDB .frm file and the information in the storage engine. You can try REPAIR TABLE ... USE_FRM possible followed by ALTER TABLE ... FORCE or dump and restore the table to fix this" from storage engine MyISAM
 test.t1  check error Corrupt
 REPAIR TABLE t1;
 Table  Op  Msg_type  Msg_text
-test.t1  repair  Error Got error 190 "Incompatible key or row definition between the MariaDB .frm file and the information in the storage engine. You may have retry " from storage engine MyISAM
+test.t1  repair  Error Got error 190 "Incompatible key or row definition between the MariaDB .frm file and the information in the storage engine. You can try REPAIR TABLE ... USE_FRM possible followed by ALTER TABLE ... FORCE or dump and restore the table to fix this" from storage engine MyISAM
+test.t1  repair  error Corrupt
+REPAIR TABLE t1 FORCE;
+Table  Op  Msg_type  Msg_text
+test.t1  repair  Warning Can't open table
 test.t1  repair  error Corrupt
 REPAIR TABLE t1 USE_FRM;
 Table  Op  Msg_type  Msg_text
-test.t1  repair  status  OK
+test.t1  repair  note  Table data recovered
+t1 repair  error Table rebuild required. Please do "ALTER TABLE `test.t1` FORCE" or dump/reload to fix it!
better to avoid exclamation marks in the error messages!
This is an old error. Better not change it.  (And in this specific
case I think that ! is a good thing to use)
Note that this error is the one that you thought we didn't have.

<cut>
...
...
+++ b/sql/handler.h
@@ -2249,6 +2267,8 @@ struct Table_scope_and_contents_source_pod_st // For trivial members
   enum_stats_auto_recalc stats_auto_recalc;
   bool varchar;                         ///< 1 if table has a VARCHAR
   bool sequence;                        // If SEQUENCE=1 was used
+  /* True if we are using OPTIMIZE TABLE, REPAIR TABLE or ALTER TABLE FORCE */
it'd be good to clarify here
/* this is used to suppress certain warnings that are only used for CREATE/ALTER TABLE */
The above is not what the flag does.
Before this flag InnoDB could try, 'invisible for the user', change
the data storage format in ALTER TABLE which could lead to:
- ALTER TABLE FAILING
- Loose of InnoDB symlinks
- Change of file formats.

The flag is there to tell InnoDB that it is not allowed to do that!

I updated the comment to:

    True if we are using OPTIMIZE TABLE, REPAIR TABLE or ALTER TABLE FORCE
    in which case the 'new' table should have identical storage layout
    as the original.
...
...
+  bool recreate_identical_table;
...
...
+++ b/sql/sql_admin.cc
@@ -48,7 +48,8 @@ const LEX_CSTRING msg_optimize= { STRING_WITH_LEN("optimize") };
 /* Prepare, run and cleanup for mysql_recreate_table() */
static bool admin_recreate_table(THD *thd, TABLE_LIST *table_list,
-                                 Recreate_info *recreate_info)
+                                 Recreate_info *recreate_info,
+                                 bool partition_admin, bool table_copy)
you don't need to pass partition_admin as an argument, if it's
always thd->lex->alter_info.partition_flags & ALTER_PARTITION_ADMIN
I did not want to have this and the calling function dependent on LEX.
I also added the argument to make it clear that recreate needs and will use
the flag (and no other flags in thd->lex->alter_info.partition_flags).

I could remove it if you insist, but I think it makes the function call
a bit clear.
...
...
{
   bool result_code;
   DBUG_ENTER("admin_recreate_table");
@@ -859,20 +867,29 @@ static bool mysql_admin_table(THD* thd, TABLE_LIST* tables,
       /* purecov: end */
     }
-    if (operator_func == &handler::ha_repair &&
-        !(check_opt->sql_flags & TT_USEFRM))
+    if (operator_func == &handler::ha_repair)
     {
       handler *file= table->table->file;
       int check_old_types=   file->check_old_types();
       int check_for_upgrade= file->ha_check_for_upgrade(check_opt);
why check_old_types is not part of ha_check_for_upgrade?
they're always used together, as far as I can see.
Yes, they are always used together.
Old code, didn't want to do too many changes in this patch.
In any case, I have now moved call to check_old_types()  to
ha_check_for_upgrade() and
removed it from all other places.


<cut>
...
...
@@ -1229,7 +1277,10 @@ static bool mysql_admin_table(THD* thd, TABLE_LIST* tables,
                  *save_next_global= table->next_global;
       table->next_local= table->next_global= 0;
-      result_code= admin_recreate_table(thd, table, &recreate_info);
+      result_code= admin_recreate_table(thd, table, &recreate_info,
+                                        (lex->alter_info.partition_flags &
+                                         ALTER_PARTITION_ADMIN), 0);
+      recreate_used= 1;
that's a bit too late, you check recreate_used earlier
The flag is to tell future code that 'admin_recreate_table()' has been
called (see one line above).
We checked earlier if an earlier admin_recreate_table() had been
called (it that would have happened, we would never have come to this
part of the code).
We come to this code when we tried REPAIR and got the error TRY_ALTER,
in which we do recreate and set the flag.
...
...
@@ -1641,7 +1696,8 @@ bool Sql_cmd_repair_table::execute(THD *thd)
   WSREP_TO_ISOLATION_BEGIN_WRTCHK(NULL, NULL, first_table);
   res= mysql_admin_table(thd, first_table, &m_lex->check_opt, &msg_repair,
                          TL_WRITE, 1,
-                         MY_TEST(m_lex->check_opt.sql_flags & TT_USEFRM),
+                         MY_TEST(m_lex->check_opt.sql_flags &
+                                 (TT_USEFRM | TT_FORCE)),
better rename repair_table_use_frm argument of mysql_admin_table
and open_only_one_table, because it's not only TT_USEFRM anymore.
a good name could be no_errors_from_open
Done.
...
...
--- a/sql/sql_table.cc
+++ b/sql/sql_table.cc
@@ -10868,19 +10869,56 @@ do_continue:;
   }
 #endif /* WITH_WSREP */
+  /*
+    ALTER TABLE ... ENGINE to the same engine is a common way to
+    request table rebuild. Set ALTER_RECREATE flag to force table
+    rebuild.
+  */
+  if (new_db_type == old_db_type)
+  {
+    if (create_info->used_fields & HA_CREATE_USED_ENGINE)
+      alter_info->flags|= ALTER_RECREATE;
+
+    /*
+      Check if we are using ALTER TABLE FORCE without any other options
+      (except ENGINE == current_engine).
+      In this case we will try to recreate an identical to the original.
+      This code is also used with REPAIR and OPTIMIZE.
+    */
+    if (alter_info->flags == ALTER_RECREATE &&
+        ((create_info->used_fields & ~HA_CREATE_USED_ENGINE) == 0))
+      create_info->recreate_identical_table= 1;
+  }
+
   /*
     We can use only copy algorithm if one of the following is true:
+    - If the table is from an old MariaDB version and requires data
+      modifications. In this case we ignore --alter-algorithm as
eh, no, you cannot ignore it, if the user specified incompatible set
of options, ALTER TABLE has to fail with an error, saying
ALGORITHM=INPLACE (for example) is impossible here. The server cannot
just silently ignore an explicit user command.
Yes, we can and what we must do!
ALTER TABLE FORCE has a very special, documented meaning (it should
always be able to recreate the table).
If it would honor the alter_algorithm variable, you would get:
- Wrong data in table on master or slave.
- ALTER TABLE FORCE stop with error (if alter_algorithm is not copy).

In case of ALTER TABLE FORCE,  the user variable 'alter-algorithm'
MUST be ignored for thing to work smoothly
The reason for this is that mariadb-upgrade uses (and has always used)
ALTER TABLE FORCE for upgrading tables
and assume this will NEVER give an error.
The flag also causes slaves to possible stop or have bad data if ALTER
TABLE FORCE is used together with any
value for --alter-algorithm other than copy.

If the user would use --alter-algorithm=inplace, in a config file,
upgrades would not work.

I rather remove the flag --alter-algorithm than the above behavior.
...
...
@@ -12451,10 +12482,14 @@ bool mysql_recreate_table(THD *thd, TABLE_LIST *table_list,
   create_info.init();
   create_info.row_type=ROW_TYPE_NOT_USED;
   create_info.alter_info= &alter_info;
+  create_info.recreate_identical_table= 1;
   /* Force alter table to recreate table */
   alter_info.flags= (ALTER_CHANGE_COLUMN | ALTER_RECREATE);
+  alter_info.partition_flags= thd->lex->alter_info.partition_flags;
+  if (partition_admin)
+    alter_info.partition_flags= ALTER_PARTITION_ADMIN;
when is this needed? I couldn't find a single code path where
alter_info.partition_flags is not ALTER_PARTITION_ADMIN at this point.
Neither code analysys nor mtr have found anything. You can safely remove
this if() and add
alter_info is a local variable and does not have ALTER_PARTION_ADMIN SET.
...
DBUG_ASSERT(alter_info.partition_flags & ALTER_PARTITION_ADMIN);
This would always be false and abort.
...
...
+++ b/sql/table.cc
@@ -1846,6 +1846,8 @@ int TABLE_SHARE::init_from_binary_frm_image(THD *thd, bool write,
   */
   if (share->frm_version == FRM_VER_TRUE_VARCHAR -1 && frm_image[33] == 5)
     share->frm_version= FRM_VER_TRUE_VARCHAR;
+  if (share->frm_version <= FRM_VER_TRUE_VARCHAR)
+    share->keep_original_mysql_version= 1;
this is wrong. see prepare_frm_header(), frm_version is
*always* FRM_VER_TRUE_VARCHAR if there are no vcols.
...
you've basically just disabled update_frm_version() for almost everyone.
Good catch. I changed it to:
  if (share->frm_version < FRM_VER_TRUE_VARCHAR)

This makes me even more worried about this code:
  new_frm_ver= (frm_image[2] - FRM_VER);
  field_pack_length= new_frm_ver < 2 ? 11 : 17;

FRM_VER_TRUE_VARCHAR= 10
FRM_VER=6

This means that in FRM_VERSION 8 we changed something that made things
incompatible.
I don't understand why we are not using frm_image[2] <=
FRM_CHANGE_OF_PACK_LENGTH.
which would be much more clear than the above code.
To make things even more unclear, we are using new_frm_ver in the rest
of the code.
like (new_frm_ver >= 3)
It is not clear what this mean.
Would be much better to compared the stored frm_version variable
directly with the FRM_VER variables.

Regards,
Monty

    

[MariaDB developers] Re: ea9a05f0821: This commit is about improving repair of tables

Michael Widenius