[Maria-discuss] Unable to DROP table, Data Dict problem
Hi Folks, We run a very large infra on MariaDB/RHEL6.x/RHEV/NetApp; thousands of VMs, each running MariaDB and very busy. MariaDB is 5.5.32. We recently had some NFS problems which led to minor DB corruption. Luckily, the corruption was isolated to a table holding cache data, and in fact one that is empty in most of our instances. That's the good news. The bad news is that the data dict is really unhappy about that table. - Cannot drop it, we get a nasty stack trace at [1] - Stopping MariaDB, removing the table by hand, restarting MariaDB, attempting to re-create the table hits a data dict error [2]. On disk I can see it created an ibd file, but no frm file. Following the instructions at http://dev.mysql.com/doc/refman/5.0/en/innodb-troubleshooting-datadict.html does not help. I have straced the process, and it does not seem to even try to create the frm file, so it really looks like it's internal state (data dict), not errors from further down the stack. OS / kernel logs are clean. Is there any reasonable method to cure this "in place" with minimal disruption? (These are large DBs, dump restore is more than awkward.) Oh and careful with NFSv4 :-/ thank you! martin 1 - Stack trace when trying to drop table Version: '5.5.32-MariaDB-log' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server InnoDB: Error: tablespace id is 47073 in the data dictionary InnoDB: but in file ./moodle_sand/mdl_tag_correlation.ibd it is 48463! 150119 17:14:45 InnoDB: Assertion failure in thread 140045886629632 in file fil0fil.c line 778 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 150119 17:14:45 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see http://kb.askmonty.org/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 5.5.32-MariaDB-log key_buffer_size=134217728 read_buffer_size=131072 max_used_connections=1 max_threads=187 thread_count=1 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 182265 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0xa92ee60 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f5ef9531d78 thread_stack 0x48000 ??:0(my_print_stacktrace)[0xa8631e] ??:0(handle_fatal_signal)[0x6cdd0b] ??:0(??)[0x30a7c0f500] ??:0(??)[0x30a78328a5] ??:0(??)[0x30a7834085] ??:0(fil_node_open_file)[0x8f3add] ??:0(fil_node_prepare_for_io)[0x8f3be5] ??:0(fil_space_get_flags)[0x8f7420] ??:0(fil_space_get_zip_size)[0x8f7529] ??:0(dict_drop_index_tree)[0x8d64e6] ??:0(row_upd_clust_step)[0x86d84d] ??:0(row_upd_step)[0x86e22e] ??:0(que_run_threads)[0x9577a8] ??:0(que_eval_sql)[0x957ea6] ??:0(row_drop_table_for_mysql)[0x85ae6a] ??:0(ha_innobase::delete_table(char const*))[0x8424e3] ??:0(ha_delete_table(THD*, handlerton*, char const*, char const*, char const*, bool))[0x6d5233] ??:0(mysql_rm_table_no_locks(THD*, TABLE_LIST*, bool, bool, bool, bool))[0x5ff928] ??:0(mysql_rm_table(THD*, TABLE_LIST*, char, char))[0x60001d] ??:0(mysql_execute_command(THD*))[0x58ad90] ??:0(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x58e369] ??:0(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x58f790] ??:0(do_handle_one_connection(THD*))[0x647cdf] ??:0(handle_one_connection)[0x647d7c] ??:0(??)[0x30a7c07851] ??:0(??)[0x30a78e811d] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7f5db800afa8): is an invalid pointer Connection ID (thread ID): 2 Status: NOT_KILLED Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off 2 - Error when attempting to recreate table 150119 17:15:22 InnoDB: Operating system error number 2 in a file operation. InnoDB: The error means the system cannot find the path specified. InnoDB: If you are installing InnoDB, remember that you must create InnoDB: directories yourself, InnoDB does not create them. 150119 17:15:22 InnoDB: Error: trying to open a table, but could not InnoDB: open the tablespace file './moodle_sand/mdl_tag_correlation.ibd'! InnoDB: Have you moved InnoDB .ibd files around without using the InnoDB: commands DISCARD TABLESPACE and IMPORT TABLESPACE? InnoDB: It is also possible that this is a temporary table #sql..., InnoDB: and MySQL removed the .ibd file for this. InnoDB: Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html InnoDB: for how to resolve the issue. -- martin.langhoff@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff
More information. Two actions attempted: dropping the table, removing the table on disk and trying to recreate it. =DROP TABLE trace= Only the ibd file is in place. I stop mariadb, restore the frm file, start mariadb, try "drop table", I get MariaDB [moodle_sand]> drop table mdl_tag_correlation ; ERROR 2013 (HY000): Lost connection to MySQL server during query and strace of the "mysqld" process of my connection and attempt to drop looks like this. I note there's no open()/stat()/read()/write(), so it seems to be all about in-memory state... Process 10102 attached - interrupt to quit restart_syscall(<... resuming interrupted call ...>) = 1 fcntl(20, F_GETFL) = 0x2 (flags O_RDWR) fcntl(20, F_SETFL, O_RDWR|O_NONBLOCK) = 0 accept(20, {sa_family=AF_FILE, NULL}, [2]) = 39 fcntl(20, F_SETFL, O_RDWR) = 0 getsockname(39, {sa_family=AF_FILE, path="/var/lib/mysql/mysql.sock"}, [28]) = 0 gettimeofday({1421768537, 624610}, NULL) = 0 fcntl(39, F_SETFL, O_RDONLY) = 0 fcntl(39, F_GETFL) = 0x2 (flags O_RDWR) setsockopt(39, SOL_SOCKET, SO_RCVTIMEO, "\36\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0 setsockopt(39, SOL_SOCKET, SO_SNDTIMEO, "<\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0 fcntl(39, F_SETFL, O_RDWR|O_NONBLOCK) = 0 setsockopt(39, SOL_IP, IP_TOS, [8], 4) = -1 EOPNOTSUPP (Operation not supported) clock_gettime(CLOCK_MONOTONIC, {64620, 470572093}) = 0 clone(child_stack=0x7f01346bcf10, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f01346bd9d0, tls=0x7f01346bd700, child_tidptr=0x7f01346bd9d0) = 10138 poll([{fd=19, events=POLLIN}, {fd=20, events=POLLIN}], 2, -1PANIC: attached pid 10102 exited with 1 <unfinished ... exit status 1> Log says: Version: '5.5.32-MariaDB-log' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server InnoDB: Error: tablespace id is 47073 in the data dictionary InnoDB: but in file ./moodle_sand/mdl_tag_correlation.ibd it is 48918! 150120 10:42:22 InnoDB: Assertion failure in thread 139643151177472 in file fil0fil.c line 778 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 150120 10:42:22 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see http://kb.askmonty.org/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 5.5.32-MariaDB-log key_buffer_size=134217728 read_buffer_size=131072 max_used_connections=1 max_threads=187 thread_count=1 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 182265 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0xc2163b0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f01346bcd78 thread_stack 0x48000 ??:0(my_print_stacktrace)[0xa8631e] ??:0(handle_fatal_signal)[0x6cdd0b] ??:0(??)[0x30a7c0f500] ??:0(??)[0x30a78328a5] ??:0(??)[0x30a7834085] ??:0(fil_node_open_file)[0x8f3add] ??:0(fil_node_prepare_for_io)[0x8f3be5] ??:0(fil_space_get_flags)[0x8f7420] ??:0(fil_space_get_zip_size)[0x8f7529] ??:0(dict_drop_index_tree)[0x8d64e6] ??:0(row_upd_clust_step)[0x86d84d] ??:0(row_upd_step)[0x86e22e] ??:0(que_run_threads)[0x9577a8] ??:0(que_eval_sql)[0x957ea6] ??:0(row_drop_table_for_mysql)[0x85ae6a] ??:0(ha_innobase::delete_table(char const*))[0x8424e3] ??:0(ha_delete_table(THD*, handlerton*, char const*, char const*, char const*, bool))[0x6d5233] ??:0(mysql_rm_table_no_locks(THD*, TABLE_LIST*, bool, bool, bool, bool))[0x5ff928] ??:0(mysql_rm_table(THD*, TABLE_LIST*, char, char))[0x60001d] ??:0(mysql_execute_command(THD*))[0x58ad90] ??:0(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x58e369] ??:0(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x58f790] ??:0(do_handle_one_connection(THD*))[0x647cdf] ??:0(handle_one_connection)[0x647d7c] ??:0(??)[0x30a7c07851] ??:0(??)[0x30a78e811d] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7efff400afa8): is an invalid pointer Connection ID (thread ID): 2 Status: NOT_KILLED Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off =Recreating table= I stop MariaDB, remove .frm and .ibd tables, start MariaDB, try to recreate the (trivial!) table. After this action, only an ibd file is found. # mysql moodle_sand < /tmp/mdl_tag_correlation.sql ERROR 1050 (42S01) at line 25: Table '`moodle_sand`.`mdl_tag_correlation`' already exists Trace of mysqld: Process 10154 attached - interrupt to quit restart_syscall(<... resuming interrupted call ...>) = 1 fcntl(20, F_GETFL) = 0x2 (flags O_RDWR) fcntl(20, F_SETFL, O_RDWR|O_NONBLOCK) = 0 accept(20, {sa_family=AF_FILE, NULL}, [2]) = 39 fcntl(20, F_SETFL, O_RDWR) = 0 getsockname(39, {sa_family=AF_FILE, path="/var/lib/mysql/mysql.sock"}, [28]) = 0 gettimeofday({1421768652, 111509}, NULL) = 0 fcntl(39, F_SETFL, O_RDONLY) = 0 fcntl(39, F_GETFL) = 0x2 (flags O_RDWR) setsockopt(39, SOL_SOCKET, SO_RCVTIMEO, "\36\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0 setsockopt(39, SOL_SOCKET, SO_SNDTIMEO, "<\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0 fcntl(39, F_SETFL, O_RDWR|O_NONBLOCK) = 0 setsockopt(39, SOL_IP, IP_TOS, [8], 4) = -1 EOPNOTSUPP (Operation not supported) futex(0x117cf84, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x117cf80, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x117c2a0, FUTEX_WAKE_PRIVATE, 1) = 1 poll([{fd=19, events=POLLIN}, {fd=20, events=POLLIN}], 2, -1 <unfinished ...> Process 10154 detached Log: InnoDB: a tablespace 48960 of name './moodle_sand/mdl_tag_correlation.ibd', InnoDB: but a tablespace 47073 of the same name InnoDB: already exists in the tablespace memory cache! InnoDB: We assume that InnoDB did a crash recovery, and you had InnoDB: an .ibd file for which the table did not exist in the InnoDB: InnoDB internal data dictionary in the ibdata files. InnoDB: We assume that you later removed the .ibd and .frm files, InnoDB: and are now trying to recreate the table. We now remove the InnoDB: conflicting tablespace object from the memory cache and try InnoDB: the init again. And yet: MariaDB [moodle_sand]> select * from mdl_tag_correlation ; ERROR 1146 (42S02): Table 'moodle_sand.mdl_tag_correlation' doesn't exist MariaDB [moodle_sand]> describe mdl_tag_correlation ; ERROR 1146 (42S02): Table 'moodle_sand.mdl_tag_correlation' doesn't exist cheers, m -- martin.langhoff@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff
participants (1)
-
Martin Langhoff