23 Sep
2013
23 Sep
'13
11:42 a.m.
Hi, Full unedited log using --valgrind and --mysqld=--debug=+d at location: https://www.dropbox.com/s/dcufp7l1cch8dfs/mysql.err.gz In below lines I have: static int binlog_commit(handlerton *hton, THD *thd, bool all) { int error= 0; DBUG_ENTER("binlog_commit"); binlog_cache_mngr *const cache_mngr= (binlog_cache_mngr*) thd_get_ha_data(thd, binlog_hton); #ifdef WITH_WSREP if (!cache_mngr) DBUG_RETURN(0); #endif /* WITH_WSREP */ WSREP_ERROR("::jan_BINLOG_COMMIT:: TRX: Pending %p pos %llu, pos_in_file %llu current_pos %llu request_pos %llu write_pos %llu", cache_mngr->trx_cache.pending(), cache_mngr->trx_cache.cache_log.pos_in_file, cache_mngr->trx_cache.cache_log.current_pos, cache_mngr->trx_cache.cache_log.request_pos, cache_mngr->trx_cache.cache_log.write_pos); WSREP_ERROR("::jan_COMMIT:: STMT: STMT: Pending %p pos %llu, pos_in_file %llu current_pos %llu request_pos %llu write_pos %llu", cache_mngr->stmt_cache.pending(), cache_mngr->stmt_cache.cache_log.pos_in_file, cache_mngr->stmt_cache.cache_log.current_pos, cache_mngr->stmt_cache.cache_log.request_pos, cache_mngr->stmt_cache.cache_log.write_pos); And that macro is // MySQL logging functions don't seem to understand long long length modifer. // This is a workaround. It also prefixes all messages with "WSREP" #define WSREP_LOG(fun, ...) \ { \ char msg[1024] = {'\0'}; \ snprintf(msg, sizeof(msg) - 1, ## __VA_ARGS__); \ fun("WSREP: %s", msg); \ } #define WSREP_DEBUG(...) \ if (wsrep_debug) WSREP_LOG(sql_print_information, ##__VA_ARGS__) #define WSREP_INFO(...) WSREP_LOG(sql_print_information, ##__VA_ARGS__) #define WSREP_WARN(...) WSREP_LOG(sql_print_warning, ##__VA_ARGS__) #define WSREP_ERROR(...) WSREP_LOG(sql_print_error, ##__VA_ARGS__) Thus , it seems that trx_cache and stmt_cache are not initialized, where that should happen ? > Jan Lindström <jplindst@mariadb.org> writes: > >> Correct log file attached. >> ==00:00:02:33.708 25289== Use of uninitialised value of size 8 >> ==00:00:02:33.708 25289== at 0x5EF865B: _itoa_word (_itoa.c:179) >> ==00:00:02:33.708 25289== by 0x5EFCB91: vfprintf (vfprintf.c:1654) >> ==00:00:02:33.708 25289== by 0x5F22654: vsnprintf (vsnprintf.c:119) >> ==00:00:02:33.708 25289== by 0x5F03141: snprintf (snprintf.c:34) >> ==00:00:02:33.708 25289== by 0x936453: binlog_commit(handlerton*, THD*, bool) (log.cc:2054) >> ==00:00:02:33.708 25289== by 0x86B3DF: commit_one_phase_2(THD*, bool, THD_TRANS*, bool) (handler.cc:1516) >> ==00:00:02:33.708 25289== by 0x86B13B: ha_commit_trans(THD*, bool) (handler.cc:1436) >> ==00:00:02:33.709 25289== by 0x7A0DD9: trans_commit_stmt(THD*) (transaction.cc:363) >> ==00:00:02:33.709 25289== by 0x673DEB: mysql_execute_command(THD*) (sql_parse.cc:5373) >> ==00:00:02:33.709 25289== by 0x677D27: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6820) >> ==00:00:02:33.709 25289== by 0x6774A8: wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6651) >> ==00:00:02:33.709 25289== by 0x6688A3: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1454) >> ==00:00:02:33.709 25289== >> ==00:00:02:33.709 25289== Conditional jump or move depends on uninitialised value(s) >> ==00:00:02:33.709 25289== at 0x5EF8665: _itoa_word (_itoa.c:179) >> ==00:00:02:33.709 25289== by 0x5EFCB91: vfprintf (vfprintf.c:1654) >> ==00:00:02:33.709 25289== by 0x5F22654: vsnprintf (vsnprintf.c:119) >> ==00:00:02:33.709 25289== by 0x5F03141: snprintf (snprintf.c:34) >> ==00:00:02:33.709 25289== by 0x936453: binlog_commit(handlerton*, THD*, bool) (log.cc:2054) >> ==00:00:02:33.709 25289== by 0x86B3DF: commit_one_phase_2(THD*, bool, THD_TRANS*, bool) (handler.cc:1516) >> ==00:00:02:33.709 25289== by 0x86B13B: ha_commit_trans(THD*, bool) (handler.cc:1436) >> ==00:00:02:33.709 25289== by 0x7A0DD9: trans_commit_stmt(THD*) (transaction.cc:363) >> ==00:00:02:33.709 25289== by 0x673DEB: mysql_execute_command(THD*) (sql_parse.cc:5373) >> ==00:00:02:33.709 25289== by 0x677D27: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6820) >> ==00:00:02:33.709 25289== by 0x6774A8: wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6651) >> ==00:00:02:33.709 25289== by 0x6688A3: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1454) >> ==00:00:02:33.709 25289== >> 130920 13:53:04 [ERROR] WSREP: ::jan_BINLOG_COMMIT:: TRX: Pending (nil) pos 0, pos_in_file 892097568 current_pos 892132560 request_pos 892132664 write_pos 303074896 > So this is the explanation for the strange numbers in your printouts, like > 892097568. Those values are uninitialised memory (probably from uninitialised > stack variables?). > > Since the assertion you get is exactly about the values of those IO_CACHE > values being wrong, this looks like something to investigate. > > I am however curious about why there is only a single Valgrind error about > this? As there are many of those unexpected numbers in your output. > > However, most of the Valgrind errors seem to be in Galera code. It is hard to > say for me if they are related. Seppo, can you look into that from your side? > Normally, we strive _very_ hard to make MariaDB code clean in Valgrind... > >> It did produce the output, not yet found the reason, some of the output just >> before corruption >> >> MYSQL_BIN_LOG::flush_and_set_pending_rows_event(event): enter: event: >> 0x7f2a200bc800 >> MYSQL_BIN_LOG::flush_and_set_pending_rows_event(event): info: cache_mngr-> >> pending(): 0x0 >> Rows_log_event::do_add_row_data: enter: row_data: 0x7f2a200cb248 length: 10 >> Rows_log_event::do_add_row_data: row_data: Memory: 0x7f2a200cb248 Bytes: (10) >> F8 01 00 00 00 01 74 02 77 65 >> my_realloc: my: ptr: 0x0 size: 1024 my_flags: 80 >> my_malloc: my: size: 1024 my_flags: 80 >> my_malloc: exit: ptr: 0x7f2a200895e0 > I need to see the *complete* log to be able to help. Yes, I know it can be > hundreds of megabyte maybe, but it should compress well. Upload it somewhere > I can access it and I can take a look. > > Using this log file, we can deduce exactly which paths the code was taking as > it got to that assert. I'm wondering if the ROLLBACK happens early and ends up > executing some binlog code at a point where transaction start has not properly > initialised the necessary data structures. So we need the dbug output from the > entire execution since transaction start. > > (If you can get the error log/Valgrind output in the same log file as the > dbug, that would make things much easier. But if not, I can still try to > decipher it...) > > - Kristian. R: -- Jan Lindström Principal Engineer MariaDB | MaxScale | skype: jan_p_lindstrom www.skysql.com <http://www.skysql.com/> Twitter <http://twitter.com/skysql> Blog <http://www.skysql.com/blog/> Facebook <http://www.facebook.com/skysql> LinkedIn <http://www.linkedin.com/company/1214250> Google+ <https://plus.google.com/117544963211695643458/posts>