[Maria-developers] Rollback causes assertion with Galera 10.0

9 Sep 2013

      Hi,

Background: I merged 10.0 to 
lp:~maria-captains/maria/maria-10.0-galera/, merge result is pushed to 
the tree.

In testing I hit repeatable to following assertion

#0  0x00007f1ff987f037 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f1ff9882698 in __GI_abort () at abort.c:90
#2  0x00007f1ff9877e03 in __assert_fail_base (fmt=0x7f1ff99ce578 
"%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
assertion=assertion@entry=0x109b4ff "empty()", file=file@entry=0x109b4d8 
"/home/jan/mysql/galera-10.0/sql/log.cc", line=line@entry=294, 
function=function@entry=0x10a01e0 
<binlog_cache_data::reset()::__PRETTY_FUNCTION__> "void 
binlog_cache_data::reset()") at assert.c:92
#3  0x00007f1ff9877eb2 in __GI___assert_fail (assertion=0x109b4ff 
"empty()", file=0x109b4d8 "/home/jan/mysql/galera-10.0/sql/log.cc", 
line=294, function=0x10a01e0 
<binlog_cache_data::reset()::__PRETTY_FUNCTION__> "void 
binlog_cache_data::reset()") at assert.c:101
#4  0x00000000009497d4 in binlog_cache_data::reset (this=0x7f1f7403ad60) 
at /home/jan/mysql/galera-10.0/sql/log.cc:294
#5  0x0000000000949b6a in binlog_cache_mngr::reset (this=0x7f1f7403ab90, 
do_stmt=false, do_trx=true) at /home/jan/mysql/galera-10.0/sql/log.cc:462
#6  0x0000000000935a4c in binlog_truncate_trx_cache (thd=0x45b7d40, 
cache_mngr=0x7f1f7403ab90, all=true) at 
/home/jan/mysql/galera-10.0/sql/log.cc:1991
#7  0x0000000000936123 in binlog_rollback (hton=0x3891fb0, 
thd=0x45b7d40, all=true) at /home/jan/mysql/galera-10.0/sql/log.cc:2180
#8  0x000000000086b750 in ha_rollback_trans (thd=0x45b7d40, all=true) at 
/home/jan/mysql/galera-10.0/sql/handler.cc:1616
#9  0x00000000007a0c5f in trans_rollback (thd=0x45b7d40) at 
/home/jan/mysql/galera-10.0/sql/transaction.cc:321
#10 0x00000000006718b2 in mysql_execute_command (thd=0x45b7d40) at 
/home/jan/mysql/galera-10.0/sql/sql_parse.cc:4634
#11 0x0000000000677d52 in mysql_parse (thd=0x45b7d40, 
rawbuf=0x7f1f74005288 "ROLLBACK", length=8, parser_state=0x7f1fe83bb610) 
at /home/jan/mysql/galera-10.0/sql/sql_parse.cc:6816
#12 0x00000000006774d3 in wsrep_mysql_parse (thd=0x45b7d40, 
rawbuf=0x7f1f74005288 "ROLLBACK", length=8, parser_state=0x7f1fe83bb610) 
at /home/jan/mysql/galera-10.0/sql/sql_parse.cc:6647
#13 0x00000000006688cd in dispatch_command (command=COM_QUERY, 
thd=0x45b7d40, packet=0x44807c1 "ROLLBACK ", packet_length=9) at 
/home/jan/mysql/galera-10.0/sql/sql_parse.cc:1451
#14 0x00000000006678a3 in do_command (thd=0x45b7d40) at 
/home/jan/mysql/galera-10.0/sql/sql_parse.cc:1077
#15 0x000000000078ba4d in do_handle_one_connection (thd_arg=0x45b7d40) 
at /home/jan/mysql/galera-10.0/sql/sql_connect.cc:1404
#16 0x000000000078b78d in handle_one_connection (arg=0x45b7d40) at 
/home/jan/mysql/galera-10.0/sql/sql_connect.cc:1311
#17 0x0000000000b0c20a in pfs_spawn_thread (arg=0x45713a0) at 
/home/jan/mysql/galera-10.0/storage/perfschema/pfs.cc:1853
#18 0x00007f1ffa423f8e in start_thread (arg=0x7f1fe83bc700) at 
pthread_create.c:311
#19 0x00007f1ff9941e1d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb)

We go from log.cc::binlog_rollback :

else if (ending_trans(thd, all) ||
              (!(thd->variables.option_bits & OPTION_KEEP_LOG) &&
               (!stmt_has_updated_non_trans_table(thd) ||
                WSREP_FORMAT(thd->variables.binlog_format) != 
BINLOG_FORMAT_STMT) &&
(!cache_mngr->trx_cache.changes_to_non_trans_temp_table() ||
                WSREP_FORMAT(thd->variables.binlog_format) != 
BINLOG_FORMAT_MIXED)))
       error= binlog_truncate_trx_cache(thd, cache_mngr, all);

end there to

  cache_mngr->reset(false, true);

and there is

  DBUG_ASSERT(empty());

  bool empty() const
   {
     return pending() == NULL && my_b_tell(&cache_log) == 0;
   }

At empty() this is what does not hold : my_b_tell(&cache_log) == 0; // 
not sure what this means

At test output there is at the end:

# 2013-09-09T10:03:21 [6620] Query: UPDATE LOW_PRIORITY IGNORE 
`table100_int_autoinc` AS X SET `col_char_12` = 220 WHERE X . 
`col_char_12_key` IN ( 2427781120 , 1 , 'i' , 'm' , 'o' ) ORDER BY 
`col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 8 
failed: 1213 Deadlock found when trying to get lock; try restarting 
transaction. Further errors of this kind will be suppressed.
# 2013-09-09T10:04:14 [6620] Query: DELETE LOW_PRIORITY  IGNORE FROM 
`table10_key_pk_parts_2_int_autoinc` WHERE `col_char_12_key` IN ( 
4200660992 , 1646198784 , 3888709632 , 32 , 'c' ) ORDER BY 
`col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 1 
failed: 1205 Lock wait timeout exceeded; try restarting transaction. 
Further errors of this kind will be suppressed.

Unfortunately, general log contains so many statements from different 
connections that they are not really helpful for getting repeatable test 
case.

R: Jan

--

Jan Lindström
Principal Engineer

MariaDB | MaxScale | skype: jan_p_lindstrom

www.skysql.com <http://www.skysql.com/>

Twitter <http://twitter.com/skysql> Blog <http://www.skysql.com/blog/> 
Facebook <http://www.facebook.com/skysql> LinkedIn 
<http://www.linkedin.com/company/1214250> Google+ 
<https://plus.google.com/117544963211695643458/posts>

Jan Lindström

Kristian Nielsen

Jan Lindström

Kristian Nielsen

Jan Lindström

tags

participants (2)