Hi Kristian, On Sat, Jun 25, 2016 at 3:57 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:
Nirbhay Choubey <nirbhay@mariadb.com> writes:
Also, it seems reasonable that FTWRL in general could wait for checkpoint events so that other backup mechanisms similarly could avoid binlog files
That sound good to me. But, considering Percona's backup locks, it seems more logical to implement this in Backup locks instead, whenever they get ported/implemented in MariaDB.
Right. As I was thinking about the problem, it occured to me that this wasn't really a Galera-specific thing, my suggestion seemed a valid general wait-for-checkpoint mechanism.
So we should put the code that waits for checkpoint in its own function (as you already did, MYSQL_BIN_LOG::wait_for_last_checkpoint_event()). But I agree, we can wait with actually exposing it (in FTWRL, backup locks, whatever) until when/if that becomes relevant/priority.
I would just note that this wait does not really do anything unless there is something else (like FTWRL in your case) that prevents new commits, otherwise a new checkpoint could become pending at any time after wait_for_last_checkpoint_event() returns.
Also, in this particular case, the problem lies in reload_acl_and_cache(REFRESH_BINARY_LOG), (executed after FTWRL while preparing for SST) that rotates the binary log.
Hm, I see. So you're always copying an empty binlog file? I'm wondering why you don't simply don't copy any binlogs and just start the new server with --tc-heuristic-recover=ROLLBACK ... maybe copying binlogs was just considered easier? Anyway, I don't have the bigger picture, so can't have much of an informed opinion here.
The joiner node also picks up the GTID state from the binary log file it received.
Yes, it worked. But, to solve this issue in 10.1, I have added this wait to REFRESH_BINARY_LOG (as explained above) only when the server is acting as a Galera node.
That seems quite ugly, why not call it from the SST code, after it has called reload_acl_and_cache()? You're basically making FLUSH LOGS behave differently in Galera and non-Galera (if my understanding is correct), which might lead to subtle bugs?
I initially thought of adding the call after reload_acl_and_cache(), but there could still be a case when user performs a REFRESH_BINARY_LOG before LOCK_log is acquired.
But again, I don't have the bigger picture, and the whole wsrep patch is garbage all over the server anyway, so I suppose it doesn't matter much to me, as long as it's #ifdef WSREP.
and it also makes the extra lock/unlock of LOCK_log above redundant.
Not quite. The wait logic (that includes LOCK_log, as the snippet above) is to pause REFRESH_BINARY_LOG and an additional use of LOCK_log to block the RESET/ FLUSH commands while file transfer is in progress.
Sure, it's fine to have both, probably makes the code clearer anyway.
Right.
--- a/sql/log.cc +++ b/sql/log.cc @@ -3690,7 +3690,10 @@ bool MYSQL_BIN_LOG::open(const char *log_name, new_xid_list_entry->binlog_id= current_binlog_id; /* Remove any initial entries with no pending XIDs. */ while ((b= binlog_xid_count_list.head()) && b->xid_count == 0) + { my_free(binlog_xid_count_list.get()); + mysql_cond_broadcast(&COND_xid_list); + } binlog_xid_count_list.push_back(new_xid_list_entry); mysql_mutex_unlock(&LOCK_xid_list);
There is no need to mysql_cond_broadcast() multiple times. Use just a single broadcast outside the loop (before or after, doesn't make a difference).
Fixed. Best, Nirbhay
- Kristian.