Hi Kristian,

On Sat, Jun 25, 2016 at 3:57 AM, Kristian Nielsen <knielsen@knielsen-hq.org> wrote:

Nirbhay Choubey <nirbhay@mariadb.com> writes:

>> Also, it seems reasonable that FTWRL in general could wait for checkpoint
>> events so that other backup mechanisms similarly could avoid binlog files

> That sound good to me. But, considering Percona's backup locks, it seems
> more logical to
> implement this in Backup locks instead, whenever they get
> ported/implemented in MariaDB.

Right. As I was thinking about the problem, it occured to me that this
wasn't really a Galera-specific thing, my suggestion seemed a valid general
wait-for-checkpoint mechanism.

So we should put the code that waits for checkpoint in its own function
(as you already did, MYSQL_BIN_LOG::wait_for_last_checkpoint_event()). But I
agree, we can wait with actually exposing it (in FTWRL, backup locks,
whatever) until when/if that becomes relevant/priority.

I would just note that this wait does not really do anything unless there is
something else (like FTWRL in your case) that prevents new commits,
otherwise a new checkpoint could become pending at any time after
wait_for_last_checkpoint_event() returns.

> Also, in this particular case, the problem lies
> in reload_acl_and_cache(REFRESH_BINARY_LOG),
> (executed after FTWRL while preparing for SST) that rotates the binary log.

Hm, I see. So you're always copying an empty binlog file? I'm wondering why
you don't simply don't copy any binlogs and just start the new server with
--tc-heuristic-recover=ROLLBACK ... maybe copying binlogs was just
considered easier? Anyway, I don't have the bigger picture, so can't have
much of an informed opinion here.

The joiner node also picks up the GTID state from the binary log file it received.

> Yes, it worked. But, to solve this issue in 10.1, I have added this wait to
> REFRESH_BINARY_LOG
> (as explained above) only when the server is acting as a Galera node.

That seems quite ugly, why not call it from the SST code, after it has
called reload_acl_and_cache()? You're basically making FLUSH LOGS behave
differently in Galera and non-Galera (if my understanding is correct), which
might lead to subtle bugs?

I initially thought of adding the call after reload_acl_and_cache(), but there

could still be a case when user performs a REFRESH_BINARY_LOG before

LOCK_log is acquired.

But again, I don't have the bigger picture, and
the whole wsrep patch is garbage all over the server anyway, so I suppose it
doesn't matter much to me, as long as it's #ifdef WSREP.

>> and it also makes the extra lock/unlock of LOCK_log above redundant.
>
>
> Not quite. The wait logic (that includes LOCK_log, as the snippet above) is
> to pause
> REFRESH_BINARY_LOG and an additional use of LOCK_log to block the RESET/
> FLUSH commands while file transfer is in progress.

Sure, it's fine to have both, probably makes the code clearer anyway.

Right.

> --- a/sql/log.cc
> +++ b/sql/log.cc
> @@ -3690,7 +3690,10 @@ bool MYSQL_BIN_LOG::open(const char *log_name,
> new_xid_list_entry->binlog_id= current_binlog_id;
> /* Remove any initial entries with no pending XIDs. */
> while ((b= binlog_xid_count_list.head()) && b->xid_count == 0)
> + {
> my_free(binlog_xid_count_list.get());
> + mysql_cond_broadcast(&COND_xid_list);
> + }
> binlog_xid_count_list.push_back(new_xid_list_entry);
> mysql_mutex_unlock(&LOCK_xid_list);

There is no need to mysql_cond_broadcast() multiple times. Use just a single
broadcast outside the loop (before or after, doesn't make a difference).

Fixed.

Best,

Nirbhay

- Kristian.