Hello Again,

Just wanted to say thank you again for this.

I restored a copy of the replica from the day before, but after some expected HA_ERR_FOUND_DUPP_KEY errors, it started throwing a bunch of HA_ERR_KEY_NOT_FOUND and that didn’t work.

Then I restored another copy from the same minute as the split, and after a few hundred HA_ERR_FOUND_DUPP_KEY skips, it started catching up. An hour later, it was 0 seconds behind and has stayed in sync.

We may have lost some data from the local non-replicated DBs, but nothing we can’t recover from the original server if we need to. Certainly a lot less pain than recreating the sync.

Again, thanks. It’s been a pretty crappy week because of this so it’s nice to end it on a high.

Cheers

S

From: Simon Avery via discuss <discuss@lists.mariadb.org>
Sent: 10 January 2025 09:31
To: Gordan Bobic <gordan.bobic@gmail.com>
Cc: sruli67--- via discuss <discuss@lists.mariadb.org>
Subject: [MariaDB discuss] Re: MariaDb Master/Replica. How to recover replication when reverting Master to an earlier backup?

Hi Gordan,

…Possibly. That’s worth exploring, even if it means restoring to a new vm and then transferring the local DBs across to it if they’ve changed since then, thank you.

Simon

From: Gordan Bobic <gordan.bobic@gmail.com>
Sent: 10 January 2025 08:24
To: Simon Avery <Simon.Avery@atass-sports.co.uk>
Cc: sruli67--- via discuss <discuss@lists.mariadb.org>
Subject: Re: [MariaDB discuss] MariaDb Master/Replica. How to recover replication when reverting Master to an earlier backup?

You don't often get email from gordan.bobic@gmail.com. Learn why this is important

Can you roll back the slave to a backup before the point you rolled the master back to? If so, assuming you are running with safe settings (sync_binlog=1, sync_master_info=1,innnodb_flush_log_at_trx_commit=1), it should come up and catch up from that earlier point, provided your binlog retention goes that far back.

On Fri, 10 Jan 2025, 09:56 Simon Avery via discuss, <discuss@lists.mariadb.org> wrote:

Hello

We have two Mariadb 10.11.10 servers acting as Master and Replica.

Both have unique local databases, plus seven that are replicated from Master to Replica with Replicate_Wild_Do_Table: DB1Name.%, DB2Name.% etc

The size of the replicated databases is just over 1Tb, with the single largest being almost 500Gb. Both machines are in vmware and on the same network.

Yesterday, we hit an issue with the Master which required that vm to be restored to a backup four hours previously. This got Master back in play, but obviously has broken Replication.

Traditional wisdom seems to suggest that I need to recreate this replication setup from scratch - ie, stop Master (from changing, ie, close firewall and block clients, flush logs), note the Log Position, and then mysqldump each database. However, due to the size of these databases, that is going to take many hours and we can't accept that downtime for Master.

It's occurred to us that we might speed this up by:

    Stop Master from changing.

    Note Log Position.

    Clone Master's vm to Master-Clone. (< 10 minutes)

    Restart Master.

Then we would be at relative leisure to

[On Clone]
Mysqldump the databases onto a temporary drive.

[On Replica]
DROP the seven databases.

Import the dumped databases from the temporary drive.

Update the log position in config and restart the slave user.

Then Replica should start syncing from Master again, even if Clone was several days old?

Does that sound sensible?

These databases have partitions – is that going to cause issues dumping and reimporting them or should I use another method?

Any pitfalls?

Any alternative ways?

Thank you

_______________________________________________
discuss mailing list -- discuss@lists.mariadb.org
To unsubscribe send an email to discuss-leave@lists.mariadb.org