Hello

 

We have two Mariadb 10.11.10 servers acting as Master and Replica.

 

Both have unique local databases, plus seven that are replicated from Master to Replica with Replicate_Wild_Do_Table: DB1Name.%, DB2Name.% etc

 

The size of the replicated databases is just over 1Tb, with the single largest being almost 500Gb. Both machines are in vmware and on the same network.

 

Yesterday, we hit an issue with the Master which required that vm to be restored to a backup four hours previously. This got Master back in play, but obviously has broken Replication.

 

Traditional wisdom seems to suggest that I need to recreate this replication setup from scratch - ie, stop Master (from changing, ie, close firewall and block clients, flush logs), note the Log Position, and then mysqldump each database. However, due to the size of these databases, that is going to take many hours and we can't accept that downtime for Master.

 

It's occurred to us that we might speed this up by:

 

    Stop Master from changing.

    Note Log Position.

    Clone Master's vm to Master-Clone. (< 10 minutes)

    Restart Master.

 

Then we would be at relative leisure to

[On Clone]
Mysqldump the databases onto a temporary drive.


[On Replica]
DROP the seven databases.

Import the dumped databases from the temporary drive.

Update the log position in config and restart the slave user.

 Then Replica should start syncing from Master again, even if Clone was several days old?

 

Does that sound sensible?

 

These databases have partitions – is that going to cause issues dumping and reimporting them or should I use another method?

 

Any pitfalls?

 

Any alternative ways?

 

Thank you