Yes percona toolkit is great. But executing a lot of those tools on a large dataset can take time. If you have spent the time to put up "a couple of slaves". Those can be transitioned into being a galera cluster. The way i use the cluster, is to route all traffic through HAProxy(there are other ways, like ProxySQL), with one live read/write and two being members that actually replicate down to slaves. The reason i route it through HAProxy is it will auto failover and you since writes to the target database are locked pessimistically, writes to the other members use opportunistic lock, which a simple bash script on multiple servers that reads and writes from the same table, can cause a deadlock. Same issue impacts RDS clusters.
Sorry for the long winded answer, but I wanted to show how a member of a cluster doesn't have to deal with lags of its members, and some of the risks. Nodes run with the same GTID and alert if they are different. I had a similar set up for many jobs and if galera was as mature as it is now built with mariadb, i would have used it in the same way. The non read/write nodes can also be backup safe as you can desync them and resync them during logical or physical backups.
I wanted to offer an alternate solution based on my perceptions of your server setup.