Hi all,

Up until now we have only directly managed stand alone MySQL instances. For clusters (MySQL) we have used a contractor to build and administrate the clusters. Our current product development requirements and good business practice (high availability of critical databases) now dictate that we move to a multi-server setup.

Our goal is to setup at least two servers which will have the same data at all times and which are at separate sites so if the link between the sites goes down, the cell of servers at that site can keep operating. Our database stores text fields and blobs of audio recordings. We read data (SELECT with a WHERE) for 90-95% of our queries, some tables can be populated with hundreds of thousand of rows but normally far fewer. The software we are running which interacts with the database is smart enough to figure out if one of the database servers it is trying to reach is not responding and will hunt through a list of IP's until it successfully connects.

I have a couple of questions as this will be our first cluster. (I don't think multi-master will meet our requirements based on the examples available online so I have chosen MariaDB 10 with Galera.)

How often is complex management (manual intervention) required such as adding a node back into the cluster after communications between the nodes fails and then is restored? Is it normal, say 90% of the time, for syncing to re-establish automatically?

What type of monitoring tools are advised? (Or is it as simple as automatically parsing logs to find errors?)

The documented recommendation is 3 servers; we don't have the hardware at the moment so is it ok to start with 2 and add an additional database server at a later time?

Alternatively, we could use 2 dedicated servers and 1 virtual server but the documentation states that the cluster only works as quickly as the slowest node. Is that only if you are running queries against the slowest node? Or if you are writing to the slowest node? I am assuming we can run the arbiter (garbd) on one of the existing data nodes.

I am assuming we can use the SSL options in the configuration file to secure communication between the servers so if it was sniffed it would not be possible to decode the data. Does anyone have any comments on if this affects reliability?

Thank you for any input you have on these questions. Please feel free to ask if you want clarification on anything.