Hi,
On Jul 23, Otto Kekäläinen wrote:
Have you considered using AUTH_SWITCH_REQUEST for that purpose? That would allow redirect to happen after switch to TLS and client/server certificate validation.
One of the main use cases at the moment is to redirect clients to another server when this current server is being shut down.
For that we need to be able to send the redirect information in the middle of the session, not only when a connection is being established. Both session tracking and error message approach allow that.
Not strictly so - we don't *need* to send the redirect information in the middle of the session. The session can just be dropped, and when clients automatically reconnect they will get the error message (if old client software) or do an automatic redirect to the new server (if https://github.com/MariaDB/server/pull/2681 is merged).
First, this requires the client to perform one extra connection attempt, just to get the redirection information. This sounds like a waste of time. And also this is not guaranteed to work - if the server shuts down, the socket might be already closed, so the client simply won't be able to connect again.
It is not a waste of time as the time it takes to attempt a connection and get an error+redirect reply is minimal. The actual delay of how much time is "wasted" in a switchover situation comes from how quickly the server A is able to drain all connections and server B to sync up and get promoted to primary, and that is the time the design should focus to optimize. It is guaranteed to work as long as the DBA keeps the server running and redirecting - just like with HTTP 301 redirects.
Second, it aborts the session in the middle where the client might not expect it. For example, in the middle of the transaction. Some clients might be able to replay it, others might not. That's why I suggest to let the client decide when it can reconnect.
The switchover is initiated and decided by the DBA controlling the server. You cannot possibly expect it to be controlled by the client - and in particular to be decided by the *slowest* client. The server can allow some time for the draining of existing connections to happen, but eventually it needs to tell old client connections to stop, and only after the redirect has started for new connections + all existing connections have been dropped can the actual switchover proceed by promoting a new primary.
This is similar to how HTTP 301/302 redirects work, and pretty sweet as it allows to gracefully drain the old server and for clients to fall back on tried-and-tested semantics on what is done on connection interrupts.
HTTP is stateless, what it does cannot always be blindly applied to the very much stateful MariaDB client-server protocol.
And, really, I don't understand how phrases "session can just be dropped" and "gracefully drain the old server" can be used in adjacent statements. There's nothing graceful about forcefully dropping the connection.
Perhaps not directly graceful, but at least all production grade clients have some mechanisms to cope with network interruptions and such. If you introduce a 'recommendation' based direct, it will lead clients to implement completely novel logic which in worst case leads to clients competing on which one is the last to use the old server to minimize per-client downtime and making system level total downtime in switchover worse for all other clients. Thus introducing a redirection that is based on server denying new connections will nicely align with existing error handling logic in clients.