Hi, Otto, On Jul 23, Otto Kekäläinen wrote:
Have you considered using AUTH_SWITCH_REQUEST for that purpose? That would allow redirect to happen after switch to TLS and client/server certificate validation.
One of the main use cases at the moment is to redirect clients to another server when this current server is being shut down.
For that we need to be able to send the redirect information in the middle of the session, not only when a connection is being established. Both session tracking and error message approach allow that.
Not strictly so - we don't *need* to send the redirect information in the middle of the session. The session can just be dropped, and when clients automatically reconnect they will get the error message (if old client software) or do an automatic redirect to the new server (if https://github.com/MariaDB/server/pull/2681 is merged).
First, this requires the client to perform one extra connection attempt, just to get the redirection information. This sounds like a waste of time. And also this is not guaranteed to work - if the server shuts down, the socket might be already closed, so the client simply won't be able to connect again. Second, it aborts the session in the middle where the client might not expect it. For example, in the middle of the transaction. Some clients might be able to replay it, others might not. That's why I suggest to let the client decide when it can reconnect.
This is similar to how HTTP 301/302 redirects work, and pretty sweet as it allows to gracefully drain the old server and for clients to fall back on tried-and-tested semantics on what is done on connection interrupts.
HTTP is stateless, what it does cannot always be blindly applied to the very much stateful MariaDB client-server protocol. And, really, I don't understand how phrases "session can just be dropped" and "gracefully drain the old server" can be used in adjacent statements. There's nothing graceful about forcefully dropping the connection.
Consider this scenario: 1. Server A is getting 1 new connection per second, has 100 existing active connections, which of 10 are actually doing a query and 2 of them are long-running transactions. 2. Admin prepares server B to replicate server A data, and once replica lag is low runs on server A: SET GLOBAL SERVER_REDIRECT_TARGET=sever-b.example.com 3. Server A redirects all new connections, server B gets 1 new connection per second but does not yet serve traffic, clients re-try connections/queries 4. Server A drops 98 connections, 2 continue with long-running transaction 5. After defined delay, server A also drops the long-running transaction connections to avoid being stuck on them 6. All traffic is hitting server B, which catching up in replication, clients keep re-connecting 7. Server B has caught up, get's promoted to primary, and starts accepting connections 8. Server B serves at this point all traffic, and clients/applications continue to work without errors as long as the total switchover delay was shorter than the configured re-try delay of clients 9. Server A serves no queries, all clients connecting to it get redirected to server B - after some delay server A can be completely shut down 10. If any of the clients had an old version of the MariaDB connector, they will stop working but error logs will clearly show that a redirect happened.
Yes, sure. It's a fine scenario and either solution can do it. With redirects-in-the-middle there will be 98 connection attempts less and _may be_ long running transactions will redirect right away too (depends on the client implementations). Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org