Hi,
On 29 Dec 2016, at 10:38, Sachin Setiya <sachin.setiya@mariadb.com> wrote:
I am thinking of implementing true lock=none for slaves. Currently Alter command is only executed on slave when master commits it. I(thanks marko for suggesting this idea) am thinking of writing BEGIN_DDL_EVENT in binlog after master hits ha_prepare_inplace_alter_table. Then master will write QUERY_EVENT on binlog with actual alter query . On commit/rollback master will write COMMIT_DDL_EVENT/ROLLBACK_DDL_EVENT.
On slave there will be pool of threads(configurable global variable), which will apply these DDLs. On receiving BEGIN_DDL_EVENT slave thread will pass the QUERY_EVENT to one of the worker thread. Worker thread will execute until ha_inplace_alter_table. Actual commit_inplace_alter will be called by sql thread. If sql thread recieve some kind of rollback event , then it will somehow signal worker thread to stop executing alter. If none of the worker threads are available then event will be en-queued, then If we received rollback event the we will simply discard event from queue, If we received commit event then SQL thread will synchronously process DDL event.
Please let me know what you think. Credits for idea Marko and Lixun.
This idea has been discussed several times in the past (a way to trigger the action to be kicked off on all the slaves while it is running on the master still) with different people at MariaDB and elsewhere and it sounds very attractive. It’s currently one of the reasons that INLINE ALTER TABLE is rarely done where I work as it generates a replication lag on slaves so is not acceptable and running the task in isolation separately on each server in a replication chain requires careful management. From the operational perspective a couple of questions come up: * this behaviour should be configurable? - new global variable on the master to allow injection of this changed event stream? - a new session variable set in the session where the command is triggered ? - on slave a setting to indicate how many INPLACE ALTER TABLEs can run at once? * how does a DBA monitor what’s going on? - the progress and number of active DDL statements - please consider adding counters/metrics for: * number of “asynchronous DDLs” in progress / completed successfully / failed or rolled back * sizes of the DDL changes made so far * number of threads busy in this state (maybe can be implied from SHOW PROCESSLIST or equivalent but nicer to have an easy to query metric) * what happens if the commit/rollback ddl event never arrives? (the statement on the master could be skipped for one of several reasons) - I assume that users on the slave see the old structure all the time until it completes - would this have a storage or performance penalty on the slave if the commit/rollback DDL event never arrives? - can a “commit” / “rollback” be manually triggered by the DBA in such circumstances? * what happens if the server crashes while this process (or these processes) are ongoing when the server starts up again? Thanks for clarifying ideas on this as while the idea sounds very attractive it helps enormously if monitoring of status is clear and easy to achieve. Regards, Simon