27 Feb
2020
27 Feb
'20
6:25 a.m.
Hi Simon, On Tue, Feb 25, 2020 at 3:51 AM Simon Mudd <simon.mudd@booking.com> wrote: > > Hi there Sachin, > > Good to see you working on this! > > > > On 22 Jan 2020, at 13:29, Sachin Setiya <sachin.setiya@mariadb.com> wrote: > > … > > > Usage:- Using this feature is quite simple. > > 1. On master you have to turn on `BINLOG_SPLIT_ALTER` dynamic variable. > > 2. Slave must be using parallel replication. > > > > Advance Usage:- > > So alter is divided like this. > > 1. START identifier Actual_alter_stmt > > 2. COMMIT/ROLLBACK identifier Actual_alter_stmt. OR > > 2. COMMIT/ROLLBACK identifier ALTER > > > > identifier is thread_id. > > thread_id of the active connection? Yes > > So it’s only possible for a single thread to kick off a _single_ “background/asynchronous ALTER table” ? Right > Might it not be possible for me to kick off several at once? For that you need multiple client. This is same as current , on one client we can have just one alter we issue next alter when previous is finished. > > e.g. > > START 1 ALTER TABLE A add column a varchar(100) NOT NULL DEFAULT ‘A’; > START 2 ALTER TABLE B add column b varchar(100) NOT NULL DEFAULT ‘B’; > …. > START 26 ALTER TABLE Z add column z varchar(100) NOT NULL DEFAULT ‘Z’; > Actually client will not be able to write START ALTER , it will simple write alter table , which will be splitted automatically by server. > I would expect the identifier to be unique (globally). Being numeric is fine We dont need it , we just need it unique at any specific point of time. >. What happens > if I repeat the same or an existing identifier? I’d expect to get some sort of error message “background alter table with identifier 6 already running”, > but then the next question would be: “What ALTER TABLE is identifier 6?” leading maybe to a request to SHOW ALTER TABLE 6 or similar. You wont be able to specify identifier , because you will be writing normal alter table > > That’s unless I’ve misunderstood by what is expected to be provided by “identifier”. I’m guessing it’s just meant to be a unique reference > which somehow can be used or verified later? Yes unique reference (but any point of time, can be reused by later alter). Yes this will be used by matching COMMIT/ROLLBACK ALTER on slave. > > > Questions by Simon Mudd. > > … > > >>> * this behaviour should be configurable? > > Yes. > >>> - new global variable on the master to allow injection of this changed event stream? > > Right , `BINLOG_SPLIT_ALTER` > >>> - a new session variable set in the session where the command is triggered ? > > Right , `BINLOG_SPLIT_ALTER` > >>> - on slave a setting to indicate how many INPLACE ALTER TABLEs can run at once? > > No setting so far , but I am thinking of maximum no of CONCURRENT ALTER = > > slave_parallel_threads > >>> * how does a DBA monitor what’s going on? > >>> - the progress and number of active DDL statements > > So as we there is 2 part, So progress will go like this > > 1. Executing of start alter (this will take most of time) > > 2. Waiting for commit/rollback Signal > > 3. Commit/ Rollback Alter. > > Number of active ALTER , these will create new threads so DBA can know > > using this > > or I am thinking of adding variable in SHOW SLAVE INFO. which will > > show active DDL. > >>> - please consider adding counters/metrics for: > >>> * number of “asynchronous DDLs” in progress / completed successfully / failed or rolled back > > Okay, We can have counter for these metrics. If i get time to implement this. > >>> * sizes of the DDL changes made so far > > Not sure if we need this. > > Actually this is critical. Imagine I kick off an ALTER TABLE of a 1 TB table. > That is going to take a long time to run, so I need a reasonably easy way to figure out how far things have progressed. > I had anticipated something like a counter of bytes or rows affected / copied so far, > or an indication of the position copying data from the source to destination table. > > Perhaps that’s too complicated and I need to use something like I_S.TABLES and some combination of DATA_LENGTH / INDEX_LENGTH and DATA_FREE > for both the original and new versions of the table, but I suspect that during the ALTER TABLE this meta data may not yet exist anywhere. > So how do I find out how much work has been done and how much is left? > > Some of the OSCs I have done in production have taken over 3 months to run so having a progress meter of some sort > tends to be really useful. Simon this might be helpful https://mariadb.com/kb/en/alter-table/#progress-reporting, But I don't have much idea in progress reporting of alter. > > Other than that I’m following your work with a lot of interest. > > Regards, > > Simon > -- Regards Sachin Setiya Software Engineer at MariaDB