Hi Jan, On Wed, Jan 11, 2017 at 9:07 AM, Jan Lindström <jan.lindstrom@mariadb.com> wrote:
Number of the fields indeed would solve the ADD COLUMN default case but has some limitations
ADD COLUMN m default xxx; would be still slow, xxx = NULL maybe possible, in other cases could we store default on SYS_COLUMNS?
The instant ADD COLUMN described in MDEV-11369 is just a start. Changing the DEFAULT of an instantly added column would require a rebuild, but specifying the DEFAULT at the time of adding the column can be done instantly. Changing a previously NOT NULL column to allow NULLs would be in the realm of https://jira.mariadb.org/browse/MDEV-11424 Instant ALTER TABLE of failure-free record format changes. Currently the DEFAULT values of columns are stored into the *.frm files by the SQL layer, not at all in InnoDB. If we allow instant changes of the DEFAULT values of instantly added columns, we must make sure that any rows that were inserted before the change to the DEFAULT would use the old default value. I believe that for this, we would effectively have to store all default values that ever existed, and we would have to identify when each record or page was created with respect to the instant ALTER operations. MDEV-11424 does cover all that. Monty has expressed a long-term goal of allowing the server to discover InnoDB tables after *.frm and *.ibd files were copied to the data directory. I think that we should avoid extending the InnoDB SYS_* tables any further, because that would seem to move us further away from the goal. I think that any metadata related to the instant ALTER has to be stored in the data file itself, and the high-level data dictionary should only reflect the latest committed data definition. I hope that we can some day deliver a transactional data dictionary in MDEV-11655 and remove all SYS_* tables.
DROP COLUMN would be still slow
Yes, DROP COLUMN (as well as changing the order of columns, or adding a column somewhere in the middle) would require a table rebuild until MDEV-11424 is implemented.
Problem naturally is that we do not have suitable metadata on every page where to identify what columns are stored. I do not see way out of this problem without introducing yet a another file forma (we have too many of them already).
Yes, that is the case. I agree with you that we must be careful when introducing new file formats. Luckily we do have 8 previously unused bytes (always written as 0 until now) in each clustered index page, the PAGE_MAX_TRX_ID field. (MDEV-6076 in MariaDB 10.2.4 will repurpose the field in clustered index root pages for a persistent AUTO_INCREMENT field.) In MDEV-11369 we repurpose a subset 1..1023 of the values, and in MDEV-11424 we can repurpose the remaining values to indicate the format of the records in the page. MDEV-11424 would also require a metadata history (or record conversion recipes) to be stored somewhere in the data file, separately from the clustered index tree. Marko -- DON’T MISS M|17 April 11 - 12, 2017 The Conrad Hotel New York City https://m17.mariadb.com/ Marko Mäkelä, Lead Developer InnoDB MariaDB Corporation