Hi Bar, Eugene, I think that the rdiff file needs to be removed altogether. I misread the test that it was converting from utf8, while it was actually converting from ascii. In my opinion, we do need several kinds of tests in this area: * Changing from utf8mb3 to utf8mb4, with ALGORITHM=INSTANT * Changing the length, ALGORITHM=INSTANT. For ROW_FORMAT=REDUNDANT, InnoDB can always enlarge VARCHAR. For others, not from 128..255 bytes to more than 255 bytes. These combinations should continue to be covered in some test. * Changing both the length and the collation, ALGORITHM=INSTANT. * Changing the encoding, such that violations will occur (say, ascii column containing byte values > 127). Both with and without IGNORE. This should currently take place with ALGORITHM=COPY, but I would not want to add ,ALGORITHM=COPY to test cases. Instead, use --enable_info and ensure that it reports more than "0 rows affected". For native ALTER, it would report 0 rows affected. I will take one more look at the entire patch tomorrow. Apart from possibly some of the my observations above, I think that it should be fine. Marko On Tue, Apr 23, 2019 at 4:49 PM Alexander Barkov <bar@mariadb.com> wrote:
Hi,
It seems I forgot to record the rdiff file.
On 4/23/19 4:55 PM, Eugene Kosov wrote:
Hello, Marko, Alexander.
23.04.2019, 15:37, "Marko Mäkelä" <marko.makela@mariadb.com>:
Bar, thank you for the revision.
I ran all tests and found a regression for ROW_FORMAT=REDUNDANT, in the test innodb.instant_alter_charset,redundant. We should allow any extension of VARCHAR maximum length (in bytes) for ROW_FORMAT=REDUNDANT. The test attempts to enlarge the column from 50*3 to 200*3 bytes. That cannot work for other ROW_FORMAT in InnoDB, but it does for ROW_FORMAT=REDUNDANT.
I think it's mostly broken test rather a code regression.
Now there is only one case when field length can be increased due to charset change and it's utf8mb3 -> utf8mb4.
I think the only test case we need now is CHAR(70) UTF8MB3 -> CHAR(70) UTF8MB4 It should work only fro ROW_FORMAT=REDUNDANT
Please find an incremental patch fixing tests.
It adds the block suggested by Eugene and updates the rdiff file accordingly.
Thanks.
Also, please check the spelling once more. I found "Uncode-1.1" in a comment.
See also the Description of MDEV-18584. I’d want to allow a few more instantaneous conversion of CHAR columns in InnoDB. I agree that the (table->file->ha_table_flags() & HA_EXTENDED_TYPES_CONVERSION) is a bit ugly and could be replaced with something more descriptive.
Marko
On Fri, Apr 19, 2019 at 2:24 PM Alexander Barkov <bar@mariadb.com> wrote:
Hi Marko, Eugene,
Thanks for good suggestions. A new version is attached.
See details inline:
On 4/19/19 2:46 PM, Marko Mäkelä wrote:
Hi Bar,
I see that these bugs were based on wrong assumptions that ASCII or UCS2 columns would only contain valid ASCII or UTF-16 data. We should have tested these assumptions during the development of MDEV-15564. Luckily these were found before the GA release of MariaDB 10.4.
That was my fault. When we had discussions on this topic with Eugene, I overlooked these problems. Sorry for this.
Here is just a quick note before I will return to work on Tuesday.
Please remove ,algorithm=copy from tests. Instead, use --enable_info alter table ... --disable_info and make sure that the table contains at least 1 row when the ALTER TABLE is executed. In that way, ALGORITHM=COPY will report that a nonzero amount of rows were affected, while the native ALTER would report that 0 rows were affected.
I added this into the new tests in the end of instant_alter_charset.test
Note, I didn't add this into original MDEV-15564 tests, as they mostly use empty tables.
I would also suggest testing with some bad data (non-ASCII chars in ASCII column, or Unicode surrogate pairs in UCS2) to see what kind of errors would be triggered (if any), or what the converted data would look like after ALTER IGNORE TABLE.
This is a very good idea. I've just added "ALTER IGNORE"s into the new tests.
Please run the commit comment through a spell-checker. I spotted the typos "correspoding" and "actuallt".
Done.
Also, please mention in the commit comment that these are regressions caused by MDEV-15564. That should make our lifes easier, should we ever decide to backport MDEV-15564 to earlier versions.
Done.
I will review the code and tests in detail next week. I hope that Eugene can review it as well.
Best regards,
Marko
-- Marko Mäkelä, Lead Developer InnoDB MariaDB Corporation
-- Marko Mäkelä, Lead Developer InnoDB MariaDB Corporation