[Maria-developers] Review 2 of compressed columns
Hi! Here is a review for the patches marked with fixup! MDEV-11371 - column compression
commit f1a3cd40ecc4c144ac32d43ab5c1e88c8e4fae7e Author: Sergey Vojtovich <svoj@mariadb.org> Date: Mon Jul 31 16:07:39 2017 +0400
fixup! MDEV-11371 - column compression
In worst case we have to store data length + 1 for compressed data, we should increase field length by 1 internally so that data is not truncated.
diff --git a/mysql-test/t/column_compression.test b/mysql-test/t/column_compression.test index 256e96c..d3f8481 100644 --- a/mysql-test/t/column_compression.test +++ b/mysql-test/t/column_compression.test @@ -54,8 +54,8 @@ SHOW CREATE TABLE t1; --cat_file $MYSQLD_DATADIR/test/t1.CSV DROP TABLE t1;
---echo # Test fields with length equal to data length -CREATE TABLE t1(a VARCHAR(10) COMPRESSED); +--echo # Test fields that don't fit data +CREATE TABLE t1(a VARCHAR(9) COMPRESSED); --error ER_DATA_TOO_LONG INSERT INTO t1 VALUES(REPEAT('a', 10)); INSERT INTO t1 VALUES(REPEAT(' ', 10));
Please also insert a correct row, and not only edge cases. (You may do it already, but I can't see that in the diff)
@@ -70,3 +70,12 @@ INSERT INTO t1 VALUES(REPEAT(' ', 255)); SET column_compression_threshold=DEFAULT; SELECT a, LENGTH(a) FROM t1; DROP TABLE t1; + +--echo # Corner case: VARCHAR(255) COMPRESSED must have 2 bytes pack length +CREATE TABLE t1(a VARCHAR(255) COMPRESSED); +SHOW CREATE TABLE t1; +SET column_compression_threshold=300; +INSERT INTO t1 VALUES(REPEAT('a', 255)); +SET column_compression_threshold=DEFAULT; +SELECT a, LENGTH(a) FROM t1; +DROP TABLE t1;
Please insert also a row without column_compression_threshold, just to cover all cases in the test case.
diff --git a/sql/field.cc b/sql/field.cc index 216fb95..e1dc193 100644 --- a/sql/field.cc +++ b/sql/field.cc @@ -7555,7 +7555,8 @@ void Field_varstring::sql_type(String &res) const length= cs->cset->snprintf(cs,(char*) res.ptr(), res.alloced_length(), "%s(%d)", (has_charset() ? "varchar" : "varbinary"), - (int) field_length / charset()->mbmaxlen); + (int) field_length / charset()->mbmaxlen - + MY_TEST(compression_method())); res.length(length); if ((thd->variables.sql_mode & (MODE_MYSQL323 | MODE_MYSQL40)) && has_charset() && (charset()->state & MY_CS_BINSORT)) @@ -10015,6 +10016,8 @@ void Column_definition::create_length_to_internal_length(void) case MYSQL_TYPE_STRING: case MYSQL_TYPE_VARCHAR: length*= charset->mbmaxlen; + if (sql_type == MYSQL_TYPE_VARCHAR && compression_method()) + length++;
Add a comment: /* Compressed columns needs one extra byte to store the compression method. This byte is invisible to the end user, but not for the storage engine. */
key_length= length; pack_length= calc_pack_length(sql_type, length); break;
<cut>
commit 274ab73a89573554d07a6de1deb846ad47eb1004 Author: Sergey Vojtovich <svoj@mariadb.org> Date: Thu Jul 27 13:06:33 2017 +0400
fixup! MDEV-11371 - column compression
Fixed BLOB COMPRESSED -> BLOB COMPRESSED replication.
The problem was that Field_blob::unpack gets compressed data and calls Field_blob_compressed::store(), which compressed it again. Fixed by calling Field_blob::set_ptr() instead of ::store().
Note that now no data copy is performed and blob is pointing to data allocated by replication routines. It used to work this way before 06fb8c2d10, but still should be reviewed carefully to make sure no copy is actually needed.
patch ok.
commit 93e00970100503c3f365063d9261119f096bee48 Author: Sergey Vojtovich <svoj@mariadb.org> Date: Thu Jul 27 11:44:20 2017 +0400
Get rid of Field::do_save_field_metadata()
It doesn't serve any purpose, but generates extra virtual function call.
ok
commit e1247d4991bb4ebd3f98788c7bfe31a34faf7b69 Author: Sergey Vojtovich <svoj@mariadb.org> Date: Thu Jul 27 10:58:47 2017 +0400
fixup! MDEV-11371 - column compression
This is an addition to "Fixed compressed columns to work with partitioning".
Mroonga calls key_copy() on write/delete, when column is not in read_set. Adjust read_set for the duration of val_str().
diff --git a/sql/field.cc b/sql/field.cc index ec82984..ab96ee9 100644 --- a/sql/field.cc +++ b/sql/field.cc @@ -7659,8 +7659,12 @@ uint Field_varstring::get_key_image(uchar *buff, uint length, { String val; uint local_char_length; + my_bitmap_map *old_map;
+ old_map= dbug_tmp_use_all_columns(table, table->read_set); val_str(&val, &val); + dbug_tmp_restore_column_map(table->read_set, old_map); +
Shouldn't this be in mronga code and not in MariaDB? Looking at the original code, we didn't check for read_set before either for get_key_image, so I assume this is ok.
local_char_length= val.charpos(length / field_charset->mbmaxlen); if (local_char_length < val.length()) val.length(local_char_length);
commit 94040156d8007663beb6785ca653d41b26758301 Author: Sergey Vojtovich <svoj@mariadb.org> Date: Wed Jul 26 17:28:57 2017 +0400
fixup! MDEV-11371 - column compression
Get rid of hardcoded "8" (zlib compression method id) by moving towards support for multiple compression methods.
commit 6c8b1e563c7456eb0b8ca11a621333237de37ec4 Author: Sergey Vojtovich <svoj@mariadb.org> Date: Tue Jul 25 13:27:07 2017 +0400
fixup! MDEV-11371 - column compression
Fixed compressed columns to work with partitioning.
Reviewer note: this patch reuses get_key_image()/set_key_image() of Field_varstring to avoid code duplication. Cons: this adds virtual function calls val_str() and store().
ok
commit 9c85ad8d105e74baae0b6aba0db01b4d6aac7ba6 Author: Sergey Vojtovich <svoj@mariadb.org> Date: Thu Jul 20 16:57:46 2017 +0400
fixup! MDEV-11371 - column compression
ok. The only thing left is now to fix replication to handle compressed columns. We also have agreed to handle the case where the master and slave have different COMPRESSED column attributes for VARCHAR and BLOB. Regards, Monty
participants (1)
-
Michael Widenius