[Maria-discuss] columns compression
hi guys! anyone testing column compression and have problems when changing form one table format to other? i created a new table with blob compressed column, and i'm trying to insert select from old table, but i got a problem: MariaDB [test]> insert into new_table select * from old_table; ERROR 1406 (22001): Data too long for column 'data' at row 1 using aria or innodb engine give the same error, it's a BLOB columns tahnks -- Roberto Spadim
Hi Roberto, Could you share test case? Simplified if possible: we should only need 2 tables with compressed columns definition and first row from old table. Also please share session column_compression% settings. Thanks, Sergey On Fri, Mar 16, 2018 at 06:05:37PM -0300, Roberto Spadim wrote:
hi guys! anyone testing column compression and have problems when changing form one table format to other?
i created a new table with blob compressed column, and i'm trying to insert select from old table, but i got a problem:
MariaDB [test]> insert into new_table select * from old_table; ERROR 1406 (22001): Data too long for column 'data' at row 1
using aria or innodb engine give the same error, it's a BLOB columns
tahnks
-- Roberto Spadim
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hi Sergey! I'm using a Azure VM (ubuntu), mariadb ubuntu package Server version: 10.3.5-MariaDB-10.3.5+maria~xenial mariadb.org binary distribution MariaDB [test]> *CREATE TABLE `table_without_compression` (* * `id` int(11) NOT NULL DEFAULT 0,* * `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,* * `data` blob NOT NULL,* * PRIMARY KEY (`id`,`gmtime`)* *) ENGINE=InnoDB DEFAULT CHARSET=latin1;* *Query OK, 0 rows affected (0.385 sec)* MariaDB [test]> *CREATE TABLE `table_compressed` (* * `id` int(11) NOT NULL DEFAULT 0,* * `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,* * `data` blob COMPRESSED NOT NULL DEFAULT '',* * PRIMARY KEY (`id`,`gmtime`)* *) ENGINE=InnoDB DEFAULT CHARSET=latin1;* *Query OK, 0 rows affected (0.385 sec)* MariaDB [test]> INSERT INTO *table_without_compression VALUES (1,0,repeat('a',65535));* *Query OK, 1 row affected (0.135 sec)* MariaDB [test]> * INSERT INTO table_without_compression VALUES (1,1,repeat('a',65536));* *ERROR 1406 (22001): Data too long for column 'data' at row 1 <- JUST TO CHECK IF WE GOT MAX SIZE OF BLOB COLUMN, maybe that's the problem, internally compression consider the from/to column instead of a bigger column (long blob for example when using blob/longblob, i will test with TEXT/LONG TEXT and VARCHAR too)* MariaDB [test]> *INSERT INTO table_compressed SELECT * FROM table_without_compression;* *ERROR 1406 (22001): Data too long for column 'data' at row 1 <- that's the error* Should I include at Jira?
Same to TEXT MariaDB [test]> *CREATE TABLE `table_compressed2` (* * -> `id` int(11) NOT NULL DEFAULT 0,* * -> `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,* * -> `data` text COMPRESSED NOT NULL DEFAULT '',* * -> PRIMARY KEY (`id`,`gmtime`)* * -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;* *Query OK, 0 rows affected (0.231 sec)* MariaDB [test]> *CREATE TABLE `table_without_compression2` (* * -> `id` int(11) NOT NULL DEFAULT 0,* * -> `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,* * -> `data` text NOT NULL,* * -> PRIMARY KEY (`id`,`gmtime`)* * -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;* *Query OK, 0 rows affected (0.253 sec)* MariaDB [test]> *INSERT INTO table_without_compression2 VALUES (1,0,repeat('a',65535));* *Query OK, 1 row affected (0.061 sec)* MariaDB [test]> *INSERT INTO table_without_compression2 VALUES (1,0,repeat('a',65536));* *ERROR 1406 (22001): Data too long for column 'data' at row 1 <- JUST TO CHECK IF WE GOT MAX SIZE OF BLOB COLUMN* MariaDB [test]> *INSERT INTO table_compressed2 SELECT * FROM table_without_compression2;* *ERROR 1406 (22001): Data too long for column 'data' at row 1*
mixing data types: MariaDB [test]> INSERT INTO table_compressed SELECT * FROM table_without_compression2; ERROR 1406 (22001): Data too long for column 'data' at row 1 MariaDB [test]> INSERT INTO table_compressed2 SELECT * FROM table_without_compression; ERROR 1406 (22001): Data too long for column 'data' at row 1 ---- creating now a LONGBLOB / LONGTEXT MariaDB [test]> *CREATE TABLE table_compressed_long_blob (* * -> `id` int(11) NOT NULL DEFAULT 0,* * -> `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,* * -> `data` LONGBLOB COMPRESSED NOT NULL,* * -> PRIMARY KEY (`id`,`gmtime`)* * -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;* *Query OK, 0 rows affected (0.141 sec)* MariaDB [test]> *CREATE TABLE table_compressed_long_text (* * -> `id` int(11) NOT NULL DEFAULT 0,* * -> `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,* * -> `data` LONGTEXT COMPRESSED NOT NULL,* * -> PRIMARY KEY (`id`,`gmtime`)* * -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;* *Query OK, 0 rows affected (0.161 sec)* MariaDB [test]> *INSERT INTO table_compressed_long_blob SELECT * FROM table_without_compression2;* *Query OK, 1 row affected (0.045 sec) <- NICE! * *Records: 1 Duplicates: 0 Warnings: 0* MariaDB [test]> *INSERT INTO table_compressed_long_text SELECT * FROM table_without_compression2;* *Query OK, 1 row affected (0.009 sec) <- NICE!* *Records: 1 Duplicates: 0 Warnings: 0*
at git commit: https://github.com/MariaDB/server/commit/fdc47792354c820aa4a8542d7c00d434424... instead of: + return n < 256 ? 1 : n < 65536 ? 2 : n < 16777216 ? 3 : 4; use: + return n < 256*-1* ? 1 : n < 65536*-1* ? 2 : n < 16777216*-1* ? 3 : 4; since length is reduced by one: https://github.com/MariaDB/server/commit/fdc47792354c820aa4a8542d7c00d434424...
Roberto, This error comes from Field_longstr::compress(), where we perform truncation if *to_length <= length. Regards, Sergey On Sat, Mar 17, 2018 at 09:31:29AM -0300, Roberto Spadim wrote:
at git commit: https://github.com/MariaDB/server/commit/fdc47792354c820aa4a8542d7c00d434424...
instead of: + return n < 256 ? 1 : n < 65536 ? 2 : n < 16777216 ? 3 : 4;
use: + return n < 256*-1* ? 1 : n < 65536*-1* ? 2 : n < 16777216*-1* ? 3 : 4;
since length is reduced by one: https://github.com/MariaDB/server/commit/fdc47792354c820aa4a8542d7c00d434424...
Roberto, If we simplify this even further, we'll end up with something like: CREATE TABLE t1(a BLOB COMPRESSED); INSERT INTO t1 VALUES(REPEAT('a', 65535)); We reserver 1 byte for compressed blobs to be able to store compression metadata. Unless I miss something, this is only required when we need to store data uncompressed, which is edge case. In normal case we could probably remove this restriction. So this is pretty valid bug report. If there was some other restriction, we should definitely document it. Regards, Sergey On Sat, Mar 17, 2018 at 08:43:36AM -0300, Roberto Spadim wrote:
Hi Sergey! I'm using a Azure VM (ubuntu), mariadb ubuntu package Server version: 10.3.5-MariaDB-10.3.5+maria~xenial mariadb.org binary distribution
MariaDB [test]> *CREATE TABLE `table_without_compression` (* * `id` int(11) NOT NULL DEFAULT 0,* * `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,* * `data` blob NOT NULL,* * PRIMARY KEY (`id`,`gmtime`)* *) ENGINE=InnoDB DEFAULT CHARSET=latin1;* *Query OK, 0 rows affected (0.385 sec)*
MariaDB [test]> *CREATE TABLE `table_compressed` (* * `id` int(11) NOT NULL DEFAULT 0,* * `gmtime` decimal(22,7) NOT NULL DEFAULT 0.0000000,* * `data` blob COMPRESSED NOT NULL DEFAULT '',* * PRIMARY KEY (`id`,`gmtime`)* *) ENGINE=InnoDB DEFAULT CHARSET=latin1;* *Query OK, 0 rows affected (0.385 sec)*
MariaDB [test]> INSERT INTO *table_without_compression VALUES (1,0,repeat('a',65535));* *Query OK, 1 row affected (0.135 sec)*
MariaDB [test]> * INSERT INTO table_without_compression VALUES (1,1,repeat('a',65536));* *ERROR 1406 (22001): Data too long for column 'data' at row 1 <- JUST TO CHECK IF WE GOT MAX SIZE OF BLOB COLUMN, maybe that's the problem, internally compression consider the from/to column instead of a bigger column (long blob for example when using blob/longblob, i will test with TEXT/LONG TEXT and VARCHAR too)*
MariaDB [test]> *INSERT INTO table_compressed SELECT * FROM table_without_compression;* *ERROR 1406 (22001): Data too long for column 'data' at row 1 <- that's the error*
Should I include at Jira?
participants (2)
-
Roberto Spadim
-
Sergey Vojtovich