Hello Sergei, Thanks for your review. I've pushed a new patch version to bb-10.7-bar-uuid. Please see comments inline: On 10/20/21 2:48 AM, Sergei Golubchik wrote:
20000000-0000-0000-0000-000000000002 +10000000-0000-0000-0000-000000000003 20000000-0000-0000-0000-000000000003
please, add a test for `ORDER BY CONCAT(a)` just to test the lexicographical ordering of UUIDs. Or may be `ORDER BY HEX(a)` ? Or `UNHEX` ? Whatever the recommended approach should be.
I propose to recommend using CAST(uuid AS BINARY(16)) for lexicographical order. This CAST only performs byte reordering (conversion from the in-record to in-memory format) While CONCAT and HEX additionally involve BINARY(16) -> VARCHAR(32) conversion through UUID::to_string(). The former should be faster.
DROP TABLE t1; # diff --git a/plugin/type_uuid/mysql-test/type_uuid/type_uuid_memory.result b/plugin/type_uuid/mysql-test/type_uuid/type_uuid_memory.result index 42bb74d4b01..eb7a51895b9 100644 --- a/plugin/type_uuid/mysql-test/type_uuid/type_uuid_memory.result +++ b/plugin/type_uuid/mysql-test/type_uuid/type_uuid_memory.result @@ -25,7 +25,7 @@ a 00000000-0000-0000-0000-0000000000ff EXPLAIN SELECT * FROM t1 WHERE a='00000000-0000-0000-0000-0000000000ff'; id select_type table type possible_keys key key_len ref rows Extra -1 SIMPLE t1 ref a a 17 const 2 Using where +1 SIMPLE t1 ref a a 17 const 4 Using where
why?
Hash collision now happens differently, this affects the optimizer estimation for certain values. Note, UUID::hash_record() is NOT used by the HEAP engine, as it might seem. UUID::hash_record() is only used only by partitioning. So partitioning distribution has not changed, because Field_fbt::hash() uses UUID::hash_record(), which hashes bytes in the "user visible" (in-memory) order. I made this for partition distribution compatibility with the BINARY(16) data type. But HEAP distribution has changed, because HEAP still uses my_charset_bin.hash_sort() directly on the entire 16 byte in-record value, which is now different. We could preserve the old HEAP distribution by introducing a special CHARSET_INFO based on my_charset_bin, but overriding hash_sort(). But I think it's not really necessary. I checked on a bigger table, and it looks fine. For some constants it returns "2 rows". For some constants it returns "4 rows". So it did not change to worse. Just a coincidence.
diff --git a/plugin/type_uuid/sql_type_uuid.h b/plugin/type_uuid/sql_type_uuid.h index 9076c874e98..be9fea8ebc9 100644 --- a/plugin/type_uuid/sql_type_uuid.h +++ b/plugin/type_uuid/sql_type_uuid.h @@ -24,8 +24,162 @@ class UUID: public FixedBinTypeStorage<MY_UUID_SIZE, MY_UUID_STRING_LENGTH> ... + // Compare two in-memory values + static int cmp(const LEX_CSTRING &a, const LEX_CSTRING &b) + {
this is way more LoC that I would've used. but ok, it's your code and it's a plugin, so as you like
diff --git a/sql/sql_string.h b/sql/sql_string.h index fe57c8153bb..795f80c3e08 100644 --- a/sql/sql_string.h +++ b/sql/sql_string.h @@ -484,6 +484,11 @@ class Binary_string: public Sql_alloc if (str.Alloced_length) Alloced_length= (uint32) (str.Alloced_length - offset); } + LEX_CSTRING to_lex_cstring() const + { + LEX_CSTRING tmp= {Ptr, str_length}; + return tmp; + } inline LEX_CSTRING *get_value(LEX_CSTRING *res)
may be you could remove get_value()? In a separate commit. the name is quite bad and the method makes rather little sense, especially if there's to_lex_cstring() now.
I am all for it. Hope Monty won't mind - he added get_value().
{ res->str= Ptr; diff --git a/sql/sql_type_fixedbin.h b/sql/sql_type_fixedbin.h index c6e3d20bcfa..5141cb9fad4 100644 --- a/sql/sql_type_fixedbin.h +++ b/sql/sql_type_fixedbin.h @@ -154,18 +160,13 @@ class FixedBinTypeBundle FbtImpl::max_char_length()+1)); return false; } - int cmp(const char *str, size_t length) const - { - DBUG_ASSERT(length == sizeof(m_buffer)); - return memcmp(m_buffer, str, length); - } int cmp(const Binary_string &other) const
what is this one used for?
It's just a convenience wrapper. It is used only in stored_field_cmp_to_item() for now.
+ static ulong KEY_pack_flags(uint column_nr) + { + /* + Return zero by default. A particular data type can override + this method return some flags, e.g. HA_PACK_KEY to enable + key prefix compression.
what if you change it later, will it make old tables corrupted?
I did not check. This method is used only in mysql_prepare_create_table(). Then flags are stored inside FRM. In theory the change should not make old tables corrupted. Should I try with different engines? Your question made me also think what should be the default. Probably we should do it the other way around: - enable compression by default - disable compression in INET6 What do you think?
+ */ + return 0; + }
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org