Re: [Maria-developers] 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
Hi, Nikita! On Nov 15, Nikita Malyavin wrote:
revision-id: 6ac19d09c66 (mariadb-10.4.4-280-g6ac19d09c66) parent(s): 67ddb6507d5 author: Nikita Malyavin <nikitamalyavin@gmail.com> committer: Nikita Malyavin <nikitamalyavin@gmail.com> timestamp: 2019-08-20 17:56:07 +1000 message:
MDEV-16978 Application-time periods: WITHOUT OVERLAPS
* The overlaps check is implemented on a handler level per row command. It creates a separate cursor (actually, another handler instance) and caches it inside the original handler, when ha_update_row or ha_insert_row is issued. Cursor closes on unlocking the handler.
* Containing the same key in index means unique constraint violation even in usual terms. So we fetch left and right neighbours and check that they have same key prefix, excluding from the key only the period part. If it doesnt match, then there's no such neighbour, and the check passes. Otherwise, we check if this neighbour intersects with the considered key.
* the check does introduce new error and fails with ER_DUPP_KEY error. This might break REPLACE workflow and should be fixed separately
diff --git a/sql/handler.cc b/sql/handler.cc index 94cffd69b75..560f6316602 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -6913,6 +6928,130 @@ void handler::set_lock_type(enum thr_lock_type lock) table->reginfo.lock_type= lock; }
+int handler::ha_check_overlaps(const uchar *old_data, const uchar* new_data) +{ + DBUG_ASSERT(new_data); + if (!table_share->period.unique_keys) + return 0; + if (table->versioned() && !table->vers_end_field()->is_max()) + return 0; + + bool is_update= old_data != NULL; + if (!check_overlaps_buffer) + check_overlaps_buffer= (uchar*)alloc_root(&table_share->mem_root, + table_share->max_unique_length + + table_share->reclength); + auto *record_buffer= check_overlaps_buffer + table_share->max_unique_length; + auto *handler= this; + if (handler->inited != NONE) + { + if (!check_overlaps_handler) + { + check_overlaps_handler= clone(table_share->normalized_path.str, + &table_share->mem_root);
wouldn't it be simpler and cheaper to use HA_EXTRA_REMEMBER_POS and HA_EXTRA_RESTORE_POS, as we've discussed?
+ int error= -1; + if (check_overlaps_handler != NULL) + error= check_overlaps_handler->ha_external_lock(table->in_use, F_RDLCK); + if (error) + return error; + } + handler= check_overlaps_handler; + }
why "if (handler->inited != NONE)" ? What happens if it is NONE?
+ + for (uint key_nr= 0; key_nr < table_share->keys; key_nr++) + { + const KEY *key_info= table->key_info + key_nr; + const uint key_parts= key_info->user_defined_key_parts; + if (!key_info->without_overlaps) + continue; + + key_copy(check_overlaps_buffer, new_data, key_info, 0); + if (is_update) + { + bool key_used= false; + for (uint k= 0; k < key_parts && !key_used; k++) + key_used= bitmap_is_set(table->write_set, + key_info->key_part[k].fieldnr - 1); + if (!key_used) + continue; + }
Good.
+ + int error= handler->ha_index_init(key_nr, 0); + if (error) + return error; + + for (int run= 0; run < 2; run++) + { + if (run == 0) + { + error = handler->ha_index_read_map(record_buffer, + check_overlaps_buffer, + key_part_map((1 << key_parts) - 1), + HA_READ_KEY_OR_PREV); + if (error == HA_ERR_KEY_NOT_FOUND) + continue; + } + else + { + error = handler->ha_index_next(record_buffer); + if (error == HA_ERR_END_OF_FILE) + continue; + } + + if (error) + { + handler->ha_index_end(); + return error; + }
I found this "for (int run= 0; run < 2; run++)" rather confusing. Particularly as you have an if() to have different code for the first and the second runs. it would be clearer to unroll the loop and move the common code (period comparison) into a function. Like handler->ha_index_read_map(...); compare_periods(); handler->ha_index_next(); compare_periods(); But it can be done better. You search for the key with the same value and a period start <= than the period start of new row. And then you have to check two rows for overlaps. If you'll search for a key with the period start <= than the _period end_ of the new row, then you'll only need to check one row (may be one more, if updating). Also, why do you store both period ends in the index? It seems like it'd be enough to store only one end - only the start or only the end. Both ends help if you use keyreads, but you don't. On the second thought, perhaps, you should use keyreads, there's no need to fetch the whole row for this overlap check. On the yet another thought it might not work with HA_EXTRA_REMEMBER_POS.
+ + /* In case of update it could appear that the nearest neighbour is + * a record we are updating. It means, that there are no overlaps + * from this side. */ + if (is_update && memcmp(old_data + table->s->null_bytes, + record_buffer + table->s->null_bytes, + table->s->reclength - table->s->null_bytes) == 0) + { + continue; + }
I'd rather compare row positions here.
+ + uint period_key_part_nr= key_parts - 2; + int cmp_res= 0; + for (uint part_nr= 0; !cmp_res && part_nr < period_key_part_nr; part_nr++) + { + Field *f= key_info->key_part[part_nr].field; + cmp_res= f->cmp(f->ptr_in_record(new_data), + f->ptr_in_record(record_buffer)); + } + if (cmp_res) + continue; /* key is different => no overlaps */ + + int period_cmp[2][2]= {/* l1 > l2, l1 > r2, r1 > l2, r1 > r2 */}; + for (int i= 0; i < 2; i++) + { + for (int j= 0; j < 2; j++) + { + Field *lhs= key_info->key_part[period_key_part_nr + i].field; + Field *rhs= key_info->key_part[period_key_part_nr + j].field; + + period_cmp[i][j]= lhs->cmp(lhs->ptr_in_record(new_data), + rhs->ptr_in_record(record_buffer)); + } + }
this can be simplified too if you'll change the code as I suggested above to do only one index read - you'll only need one comparison, instead of four.
+ + if ((period_cmp[0][0] <= 0 && period_cmp[1][0] > 0) + || (period_cmp[0][0] >= 0 && period_cmp[0][1] < 0)) + { + handler->ha_index_end(); + return HA_ERR_FOUND_DUPP_KEY; + } + } + error= handler->ha_index_end(); + if (error) + return error; + } + return 0; +} + #ifdef WITH_WSREP /** @details diff --git a/sql/handler.h b/sql/handler.h index 63d0bf2215c..71debf9bab7 100644 --- a/sql/handler.h +++ b/sql/handler.h @@ -3237,6 +3243,8 @@ class handler :public Sql_alloc DBUG_ASSERT(cached_table_flags < (HA_LAST_TABLE_FLAG << 1)); return cached_table_flags; } + /** PRIMARY KEY WITHOUT OVERLAPS check is done globally */
what do you mean "globally"?
+ int ha_check_overlaps(const uchar *old_data, const uchar* new_data); /** These functions represent the public interface to *users* of the handler class, hence they are *not* virtual. For the inheritance diff --git a/sql/sql_table.cc b/sql/sql_table.cc index 9bb1d98152b..5aba86003c6 100644 --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -4549,25 +4552,57 @@ static bool vers_prepare_keys(THD *thd, HA_CREATE_INFO *create_info, if (key->type != Key::PRIMARY && key->type != Key::UNIQUE) continue;
- Key_part_spec *key_part= NULL; - List_iterator<Key_part_spec> part_it(key->columns); - while ((key_part=part_it++)) + if (create_info->versioned()) { - if (!my_strcasecmp(system_charset_info, - row_start_field, - key_part->field_name.str) || - - !my_strcasecmp(system_charset_info, - row_end_field, - key_part->field_name.str)) - break; + Key_part_spec *key_part=NULL; + List_iterator<Key_part_spec> part_it(key->columns); + while ((key_part=part_it++)) + { + if (row_start_field.streq(key_part->field_name) || + row_end_field.streq(key_part->field_name)) + break; + } + if (!key_part) + key->columns.push_back(new Key_part_spec(&row_end_field, 0)); } - if (key_part) - continue; // Key already contains Sys_start or Sys_end + }
- Key_part_spec *key_part_sys_end_col= - new (thd->mem_root) Key_part_spec(&create_info->vers_info.as_row.end, 0); - key->columns.push_back(key_part_sys_end_col); + key_it.rewind(); + while ((key=key_it++)) + { + if (key->without_overlaps) + { + if (key->type != Key::PRIMARY && key->type != Key::UNIQUE) + { + my_error(ER_PERIOD_WITHOUT_OVERLAPS_NON_UNIQUE, MYF(0), key->period.str); + return true; + } + if (!create_info->period_info.is_set() + || !key->period.streq(create_info->period_info.name)) + { + my_error(ER_PERIOD_NOT_FOUND, MYF(0), key->period.str); + return true; + } + if (thd->work_part_info) + { + my_error(ER_PERIOD_WITHOUT_OVERLAPS_PARTITIONED, MYF(0));
why?
+ return true; + } + const auto &period_start= create_info->period_info.period.start; + const auto &period_end= create_info->period_info.period.end; + List_iterator<Key_part_spec> part_it(key->columns); + while (Key_part_spec *key_part= part_it++) + { + if (period_start.streq(key_part->field_name) + || period_end.streq(key_part->field_name)) + { + my_error(ER_KEY_CONTAINS_PERIOD_FIELDS, MYF(0), key->name.str); + return true; + } + } + key->columns.push_back(new Key_part_spec(&period_start, 0)); + key->columns.push_back(new Key_part_spec(&period_end, 0)); + } }
return false; diff --git a/sql/table.h b/sql/table.h index 2b866159fe0..6511960488e 100644 --- a/sql/table.h +++ b/sql/table.h @@ -1730,10 +1731,18 @@ class IS_table_read_plan;
/** number of bytes used by field positional indexes in frm */ constexpr uint frm_fieldno_size= 2; +/** number of bytes used by key position number in frm */ +constexpr uint frm_keyno_size= 2; static inline uint16 read_frm_fieldno(const uchar *data) { return uint2korr(data); } -static inline void store_frm_fieldno(const uchar *data, uint16 fieldno) +static inline void store_frm_fieldno(uchar *data, uint16 fieldno) { int2store(data, fieldno); } +static inline uint16 read_frm_keyno(const uchar *data) +{ return uint2korr(data); } +static inline void store_frm_keyno(uchar *data, uint16 fieldno) +{ int2store(data, fieldno); } +static inline size_t frm_ident_stored_size(size_t len) +{ return len + (len > 255 ? 3 : 1); }
this is exactly extra2_str_size() function. Don't duplicate it, reuse. (and rename, if you'd like a more generic name)
class select_unit; class TMP_TABLE_PARAM; diff --git a/sql/unireg.cc b/sql/unireg.cc index 7130b3e5d8a..1e95242786e 100644 --- a/sql/unireg.cc +++ b/sql/unireg.cc @@ -485,6 +486,16 @@ LEX_CUSTRING build_frm_image(THD *thd, const LEX_CSTRING &table, store_frm_fieldno(pos, get_fieldno_by_name(create_info, create_fields, create_info->period_info.period.end)); pos+= frm_fieldno_size; + store_frm_keyno(pos, create_info->period_info.unique_keys); + pos+= frm_keyno_size; + for (uint key= 0; key < keys; key++) + { + if (key_info[key].without_overlaps) + { + store_frm_keyno(pos, key); + pos+= frm_keyno_size; + } + }
This will make 10.5 frms look corrupted in 10.4. Better use a new EXTRA2 flag for that. With a number > EXTRA2_ENGINE_IMPORTANT. Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hello, Sergei! I have updated the code, please see - bb-10.5-MDEV-16978-without-overlaps <https://github.com/MariaDB/server/compare/bb-10.5-MDEV-16978-without-overlaps> branch. Among the changes I, having noticed that innodb combination is missing, have added it, and some merging down from my INSERT/REPLACE patch was required to make it run successfully. Below are the rest comments: On Tue, 19 Nov 2019 at 15:45, Sergei Golubchik <serg@mariadb.org> wrote:
diff --git a/sql/handler.cc b/sql/handler.cc index 94cffd69b75..560f6316602 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -6913,6 +6928,130 @@ void handler::set_lock_type(enum thr_lock_type lock) table->reginfo.lock_type= lock; }
+int handler::ha_check_overlaps(const uchar *old_data, const uchar* new_data) +{ + DBUG_ASSERT(new_data); + if (!table_share->period.unique_keys) + return 0; + if (table->versioned() && !table->vers_end_field()->is_max()) + return 0; + + bool is_update= old_data != NULL; + if (!check_overlaps_buffer) + check_overlaps_buffer= (uchar*)alloc_root(&table_share->mem_root, + table_share->max_unique_length + + table_share->reclength); + auto *record_buffer= check_overlaps_buffer + table_share->max_unique_length; + auto *handler= this; + if (handler->inited != NONE) + { + if (!check_overlaps_handler) + { + check_overlaps_handler= clone(table_share->normalized_path.str, + &table_share->mem_root);
wouldn't it be simpler and cheaper to use HA_EXTRA_REMEMBER_POS and HA_EXTRA_RESTORE_POS, as we've discussed?
Can it stack (like, what'd be if HA_EXTRA_REMEMBER_POS is called several times)? I guess it can't, and somebody on upper layers could have already used it. This is for example done in TABLE::delete_row(). I see that you've removed it in latest 10.4+, but anyway, I think it's better to avoid remembering on lower layers
+ int error= -1; + if (check_overlaps_handler != NULL) + error= check_overlaps_handler->ha_external_lock(table->in_use, F_RDLCK); + if (error) + return error; + } + handler= check_overlaps_handler; + }
why "if (handler->inited != NONE)" ? What happens if it is NONE?
then `this` is used
+ + for (uint key_nr= 0; key_nr < table_share->keys; key_nr++) + { + const KEY *key_info= table->key_info + key_nr; + const uint key_parts= key_info->user_defined_key_parts; + if (!key_info->without_overlaps) + continue; + + key_copy(check_overlaps_buffer, new_data, key_info, 0); + if (is_update) + { + bool key_used= false; + for (uint k= 0; k < key_parts && !key_used; k++) + key_used= bitmap_is_set(table->write_set, + key_info->key_part[k].fieldnr - 1); + if (!key_used) + continue; + }
Good.
That's midenok's idea:)
+ + int error= handler->ha_index_init(key_nr, 0); + if (error) + return error; + + for (int run= 0; run < 2; run++) + { + if (run == 0) + { + error = handler->ha_index_read_map(record_buffer, + check_overlaps_buffer, + key_part_map((1 << key_parts) - 1), + HA_READ_KEY_OR_PREV); + if (error == HA_ERR_KEY_NOT_FOUND) + continue; + } + else + { + error = handler->ha_index_next(record_buffer); + if (error == HA_ERR_END_OF_FILE) + continue; + } + + if (error) + { + handler->ha_index_end(); + return error; + }
I found this "for (int run= 0; run < 2; run++)" rather confusing. Particularly as you have an if() to have different code for the first and the second runs.
it would be clearer to unroll the loop and move the common code (period comparison) into a function. Like
handler->ha_index_read_map(...); compare_periods(); handler->ha_index_next(); compare_periods();
Have rewritten it that way, looks really better👍
But it can be done better. You search for the key with the same value and a period start <= than the period start of new row. And then you have to check two rows for overlaps. If you'll search for a key with the period start <= than the _period end_ of the new row, then you'll only need to check one row (may be one more, if updating).
It can't work just that way: to handle case when period start *=* _period end_ of the new row, you should write even more checks, and then move cursor left. So even more amount of effort is required, but the advantage on the other hand is that moving the cursor will happen much less frequent. Also, why do you store both period ends in the index? It seems like it'd
be enough to store only one end - only the start or only the end. Both ends help if you use keyreads, but you don't. On the second thought, perhaps, you should use keyreads, there's no need to fetch the whole row for this overlap check. On the yet another thought it might not work with HA_EXTRA_REMEMBER_POS.
I think keyreads can work with HA_EXTRA_REMEMBER_POS. Moreover, I agree that using keyreads can be much more efficient. However, all of my latest code, namely FKs and REPLACE/IODKU, are not based on keyreads, and rewriting it now will require significant effort. Let's make it a separate task, maybe?
+ /* In case of update it could appear that the nearest neighbour is + * a record we are updating. It means, that there are no overlaps + * from this side. */ + if (is_update && memcmp(old_data + table->s->null_bytes, + record_buffer + table->s->null_bytes, + table->s->reclength -
+ table->s->null_bytes) == 0)
+ { + continue; + }
I'd rather compare row positions here.
What do you mean by that?
+ + uint period_key_part_nr= key_parts - 2; + int cmp_res= 0; + for (uint part_nr= 0; !cmp_res && part_nr < period_key_part_nr; part_nr++) + { + Field *f= key_info->key_part[part_nr].field; + cmp_res= f->cmp(f->ptr_in_record(new_data), + f->ptr_in_record(record_buffer)); + } + if (cmp_res) + continue; /* key is different => no overlaps */ + + int period_cmp[2][2]= {/* l1 > l2, l1 > r2, r1 > l2, r1 > r2 */}; + for (int i= 0; i < 2; i++) + { + for (int j= 0; j < 2; j++) + { + Field *lhs= key_info->key_part[period_key_part_nr + i].field; + Field *rhs= key_info->key_part[period_key_part_nr + j].field; + + period_cmp[i][j]= lhs->cmp(lhs->ptr_in_record(new_data), + rhs->ptr_in_record(record_buffer)); + } + }
this can be simplified too if you'll change the code as I suggested above to do only one index read - you'll only need one comparison, instead of four.
I have already moved it in a separate function in FK and REPLACE code. Have merged it down to WITHOUT OVERLAPS branch now.
+ + if ((period_cmp[0][0] <= 0 && period_cmp[1][0] > 0) + || (period_cmp[0][0] >= 0 && period_cmp[0][1] < 0)) + { + handler->ha_index_end(); + return HA_ERR_FOUND_DUPP_KEY; + } + } + error= handler->ha_index_end(); + if (error) + return error; + } + return 0; +} + #ifdef WITH_WSREP /** @details diff --git a/sql/handler.h b/sql/handler.h index 63d0bf2215c..71debf9bab7 100644 --- a/sql/handler.h +++ b/sql/handler.h @@ -3237,6 +3243,8 @@ class handler :public Sql_alloc DBUG_ASSERT(cached_table_flags < (HA_LAST_TABLE_FLAG << 1)); return cached_table_flags; } + /** PRIMARY KEY WITHOUT OVERLAPS check is done globally */
what do you mean "globally"?
Yeah, looks weird. Updated
+ int ha_check_overlaps(const uchar *old_data, const uchar* new_data);
/** These functions represent the public interface to *users* of the handler class, hence they are *not* virtual. For the inheritance diff --git a/sql/sql_table.cc b/sql/sql_table.cc index 9bb1d98152b..5aba86003c6 100644 --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -4549,25 +4552,57 @@ static bool vers_prepare_keys(THD *thd, HA_CREATE_INFO *create_info, if (key->type != Key::PRIMARY && key->type != Key::UNIQUE) continue;
- Key_part_spec *key_part= NULL; - List_iterator<Key_part_spec> part_it(key->columns); - while ((key_part=part_it++)) + if (create_info->versioned()) { - if (!my_strcasecmp(system_charset_info, - row_start_field, - key_part->field_name.str) || - - !my_strcasecmp(system_charset_info, - row_end_field, - key_part->field_name.str)) - break; + Key_part_spec *key_part=NULL; + List_iterator<Key_part_spec> part_it(key->columns); + while ((key_part=part_it++)) + { + if (row_start_field.streq(key_part->field_name) || + row_end_field.streq(key_part->field_name)) + break; + } + if (!key_part) + key->columns.push_back(new Key_part_spec(&row_end_field, 0)); } - if (key_part) - continue; // Key already contains Sys_start or Sys_end + }
- Key_part_spec *key_part_sys_end_col= - new (thd->mem_root) Key_part_spec(&create_info->vers_info.as_row.end, 0); - key->columns.push_back(key_part_sys_end_col); + key_it.rewind(); + while ((key=key_it++)) + { + if (key->without_overlaps) + { + if (key->type != Key::PRIMARY && key->type != Key::UNIQUE) + { + my_error(ER_PERIOD_WITHOUT_OVERLAPS_NON_UNIQUE, MYF(0), key->period.str); + return true; + } + if (!create_info->period_info.is_set() + || !key->period.streq(create_info->period_info.name)) + { + my_error(ER_PERIOD_NOT_FOUND, MYF(0), key->period.str); + return true; + } + if (thd->work_part_info) + { + my_error(ER_PERIOD_WITHOUT_OVERLAPS_PARTITIONED, MYF(0));
why?
Unfortunately partitions do not support searching upper/lower bounds (i.e. KEY_OR_PREV, KEY_OR_NEXT). But I think it is doable. For now it just doesn't work.
+ return true; + } + const auto &period_start= create_info->period_info.period.start; + const auto &period_end= create_info->period_info.period.end; + List_iterator<Key_part_spec> part_it(key->columns); + while (Key_part_spec *key_part= part_it++) + { + if (period_start.streq(key_part->field_name) + || period_end.streq(key_part->field_name)) + { + my_error(ER_KEY_CONTAINS_PERIOD_FIELDS, MYF(0), key->name.str); + return true; + } + } + key->columns.push_back(new Key_part_spec(&period_start, 0)); + key->columns.push_back(new Key_part_spec(&period_end, 0)); + } }
return false; diff --git a/sql/table.h b/sql/table.h index 2b866159fe0..6511960488e 100644 --- a/sql/table.h +++ b/sql/table.h @@ -1730,10 +1731,18 @@ class IS_table_read_plan;
/** number of bytes used by field positional indexes in frm */ constexpr uint frm_fieldno_size= 2; +/** number of bytes used by key position number in frm */ +constexpr uint frm_keyno_size= 2; static inline uint16 read_frm_fieldno(const uchar *data) { return uint2korr(data); } -static inline void store_frm_fieldno(const uchar *data, uint16 fieldno) +static inline void store_frm_fieldno(uchar *data, uint16 fieldno) { int2store(data, fieldno); } +static inline uint16 read_frm_keyno(const uchar *data) +{ return uint2korr(data); } +static inline void store_frm_keyno(uchar *data, uint16 fieldno) +{ int2store(data, fieldno); } +static inline size_t frm_ident_stored_size(size_t len) +{ return len + (len > 255 ? 3 : 1); }
this is exactly extra2_str_size() function. Don't duplicate it, reuse. (and rename, if you'd like a more generic name)
thanks, replaced with extra2_str_size()
class select_unit; class TMP_TABLE_PARAM; diff --git a/sql/unireg.cc b/sql/unireg.cc index 7130b3e5d8a..1e95242786e 100644 --- a/sql/unireg.cc +++ b/sql/unireg.cc @@ -485,6 +486,16 @@ LEX_CUSTRING build_frm_image(THD *thd, const
LEX_CSTRING &table,
store_frm_fieldno(pos, get_fieldno_by_name(create_info,
create_fields,
create_info->period_info.period.end));
pos+= frm_fieldno_size; + store_frm_keyno(pos, create_info->period_info.unique_keys); + pos+= frm_keyno_size; + for (uint key= 0; key < keys; key++) + { + if (key_info[key].without_overlaps) + { + store_frm_keyno(pos, key); + pos+= frm_keyno_size; + } + }
This will make 10.5 frms look corrupted in 10.4. Better use a new EXTRA2 flag for that. With a number > EXTRA2_ENGINE_IMPORTANT.
Another way is to update 10.4 code to handle additional entry, but ok,
this way is same good for me Regards,
Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- Yours truly, Nikita Malyavin
One more update: I have replaced `goto err` with `DBUG_RETURN(err())` everywhere in TABLE_SHARE::init_from_binary_frm_image, making err a callback with a closure. Making err a function helps to debug error handling a lot, so I think it's one of the good possible styles to move to. This commit is independent from the feature and can be pushed separately (all the others are just fixups) On Tue, 26 Nov 2019 at 19:28, Nikita Malyavin <nikitamalyavin@gmail.com> wrote:
Hello, Sergei! I have updated the code, please see
- bb-10.5-MDEV-16978-without-overlaps <https://github.com/MariaDB/server/compare/bb-10.5-MDEV-16978-without-overlaps>
branch.
Among the changes I, having noticed that innodb combination is missing, have added it, and some merging down from my INSERT/REPLACE patch was required to make it run successfully.
Below are the rest comments:
On Tue, 19 Nov 2019 at 15:45, Sergei Golubchik <serg@mariadb.org> wrote:
diff --git a/sql/handler.cc b/sql/handler.cc index 94cffd69b75..560f6316602 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -6913,6 +6928,130 @@ void handler::set_lock_type(enum thr_lock_type lock) table->reginfo.lock_type= lock; }
+int handler::ha_check_overlaps(const uchar *old_data, const uchar* new_data) +{ + DBUG_ASSERT(new_data); + if (!table_share->period.unique_keys) + return 0; + if (table->versioned() && !table->vers_end_field()->is_max()) + return 0; + + bool is_update= old_data != NULL; + if (!check_overlaps_buffer) + check_overlaps_buffer= (uchar*)alloc_root(&table_share->mem_root, + table_share->max_unique_length + + table_share->reclength); + auto *record_buffer= check_overlaps_buffer + table_share->max_unique_length; + auto *handler= this; + if (handler->inited != NONE) + { + if (!check_overlaps_handler) + { + check_overlaps_handler= clone(table_share->normalized_path.str, + &table_share->mem_root);
wouldn't it be simpler and cheaper to use HA_EXTRA_REMEMBER_POS and HA_EXTRA_RESTORE_POS, as we've discussed?
Can it stack (like, what'd be if HA_EXTRA_REMEMBER_POS is called several times)? I guess it can't, and somebody on upper layers could have already used it. This is for example done in TABLE::delete_row(). I see that you've removed it in latest 10.4+, but anyway, I think it's better to avoid remembering on lower layers
+ int error= -1; + if (check_overlaps_handler != NULL) + error= check_overlaps_handler->ha_external_lock(table->in_use, F_RDLCK); + if (error) + return error; + } + handler= check_overlaps_handler; + }
why "if (handler->inited != NONE)" ? What happens if it is NONE?
then `this` is used
+ + for (uint key_nr= 0; key_nr < table_share->keys; key_nr++) + { + const KEY *key_info= table->key_info + key_nr; + const uint key_parts= key_info->user_defined_key_parts; + if (!key_info->without_overlaps) + continue; + + key_copy(check_overlaps_buffer, new_data, key_info, 0); + if (is_update) + { + bool key_used= false; + for (uint k= 0; k < key_parts && !key_used; k++) + key_used= bitmap_is_set(table->write_set, + key_info->key_part[k].fieldnr - 1); + if (!key_used) + continue; + }
Good.
That's midenok's idea:)
+ + int error= handler->ha_index_init(key_nr, 0); + if (error) + return error; + + for (int run= 0; run < 2; run++) + { + if (run == 0) + { + error = handler->ha_index_read_map(record_buffer, + check_overlaps_buffer, + key_part_map((1 << key_parts) - 1), + HA_READ_KEY_OR_PREV); + if (error == HA_ERR_KEY_NOT_FOUND) + continue; + } + else + { + error = handler->ha_index_next(record_buffer); + if (error == HA_ERR_END_OF_FILE) + continue; + } + + if (error) + { + handler->ha_index_end(); + return error; + }
I found this "for (int run= 0; run < 2; run++)" rather confusing. Particularly as you have an if() to have different code for the first and the second runs.
it would be clearer to unroll the loop and move the common code (period comparison) into a function. Like
handler->ha_index_read_map(...); compare_periods(); handler->ha_index_next(); compare_periods();
Have rewritten it that way, looks really better👍
But it can be done better. You search for the key with the same value and a period start <= than the period start of new row. And then you have to check two rows for overlaps. If you'll search for a key with the period start <= than the _period end_ of the new row, then you'll only need to check one row (may be one more, if updating).
It can't work just that way: to handle case when period start *=* _period end_ of the new row, you should write even more checks, and then move cursor left. So even more amount of effort is required, but the advantage on the other hand is that moving the cursor will happen much less frequent.
Also, why do you store both period ends in the index? It seems like it'd
be enough to store only one end - only the start or only the end. Both ends help if you use keyreads, but you don't. On the second thought, perhaps, you should use keyreads, there's no need to fetch the whole row for this overlap check. On the yet another thought it might not work with HA_EXTRA_REMEMBER_POS.
I think keyreads can work with HA_EXTRA_REMEMBER_POS. Moreover, I agree that using keyreads can be much more efficient. However, all of my latest code, namely FKs and REPLACE/IODKU, are not based on keyreads, and rewriting it now will require significant effort. Let's make it a separate task, maybe?
+ /* In case of update it could appear that the nearest neighbour is + * a record we are updating. It means, that there are no overlaps + * from this side. */ + if (is_update && memcmp(old_data + table->s->null_bytes, + record_buffer + table->s->null_bytes, + table->s->reclength -
+ table->s->null_bytes) == 0)
+ { + continue; + }
I'd rather compare row positions here.
What do you mean by that?
+ + uint period_key_part_nr= key_parts - 2; + int cmp_res= 0; + for (uint part_nr= 0; !cmp_res && part_nr < period_key_part_nr; part_nr++) + { + Field *f= key_info->key_part[part_nr].field; + cmp_res= f->cmp(f->ptr_in_record(new_data), + f->ptr_in_record(record_buffer)); + } + if (cmp_res) + continue; /* key is different => no overlaps */ + + int period_cmp[2][2]= {/* l1 > l2, l1 > r2, r1 > l2, r1 > r2 */}; + for (int i= 0; i < 2; i++) + { + for (int j= 0; j < 2; j++) + { + Field *lhs= key_info->key_part[period_key_part_nr + i].field; + Field *rhs= key_info->key_part[period_key_part_nr + j].field; + + period_cmp[i][j]= lhs->cmp(lhs->ptr_in_record(new_data), + rhs->ptr_in_record(record_buffer)); + } + }
this can be simplified too if you'll change the code as I suggested above to do only one index read - you'll only need one comparison, instead of four.
I have already moved it in a separate function in FK and REPLACE code. Have merged it down to WITHOUT OVERLAPS branch now.
+ + if ((period_cmp[0][0] <= 0 && period_cmp[1][0] > 0) + || (period_cmp[0][0] >= 0 && period_cmp[0][1] < 0)) + { + handler->ha_index_end(); + return HA_ERR_FOUND_DUPP_KEY; + } + } + error= handler->ha_index_end(); + if (error) + return error; + } + return 0; +} + #ifdef WITH_WSREP /** @details diff --git a/sql/handler.h b/sql/handler.h index 63d0bf2215c..71debf9bab7 100644 --- a/sql/handler.h +++ b/sql/handler.h @@ -3237,6 +3243,8 @@ class handler :public Sql_alloc DBUG_ASSERT(cached_table_flags < (HA_LAST_TABLE_FLAG << 1)); return cached_table_flags; } + /** PRIMARY KEY WITHOUT OVERLAPS check is done globally */
what do you mean "globally"?
Yeah, looks weird. Updated
+ int ha_check_overlaps(const uchar *old_data, const uchar* new_data);
/** These functions represent the public interface to *users* of the handler class, hence they are *not* virtual. For the inheritance diff --git a/sql/sql_table.cc b/sql/sql_table.cc index 9bb1d98152b..5aba86003c6 100644 --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -4549,25 +4552,57 @@ static bool vers_prepare_keys(THD *thd, HA_CREATE_INFO *create_info, if (key->type != Key::PRIMARY && key->type != Key::UNIQUE) continue;
- Key_part_spec *key_part= NULL; - List_iterator<Key_part_spec> part_it(key->columns); - while ((key_part=part_it++)) + if (create_info->versioned()) { - if (!my_strcasecmp(system_charset_info, - row_start_field, - key_part->field_name.str) || - - !my_strcasecmp(system_charset_info, - row_end_field, - key_part->field_name.str)) - break; + Key_part_spec *key_part=NULL; + List_iterator<Key_part_spec> part_it(key->columns); + while ((key_part=part_it++)) + { + if (row_start_field.streq(key_part->field_name) || + row_end_field.streq(key_part->field_name)) + break; + } + if (!key_part) + key->columns.push_back(new Key_part_spec(&row_end_field, 0)); } - if (key_part) - continue; // Key already contains Sys_start or Sys_end + }
- Key_part_spec *key_part_sys_end_col= - new (thd->mem_root) Key_part_spec(&create_info->vers_info.as_row.end, 0); - key->columns.push_back(key_part_sys_end_col); + key_it.rewind(); + while ((key=key_it++)) + { + if (key->without_overlaps) + { + if (key->type != Key::PRIMARY && key->type != Key::UNIQUE) + { + my_error(ER_PERIOD_WITHOUT_OVERLAPS_NON_UNIQUE, MYF(0), key->period.str); + return true; + } + if (!create_info->period_info.is_set() + || !key->period.streq(create_info->period_info.name)) + { + my_error(ER_PERIOD_NOT_FOUND, MYF(0), key->period.str); + return true; + } + if (thd->work_part_info) + { + my_error(ER_PERIOD_WITHOUT_OVERLAPS_PARTITIONED, MYF(0));
why?
Unfortunately partitions do not support searching upper/lower bounds (i.e. KEY_OR_PREV, KEY_OR_NEXT). But I think it is doable. For now it just doesn't work.
+ return true; + } + const auto &period_start= create_info->period_info.period.start; + const auto &period_end= create_info->period_info.period.end; + List_iterator<Key_part_spec> part_it(key->columns); + while (Key_part_spec *key_part= part_it++) + { + if (period_start.streq(key_part->field_name) + || period_end.streq(key_part->field_name)) + { + my_error(ER_KEY_CONTAINS_PERIOD_FIELDS, MYF(0), key->name.str); + return true; + } + } + key->columns.push_back(new Key_part_spec(&period_start, 0)); + key->columns.push_back(new Key_part_spec(&period_end, 0)); + } }
return false; diff --git a/sql/table.h b/sql/table.h index 2b866159fe0..6511960488e 100644 --- a/sql/table.h +++ b/sql/table.h @@ -1730,10 +1731,18 @@ class IS_table_read_plan;
/** number of bytes used by field positional indexes in frm */ constexpr uint frm_fieldno_size= 2; +/** number of bytes used by key position number in frm */ +constexpr uint frm_keyno_size= 2; static inline uint16 read_frm_fieldno(const uchar *data) { return uint2korr(data); } -static inline void store_frm_fieldno(const uchar *data, uint16 fieldno) +static inline void store_frm_fieldno(uchar *data, uint16 fieldno) { int2store(data, fieldno); } +static inline uint16 read_frm_keyno(const uchar *data) +{ return uint2korr(data); } +static inline void store_frm_keyno(uchar *data, uint16 fieldno) +{ int2store(data, fieldno); } +static inline size_t frm_ident_stored_size(size_t len) +{ return len + (len > 255 ? 3 : 1); }
this is exactly extra2_str_size() function. Don't duplicate it, reuse. (and rename, if you'd like a more generic name)
thanks, replaced with extra2_str_size()
class select_unit; class TMP_TABLE_PARAM; diff --git a/sql/unireg.cc b/sql/unireg.cc index 7130b3e5d8a..1e95242786e 100644 --- a/sql/unireg.cc +++ b/sql/unireg.cc @@ -485,6 +486,16 @@ LEX_CUSTRING build_frm_image(THD *thd, const
LEX_CSTRING &table,
store_frm_fieldno(pos, get_fieldno_by_name(create_info,
create_fields,
create_info->period_info.period.end));
pos+= frm_fieldno_size; + store_frm_keyno(pos, create_info->period_info.unique_keys); + pos+= frm_keyno_size; + for (uint key= 0; key < keys; key++) + { + if (key_info[key].without_overlaps) + { + store_frm_keyno(pos, key); + pos+= frm_keyno_size; + } + }
This will make 10.5 frms look corrupted in 10.4. Better use a new EXTRA2 flag for that. With a number > EXTRA2_ENGINE_IMPORTANT.
Another way is to update 10.4 code to handle additional entry, but ok, this way is same good for me
Regards,
Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- Yours truly, Nikita Malyavin
-- Yours truly, Nikita Malyavin
Hi, Nikita! On Nov 26, Nikita Malyavin wrote:
Hello, Sergei! I have updated the code, please see
- bb-10.5-MDEV-16978-without-overlaps <https://github.com/MariaDB/server/compare/bb-10.5-MDEV-16978-without-overlaps>
branch.
Among the changes I, having noticed that innodb combination is missing, have added it, and some merging down from my INSERT/REPLACE patch was required to make it run successfully.
Below are the rest comments:
On Tue, 19 Nov 2019 at 15:45, Sergei Golubchik <serg@mariadb.org> wrote:
diff --git a/sql/handler.cc b/sql/handler.cc index 94cffd69b75..560f6316602 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -6913,6 +6928,130 @@ void handler::set_lock_type(enum thr_lock_type lock) table->reginfo.lock_type= lock; }
+int handler::ha_check_overlaps(const uchar *old_data, const uchar* new_data) +{ + DBUG_ASSERT(new_data); + if (!table_share->period.unique_keys) + return 0; + if (table->versioned() && !table->vers_end_field()->is_max()) + return 0; + + bool is_update= old_data != NULL; + if (!check_overlaps_buffer) + check_overlaps_buffer= (uchar*)alloc_root(&table_share->mem_root, + table_share->max_unique_length + + table_share->reclength); + auto *record_buffer= check_overlaps_buffer + table_share->max_unique_length; + auto *handler= this; + if (handler->inited != NONE) + { + if (!check_overlaps_handler) + { + check_overlaps_handler= clone(table_share->normalized_path.str, + &table_share->mem_root);
wouldn't it be simpler and cheaper to use HA_EXTRA_REMEMBER_POS and HA_EXTRA_RESTORE_POS, as we've discussed?
Can it stack (like, what'd be if HA_EXTRA_REMEMBER_POS is called several times)? I guess it can't, and somebody on upper layers could have already used it. This is for example done in TABLE::delete_row(). I see that you've removed it in latest 10.4+, but anyway, I think it's better to avoid remembering on lower layers
strictly speaking, cloning is rather a high-leveloperation, not for lower layers. but I'm not as much interested in layer purity, as in the simpler and faster code. I thought that HA_EXTRA_REMEMBER_POS would be it - simpler and faster - but let me look and your new patch first, if I'll still think that I comment there again :)
+ int error= -1; + if (check_overlaps_handler != NULL) + error= check_overlaps_handler->ha_external_lock(table->in_use, F_RDLCK); + if (error) + return error; + } + handler= check_overlaps_handler; + }
why "if (handler->inited != NONE)" ? What happens if it is NONE?
then `this` is used
right, but when handler->inited is NONE? How can that happen?
But it can be done better. You search for the key with the same value and a period start <= than the period start of new row. And then you have to check two rows for overlaps. If you'll search for a key with the period start <= than the _period end_ of the new row, then you'll only need to check one row (may be one more, if updating).
It can't work just that way: to handle case when period start = _period end_ of the new row, you should write even more checks, and then move cursor left.
No, I don't see that. Can you show an example where you'd need even more checks?
Also, why do you store both period ends in the index? It seems like it'd be enough to store only one end - only the start or only the end. Both ends help if you use keyreads, but you don't. On the second thought, perhaps, you should use keyreads, there's no need to fetch the whole row for this overlap check. On the yet another thought it might not work with HA_EXTRA_REMEMBER_POS.
I think keyreads can work with HA_EXTRA_REMEMBER_POS. Moreover, I agree that using keyreads can be much more efficient. However, all of my latest code, namely FKs and REPLACE/IODKU, are not based on keyreads, and rewriting it now will require significant effort. Let's make it a separate task, maybe?
Why would it require significant effort? As far as I understand in your current code you only need to enable keyreads and that's all.
+ /* In case of update it could appear that the nearest neighbour is + * a record we are updating. It means, that there are no overlaps + * from this side. */ + if (is_update && memcmp(old_data + table->s->null_bytes, + record_buffer + table->s->null_bytes, + table->s->reclength - table->s->null_bytes) == 0) + { + continue; + }
I'd rather compare row positions here. What do you mean by that?
two rows are the same, if their "positions" are equal, not if their column values are equal. Also positions are much shorter to compare. after ha_index_read_map or ha_index_next you do handler->position(record_buffer) and then you have a "position" stored in handler->ref, and it has the length of handler->ref_length bytes. For MyISAM it's usually the file offset, for InnoDB - PK value. For UPDATE you can, I suppose, call this->position(old_data) to get the position.
+ if (thd->work_part_info) + { + my_error(ER_PERIOD_WITHOUT_OVERLAPS_PARTITIONED, MYF(0));
why?
Unfortunately partitions do not support searching upper/lower bounds (i.e. KEY_OR_PREV, KEY_OR_NEXT). But I think it is doable. For now it just doesn't work.
please add a comment about it here. Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hello! On Tue, 3 Dec 2019 at 21:52, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Nikita!
Hello, Sergei! I have updated the code, please see
- bb-10.5-MDEV-16978-without-overlaps < https://github.com/MariaDB/server/compare/bb-10.5-MDEV-16978-without-overlap...
branch.
Among the changes I, having noticed that innodb combination is missing, have added it, and some merging down from my INSERT/REPLACE patch was required to make it run successfully.
Below are the rest comments:
On Tue, 19 Nov 2019 at 15:45, Sergei Golubchik <serg@mariadb.org> wrote:
diff --git a/sql/handler.cc b/sql/handler.cc index 94cffd69b75..560f6316602 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -6913,6 +6928,130 @@ void handler::set_lock_type(enum
table->reginfo.lock_type= lock; }
+int handler::ha_check_overlaps(const uchar *old_data, const uchar* new_data) +{ + DBUG_ASSERT(new_data); + if (!table_share->period.unique_keys) + return 0; + if (table->versioned() && !table->vers_end_field()->is_max()) + return 0; + + bool is_update= old_data != NULL; + if (!check_overlaps_buffer) + check_overlaps_buffer= (uchar*)alloc_root(&table_share->mem_root, + table_share->max_unique_length + +
+ auto *record_buffer= check_overlaps_buffer +
On Nov 26, Nikita Malyavin wrote: thr_lock_type lock) table_share->reclength); table_share->max_unique_length;
+ auto *handler= this; + if (handler->inited != NONE) + { + if (!check_overlaps_handler) + { + check_overlaps_handler= clone(table_share->normalized_path.str, + &table_share->mem_root);
wouldn't it be simpler and cheaper to use HA_EXTRA_REMEMBER_POS and HA_EXTRA_RESTORE_POS, as we've discussed?
Can it stack (like, what'd be if HA_EXTRA_REMEMBER_POS is called several times)? I guess it can't, and somebody on upper layers could have already used it. This is for example done in TABLE::delete_row(). I see that you've removed it in latest 10.4+, but anyway, I think it's better to avoid remembering on lower layers
strictly speaking, cloning is rather a high-leveloperation, not for lower layers.
but I'm not as much interested in layer purity, as in the simpler and faster code. I thought that HA_EXTRA_REMEMBER_POS would be it - simpler and faster - but let me look and your new patch first, if I'll still think that I comment there again :)
+ int error= -1; + if (check_overlaps_handler != NULL) + error= check_overlaps_handler->ha_external_lock(table->in_use, F_RDLCK); + if (error) + return error; + } + handler= check_overlaps_handler; + }
why "if (handler->inited != NONE)" ? What happens if it is NONE?
then `this` is used
right, but when handler->inited is NONE? How can that happen?
When you just insert the record, for example. Neither rnd, nor index can be inited.
But it can be done better. You search for the key with the same value and a period start <= than the period start of new row. And then you have to check two rows for overlaps. If you'll search for a key with the period start <= than the _period end_ of the new row, then you'll only need to check one row (may be one more, if updating).
It can't work just that way: to handle case when period start = _period end_ of the new row, you should write even more checks, and then move cursor left.
No, I don't see that. Can you show an example where you'd need even more checks?
Ok, suppose you have to rows in table with periods: (b, c), (c, d).
Now you insert (a, c). ha_index_read_map will look for period_start <= c, so it will return (c, d). These rows do not overlap, but we yet do not know if (b, c) is there. So we need to go to the previous row. Now the algorithm looks same complex: read key, make some checks, got to the prev record.
Also, why do you store both period ends in the index? It seems like it'd be enough to store only one end - only the start or only the end. Both ends help if you use keyreads, but you don't. On the second thought, perhaps, you should use keyreads, there's no need to fetch the whole row for this overlap check. On the yet another thought it might not work with HA_EXTRA_REMEMBER_POS.
I think keyreads can work with HA_EXTRA_REMEMBER_POS. Moreover, I agree that using keyreads can be much more efficient. However, all of my latest code, namely FKs and REPLACE/IODKU, are not based on keyreads, and rewriting it now will require significant effort. Let's make it a separate task, maybe?
Why would it require significant effort? As far as I understand in your current code you only need to enable keyreads and that's all.
it'll return in different format, isn't it? You can't rely on key_part->field->ptr_in_record after that.
+ * a record we are updating. It means, that there are no overlaps + * from this side. */ + if (is_update && memcmp(old_data + table->s->null_bytes, + record_buffer + table->s->null_bytes, + table->s->reclength -
+ /* In case of update it could appear that the nearest neighbour is table->s->null_bytes) == 0)
+ { + continue; + }
I'd rather compare row positions here. What do you mean by that?
two rows are the same, if their "positions" are equal, not if their column values are equal. Also positions are much shorter to compare.
after ha_index_read_map or ha_index_next you do
handler->position(record_buffer)
and then you have a "position" stored in handler->ref, and it has the length of handler->ref_length bytes. For MyISAM it's usually the file offset, for InnoDB - PK value.
For UPDATE you can, I suppose, call this->position(old_data) to get the position.
It actually returns the ref for the last fetched row. the argument passed is not even used😐 . Only innodb uses it, and only for the primary key case.
+ if (thd->work_part_info)
+ { + my_error(ER_PERIOD_WITHOUT_OVERLAPS_PARTITIONED, MYF(0));
why?
Unfortunately partitions do not support searching upper/lower bounds (i.e. KEY_OR_PREV, KEY_OR_NEXT). But I think it is doable. For now it just doesn't work.
please add a comment about it here.
added
Regards,
Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- Yours truly, Nikita Malyavin
Hi, Nikita! On Dec 03, Nikita Malyavin wrote:
why "if (handler->inited != NONE)" ? What happens if it is NONE?
then `this` is used
right, but when handler->inited is NONE? How can that happen?
When you just insert the record, for example. Neither rnd, nor index can be inited.
Okay
But it can be done better. You search for the key with the same value and a period start <= than the period start of new row. And then you have to check two rows for overlaps. If you'll search for a key with the period start <= than the _period end_ of the new row, then you'll only need to check one row (may be one more, if updating).
It can't work just that way: to handle case when period start = _period end_ of the new row, you should write even more checks, and then move cursor left.
No, I don't see that. Can you show an example where you'd need even more checks?
Ok, suppose you have to rows in table with periods: (b, c), (c, d).
Now you insert (a, c). ha_index_read_map will look for period_start <= c, so it will return (c, d). These rows do not overlap, but we yet do not know if (b, c) is there. So we need to go to the previous row.
Now the algorithm looks same complex: read key, make some checks, got to the prev record.
I see. Because intervals are inclusive we probably have to search with < not <=. HA_READ_BEFORE_KEY
Also, why do you store both period ends in the index? It seems like it'd be enough to store only one end - only the start or only the end. Both ends help if you use keyreads, but you don't. On the second thought, perhaps, you should use keyreads, there's no need to fetch the whole row for this overlap check. On the yet another thought it might not work with HA_EXTRA_REMEMBER_POS.
I think keyreads can work with HA_EXTRA_REMEMBER_POS. Moreover, I agree that using keyreads can be much more efficient. However, all of my latest code, namely FKs and REPLACE/IODKU, are not based on keyreads, and rewriting it now will require significant effort. Let's make it a separate task, maybe?
Why would it require significant effort? As far as I understand in your current code you only need to enable keyreads and that's all.
it'll return in different format, isn't it? You can't rely on key_part->field->ptr_in_record after that.
Same format, it'll unpack the key into the table->record[0] so you can use Field->val* methods normally.
+ /* In case of update it could appear that the nearest neighbour is
+ * a record we are updating. It means, that there are no overlaps + * from this side. */ + if (is_update && memcmp(old_data + table->s->null_bytes, + record_buffer + table->s->null_bytes, + table->s->reclength - table->s->null_bytes) == 0) + { + continue; + }
I'd rather compare row positions here. What do you mean by that?
two rows are the same, if their "positions" are equal, not if their column values are equal. Also positions are much shorter to compare.
after ha_index_read_map or ha_index_next you do
handler->position(record_buffer)
and then you have a "position" stored in handler->ref, and it has the length of handler->ref_length bytes. For MyISAM it's usually the file offset, for InnoDB - PK value.
For UPDATE you can, I suppose, call this->position(old_data) to get the position.
It actually returns the ref for the last fetched row. the argument passed is not even used😐 . Only innodb uses it, and only for the primary key case.
Right, but it doesn't matter what the engine internally uses, you have to pass the row - that's how the method is defined. But yes, it'll return positions for the last fetched row, that's why handler->position(record_buffer) and this->position(old_data) will return you two positions that you can compare. Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
On Tue, 3 Dec 2019 at 23:59, Sergei Golubchik <serg@mariadb.org> wrote:
But it can be done better. You search for the key with the same value and a period start <= than the period start of new row. And then you have to check two rows for overlaps. If you'll search for a key with the period start <= than the _period end_ of the new row, then you'll only need to check one row (may be one more, if updating).
It can't work just that way: to handle case when period start = _period end_ of the new row, you should write even more checks, and then move cursor left.
No, I don't see that. Can you show an example where you'd need even more checks?
Ok, suppose you have to rows in table with periods: (b, c), (c, d).
Now you insert (a, c). ha_index_read_map will look for period_start <= c, so it will return (c, d). These rows do not overlap, but we yet do not know if (b, c) is there. So we need to go to the previous row.
Now the algorithm looks same complex: read key, make some checks, got to the prev record.
I see. Because intervals are inclusive we probably have to search with < not <=. HA_READ_BEFORE_KEY
Yes, suddenly I didn't think about < (i.e., HA_READ_BEFORE_KEY), but that can work
Also, why do you store both period ends in the index? It seems
like it'd be enough to store only one end - only the start or only the end. Both ends help if you use keyreads, but you don't. On the second thought, perhaps, you should use keyreads, there's no need to fetch the whole row for this overlap check. On the yet another thought it might not work with HA_EXTRA_REMEMBER_POS.
I think keyreads can work with HA_EXTRA_REMEMBER_POS. Moreover, I agree that using keyreads can be much more efficient. However, all of my latest code, namely FKs and REPLACE/IODKU, are not based on keyreads, and rewriting it now will require significant effort. Let's make it a separate task, maybe?
Why would it require significant effort? As far as I understand in your current code you only need to enable keyreads and that's all.
it'll return in different format, isn't it? You can't rely on key_part->field->ptr_in_record after that.
Same format, it'll unpack the key into the table->record[0] so you can use Field->val* methods normally.
Aah, now i see how it works. It just leaves the holes in record. OK then, I can try enablling it.
+ * a record we are updating. It means, that there are no overlaps + * from this side. */ + if (is_update && memcmp(old_data + table->s->null_bytes, + record_buffer +
+ table->s->reclength -
+ /* In case of update it could appear that the nearest neighbour is table->s->null_bytes, table->s->null_bytes) == 0)
+ { + continue; + }
I'd rather compare row positions here. What do you mean by that?
two rows are the same, if their "positions" are equal, not if their column values are equal. Also positions are much shorter to compare.
after ha_index_read_map or ha_index_next you do
handler->position(record_buffer)
and then you have a "position" stored in handler->ref, and it has the length of handler->ref_length bytes. For MyISAM it's usually the file offset, for InnoDB - PK value.
For UPDATE you can, I suppose, call this->position(old_data) to get the position.
It actually returns the ref for the last fetched row. the argument passed is not even used😐 . Only innodb uses it, and only for the primary key case.
Right, but it doesn't matter what the engine internally uses, you have to pass the row - that's how the method is defined.
But yes, it'll return positions for the last fetched row, that's why handler->position(record_buffer) and this->position(old_data) will return you two positions that you can compare.
old_data could have been even not fetched, and in some cases handler is not cloned, so it can't work in general. BTW I also wonder, would this memcpy work with KEYREAD? -- Yours truly, Nikita Malyavin
Hi, Nikita! On Dec 04, Nikita Malyavin wrote:
On Tue, 3 Dec 2019 at 23:59, Sergei Golubchik <serg@mariadb.org> wrote:
two rows are the same, if their "positions" are equal, not if their column values are equal. Also positions are much shorter to compare.
after ha_index_read_map or ha_index_next you do
handler->position(record_buffer)
and then you have a "position" stored in handler->ref, and it has the length of handler->ref_length bytes. For MyISAM it's usually the file offset, for InnoDB - PK value.
For UPDATE you can, I suppose, call this->position(old_data) to get the position.
old_data could have been even not fetched, and in some cases handler is not cloned, so it can't work in general.
In what cases? We're talking about UPDATE here, there's `this` handler that has already read old_data, so `this->position()` is well defined. `handler` handler by this time is cloned, and after `handler->ha_index_read_map()` you can read the position just fine. So both positions exist, can be read and compared. What's not general here, that it doesn't work for INSERT? It doesn't have to, this is special logic for UPDATE only.
BTW I also wonder, would this memcpy work with KEYREAD?
I presume - no, unread columns could contain some garbage. Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
On Wed, 4 Dec 2019 at 05:04, Sergei Golubchik <serg@mariadb.org> wrote:
two rows are the same, if their "positions" are equal, not if their column values are equal. Also positions are much shorter to compare.
after ha_index_read_map or ha_index_next you do
handler->position(record_buffer)
and then you have a "position" stored in handler->ref, and it has the length of handler->ref_length bytes. For MyISAM it's usually the file offset, for InnoDB - PK value.
For UPDATE you can, I suppose, call this->position(old_data) to get the position.
old_data could have been even not fetched, and in some cases handler is not cloned, so it can't work in general.
In what cases? We're talking about UPDATE here, there's `this` handler that has already read old_data, so `this->position()` is well defined.
Hmm yes... I guess ha_update_row wouldn't even work correctly if old_data is not the last fetched row. I remember that was true for myisam at least -- ha_write_row was spoiling ref (when history row was inserted, or period row were split in FOR PORTION case), and that was worked around -- either by saving/restoring positions, or by having separate file with rowids
So both positions exist, can be read and compared. What's not general here, that it doesn't work for INSERT? It doesn't have to, this is special logic for UPDATE only.
Ok, I still think it is a little bit a hack but we can try and see if it'll work in long term. -- Yours truly, Nikita Malyavin
Heh, unfortunately comparing refs will not work in conjunction with KEYREAD: for innodb ref can be a primary key, if it has it, so index is not clustered. The index we're evaluating can be different from a primary key. In that case some values can be left unfetched with KEYREAD. All that is because if innodb has primary key, it just does key_copy On Wed, 4 Dec 2019 at 17:14, Nikita Malyavin <nikitamalyavin@gmail.com> wrote:
On Wed, 4 Dec 2019 at 05:04, Sergei Golubchik <serg@mariadb.org> wrote:
two rows are the same, if their "positions" are equal, not if their column values are equal. Also positions are much shorter to compare.
after ha_index_read_map or ha_index_next you do
handler->position(record_buffer)
and then you have a "position" stored in handler->ref, and it has the length of handler->ref_length bytes. For MyISAM it's usually the file offset, for InnoDB - PK value.
For UPDATE you can, I suppose, call this->position(old_data) to get the position.
old_data could have been even not fetched, and in some cases handler is not cloned, so it can't work in general.
In what cases? We're talking about UPDATE here, there's `this` handler that has already read old_data, so `this->position()` is well defined.
Hmm yes... I guess ha_update_row wouldn't even work correctly if old_data is not the last fetched row. I remember that was true for myisam at least -- ha_write_row was spoiling ref (when history row was inserted, or period row were split in FOR PORTION case), and that was worked around -- either by saving/restoring positions, or by having separate file with rowids
So both positions exist, can be read and compared. What's not general here, that it doesn't work for INSERT? It doesn't have to, this is special logic for UPDATE only.
Ok, I still think it is a little bit a hack but we can try and see if it'll work in long term.
-- Yours truly, Nikita Malyavin
-- Yours truly, Nikita Malyavin
Hi, Nikita! On Dec 05, Nikita Malyavin wrote:
Heh, unfortunately comparing refs will not work in conjunction with KEYREAD:
for innodb ref can be a primary key, if it has it, so index is not clustered. The index we're evaluating can be different from a primary key.
In that case some values can be left unfetched with KEYREAD.
All that is because if innodb has primary key, it just does key_copy
In InnoDB every secondary key automatically has primary key at the end, because after finding the key value in a secondary index, InnoDB needs to fetch the corresponding row, so it has to know the PK value for that secondary key. Generalizing, every secondary key has to have the "ref" for the corresponding row. In InnoDB it's PK, in MyISAM it's the row offset, in MEMORY it's pointer. But either way, in any engine, any secondary key knows the ref for the corresponding row. Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hi Sergei! I traced out the problem I faced with refs comparison -- It's in key_copy implementation, which uses field->get_key_image, which in turn copied data to the buffer passed, but from `field->ptr`. The latter is important -- it means, current key_copy implementation always copies data from table->record[0]. I have already came across it while implementing FKs, and have fixed the implementation in the separate commit (see key_copy: use from_record argument to copy the data from). That fixes 'innodb' test combination. I think now everything you wanted so far is in the branch, so am waiting now for your comments on the new code! Kind regards, Nikita Malyavin On Thu, 5 Dec 2019 at 02:36, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Nikita!
On Dec 05, Nikita Malyavin wrote:
Heh, unfortunately comparing refs will not work in conjunction with KEYREAD:
for innodb ref can be a primary key, if it has it, so index is not clustered. The index we're evaluating can be different from a primary key.
In that case some values can be left unfetched with KEYREAD.
All that is because if innodb has primary key, it just does key_copy
In InnoDB every secondary key automatically has primary key at the end, because after finding the key value in a secondary index, InnoDB needs to fetch the corresponding row, so it has to know the PK value for that secondary key.
Generalizing, every secondary key has to have the "ref" for the corresponding row. In InnoDB it's PK, in MyISAM it's the row offset, in MEMORY it's pointer. But either way, in any engine, any secondary key knows the ref for the corresponding row.
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
We've also had MDEV-19082 <https://jira.mariadb.org/browse/MDEV-19082> -- change application-time period tables to use HA_EXTRA_REMEMBER_POS. On Fri, 6 Dec 2019 at 18:33, Nikita Malyavin <nikitamalyavin@gmail.com> wrote:
Hi Sergei!
I traced out the problem I faced with refs comparison -- It's in key_copy implementation, which uses field->get_key_image, which in turn copied data to the buffer passed, but from `field->ptr`. The latter is important -- it means, current key_copy implementation always copies data from table->record[0].
I have already came across it while implementing FKs, and have fixed the implementation in the separate commit (see key_copy: use from_record argument to copy the data from). That fixes 'innodb' test combination.
I think now everything you wanted so far is in the branch, so am waiting now for your comments on the new code!
Kind regards, Nikita Malyavin
On Thu, 5 Dec 2019 at 02:36, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Nikita!
On Dec 05, Nikita Malyavin wrote:
Heh, unfortunately comparing refs will not work in conjunction with KEYREAD:
for innodb ref can be a primary key, if it has it, so index is not clustered. The index we're evaluating can be different from a primary key.
In that case some values can be left unfetched with KEYREAD.
All that is because if innodb has primary key, it just does key_copy
In InnoDB every secondary key automatically has primary key at the end, because after finding the key value in a secondary index, InnoDB needs to fetch the corresponding row, so it has to know the PK value for that secondary key.
Generalizing, every secondary key has to have the "ref" for the corresponding row. In InnoDB it's PK, in MyISAM it's the row offset, in MEMORY it's pointer. But either way, in any engine, any secondary key knows the ref for the corresponding row.
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- Yours truly, Nikita Malyavin
Hi, Nikita! On Dec 06, Nikita Malyavin wrote:
Hi Sergei!
I traced out the problem I faced with refs comparison -- It's in key_copy implementation, which uses field->get_key_image, which in turn copied data to the buffer passed, but from `field->ptr`. The latter is important -- it means, current key_copy implementation always copies data from table->record[0].
I see. It looks like so far all usages of key_copy() used record[0] indeed. But as far as I can see, you're also using key_copy() with record[0], so why would you need that fix? Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
It is used in innodb's implementation of handler::position, to copy the key from the record buffer, which is also used to be record[0] On Tue, 17 Dec 2019 at 01:20, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Nikita!
On Dec 06, Nikita Malyavin wrote:
Hi Sergei!
I traced out the problem I faced with refs comparison -- It's in key_copy implementation, which uses field->get_key_image, which in turn copied data to the buffer passed, but from `field->ptr`. The latter is important -- it means, current key_copy implementation always copies data from table->record[0].
I see. It looks like so far all usages of key_copy() used record[0] indeed.
But as far as I can see, you're also using key_copy() with record[0], so why would you need that fix?
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- Yours truly, Nikita Malyavin
Hi, Nikita! On Dec 17, Nikita Malyavin wrote:
It is used in innodb's implementation of handler::position, to copy the key from the record buffer, which is also used to be record[0]
Okay. It seems that position() was always called with record[0] as an argument. But check_duplicate_long_entry_key() uses store_record()/restore_record() to call position(record[0]). Please see if you can remove them now.
On Tue, 17 Dec 2019 at 01:20, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Nikita! On Dec 06, Nikita Malyavin wrote:
Hi Sergei!
I traced out the problem I faced with refs comparison -- It's in key_copy implementation, which uses field->get_key_image, which in turn copied data to the buffer passed, but from `field->ptr`. The latter is important -- it means, current key_copy implementation always copies data from table->record[0].
I see. It looks like so far all usages of key_copy() used record[0] indeed.
But as far as I can see, you're also using key_copy() with record[0], so why would you need that fix?
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Yes, Sergei, you are right, it works fine! Additionally, I see now that I implement something very similar to what Sachin does -- i.e., cloning handler. I think that we can share one handler among all similar stuff, when we need to make cascade index searches. But some refactoring is required -- TABLE::clone_handler_for_update is not actually "for update" but rather for read, and I think it's better to store the handler in parent handler (as well as clone function), and to call it smth like `internal_handler` (I call it now check_overlaps_handler, and Sachin calls it update_handler). Sergei, Sachin, agree? On Tue, 17 Dec 2019 at 03:50, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Nikita!
On Dec 17, Nikita Malyavin wrote:
It is used in innodb's implementation of handler::position, to copy the key from the record buffer, which is also used to be record[0]
Okay.
It seems that position() was always called with record[0] as an argument.
But check_duplicate_long_entry_key() uses store_record()/restore_record() to call position(record[0]). Please see if you can remove them now.
On Tue, 17 Dec 2019 at 01:20, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Nikita! On Dec 06, Nikita Malyavin wrote:
Hi Sergei!
I traced out the problem I faced with refs comparison -- It's in key_copy implementation, which uses field->get_key_image, which in turn copied data to the buffer passed, but from `field->ptr`. The latter is important -- it means, current key_copy implementation always copies data from table->record[0].
I see. It looks like so far all usages of key_copy() used record[0] indeed.
But as far as I can see, you're also using key_copy() with record[0], so why would you need that fix?
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- Yours truly, Nikita Malyavin
Hi, Nikita! I agree. In fact, I wrote the same in my review (should be finished in a couple of hours). I can think of one important case where in-TABLE vs in-handler makes a difference - this is partitioning. All partitions share the same TABLE, but every partition has its own handler. And in this case the "handler for cascading index searches" has to be per partition too, so it cannot be stored in the TABLE. On Dec 17, Nikita Malyavin wrote:
Yes, Sergei, you are right, it works fine!
Additionally, I see now that I implement something very similar to what Sachin does -- i.e., cloning handler.
I think that we can share one handler among all similar stuff, when we need to make cascade index searches. But some refactoring is required -- TABLE::clone_handler_for_update is not actually "for update" but rather for read, and I think it's better to store the handler in parent handler (as well as clone function), and to call it smth like `internal_handler` (I call it now check_overlaps_handler, and Sachin calls it update_handler).
Sergei, Sachin, agree?
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
On Wed, 4 Dec 2019 at 01:00, Nikita Malyavin <nikitamalyavin@gmail.com> wrote:
On Tue, 3 Dec 2019 at 23:59, Sergei Golubchik <serg@mariadb.org> wrote:
But it can be done better. You search for the key with the same value and a period start <= than the period start of new row. And then you have to check two rows for overlaps. If you'll search for a key with the period start <= than the _period end_ of the new row, then you'll only need to check one row (may be one more, if updating).
It can't work just that way: to handle case when period start = _period end_ of the new row, you should write even more checks, and then move cursor left.
No, I don't see that. Can you show an example where you'd need even more checks?
Ok, suppose you have to rows in table with periods: (b, c), (c, d).
Now you insert (a, c). ha_index_read_map will look for period_start <= c, so it will return (c, d). These rows do not overlap, but we yet do not know if (b, c) is there. So we need to go to the previous row.
Now the algorithm looks same complex: read key, make some checks, got to the prev record.
I see. Because intervals are inclusive we probably have to search with < not <=. HA_READ_BEFORE_KEY
Yes, suddenly I didn't think about < (i.e., HA_READ_BEFORE_KEY), but that can work
Another case: suppose you have to rows in table with periods (a<b<c<d): (a, c), (c, d). Let us update (c,d) --> (b,d). I call it "move left". ha_index_read_map with period_start < d will return (c,d). Again check is_update, and call ha_index_prev()
Hi, Nikita! On Dec 04, Nikita Malyavin wrote:
Another case: suppose you have to rows in table with periods (a<b<c<d): (a, c), (c, d). Let us update (c,d) --> (b,d). I call it "move left".
ha_index_read_map with period_start < d will return (c,d). Again check is_update, and call ha_index_prev()
Yes, for update you might need a second check, if the first search will find the old row. This will be always the case, no matter how you index. In your patch you also have extra checks for update. Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Well, I've pushed that logic. Looks about 2 lines prettier ср, 4 дек. 2019 г., 19:19 Sergei Golubchik <serg@mariadb.org>:
Hi, Nikita!
On Dec 04, Nikita Malyavin wrote:
Another case: suppose you have to rows in table with periods (a<b<c<d): (a, c), (c, d). Let us update (c,d) --> (b,d). I call it "move left".
ha_index_read_map with period_start < d will return (c,d). Again check is_update, and call ha_index_prev()
Yes, for update you might need a second check, if the first search will find the old row. This will be always the case, no matter how you index. In your patch you also have extra checks for update.
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
participants (2)
-
Nikita Malyavin
-
Sergei Golubchik