Hi! <cut>
I mean, you have tests, both with PERIOD, but one in the main suite, the other in the versioning suite. While they both should be in the period suite. Here, your test from the main/long_unique.test:
+CREATE OR REPLACE TABLE t1 (a INT, b BLOB, s DATE, e DATE, PERIOD FOR app(s,e), UNIQUE(b)) ENGINE=MyISAM PARTITION BY HASH(a) PARTITIONS 2; +INSERT INTO t1 VALUES (1,'foo','2022-01-01', '2025-01-01'); +--error ER_DUP_ENTRY +DELETE FROM t1 FOR PORTION OF app FROM '2023-01-01' TO '2024-01-01'; +DROP TABLE t1;
Your test from the suite/versioning/t/long_unique.test:
+CREATE TABLE t1 (f VARCHAR(4096), s DATE, e DATE, PERIOD FOR app(s,e), UNIQUE(f)) ENGINE=MyISAM; +INSERT INTO t1 VALUES ('foo', '2023-08-30', '2025-07-09'),('bar', '2021-01-01', '2021-12-31'); +DELETE FROM t1 FOR PORTION OF app FROM '2023-08-29' TO '2025-07-01'; +DROP TABLE t1;
The above are actual different, indendent tests not related to each other. But I agree that they are both related to period. I will move the PERIOD test from main/long_unique.test and suite/versioning/t/long_unique.test to suite/period/t/long_unique.test Sorry, I got confused about the suggestion with moving whole test files.
diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc index 73191907aac..c7e61864366 100644 --- a/sql/ha_partition.cc +++ b/sql/ha_partition.cc @@ -4340,6 +4339,10 @@ int ha_partition::write_row(const uchar * buf) } m_last_part= part_id; DBUG_PRINT("info", ("Insert in partition %u", part_id)); + if (table->s->long_unique_table && + m_file[part_id]->update_handler == m_file[part_id] && inited == RND) + m_file[part_id]->prepare_for_insert(0);
This looks quite incomprehensible. If the table has a long unique key and the update_handler is the handler itself you _prepare for insert_ ???
This is basically the original code we had in handler::write_row. We have to have this in this place as we don't want to call handler->prepare_for_insert() for all partitions, only those that where really used.
I mean not the condition, but the "prepare_for_insert" method name that you've introduced, it didn't exist in the original code.
If the name would've been, say, "prepare_update_handler" or "prepare_long_unique_handler" then the code would've looked fine and I wouldn't have asked.
if it's an insert it should've been prepared earlier, like, normally for an insert, and it don't have to do anything with long uniques.
See above. The above is to avoid creating new handlers when we don't need them! If you have 100 partitions, you really don't want to create 100 cloned table handlers.
Of course. I just mean the method name. If it is "preparing for insert" as the name suggests, one could've expected it to be done earlier. I'm saying, the name should not suggest it's a preparation for an insert, call it, for example, "prepare_update_handler"
See my comments. It's not only intended for update handler, so I prefer to keep the current name as that will be more future proof. Also, it should only be called when one does insert's, not because there exists an update handler. It's perfectly safe to call before any insert usage. The current code just optimized when it's called.
What you actually do here (as I've found out looking firther in the patch) you prepare update_handler. That is "prepare_for_insert" is VERY confusing here, it sounds like it prepares for a normal insert and should be called at the beginning of mysql_insert(). And it's nothing like that.
We are calling it as start of mysql_insert(), however we have an if around it as a small optimization. We can remove the if you want, things will work as before if we d.
It would remove half of the questions if you'd call it prepare_update_handler() or, better, prepare_auxiliary_handler() (because it'll be used not only for long uniques) or something like that.
As one ONLY have to call it for insert and it's safe to call if one has update handler or not, I don't think any of the above names are any better.
The idea with prepare_for_insert is that we can add special handler code that we want to do once in case of insert. We didn't have that before and I can see many use of that.
In any case, I will add a comment to the above partition code to clarify what we are doing:
"We have to call prepare_for_insert() if we have an update handler in the underlying table (to clone the handler). This is because for INSERT's prepare_for_insert() is only called for the main table, not for all partitions. This is to reduce the huge overhead of cloning a possible not needed handler if there are many partitions."
I don't think this is needed. The confusing part was not the condition, but the method name. If you rename the method, everything else will be fine.
I think it will be worse with a rename. I had originally plans to add more things to prepare_for_insert(), but this is for a future commit.
Still, some questions are left:
1. why partitioning needs special code for that?
The comment explains it all
Not quite. The long unique is only done for the main table, not for partitions. Individual partitions should never have update_handler initialized in the first place.
It must have as we call ha_write_row() with the partition handler and this must do with the unique handler if the row exists or not.
2. when a handler itself is its update_handler AND inited==RND? the handler itself can be the update_handler on INSERT, but then inited is NONE Is it UPDATE with system versioning?
No, this is needed for normal updates, just like in the old code.
For normal updates this->update_handler != this. For inserts inited!=RND. I'm asking when update_handler==this && inited==RND.
This is the code from the original code before any of my changes. I assumed you had already asked that as part of the orignal review. In case of insert with DUP_ERROR we call ha_rnd_init_with_error(). I thought it was possible also with update, but that could be mainly when versioning is involved.
start_part_bulk_insert(thd, part_id);
tmp_disable_binlog(thd); /* Do not replicate the low-level changes. */ diff --git a/sql/handler.cc b/sql/handler.cc index 0700b1571e5..1e3f987b4e5 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -2648,6 +2649,39 @@ handler *handler::clone(const char *name, MEM_ROOT *mem_root) return NULL; }
+/** + @brief clone of current handler.
you generally don't have to write @brief in these cases. Just
I usually don't write @brief. Strange, I must have copied the code from somewhere. Ok, the @brief came from the original code that I copied from table.cc and I didn't remove it.
I have now removed it. Someone else should fix all the other places where @brief is used wrongly!
It's not really wrong, just redundant. So, I'd say, does not justify global removal from everywhere, but better not to do it in the new code. Not a big deal if you use it, though.
I don't like it. I only got @brief becaues I copied two functions with function comments between files. When doing that, I prefer to do as little changes as possible to make it easier for people to find the new functions.
<cut>
+bool handler::clone_handler_for_update() +{ + handler *tmp; + DBUG_ASSERT(table->s->long_unique_table); + + if (update_handler != this) + return 0; // Already done + if (!(tmp= clone(table->s->normalized_path.str, table->in_use->mem_root))) + return 1; + update_handler= tmp; + update_handler->ha_external_lock(table->in_use, F_RDLCK);
Looks strange. Why F_RDLCK if it's an *update* handler? This definitely needs a comment, why you're using read lock for updates.
Because the update handler is ONLY used to check if there is an existing row. We never use it for doing changes. Note that this is old code. Didn't you catch this in your original review before it was pushed ?
Okay. It doesn't matter because it's renamed in the Nikita's patch anyway. (but only a few days ago, after I wrote that review).
I hope that Nikita will base his new work on my code. It would be better that he wait until I have pushed before doing new work.
@@ -6481,6 +6515,8 @@ int handler::ha_reset() DBUG_ASSERT(inited == NONE); /* reset the bitmaps to point to defaults */ table->default_column_bitmaps(); + if (update_handler != this) + delete_update_handler();
you already have an if() inside the delete_update_handler(). Either remove it from here or from there.
We don't check the handler in all calls. I just added the check to ha_reset() as this is a very common call and wanted to avoid an extra call. Keeping current code.
it's not virtual. And in the same file. And small. The compiler will inline it, you don't need to bother.
The compiler will also remove it. I think it's better to make things explicite for anything that is in commonly used functions.
+++ b/sql/sql_delete.cc @@ -752,6 +752,10 @@ bool mysql_delete(THD *thd, TABLE_LIST *table_list, COND *conds, && !table->versioned() && table->file->has_transactions();
+ if (table->versioned(VERS_TIMESTAMP) || + (table_list->has_period() && !portion_of_time_through_update)) + table->file->prepare_for_insert(1);
Ok, now I think that your idea of the default update_handler=this is rather fragile. If the default is NULL, than we can put in many places
DBUG_ASSERT(update_handler != NULL);
to make sure it was initialized when needed. Now we cannot do it.
The current is faster as we can always rely on the value. The code will work in most cases even if prepare_for_insert() is not set.
For an assert, on can always do (trough an inline handler function) DBUG_ASSERT(update_handler != table->file)
Exactly same thing.
Not at all, usually update_handler==table->file for inserts. It does not mean update_handler is not initialized. NULL was not initialized before your patch, and now one cannot detect whether update_handler was initialized or not.
Initialized means that it changes value. In old code it was only updated if there was an update handler, so asserts would not work 'as such'. Also it was only set after the first insert, so it was hard to have any asserts over the place. It still really trivial to add an assert that checks if we should need a clone and if the update handler was set. It's actually easier to do now as the update handler will be set BEFORE the ha_write_row call.
And there are many places in the code and non-trivial conditions when the update_handler has to be initialized and these places and conditions will change in the future. An assert would be helpful here.
Nothing stops us from having an assert with current code. The only different from my code is that we don't get a crash if we forget an assert and wrongly call update_handler.
see above. An assert is now impossible. If we wrongly call update_handler it might work, but more often than not it'll produce incorrect table content. So, no crash, but a hard to detect incorrect data corruption.
If ha_reset() is called, then things will work. If it's not called, most things will crash. This means that it's not likely that this will go wrong without having it detected.
+ THD_STAGE_INFO(thd, stage_updating); while (likely(!(error=info.read_record())) && likely(!thd->killed) && likely(!thd->is_error())) --- a/storage/myisam/ha_myisam.cc +++ b/storage/myisam/ha_myisam.cc @@ -1740,7 +1740,7 @@ void ha_myisam::start_bulk_insert(ha_rows rows, uint flags) enable_indexes() failed, thus it's essential that indexes are disabled ONLY for an empty table. */ - if (file->state->records == 0 && can_enable_indexes && + if (file->state->records == 0 && can_enable_indexes && !has_long_unique() && (!rows || rows >= MI_MIN_ROWS_TO_DISABLE_INDEXES))
why do you disable bulk inserts for long uniques?
Because they are incompatible. Didn't you read my excellent commit message? " Disable write cache in MyISAM and Aria when using update_handler as if cache is used, the row will not be inserted until end of statement and update_handler would not find conflicting rows"
Does that explain things?
No, sorry, it doesn't. I did read your comment before writing the question.
Here you didn't disable _write cache_. You disabled switching off indexes. Your comment doesn't apply to that, because no matter whether indexes are disabled or not, rows are still inserted at once.
Sorry, you right the comment is wrong. I orignally thought we have to switch of the row cache, but we didn't need to as we do flush the cache for any key read. I update the comment to the following: - Disable write cache in MyISAM and Aria when using update_handler as if cache is used, the long unique index will not be updated on insert because the index is not marked with HA_NOSAME A better version for future updates to long unique would be to mark the long unique index with a special flag (HA_EXTERNAL_NOSAME) so that the engine knows that this can't be disabled. Regards, Monty