Re: 22e2f082400: MDEV-25370 Update for portion changes autoincrement key in bi-temp table
Hi, Nikita, On Jan 06, Nikita Malyavin wrote:
revision-id: 22e2f082400 (mariadb-10.5.23-39-g22e2f082400) parent(s): e472b682e06 author: Nikita Malyavin committer: Nikita Malyavin timestamp: 2023-12-20 02:46:09 +0100 message:
MDEV-25370 Update for portion changes autoincrement key in bi-temp table
According to the standard, the autoincrement column (i.e. *identity column*) should be advanced each insert implicitly made by UPDATE/DELETE ... FOR PORTION.
When you refer to the standard, specify the exact reference so that someone could verify it. E.g. as I've commented in MDEV, "SQL:2016, Part 2, 15.13 Effect of replacing rows in base tables, paragraph 10) b) i)" Also, the standard doesn't have autoincrement, it'd be better to say "According to the standard, the identity column (which autoincrement is a functional analog of) should be advanced...."
This is very unconvenient use in several notable cases. Concider a WITHOUT OVERLAPS key with an autoinc column: id int auto_increment, unique(id, p without overlaps)
An update or delete with FOR PORTION creates a sense that id will remain unchanged in such case.
This commit makes an exception the autoinc columns that are part of WITHOUT OVERLAPS keys and infers the following rule for the newly inserted rows:
Better to say here not that we're making an exception and violating the standard (because we are not), but that in the standard IDENTITY uses its own independent sequence generator, which knows nothing about periods. While autoincrement always requires an index (that is, our "identity" column is a pair {autoincrement, index}, not autoincrement alone), and the index can be declared WITHOUT OVERLAPS, so our "identity" column is period-aware. This is an extension of the standard, and we can freely define how it behaves in the cases not covered by the standard, that is in the WITHOUT OVERLAPS case.
===== Let V_j a set of values for currently updating/deleting row. Let VI_j a set of values to be inserted. * Case: (a) if V_i is an identity column that is also a part of a unique constraint that includes <without overlap specification>, then let VI_i be V_i (b) otherwise, let VI_i be the value deduced as described by the rules of ISO/IEC 9075-2, clauses 15.7 Effect of deleting rows from base tables and 15.13 Effect of replacing rows in base tables. =====
That's confusing, I'm sure that anyone who doesn't remember the corresponding part of the sql standard will think it's a direct quote from it. You've imitated the style rather convincingly :) It's also rather unnecessary, we don't have identity columns, so nobody cares how they should behave in this case. Perhaps it'd be better if you simply write that if the key, that defines autoincrement values, is declared WITHOUT OVERLAPS, then new rows created by UPDATE/DELETE FOR PORTION OF will use old autoincrement values, as they belong to non-overlapping ranges. But INSERT will generate a new autoincrement value, it will not try to find the smallest value in the overlapping ranges.
Note that this also infers that if an identity column is not a part of any unique constraint that includes <without overlap specification>, then its value will be DEFAULT and, that is, auto-incremented.
Apart from WITHOUT OVERLAPS there is also another notable case, referred by the reporter - a unique key that has an autoincrement column and a field from the period specification: id int auto_increment, unique(id, s), period for p(s, e)
for this case, no exception is made, and the autoincrementing rules will be proceeded accordung to the standard (i.e. the value will be advanced on implicit inserts).
diff --git a/mysql-test/suite/period/r/overlaps.result b/mysql-test/suite/period/r/overlaps.result --- a/mysql-test/suite/period/r/overlaps.result +++ b/mysql-test/suite/period/r/overlaps.result @@ -449,3 +449,86 @@ VALUES ('2000-01-01 00:00:00.000000', '2001-01-01 00:00:00.000000', 'abc'); ERROR 23000: Duplicate entry 'abc-2001-01-01 00:00:00.000000-2000-01-01 00:00:00.000000' for key 'index_name' DROP TABLE t1; +create or replace table cars(id int auto_increment, +price int, s date, e date, +period for p(s,e), +primary key(id, p without overlaps)); +insert into cars(price, s, e) values (1000, '2018-01-01', '2020-01-01'); +select * from cars; +id price s e +1 1000 2018-01-01 2020-01-01 +update cars for portion of p from '2019-01-01' to '2019-12-01' set price= 1100; +select * from cars; +id price s e +1 1000 2018-01-01 2019-01-01 +1 1000 2019-12-01 2020-01-01 +1 1100 2019-01-01 2019-12-01 +delete from cars for portion of p from '2019-12-10' to '2019-12-20'; +select * from cars; +id price s e +1 1000 2018-01-01 2019-01-01 +1 1000 2019-12-01 2019-12-10 +1 1000 2019-12-20 2020-01-01 +1 1100 2019-01-01 2019-12-01
ok
+# AUTO_INCREMENT field is separate from WITHOUT OVERLAPS +create or replace table cars(id int primary key auto_increment, +car_id int, +price int, s date, e date, +period for p(s,e), +unique(car_id, p without overlaps)); +insert cars(car_id, price, s, e) values (1, 1000, '2018-01-01', '2020-01-01'); +select * from cars; +id car_id price s e +1 1 1000 2018-01-01 2020-01-01 +update cars for portion of p from '2019-01-01' to '2019-12-01' set price= 1100; +select * from cars; +id car_id price s e +1 1 1100 2019-01-01 2019-12-01 +2 1 1000 2018-01-01 2019-01-01 +3 1 1000 2019-12-01 2020-01-01 +delete from cars for portion of p from '2019-12-10' to '2019-12-20'; +select * from cars; +id car_id price s e +1 1 1100 2019-01-01 2019-12-01 +2 1 1000 2018-01-01 2019-01-01 +4 1 1000 2019-12-01 2019-12-10 +5 1 1000 2019-12-20 2020-01-01
right
+# AUTO_INCREMENT field is both standalone and in WITHOUT OVERLAPS +create or replace table cars(id int primary key auto_increment, +price int, s date, e date, +period for p(s,e), +unique(id, p without overlaps)); +insert cars(price, s, e) values (1000, '2018-01-01', '2020-01-01'); +select * from cars; +id price s e +1 1000 2018-01-01 2020-01-01 +update cars for portion of p from '2019-01-01' to '2019-12-01' set price= 1100; +ERROR 23000: Duplicate entry '1' for key 'PRIMARY'
uhm, I don't know. I'd expect that here PRIMARY would define autoincrement values.
+truncate cars; +insert cars(price, s, e) values (1000, '2018-01-01', '2020-01-01'); +delete from cars for portion of p from '2019-12-10' to '2019-12-20'; +ERROR 23000: Duplicate entry '1' for key 'PRIMARY' +# AUTO_INCREMENT field is non-unique
not sure here. I think here unique should define autoincrement values. That is, in both cases, the most restrictive index should define.
+create or replace table cars(id int auto_increment, key(id), +car_id int, +price int, s date, e date, +period for p(s,e), +unique(car_id, p without overlaps)); +insert cars(car_id, price, s, e) values (1, 1000, '2018-01-01', '2020-01-01'); +select * from cars; +id car_id price s e +1 1 1000 2018-01-01 2020-01-01 +update cars for portion of p from '2019-01-01' to '2019-12-01' set price= 1100; +select * from cars; +id car_id price s e +1 1 1100 2019-01-01 2019-12-01 +2 1 1000 2018-01-01 2019-01-01 +3 1 1000 2019-12-01 2020-01-01 +delete from cars for portion of p from '2019-12-10' to '2019-12-20'; +select * from cars; +id car_id price s e +1 1 1100 2019-01-01 2019-12-01 +2 1 1000 2018-01-01 2019-01-01 +4 1 1000 2019-12-01 2019-12-10 +5 1 1000 2019-12-20 2020-01-01
please also add test for INSERT. Just add a second insert like insert cars(price, s, e) values (1000, '2021-01-01', '2022-01-01'); to show that id will *not* be 1 here, even though 1 would not conflict with anything in the specified date range. Simply to have the expected behavior recorded as a test result.
+drop table cars; diff --git a/sql/table.cc b/sql/table.cc --- a/sql/table.cc +++ b/sql/table.cc @@ -8917,6 +8916,21 @@ int TABLE::update_generated_fields() return res; }
+void TABLE::period_prepare_autoinc() +{ + if (!found_next_number_field) + return; + + key_map::Iterator it(found_next_number_field->part_of_key_not_clustered); + for (uint i; (i= it++) != key_map::Iterator::BITMAP_END;) + { + KEY &key= key_info[i]; + if (key.without_overlaps) + return; + }
No, I don't think you need to check *all* indexes that the field is part of. I believe you should only check the "main" index, the one that defines autoincrement field behavior. That is only if (key_info[s->next_number_index].without_overlaps) return;
+ next_number_field= found_next_number_field; +} + int TABLE::period_make_insert(Item *src, Field *dst) { THD *thd= in_use;
Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
Hello Sergei!
+create or replace table cars(id int primary key auto_increment, +price int, s date, e date, +period for p(s,e), +unique(id, p without overlaps)); +insert cars(price, s, e) values (1000, '2018-01-01', '2020-01-01'); +select * from cars; +id price s e +1 1000 2018-01-01 2020-01-01 +update cars for portion of p from '2019-01-01' to '2019-12-01' set
+# AUTO_INCREMENT field is both standalone and in WITHOUT OVERLAPS price= 1100;
+ERROR 23000: Duplicate entry '1' for key 'PRIMARY'
uhm, I don't know. I'd expect that here PRIMARY would define autoincrement values.
+truncate cars; +insert cars(price, s, e) values (1000, '2018-01-01', '2020-01-01'); +delete from cars for portion of p from '2019-12-10' to '2019-12-20'; +ERROR 23000: Duplicate entry '1' for key 'PRIMARY' +# AUTO_INCREMENT field is non-unique
not sure here. I think here unique should define autoincrement values. That is, in both cases, the most restrictive index should define.
+void TABLE::period_prepare_autoinc() +{ + if (!found_next_number_field) + return; + + key_map::Iterator it(found_next_number_field->part_of_key_not_clustered); + for (uint i; (i= it++) != key_map::Iterator::BITMAP_END;) + { + KEY &key= key_info[i]; + if (key.without_overlaps) + return; + }
No, I don't think you need to check *all* indexes that the field is part of. I believe you should only check the "main" index, the one that defines autoincrement field behavior. That is only
if (key_info[s->next_number_index].without_overlaps) return;
I don't understand the notion of the "autoincrement index". Why any index is important at all for an autoincrement field? Other than that, if next_number_index means the most restrictive one, it may make sense. -- Yours truly, Nikita Malyavin
Hi, Nikita, On Jan 09, Nikita Malyavin wrote:
I don't understand the notion of the "autoincrement index". Why any index is important at all for an autoincrement field?
Three examples: ##################### create table t1 (a int, b int not null auto_increment, unique ub (b)); insert t1 (a) values (1),(2),(1),(3); select * from t1; prints: +------+---+ | a | b | +------+---+ | 1 | 1 | | 2 | 2 | | 1 | 3 | | 3 | 4 | +------+---+ ##################### create table t1 (a int, b int not null auto_increment, unique uab (a,b)); insert t1 (a) values (1),(2),(1),(3); select * from t1; prints: +------+---+ | a | b | +------+---+ | 1 | 1 | | 1 | 2 | | 2 | 1 | | 3 | 1 | +------+---+ ##################### create table t1 (a int, b int not null auto_increment, unique uab (a,b), unique ub (b)); insert t1 (a) values (1),(2),(1),(3); select * from t1; prints: +------+---+ | a | b | +------+---+ | 1 | 1 | | 1 | 3 | | 2 | 2 | | 3 | 4 | +------+---+ in the third example the auto-inc column takes part in two indexes. But only one of them - ub - defines how auto-inc values are generated. The "value generator" is a pair {b, ub} - a column and an index. Looking at the column declaration alone you cannot predict how values will be generated. Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
OMG! 😯 So, it's really not what's called *identity field* in the standard, since there's no connection with key, it's just always generated. And now your approach looks even better. One more question: create table t1 (a int, b int not null auto_increment, unique uab (a,b));
insert t1 (a) values (1),(2),(1),(3); select * from t1;
prints: +------+---+ | a | b | +------+---+ | 1 | 1 | | 1 | 2 | | 2 | 1 | | 3 | 1 | +------+---+
Why not 1 2 1 2 in column b? I understood the behavior as "generate a new value only if we have a conflict with the current value". And then in the insertion order the values for b would be 1, 1, 2, 2. -- Yours truly, Nikita Malyavin
Hi, Nikita, On Jan 09, Nikita Malyavin wrote:
So, it's really not what's called *identity field* in the standard, since there's no connection with key, it's just always generated.
Exactly. That's why I mean that IDENTITY in the standard has no knowledge of periods, so it cannot be "with" or "without" overlaps. And auto-inc is a pair of {column, key} so it can be. Thus "auto-inc without overlaps" is our extension, and we can select for it any behavior we think is better, it doesn't conflict with the standard.
One more question:
create table t1 (a int, b int not null auto_increment, unique uab (a,b));
insert t1 (a) values (1),(2),(1),(3); select * from t1;
prints: +------+---+ | a | b | +------+---+ | 1 | 1 | | 1 | 2 | | 2 | 1 | | 3 | 1 | +------+---+
Why not 1 2 1 2 in column b? I understood the behavior as "generate a new value only if we have a conflict with the current value". And then in the insertion order the values for b would be 1, 1, 2, 2.
no, why, the pair {3,1} does not conflict with {2,1}. Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
create table t1 (a int, b int not null auto_increment, unique uab (a,b));
insert t1 (a) values (1),(2),(1),(3); select * from t1;
prints: +------+---+ | a | b | +------+---+ | 1 | 1 | | 1 | 2 | | 2 | 1 | | 3 | 1 | +------+---+
Why not 1 2 1 2 in column b? I understood the behavior as "generate a new value only if we have a conflict with the current value". And then in the insertion order the values for b would be 1, 1, 2, 2.
no, why, the pair {3,1} does not conflict with {2,1}.
So, it always begins from 1 again, if no conflicts? -- Yours truly, Nikita Malyavin
Hi, Nikita, On Jan 11, Nikita Malyavin wrote:
create table t1 (a int, b int not null auto_increment, unique uab (a,b));
insert t1 (a) values (1),(2),(1),(3); select * from t1;
prints: +------+---+ | a | b | +------+---+ | 1 | 1 | | 1 | 2 | | 2 | 1 | | 3 | 1 | +------+---+
Why not 1 2 1 2 in column b? I understood the behavior as "generate a new value only if we have a conflict with the current value". And then in the insertion order the values for b would be 1, 1, 2, 2.
no, why, the pair {3,1} does not conflict with {2,1}.
So, it always begins from 1 again, if no conflicts?
Yes. In a way it's like a separate generator of b values for every distinct value of a. Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
Sergei, here's another example: create or replace table t(a int not null auto_increment, b int, primary key (a, b)); insert t(b) values (1); insert t(b) values (2); select * from t; a b 1 1 2 2 However there's no collision! Switching key parts ordering to (b,a) changes the behavior. I am still struggling with this because create or replace table cars(id int unique auto_increment, price int, s date, e date, period for p(s,e), primary key(id, p without overlaps)); has primary key as an autoincrement index! And as a consequence, the following fails: insert cars(price, s, e) values (1000, '2018-01-01', '2020-01-01'); insert cars(price, s, e) values (1000, '2021-01-01', '2022-01-01'); update cars for portion of p from '2019-01-01' to '2019-12-01' set price= 1100; As you may guess, the same applies to the table t, if we'd add unique(a) - it doesn't become the autoincrement index in the code, but the behavior is the same... Should I maybe update the key determination rules for such case? -- Yours truly, Nikita Malyavin
The new behavior can be found in this commit, the line is highlighted https://github.com/MariaDB/server/commit/09bf45a89ee23adc7ce05c46ba859a74b13... -- Yours truly, Nikita Malyavin
Hi, Nikita, On Jan 12, Nikita Malyavin wrote:
The new behavior can be found in this commit, the line is highlighted https://github.com/MariaDB/server/commit/09bf45a89ee23adc7ce05c46ba859a74b13...
Why do you have ER_DUP_ENTRY? There are three non-overlapping ranges, they all can have id=1 without conflicting. Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
Because the first key is a single field unique пт, 12 янв. 2024 г., 22:38 Sergei Golubchik <serg@mariadb.org>:
Hi, Nikita,
On Jan 12, Nikita Malyavin wrote:
The new behavior can be found in this commit, the line is highlighted
https://github.com/MariaDB/server/commit/09bf45a89ee23adc7ce05c46ba859a74b13...
Why do you have ER_DUP_ENTRY? There are three non-overlapping ranges, they all can have id=1 without conflicting.
Regards, Sergei Chief Architect, MariaDB Server and security@mariadb.org
participants (2)
-
Nikita Malyavin
-
Sergei Golubchik