Re: [Maria-developers] da5a72f32d4: MDEV-31033 ER_KEY_NOT_FOUND upon online COPY ALTER on a partitioned table
Hi, Nikita, On May 05, Nikita Malyavin wrote:
revision-id: da5a72f32d4 (mariadb-11.0.1-114-gda5a72f32d4) parent(s): 0d6584c019c author: Nikita Malyavin committer: Nikita Malyavin timestamp: 2023-05-01 01:15:41 +0300 message:
MDEV-31033 ER_KEY_NOT_FOUND upon online COPY ALTER on a partitioned table
The row events were applied "twice": once for the ha_partition, and one more time for the underlying storage engine.
There's no such problem in binlog/rpl, because ha_partiton::row_logging is normally set to false.
s/ha_partiton/ha_partition/
The proposed fix adds table->skip_online_logging field, preventing the underlying engines from replicating online events. Only upper layer engine should do this.
table->skip_online_logging is initialized to 0 together with the table itself. Normally it is always 0, except for the intermediate overriding part, once it is checked in handler::binlog_log_row.
Note on MDEV-31040 (Server crashes in MYSQL_LOG::is_open upon ALTER on partitioned table): First the solution was to override table->s->online_alter_binlog, and set it to NULL during the partitioning operation, which was ridiculously wrong: no-one is supposed to modify that field outside ALTER! That field write access is protected by EXCLUSIVE metadata lock.
I don't think you need to list your different (non-working) attemps to fix the bug in the commit comment. Just explain the fix that works, not the fix that doesn't.
In comparison, table->skip_online_logging is a connection-local field.
The test is a non-deterministic one. It wasn't convenient to construct a deterministic test here, besides for the case that will never be repeated.
Is it still the case? Your test in alter_table_online_debug.test with debug_sync - is it non-deterministic?
diff --git a/sql/handler.cc b/sql/handler.cc index 3fcbb5171d6..a7e229e8f8d 100644 --- a/sql/handler.cc +++ b/sql/handler.cc @@ -7280,7 +7280,8 @@ int handler::binlog_log_row(const uchar *before_record, log_func, row_logging_has_trans);
#ifdef HAVE_REPLICATION - if (unlikely(!error && table->s->online_alter_binlog)) + if (unlikely(!error && table->s->online_alter_binlog && + !table->skip_online_logging))
this doesn't need any juggling with skip_online_logging. This problem is already solved for, say, long uniques. You can use this == table->file condition to filter out individual partitions and have your binlog_log_row_online_alter called only for the main ha_partition handler.
error= binlog_log_row_online_alter(table, before_record, after_record, log_func); #endif // HAVE_REPLICATION
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Hi, On Fri, 5 May 2023 at 20:46, Sergei Golubchik <serg@mariadb.org> wrote:
#ifdef HAVE_REPLICATION - if (unlikely(!error && table->s->online_alter_binlog)) + if (unlikely(!error && table->s->online_alter_binlog && + !table->skip_online_logging))
this doesn't need any juggling with skip_online_logging. This problem is already solved for, say, long uniques. You can use
this == table->file
condition to filter out individual partitions and have your binlog_log_row_online_alter called only for the main ha_partition handler.
This doesn't look reliable -- won't there be any cases when we make a change from another handler, not table->file? If that never happens, we could add DBUG_ASSERT(this == table->faile) in ha_innodb's/ha_myisam's ha_*_row methods. -- Yours truly, Nikita Malyavin
Hi, Nikita, On May 08, Nikita Malyavin wrote:
On Fri, 5 May 2023 at 20:46, Sergei Golubchik <serg@mariadb.org> wrote:
#ifdef HAVE_REPLICATION - if (unlikely(!error && table->s->online_alter_binlog)) + if (unlikely(!error && table->s->online_alter_binlog && + !table->skip_online_logging))
this doesn't need any juggling with skip_online_logging. This problem is already solved for, say, long uniques. You can use
this == table->file
condition to filter out individual partitions and have your binlog_log_row_online_alter called only for the main ha_partition handler.
This doesn't look reliable -- won't there be any cases when we make a change from another handler, not table->file?
So far, that's how it works - updates come through table->file
If that never happens, we could add DBUG_ASSERT(this == table->file) in ha_innodb's/ha_myisam's ha_*_row methods.
Of course, not. That's the whole point. The server calls ha_partition::ha_write_row, which calls, in turn, ha_innodb::ha_write_row. For individual partitions this != table->file, that's how we make sure that certain code (e.g. long unique checks) are only run once and not repeated per partition. Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
Sorry, the assertion should be DBUG_ASSERT(table->file == this || dynamic_cast<ha_innobase*>(table->file) == NULL); Well, the tests seem passing after adding this assertion (and a similar one for myisam): https://github.com/MariaDB/server/commit/f221bc46e6981c9e7c5d2806faefb1ed086... Okay, I think I can agree on this approach, but then I want these assertions to be committed as well. dynamic_cast can be replaced with hton comparison, of your choice. Maybe somebody later will need row changes from a separate handler, imagine for example cascade foreign key updates, which would have used a lookup handler for the cascade changes. For that case, I'll try to gather these table->file comparisons under a single method. Alternatively, we could introduce some bool handler::root_handler just now, without waiting for a demand, but I'm afraid it could become another source for bugs. So better later on demand. On Mon, 8 May 2023 at 22:49, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Nikita,
On May 08, Nikita Malyavin wrote:
On Fri, 5 May 2023 at 20:46, Sergei Golubchik <serg@mariadb.org> wrote:
#ifdef HAVE_REPLICATION - if (unlikely(!error && table->s->online_alter_binlog)) + if (unlikely(!error && table->s->online_alter_binlog && + !table->skip_online_logging))
this doesn't need any juggling with skip_online_logging. This problem is already solved for, say, long uniques. You can use
this == table->file
condition to filter out individual partitions and have your binlog_log_row_online_alter called only for the main ha_partition handler.
This doesn't look reliable -- won't there be any cases when we make a change from another handler, not table->file?
So far, that's how it works - updates come through table->file
If that never happens, we could add DBUG_ASSERT(this == table->file) in ha_innodb's/ha_myisam's ha_*_row methods.
Of course, not. That's the whole point. The server calls ha_partition::ha_write_row, which calls, in turn, ha_innodb::ha_write_row. For individual partitions this != table->file, that's how we make sure that certain code (e.g. long unique checks) are only run once and not repeated per partition.
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- Yours truly, Nikita Malyavin
Hi, Nikita, On May 08, Nikita Malyavin wrote:
Sorry, the assertion should be DBUG_ASSERT(table->file == this || dynamic_cast<ha_innobase*>(table->file) == NULL);
Well, the tests seem passing after adding this assertion (and a similar one for myisam): https://github.com/MariaDB/server/commit/f221bc46e6981c9e7c5d2806faefb1ed086...
Okay, I think I can agree on this approach, but then I want these assertions to be committed as well. dynamic_cast can be replaced with hton comparison, of your choice.
hton comparison, please, otherwise ok.
Maybe somebody later will need row changes from a separate handler, imagine for example cascade foreign key updates, which would have used a lookup handler for the cascade changes. For that case, I'll try to gather these table->file comparisons under a single method.
Alternatively, we could introduce some bool handler::root_handler just now, without waiting for a demand, but I'm afraid it could become another source for bugs. So better later on demand.
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
It's done: 56e2cd20ed3 <https://github.com/MariaDB/server/commit/56e2cd20ed3ea0f826973c0dcfaed933779fc038> -- Yours truly, Nikita Malyavin
participants (2)
-
Nikita Malyavin
-
Sergei Golubchik