
Hi Sergei! On Fri, Aug 27, 2021 at 3:23 PM Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Aleksey!
On Aug 26, Aleksey Midenkov wrote:
Hi Sergei!
The updated branch is cabd0530 bb-10.7-midenok-MDEV-22166
Commits
cabd0530426 Parser refactoring db69d4eb025 Cleanups
are fine, thanks. The main change in the simplified diff is:
diff --git a/sql/log_event_server.cc b/sql/log_event_server.cc --- a/sql/log_event_server.cc +++ b/sql/log_event_server.cc @@ -1299,7 +1299,7 @@ bool Query_log_event::write() if (thd && thd->binlog_xid) { *start++= Q_XID; - int8store(start, thd->query_id); + int8store(start, thd->binlog_xid);
I agree that it looks correct. How did you find it? Does your patch introduce a case where thd->binlog_xid != thd->query_id ?
I just tried to update xid via query_id and failed. The interface of updating xid is quite weird by having to copy query_id to binlog_xid and then clear binlog_xid back. I don't understand why this is needed.
start+= 8; }
diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc --- a/sql/sql_partition.cc +++ b/sql/sql_partition.cc @@ -7439,18 +7439,28 @@ uint fast_alter_partition_table(THD *thd, TABLE *table, alter_partition_extract(lpt) || (frm_install= TRUE, FALSE) || mysql_write_frm(lpt, WFRM_INSTALL_SHADOW|WFRM_BACKUP_ORIGINAL) || log_partition_alter_to_ddl_log(lpt) || (frm_install= FALSE, FALSE) || + ((!thd->lex->no_write_to_binlog) && + (thd->binlog_xid= thd->query_id, + ddl_log_update_xid(lpt->part_info, thd->binlog_xid), + write_bin_log(thd, false, thd->query(), thd->query_length()), + thd->binlog_xid= 0)) ||
do I understand it correctly - if the crash happens before the binlog write, ddl log on recovery will undo everything. If the crash happens after, it'll see the xid in binlog and will not undo the operation?
Yes. Don't forget about error handling. We rollback everything on error, that is important for atomicity too. And that was not done for the main code of mysql_alter_table() AFAIK.
+ /* + Note: this crash-point does not cleanup backup (see WFRM_DROP_BACKUP), + but this crash-point is low probability as ddl_log_complete() is + fast operation. + */
I still don't quite understand why did you introduce a new method with frm backup and a (low probability) window where it won't clean up.
Isn't it similar to normal ALTER TABLE? It also needs to replace frms and it is solved with ddl log - if a crash happens after the point of no return (after the frm was replaced with a new one), ddl log will binlog the query on recovery.
Could you do the same, like
case DDL_LOCK_MIGRATE_PARTITION_ACTION: if (shadow frm no longer exists && xid not in binlog) write_bin_log(...);
would that work? Or would partitioning-specific old ddl logging code interfere?
I believe ALTER TABLE atomicity is not the perfect one in respect of rollback on error so why should that be an example for me? Another issue with the scheme you proposed is worse complexity and reliability. Why do I have to do some vague logic about binlogging if it can be straightforward and simple? So either I have to expand frm handling or add another DDL log action: why are you asking me not to do the former and do the latter? I can create the second chain for cleaning up the leftover backups and that would be anyway better than binlogging in 2 places.
(ddl_log_complete(lpt->part_info), false) || - ((!thd->lex->no_write_to_binlog) && - (write_bin_log(thd, FALSE, - thd->query(), thd->query_length()), FALSE)) || mysql_write_frm(lpt, WFRM_DROP_BACKUP) ||
Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org
-- All the best, Aleksey Midenkov @midenok