[MariaDB developers] Re: [PATCH] MDEV-34481 optimize away waiting for owned by prepared xa non-unique index record

15 Aug 2024

      Andrei Elkin <andrei.elkin@mariadb.com> writes:
...
...
This is not the way to communicate back from engine to server. Instead,
simply return a different error code than HA_ERR_LOCK_WAIT_TIMEOUT from
InnoDB that you can check for here. You can maybe use one of the existing
HA_ERR_* from include/my_base.h, or add a new one if necessary.
...
However, adding a new error is clearly more intrusive.
The new error is completely handled inside the code you add for signalling
the condition inside InnoDB locking code and handling it in replication row
event apply. It will not be visible to users. So it is not intrusive to add
a new one. Just be careful to not pick a number that conflicts with another
entry in a later branch, if you do this in a GA release.

Alternatively, you can use the existing HA_ERR_ROW_NOT_VISIBLE error, which
is used in a somewhat similar way in storage/maria/ma_blockrec.c and nowhere
else. But I would just add a new error for clarity.
...
And at the same time I thought a some sort of runtime state the slave
applier had always missed. So the intent of this struct is future-minded
as well.
This is not the point. Even if the rgi->exec_flags already existed, it would
still be wrong to use it for this.

Here is the code where you are handling the condition:
...
+      if (error == HA_ERR_LOCK_WAIT_TIMEOUT && !is_index_unique &&
+          (rgi->exec_flags & (1 << rpl_group_info::HIT_BUSY_INDEX)))
+      {
+        rgi->exec_flags &= ~(1 << rpl_group_info::HIT_BUSY_INDEX);
+        skip_first_compare= true;
So you _already_ are using a specific error code to signal the condition
from InnoDB to replication, the error HA_ERR_LOCK_WAIT_TIMEOUT. But this
error can also be thrown in other situations, so you need to add these extra
conditions to distinguish.

So essentially, you are using the wrong error code to signal the condition.
Use a correct error code that uniquely identifies the condition that
occured, and then you can avoid these extra conditions and simplify the code.
...
I would be fine with either method.
Good, then please change it to return a proper error code.

The only reason that you are using HA_ERR_LOCK_WAIT_TIMEOUT is that then you
could do the quick change inside InnoDB to set its timeout value to 0, and
this happens to then propagate the HA_ERR_LOCK_WAIT_TIMEOUT up to the
callers without waiting. But that's not an appropriate way to design new
code and APIs. Yes, it is frustrating when conceptually simple changes need
extra effort, due to the need to work in an existing old and complex code
base like MariaDB. But that is just the way it is, it is the same for all of
us.

 - Kristian.

[MariaDB developers] Re: [PATCH] MDEV-34481 optimize away waiting for owned by prepared xa non-unique index record

Kristian Nielsen