Re: [PATCH] MDEV-32551: "Read semi-sync reply magic number error" warnings on master
Hi Monty,
From: Michael Widenius <monty@mariadb.com>
commit 7abf83f86829ed94de618af4749eb96d5379bd90 (origin/bb-10.6-monty) Author: Michael Widenius <monty@mariadb.com> Date: Wed Nov 8 16:57:58 2023 -0700
MDEV-32551: "Read semi-sync reply magic number error" warnings on master
I read through the patch, looks good. A few comments below, mostly minor things and typos that you can consider as you like. - Kristian.
- We should in 11.3 changed the default value for rpl-semi-sync-master-wait-no-slave from TRUE to FALSE as the TRUE
Sys_semisync_master_wait_no_slave( "rpl_semi_sync_master_wait_no_slave", - "Wait until timeout when no semi-synchronous replication slave " - "available (enabled by default).", + "Wait until timeout when no semi-synchronous replication slave is " + "available.", GLOBAL_VAR(rpl_semi_sync_master_wait_no_slave), CMD_LINE(OPT_ARG), DEFAULT(TRUE), - NO_MUTEX_GUARD, NOT_IN_BINLOG, ON_CHECK(0), - ON_UPDATE(fix_rpl_semi_sync_master_wait_no_slave)); + NO_MUTEX_GUARD, NOT_IN_BINLOG, ON_CHECK(0));
-my_bool rpl_semi_sync_master_wait_no_slave = 1; +my_bool rpl_semi_sync_master_wait_no_slave = 0;
There seems to be som inconsistency here: - The commit is based on 10.6, but the commit comment says 11.3 - The option help text is changed in the commit - The "DEFAULT(TRUE)" in the sysvar definition is _not_ changed in the commit - The C initialization values of the my_bool rpl_semi_sync_master_wait_no_slave _is_ changed in the commit. Should it be DEFAULT(FALSE) also in this patch, and the commit message say "10.6"? Or should these changes be removed from the 10.6 patch and put in 11.3 instead? At least this looks inconsistent.
diff --git a/mysql-test/include/wait_for_pattern_in_file.inc b/mysql-test/include/wait_for_pattern_in_file.inc new file mode 100644 index 00000000000..d088cb125c2 --- /dev/null +++ b/mysql-test/include/wait_for_pattern_in_file.inc @@ -0,0 +1,54 @@ +# ==== Purpose ==== +# +# Waits until pattern comes into log file or until a timeout is reached.
Nice addition, this will be useful for other things!
+--let $keep_include_silent= 1
Why do you set this in the .inc? (Normally this is set in the .test files that want to skip the printout). +let SEARCH_SILENT=1; Maybe the .inc should restore the old value at the end (in case a .test will at some point set it globally for its own calls to include/search_pattern_in_file.inc)
+--let $keep_include_silent= 0
Here, better restore the original value (in case some .test has set it to 1 for itself).
+# Test is binlog format indepdendent, so save resources
(Typo s/indepdendent/independent/)
+# The followowing command is likely to cause the slave master is not yet setup +# for semi-sync
(Typo s/followowing/following/)
- strcpy(m_reply_file_name, ""); - strcpy(m_wait_file_name, ""); + m_reply_file_name[0]= m_wait_file_name[0]= 0;
Interestingly, GCC actually creates identical code for these (it knows to optimize strcpy()). (I'm just mentioning this as I know we both are interested in low-level code details, not asking you to revert the change). Example code: ----------------------------------------------------------------------- #include <string.h> struct X { char m_a[100]; char m_b[10]; void init1(); void init2(); }; void X::init1() { strcpy(m_a, ""); strcpy(m_b, ""); } void X::init2() { m_a[0]= m_b[0]= 0; } ----------------------------------------------------------------------- Result from gcc -O3: ----------------------------------------------------------------------- .type _ZN1X5init1Ev, @function _ZN1X5init1Ev: movb $0, (%rdi) movb $0, 100(%rdi) ret .type _ZN1X5init2Ev, @function _ZN1X5init2Ev: movb $0, 100(%rdi) movb $0, (%rdi) ret ----------------------------------------------------------------------- Identical, except that we now initialize in the opposite order ;-)
+#ifndef DBUG_OFF +void dbug_verify_no_duplicate_slaves(Slave_ilist *m_slaves, THD *thd) +{
You could make this (debug-only) function "static".
+ Master is not responding (gone away?) or it has turned semi sync + off. Turning off semi-sync repsonces as there is no point in sending
(Typo s/repsonces/responses/)
Hi! Thanks for doing a review ! On Wed, Nov 22, 2023 at 10:28 PM Kristian Nielsen <knielsen@knielsen-hq.org> wrote: <cut>
- We should in 11.3 changed the default value for rpl-semi-sync-master-wait-no-slave from TRUE to FALSE as the TRUE
Sys_semisync_master_wait_no_slave( "rpl_semi_sync_master_wait_no_slave", - "Wait until timeout when no semi-synchronous replication slave " - "available (enabled by default).", + "Wait until timeout when no semi-synchronous replication slave is " + "available.", GLOBAL_VAR(rpl_semi_sync_master_wait_no_slave), CMD_LINE(OPT_ARG), DEFAULT(TRUE), - NO_MUTEX_GUARD, NOT_IN_BINLOG, ON_CHECK(0), - ON_UPDATE(fix_rpl_semi_sync_master_wait_no_slave)); + NO_MUTEX_GUARD, NOT_IN_BINLOG, ON_CHECK(0));
-my_bool rpl_semi_sync_master_wait_no_slave = 1; +my_bool rpl_semi_sync_master_wait_no_slave = 0;
There seems to be som inconsistency here:
- The commit is based on 10.6, but the commit comment says 11.3 - The option help text is changed in the commit - The "DEFAULT(TRUE)" in the sysvar definition is _not_ changed in the commit - The C initialization values of the my_bool rpl_semi_sync_master_wait_no_slave _is_ changed in the commit.
Should it be DEFAULT(FALSE) also in this patch, and the commit message say "10.6"? Or should these changes be removed from the 10.6 patch and put in 11.3 instead? At least this looks inconsistent.
This was more to prepare for future changes. Also, normally we initialize all sys variables to 0 as it us up to sys_vars to set the initial values. (We should not have to modify things in many places when we change defaults). removed the 'default' text as we do not do that for other options and I think it's wrong to promote something as 'default' that will change in the future This means that in 11.3 we only have to update the default value in sys_vars and update a few result files. For me saying 'default' for only one option in help, indicates that this is the best possible option, which it is not For completeness I added back the initial setting (which does nothing).
diff --git a/mysql-test/include/wait_for_pattern_in_file.inc b/mysql-test/include/wait_for_pattern_in_file.inc new file mode 100644 index 00000000000..d088cb125c2 --- /dev/null +++ b/mysql-test/include/wait_for_pattern_in_file.inc @@ -0,0 +1,54 @@ +# ==== Purpose ==== +# +# Waits until pattern comes into log file or until a timeout is reached.
Nice addition, this will be useful for other things!
+--let $keep_include_silent= 1
Why do you set this in the .inc? (Normally this is set in the .test files that want to skip the printout).
I do not want to clutter the output from the call to include/search_pattern_in_file.inc which can happen a random number of times.
+let SEARCH_SILENT=1;
Maybe the .inc should restore the old value at the end (in case a .test will at some point set it globally for its own calls to include/search_pattern_in_file.inc)
SEARCH_SILENT was added to be used only by wait_for_pattern_in_file.inc If someone needs the behavior, better to call wait_for_pattern_in_file.inc
+--let $keep_include_silent= 0
Here, better restore the original value (in case some .test has set it to 1 for itself).
Will do.
+# Test is binlog format indepdendent, so save resources
(Typo s/indepdendent/independent/)
+# The followowing command is likely to cause the slave master is not yet setup +# for semi-sync
(Typo s/followowing/following/)
both commands.
- strcpy(m_reply_file_name, ""); - strcpy(m_wait_file_name, ""); + m_reply_file_name[0]= m_wait_file_name[0]= 0;
Interestingly, GCC actually creates identical code for these (it knows to optimize strcpy()). (I'm just mentioning this as I know we both are interested in low-level code details, not asking you to revert the change).
I actually assumed gcc would do that. However I think the change makes the code more clean and precise (I also do not like strcpy() as in most cases the return value is unusable and gets people to use strcat(), which is bad). Example code: <cut> Thanks for looking deep into this. It is always interesting to get ones assumptions proved.
+#ifndef DBUG_OFF +void dbug_verify_no_duplicate_slaves(Slave_ilist *m_slaves, THD *thd) +{
You could make this (debug-only) function "static".
Good point. Done.
+ Master is not responding (gone away?) or it has turned semi sync + off. Turning off semi-sync repsonces as there is no point in sending
(Typo s/repsonces/responses/)
Fixed. Thanks for the review! Regards, Monty
participants (2)
-
Kristian Nielsen
-
Michael Widenius