Re: [Maria-developers] multi-source replication, first (untested) version
Michael Widenius <monty@askmonty.org> writes:
I have now got the first version of multi source patch ported to MariadB.
As requested, I did a quick read through of the patch. I took the patch from your 10.0-mdev253 tree, this revision: revid:monty@askmonty.org-20120930233044-e0m6u6t9q3jeqx03 Below are my comments. Overall, it was nice to see the relatively modest changes needed to get this working. - Kristian. -----------------------------------------------------------------------
=== modified file 'sql/sys_vars.cc' --- sql/sys_vars.cc 2012-09-22 12:30:24 +0000 +++ sql/sys_vars.cc 2012-10-01 09:55:23 +0000
+static Sys_var_multi_source_ulong +Sys_slave_skip_counter("sql_slave_skip_counter", + "Skip the next N events from the master log", + SESSION_VAR(slave_skip_counter), + NO_CMD_LINE, + offsetof(Master_info, rli.slave_skip_counter),
This use of offsetof() on a C++ class (non-POD) is incorrect. Even if it works for you with your current version of GCC and current compiler options, it is likely to break at some point on some future compiler. And it will be hell to track down. Look, I understand that you do not like some of the ways that the C++ language is defined, like here that offsetof() is not allowed on non-POD data structures. But we have to live with it. Surely, it is better to have a standard language that compiler writers and compiler users agree on how works - even if that language is less convenient than what we would like. So let us please change this to be correct C++. I can change it for you if you want.
=== modified file 'sql/slave.cc' --- sql/slave.cc 2012-09-13 12:31:29 +0000 +++ sql/slave.cc 2012-10-01 09:55:23 +0000
+bool show_all_master_info(THD* thd) +{ + uint i, elements; + Master_info **tmp; + DBUG_ENTER("show_master_info"); + + if (send_show_master_info_header(thd, active_mi, 1)) + DBUG_RETURN(TRUE); + + if (!(elements= master_info_index->master_info_hash.records)) + goto end; + + /* + Sort lines to get them into a predicted order + (needed for test cases and to not confuse users) + */ + if (!(tmp= (Master_info**) thd->alloc(sizeof(Master_info*) * elements))) + DBUG_RETURN(TRUE); + + for (i= 0; i < elements; i++) + { + tmp[i]= (Master_info *) my_hash_element(&master_info_index-> + master_info_hash, i); + }
Isn't there a race condition here? What happens if the number of elements changes between looking up number of elements and the loop? If there is a lock protecting this, please add a safe_mutex_assert_owner() and/or a comment. The call of show_all_master_info() is like this: mysql_mutex_lock(&LOCK_active_mi); if (lex->verbose) res= show_all_master_info(thd); Does LOCK_active_mi protect all changes to multi-master state? If so, I think it should be renamed - "active_mi" is rather confusing in this case.
=== modified file 'sql/sql_class.h' --- sql/sql_class.h 2012-09-22 14:11:40 +0000 +++ sql/sql_class.h 2012-10-01 09:55:23 +0000
+ /** + Place holders to store Multi-source variables in sys_var.cc during + update and show of variables. + */ + ulong slave_skip_counter; + ulong max_relay_log_size;
+ /* Names. These will be allocated in buffers in thd */ + LEX_STRING default_master_connection; +
+ LEX_STRING connection_name; /* If slave */ + char default_master_connection_buff[MAX_CONNECTION_NAME+1];
Is there not some other way to do this? THD seems to have become some kind of garbage dump for all kinds of special-purpose data. This is not good, it pollutes the CPU caches for all threads, and generally makes everything a mess :-/ I would like to see us move towards having less things in THD that are not relevant to all threads, not more. Could we add a single void * somewhere in THD for special-purpose threads? And then replication threads could use that for their special-purpose stuff. Or maybe sub-class THD. Then later we could start moving other stuff out of THD that is not relevant to all threads...
=== modified file 'sql/rpl_rli.h' --- sql/rpl_rli.h 2012-08-27 16:13:17 +0000 +++ sql/rpl_rli.h 2012-10-01 09:55:23 +0000
+ volatile ulong slave_skip_counter; /* Must be ulong */
Such comments are extremely unhelpful. Instead explain why it must be ulong.
=== modified file 'sql/item_create.cc' --- sql/item_create.cc 2012-01-13 14:50:02 +0000 +++ sql/item_create.cc 2012-10-01 09:55:23 +0000 @@ -4393,27 +4393,36 @@ Create_func_master_pos_wait::create_nati
+ thd->lex->safe_to_cache_query= 0;
case 3:
thd->lex->safe_to_cache_query= 0;
+ case 4:
+ thd->lex->safe_to_cache_query= 0;
Isn't it redundant to set safe_to_cache_query twice here?
=== modified file 'sql/rpl_mi.cc' --- sql/rpl_mi.cc 2012-03-27 23:04:46 +0000 +++ sql/rpl_mi.cc 2012-10-01 09:55:23 +0000
+ /* Store connection name and lower case connection name */ + connection_name.length= cmp_connection_name.length= + connection_name_arg->length; + if ((connection_name.str= (char*) my_malloc(connection_name_arg->length*2+2, + MYF(MY_WME)))) + { + cmp_connection_name.str= (connection_name.str + + connection_name_arg->length+1); + strmake(connection_name.str, connection_name_arg->str, + connection_name.length); + memcpy(cmp_connection_name.str, connection_name_arg->str, + connection_name.length+1); + my_casedn_str(system_charset_info, cmp_connection_name.str); + }
Is it really correct to ignore out-of-memory error here, and just do nothing? Ah, I think this is handled by checking Master_info::error() in other places? If so, then please add a comment here explaining this.
+/** + Create a log file with a signed suffix.
I do not know what is meant with the term "signed suffix". This is not something that people will understand. Does it mean "Create a log file name by appending a dash "-" followed by a suffix" ? Also rename the file create_signed_file_name() - again, the "signed" is confusing. For example "create_augmented_file_name", or "create_name_with_dash_and_suffix" or something.
+ It's valid if it's a valid system name, is less than + MAX_CONNECTION_NAME.
"It is valid if it is a valid system name of length less than MAX_CONNECTION_NAME".
=== modified file 'client/mysqltest.cc' --- client/mysqltest.cc 2012-08-31 21:54:54 +0000 +++ client/mysqltest.cc 2012-10-01 09:55:23 +0000
+ if (buff) + my_free(start);
Better use my_free(buff) here.
=== modified file 'sql/rpl_mi.h' --- sql/rpl_mi.h 2012-03-27 23:04:46 +0000 +++ sql/rpl_mi.h 2012-10-01 09:55:23 +0000
+ char master_log_name[FN_REFLEN+6]; /* Place for multi-*/
"place for" -> "room for".
=== modified file 'sql/mysqld.cc' --- sql/mysqld.cc 2012-09-22 14:11:40 +0000 +++ sql/mysqld.cc 2012-10-01 09:55:23 +0000
+ /* + We must have LOCK_open before LOCK_global_system_variables because + LOCK_open is hold while sql_plugin.c::intern_sys_var_ptr() is called.
"is hold" -> "is held"
=== modified file 'sql/item_create.cc' --- sql/item_create.cc 2012-01-13 14:50:02 +0000 +++ sql/item_create.cc 2012-10-01 09:55:23 +0000 @@ -4393,27 +4393,36 @@ Create_func_master_pos_wait::create_nati if (item_list != NULL) arg_count= item_list->elements;
+ if (arg_count < 2 || arg_count >4)
Missing space before "4"
Hi!
"Kristian" == Kristian Nielsen <knielsen@knielsen-hq.org> writes:
Kristian> Michael Widenius <monty@askmonty.org> writes:
I have now got the first version of multi source patch ported to MariadB.
Kristian> As requested, I did a quick read through of the patch. I took the patch from Kristian> your 10.0-mdev253 tree, this revision: Kristian> revid:monty@askmonty.org-20120930233044-e0m6u6t9q3jeqx03 Kristian> Below are my comments. Overall, it was nice to see the relatively modest Kristian> changes needed to get this working. Kristian> - Kristian. Kristian> -----------------------------------------------------------------------
=== modified file 'sql/sys_vars.cc' --- sql/sys_vars.cc 2012-09-22 12:30:24 +0000 +++ sql/sys_vars.cc 2012-10-01 09:55:23 +0000
+static Sys_var_multi_source_ulong +Sys_slave_skip_counter("sql_slave_skip_counter", + "Skip the next N events from the master log", + SESSION_VAR(slave_skip_counter), + NO_CMD_LINE, + offsetof(Master_info, rli.slave_skip_counter),
Kristian> This use of offsetof() on a C++ class (non-POD) is incorrect. Even if it works Kristian> for you with your current version of GCC and current compiler options, it is Kristian> likely to break at some point on some future compiler. And it will be hell to Kristian> track down. Kristian> Look, I understand that you do not like some of the ways that the C++ language Kristian> is defined, like here that offsetof() is not allowed on non-POD data Kristian> structures. But we have to live with it. Surely, it is better to have a Kristian> standard language that compiler writers and compiler users agree on how works Kristian> - even if that language is less convenient than what we would like. Kristian> So let us please change this to be correct C++. I can change it for you if you Kristian> want. Yes, please suggest/push a fix. I tried but couldn't come up with anything efficient, short to write or that didn't require a function for every offset. I think this is safe as long as Sys_vars doesn't have any virtual functions. According to gcc manual it looks like future gcc version will support offsetof() on base C++ classes.
=== modified file 'sql/slave.cc' --- sql/slave.cc 2012-09-13 12:31:29 +0000 +++ sql/slave.cc 2012-10-01 09:55:23 +0000
+bool show_all_master_info(THD* thd) +{
<cut>
+ for (i= 0; i < elements; i++) + { + tmp[i]= (Master_info *) my_hash_element(&master_info_index-> + master_info_hash, i); + }
Kristian> Isn't there a race condition here? What happens if the number of elements Kristian> changes between looking up number of elements and the loop? It can't do that because it's protected by LOCK_active_mi. (It's only under this lock the hash can change). Kristian> If there is a lock protecting this, please add a safe_mutex_assert_owner() Kristian> and/or a comment. I have added safe_mutex and comment Kristian> The call of show_all_master_info() is like this: Kristian> mysql_mutex_lock(&LOCK_active_mi); Kristian> if (lex->verbose) Kristian> res= show_all_master_info(thd); Kristian> Does LOCK_active_mi protect all changes to multi-master state? If so, I think Kristian> it should be renamed - "active_mi" is rather confusing in this case. Yes, but it has to be done in a separate patch as some point.
=== modified file 'sql/sql_class.h' --- sql/sql_class.h 2012-09-22 14:11:40 +0000 +++ sql/sql_class.h 2012-10-01 09:55:23 +0000
+ /** + Place holders to store Multi-source variables in sys_var.cc during + update and show of variables. + */ + ulong slave_skip_counter; + ulong max_relay_log_size;
+ /* Names. These will be allocated in buffers in thd */ + LEX_STRING default_master_connection; +
+ LEX_STRING connection_name; /* If slave */ + char default_master_connection_buff[MAX_CONNECTION_NAME+1];
Kristian> Is there not some other way to do this? Kristian> THD seems to have become some kind of garbage dump for all kinds of Kristian> special-purpose data. This is not good, it pollutes the CPU caches for all Kristian> threads, and generally makes everything a mess :-/ There was no easier way I could to fix the issue with the variables. (because of how set_var.cc is constructed) default_master_connection is needed in THD, as this is specific for this THD. connection_name is just a pointer to make a lot of code easier and safer. Kristian> I would like to see us move towards having less things in THD that are not Kristian> relevant to all threads, not more. Kristian> Could we add a single void * somewhere in THD for special-purpose threads? Kristian> And then replication threads could use that for their special-purpose stuff. Kristian> Or maybe sub-class THD. Kristian> Then later we could start moving other stuff out of THD that is not relevant Kristian> to all threads... I am happy to look at patch from you for the above ;)
=== modified file 'sql/rpl_rli.h' --- sql/rpl_rli.h 2012-08-27 16:13:17 +0000 +++ sql/rpl_rli.h 2012-10-01 09:55:23 +0000
+ volatile ulong slave_skip_counter; /* Must be ulong */
Kristian> Such comments are extremely unhelpful. Instead explain why it must be ulong. Fixed.
=== modified file 'sql/item_create.cc' --- sql/item_create.cc 2012-01-13 14:50:02 +0000 +++ sql/item_create.cc 2012-10-01 09:55:23 +0000 @@ -4393,27 +4393,36 @@ Create_func_master_pos_wait::create_nati
+ thd->lex->safe_to_cache_query= 0;
case 3:
thd-> lex->safe_to_cache_query= 0;
+ case 4:
+ thd->lex->safe_to_cache_query= 0;
Kristian> Isn't it redundant to set safe_to_cache_query twice here? That come from me reorganziation the code. I forgot to remove the above that was needed in the old code.
=== modified file 'sql/rpl_mi.cc' --- sql/rpl_mi.cc 2012-03-27 23:04:46 +0000 +++ sql/rpl_mi.cc 2012-10-01 09:55:23 +0000
+ /* Store connection name and lower case connection name */ + connection_name.length= cmp_connection_name.length= + connection_name_arg->length; + if ((connection_name.str= (char*) my_malloc(connection_name_arg->length*2+2, + MYF(MY_WME)))) + { + cmp_connection_name.str= (connection_name.str + + connection_name_arg->length+1); + strmake(connection_name.str, connection_name_arg->str, + connection_name.length); + memcpy(cmp_connection_name.str, connection_name_arg->str, + connection_name.length+1); + my_casedn_str(system_charset_info, cmp_connection_name.str); + }
Kristian> Is it really correct to ignore out-of-memory error here, and just do nothing? Kristian> Ah, I think this is handled by checking Master_info::error() in other places? Kristian> If so, then please add a comment here explaining this. Yes, it's handled. Comment added.
+/** + Create a log file with a signed suffix.
Kristian> I do not know what is meant with the term "signed suffix". This is not Kristian> something that people will understand. 'signed' was a typo left from the old code. However the comment for the function should explain this fully. Kristian> Does it mean "Create a log file name by appending a dash "-" followed by a Kristian> suffix" ? Kristian> Also rename the file create_signed_file_name() - again, the "signed" is Kristian> confusing. For example "create_augmented_file_name", or Kristian> "create_name_with_dash_and_suffix" or something. augmented is even more confusing for me... I am going with 'create_logfile_name_with_suffix'
+ It's valid if it's a valid system name, is less than + MAX_CONNECTION_NAME.
Kristian> "It is valid if it is a valid system name of length less than Kristian> MAX_CONNECTION_NAME". fixed.
=== modified file 'client/mysqltest.cc' --- client/mysqltest.cc 2012-08-31 21:54:54 +0000 +++ client/mysqltest.cc 2012-10-01 09:55:23 +0000
+ if (buff) + my_free(start);
Kristian> Better use my_free(buff) here. Doesn't work. (different variables)
=== modified file 'sql/rpl_mi.h' --- sql/rpl_mi.h 2012-03-27 23:04:46 +0000 +++ sql/rpl_mi.h 2012-10-01 09:55:23 +0000
+ char master_log_name[FN_REFLEN+6]; /* Place for multi-*/
Kristian> "place for" -> "room for". Fixed
=== modified file 'sql/mysqld.cc' --- sql/mysqld.cc 2012-09-22 14:11:40 +0000 +++ sql/mysqld.cc 2012-10-01 09:55:23 +0000
+ /* + We must have LOCK_open before LOCK_global_system_variables because + LOCK_open is hold while sql_plugin.c::intern_sys_var_ptr() is called.
Kristian> "is hold" -> "is held" Fixed
=== modified file 'sql/item_create.cc' --- sql/item_create.cc 2012-01-13 14:50:02 +0000 +++ sql/item_create.cc 2012-10-01 09:55:23 +0000 @@ -4393,27 +4393,36 @@ Create_func_master_pos_wait::create_nati if (item_list != NULL) arg_count= item_list->elements;
+ if (arg_count < 2 || arg_count >4)
Kristian> Missing space before "4" Fixed. Thanks! Regards, Monty
participants (2)
-
Kristian Nielsen
-
Michael Widenius