[Maria-developers] Code syntax: questions on pointers, etc.
Recent refactorings of replacing C strings with LEX_CSTRING which is no doubt a good thing raise some questions: 1. Is it still guaranteed that Field::field_name.str is NULL-terminated? 2. It is still passed as a pointer to functions. Why is that? The main feature of C++ references is that it cannot be NULL, so we get segfault on top of the stack (closer to a cause), not the bottom of it. I see that pointers are now widely used and mainly assumed to be always non-NULL (i.e. dereferenced without assertion). But placing such implicit contract on data is not evident and bug-prone. IMHO it's much better to use references whenever it is possible (and when there is no need in cosy NULL semantic). What do you think? But for such lightweight structs like LEX_CSTRING it is even better to pass by value, so we could have the conventience of type cast. 3. LEX_CSTRING and LEX_STRING are now non-convertible. Why not to make: template <typename char_t> struct st_mysql_lex_string { char_t *str; size_t length; }; typedef st_mysql_lex_string<char *> LEX_STRING; typedef st_mysql_lex_string<const char *> LEX_CSTRING; typedef st_mysql_lex_string<const unsigned char *> LEX_CUSTRING; ? 4. There are some duplicate types: MYSQL_LEX_STRING, MYSQL_CONST_LEX_STRING. Why? -- All the best, Aleksey Midenkov @midenok
Hi, Aleksey! On Oct 20, Aleksey Midenkov wrote:
Recent refactorings of replacing C strings with LEX_CSTRING which is no doubt a good thing raise some questions:
1. Is it still guaranteed that Field::field_name.str is NULL-terminated?
I think it's still the case, yes. What code relies on it?
2. It is still passed as a pointer to functions. Why is that?
The main feature of C++ references is that it cannot be NULL, so we get segfault on top of the stack (closer to a cause), not the bottom of it. I see that pointers are now widely used and mainly assumed to be always non-NULL (i.e. dereferenced without assertion). But placing such implicit contract on data is not evident and bug-prone. IMHO it's much better to use references whenever it is possible (and when there is no need in cosy NULL semantic). What do you think?
This is something that gets raised over and over. Some prefer C-style pointers over references, so that when you look at the function call: func(a,&b); you can immediately see that `a` is passed by value and cannot be modified by `func`, while `b` can. Others prefer C++ references. I personally reside on the middle ground, where one uses pointers to pass "out" parameters, and const references for "in" parameters.
But for such lightweight structs like LEX_CSTRING it is even better to pass by value, so we could have the conventience of type cast.
What do you mean by that?
3. LEX_CSTRING and LEX_STRING are now non-convertible. Why not to make:
template <typename char_t> struct st_mysql_lex_string { char_t *str; size_t length; };
typedef st_mysql_lex_string<char *> LEX_STRING; typedef st_mysql_lex_string<const char *> LEX_CSTRING; typedef st_mysql_lex_string<const unsigned char *> LEX_CUSTRING;
?
What would that change?
4. There are some duplicate types: MYSQL_LEX_STRING, MYSQL_CONST_LEX_STRING. Why?
These are names used in the plugin API. They start from MYSQL_* to avoid possible name clashes with third-party code. Regards, Sergei Chief Architect MariaDB and security@mariadb.org
On Sun, Oct 22, 2017 at 10:27 PM, Sergei Golubchik <serg@mariadb.org> wrote:
Hi, Aleksey!
On Oct 20, Aleksey Midenkov wrote:
Recent refactorings of replacing C strings with LEX_CSTRING which is no doubt a good thing raise some questions:
1. Is it still guaranteed that Field::field_name.str is NULL-terminated?
I think it's still the case, yes. What code relies on it?
Sorry, I don't remember. Maybe none. That was the general question.
2. It is still passed as a pointer to functions. Why is that?
The main feature of C++ references is that it cannot be NULL, so we get segfault on top of the stack (closer to a cause), not the bottom of it. I see that pointers are now widely used and mainly assumed to be always non-NULL (i.e. dereferenced without assertion). But placing such implicit contract on data is not evident and bug-prone. IMHO it's much better to use references whenever it is possible (and when there is no need in cosy NULL semantic). What do you think?
This is something that gets raised over and over. Some prefer C-style pointers over references, so that when you look at the function call:
func(a,&b);
you can immediately see that `a` is passed by value and cannot be modified by `func`, while `b` can.
Others prefer C++ references.
But are the reasons mentioned above not enough to once and for all resolve the dilemma?
I personally reside on the middle ground, where one uses pointers to pass "out" parameters, and const references for "in" parameters.
But for such lightweight structs like LEX_CSTRING it is even better to pass by value, so we could have the conventience of type cast.
What do you mean by that?
#include <cstring> struct LEX_CSTRING { const char *str; unsigned long length; LEX_CSTRING() {} LEX_CSTRING(const char *c_str) { str= c_str; length= strlen(c_str); } }; void func(LEX_CSTRING arg) { }; class MyCunningString { public: operator LEX_CSTRING() { return LEX_CSTRING(); } }; int main() { MyCunningString str; func(str); const char* c_str; func(c_str); return 0; }
3. LEX_CSTRING and LEX_STRING are now non-convertible. Why not to make:
template <typename char_t> struct st_mysql_lex_string { char_t *str; size_t length; };
typedef st_mysql_lex_string<char *> LEX_STRING; typedef st_mysql_lex_string<const char *> LEX_CSTRING; typedef st_mysql_lex_string<const unsigned char *> LEX_CUSTRING;
?
What would that change?
#include <cstring> template <typename char_t> struct st_mysql_lex_string { char_t *str; size_t length; st_mysql_lex_string<char_t>(){} template <typename charX_t> st_mysql_lex_string<char_t>(st_mysql_lex_string<charX_t> &from) : str ((char_t *) from.str), length (from.length) {} }; typedef st_mysql_lex_string<char *> LEX_STRING; typedef st_mysql_lex_string<const char *> LEX_CSTRING; typedef st_mysql_lex_string<const unsigned char *> LEX_CUSTRING; void func(LEX_CSTRING arg) { } int main() { LEX_STRING str; func(str); LEX_CUSTRING ustr; func(ustr); return 0; }
4. There are some duplicate types: MYSQL_LEX_STRING, MYSQL_CONST_LEX_STRING. Why?
These are names used in the plugin API. They start from MYSQL_* to avoid possible name clashes with third-party code.
Regards, Sergei Chief Architect MariaDB and security@mariadb.org
-- All the best, Aleksey Midenkov @midenok
Hi, Aleksey! On Oct 23, Aleksey Midenkov wrote:
On Sun, Oct 22, 2017 at 10:27 PM, Sergei Golubchik <serg@mariadb.org> wrote:
2. It is still passed as a pointer to functions. Why is that?
The main feature of C++ references is that it cannot be NULL, so we get segfault on top of the stack (closer to a cause), not the bottom of it. I see that pointers are now widely used and mainly assumed to be always non-NULL (i.e. dereferenced without assertion). But placing such implicit contract on data is not evident and bug-prone. IMHO it's much better to use references whenever it is possible (and when there is no need in cosy NULL semantic). What do you think?
This is something that gets raised over and over. Some prefer C-style pointers over references, so that when you look at the function call:
func(a,&b);
you can immediately see that `a` is passed by value and cannot be modified by `func`, while `b` can.
Others prefer C++ references.
But are the reasons mentioned above not enough to once and for all resolve the dilemma?
As we're talking about it (and were, many times) - apparently not :)
I personally reside on the middle ground, where one uses pointers to pass "out" parameters, and const references for "in" parameters.
But for such lightweight structs like LEX_CSTRING it is even better to pass by value, so we could have the conventience of type cast.
What do you mean by that?
#include <cstring>
struct LEX_CSTRING { const char *str; unsigned long length; LEX_CSTRING() {} LEX_CSTRING(const char *c_str) { str= c_str; length= strlen(c_str); } };
void func(LEX_CSTRING arg) { };
class MyCunningString { public: operator LEX_CSTRING() { return LEX_CSTRING(); } };
int main() { MyCunningString str; func(str); const char* c_str; func(c_str); return 0; }
Okay. Looks good.
3. LEX_CSTRING and LEX_STRING are now non-convertible. Why not to make:
template <typename char_t> struct st_mysql_lex_string { char_t *str; size_t length; };
typedef st_mysql_lex_string<char *> LEX_STRING; typedef st_mysql_lex_string<const char *> LEX_CSTRING; typedef st_mysql_lex_string<const unsigned char *> LEX_CUSTRING;
?
What would that change?
#include <cstring>
template <typename char_t> struct st_mysql_lex_string { char_t *str; size_t length; st_mysql_lex_string<char_t>(){} template <typename charX_t> st_mysql_lex_string<char_t>(st_mysql_lex_string<charX_t> &from) : str ((char_t *) from.str), length (from.length) {} };
typedef st_mysql_lex_string<char *> LEX_STRING; typedef st_mysql_lex_string<const char *> LEX_CSTRING; typedef st_mysql_lex_string<const unsigned char *> LEX_CUSTRING;
void func(LEX_CSTRING arg) { }
int main() { LEX_STRING str; func(str); LEX_CUSTRING ustr; func(ustr); return 0; }
Right, so I thought. So template<> doesn't change anything. Constructors and passing-by value do. Anyway, type conversion is nice. It's rather annoying to cast between LEX_CSTRING and LEX_STRING, for example. Templates - not sure we could use them here, because these types sometimes need to work in pure C too. LEX_STRING does, at least. We could do, say struct LEX_STRING { char *str; size_t length; #ifdef __cplusplus LEX_STRING(...) { ... } #endif } to have some extra C++ convenience. But with templates you'd need a completely separate LEX_STRING definition in C and C++. Regards, Sergei Chief Architect MariaDB and security@mariadb.org
participants (2)
-
Aleksey Midenkov
-
Sergei Golubchik