[Maria-developers] MDEV-13118 Wrong results with LOWER and UPPER and subquery
Hello Sergei, I sent a broken patch last time. Sending the patch again. Thanks! On 06/20/2017 12:45 PM, Alexander Barkov wrote:
Hello Sergei,
Please review a patch for MDEV-13118.
Thanks!
Hi, Alexander! On Jul 22, Alexander Barkov wrote:
On 06/20/2017 12:45 PM, Alexander Barkov wrote:
Hello Sergei,
Please review a patch for MDEV-13118.
Thanks!
commit dc7352fa2bc21fd9b66b12ec33cdacf88b478157 Author: Alexander Barkov
Date: Tue Jun 20 12:44:36 2017 +0400 MDEV-13118 Wrong results with LOWER and UPPER and subquery
This problem is similar to MDEV-10306.
1. Fixing Item_str_conv::val_str(String *str) to return the result in "str", and to use tmp_value only as a temporary buffer for args[0]->val_str(). The new code version now guarantees that the result is always returned in "str". The trick with copy_if_not_alloced() is not used any more.
2. Removing the optimization that *some* character sets used in casedn() and caseup(), which allowed (and required) to change the case in-place, overwriting the string passed as the "src" argument. Now all CHARSET_INFO's work in the same way: non of them change the source string in-place, all of them now convert case from the source string to the destination string, leaving the source string untouched.
I don't understand why it was needed. If the goal is to return the result in 'str' and never in 'tmp_value', it could've been achieved without doing 1. or 2., something like diff --git a/sql/item_strfunc.cc b/sql/item_strfunc.cc index 89e55234482..f1837a4477b 100644 --- a/sql/item_strfunc.cc +++ b/sql/item_strfunc.cc @@ -1560,30 +1560,28 @@ String *Item_str_conv::val_str(String *str) { DBUG_ASSERT(fixed == 1); String *res; - if (!(res=args[0]->val_str(str))) - { - null_value=1; /* purecov: inspected */ - return 0; /* purecov: inspected */ - } - null_value=0; if (multiply == 1) { uint len; - res= copy_if_not_alloced(&tmp_value, res, res->length()); + if ((null_value= !(res= args[0]->val_str(str)))) + return 0; + res= copy_if_not_alloced(str, res, res->length()); len= converter(collation.collation, (char*) res->ptr(), res->length(), (char*) res->ptr(), res->length()); DBUG_ASSERT(len <= res->length()); res->length(len); } else { + if ((null_value= !(res= args[0]->val_str(&tmp_value)))) + return 0; uint len= res->length() * multiply; - tmp_value.alloc(len); - tmp_value.set_charset(collation.collation); + str->alloc(len); + str->set_charset(collation.collation); len= converter(collation.collation, (char*) res->ptr(), res->length(), - (char*) tmp_value.ptr(), len); - tmp_value.length(len); - res= &tmp_value; + (char*) str->ptr(), len); + str->length(len); + res= str; } return res; } Regards, Sergei Chief Architect MariaDB and security@mariadb.org
Hello Sergei, Thanks for your reviews. Please find my comments below. On 10/24/2017 01:09 PM, Sergei Golubchik wrote:
Hi, Alexander!
On Jul 22, Alexander Barkov wrote:
On 06/20/2017 12:45 PM, Alexander Barkov wrote:
Hello Sergei,
Please review a patch for MDEV-13118.
Thanks!
commit dc7352fa2bc21fd9b66b12ec33cdacf88b478157 Author: Alexander Barkov
Date: Tue Jun 20 12:44:36 2017 +0400 MDEV-13118 Wrong results with LOWER and UPPER and subquery
This problem is similar to MDEV-10306.
1. Fixing Item_str_conv::val_str(String *str) to return the result in "str", and to use tmp_value only as a temporary buffer for args[0]->val_str(). The new code version now guarantees that the result is always returned in "str". The trick with copy_if_not_alloced() is not used any more.
2. Removing the optimization that *some* character sets used in casedn() and caseup(), which allowed (and required) to change the case in-place, overwriting the string passed as the "src" argument. Now all CHARSET_INFO's work in the same way: non of them change the source string in-place, all of them now convert case from the source string to the destination string, leaving the source string untouched.
I don't understand why it was needed. If the goal is to return the result in 'str' and never in 'tmp_value', it could've been achieved without doing 1. or 2., something like
There was a symmetry in the charset library in caseup() and casedn() implementations for various charsets: - charsets whose case conversion does not change octet length implemented "inplace" conversion. - charsets whose case conversion change octet length implemented "copying" conversion. That was done specifically for Item_str_conv. Now with Item_str_conv returning the value in "str" anyway: - This optimization is not needed any more. There is no sense to do case conversion inplace and then immediately copy the modified string to another String (or the other way around: copy to a new String and then convert inplace). - It's good to make the charset library symmetric for all charsets
diff --git a/sql/item_strfunc.cc b/sql/item_strfunc.cc index 89e55234482..f1837a4477b 100644 --- a/sql/item_strfunc.cc +++ b/sql/item_strfunc.cc @@ -1560,30 +1560,28 @@ String *Item_str_conv::val_str(String *str) { DBUG_ASSERT(fixed == 1); String *res; - if (!(res=args[0]->val_str(str))) - { - null_value=1; /* purecov: inspected */ - return 0; /* purecov: inspected */ - } - null_value=0; if (multiply == 1) { uint len; - res= copy_if_not_alloced(&tmp_value, res, res->length()); + if ((null_value= !(res= args[0]->val_str(str)))) + return 0; + res= copy_if_not_alloced(str, res, res->length()); len= converter(collation.collation, (char*) res->ptr(), res->length(), (char*) res->ptr(), res->length()); DBUG_ASSERT(len <= res->length()); res->length(len); } else { + if ((null_value= !(res= args[0]->val_str(&tmp_value)))) + return 0; uint len= res->length() * multiply; - tmp_value.alloc(len); - tmp_value.set_charset(collation.collation); + str->alloc(len); + str->set_charset(collation.collation); len= converter(collation.collation, (char*) res->ptr(), res->length(), - (char*) tmp_value.ptr(), len); - tmp_value.length(len); - res= &tmp_value; + (char*) str->ptr(), len); + str->length(len); + res= str; } return res; }
Regards, Sergei Chief Architect MariaDB and security@mariadb.org
Hi, Alexander! On Nov 02, Alexander Barkov wrote:
commit dc7352fa2bc21fd9b66b12ec33cdacf88b478157 Author: Alexander Barkov
Date: Tue Jun 20 12:44:36 2017 +0400 MDEV-13118 Wrong results with LOWER and UPPER and subquery
This problem is similar to MDEV-10306.
1. Fixing Item_str_conv::val_str(String *str) to return the result in "str", 2. Removing the optimization that *some* character sets used in casedn() and caseup(), which allowed (and required) to change the case in-place,
I don't understand why it was needed. If the goal is to return the result in 'str' and never in 'tmp_value', it could've been achieved without doing 1. or 2., something like
There was a symmetry in the charset library in caseup() and casedn() implementations for various charsets: That was done specifically for Item_str_conv. Now with Item_str_conv returning the value in "str" anyway:
- This optimization is not needed any more. There is no sense to do case conversion inplace and then immediately copy the modified string to another String (or the other way around: copy to a new String and then convert inplace).
- It's good to make the charset library symmetric for all charsets
I see. Could you split this commit in two, then? Should be as easy as using 'git citool' to move all strings/* and include/* changes into a separate changeset with the comment like Simplify caseup() and casedn() in charsets After the MDEV-13118 fix there's no code in the server that wants caseup/casedn to change the argument in place for simple charsets. Let's remove this logic and always return the result in a new string for all charsets, both simple and complex. Regards, Sergei Chief Architect MariaDB and security@mariadb.org
participants (2)
-
Alexander Barkov
-
Sergei Golubchik