thanks alexander! :D i will read and test it as soon as possible :) 2013/9/23 Alexander Barkov <bar@mariadb.org>
Hi Roberto,
the patch has been pushed to the 10.0 tree:
lp:~maria-captains/maria/10.0
See here for more details about the 10.0 tree. https://launchpad.net/maria/**10.0 <https://launchpad.net/maria/10.0>
On 09/17/2013 07:42 PM, Roberto Spadim wrote:
nice =) after this patch commited, could report where i could get it? i will try to rewrite to base32 if possible :) and send at jira thanks!
2013/9/17 Alexander Barkov <bar@mariadb.org <mailto:bar@mariadb.org>>
Hello Roberto,
On 09/17/2013 06:38 PM, Roberto Spadim wrote:
hi guys! one question, as a mariadb user... base64 will be exposed to sql layer? could i use TO_BASE64() and FROM_BASE64() at mysql client? there's a MDEV in JIRA about it, if you could attach it to this patch could be nice: MDEV-4387 <https://mariadb.atlassian.__**net/browse/MDEV-4387
<https://mariadb.atlassian.**net/browse/MDEV-4387<https://mariadb.atlassian.net/browse/MDEV-4387>
and with time, implement base32 and base16: MDEV-4800 <https://mariadb.atlassian.__**net/browse/MDEV-4800
<https://mariadb.atlassian.**net/browse/MDEV-4800<https://mariadb.atlassian.net/browse/MDEV-4800>
base64 is very usefull, and many php users use it to "send/receive" files from database base32 from what i know is usefull only for OTP and other password apps base16 is a hexadecimal function HEX() and UNHEX()
Yes, TO_BASE64() and FROM_BASE64() will be exposed to the SQL level, so one case use for example things like this:
INSERT INTO t1 VALUES (FROM_BASE64('base64-string'))**__;
or
SELECT TO_BASE64(column) FROM t1;
There are no plans to implement base32.
thanks guys
2013/9/17 Alexander Barkov <bar@mariadb.org <mailto:bar@mariadb.org> <mailto:bar@mariadb.org
<mailto:bar@mariadb.org>>>
Hi Monty,
thanks for review.
I have addressed most of your suggestions. See the new version attached, and the detailed comments inline:
On 09/12/2013 04:32 PM, Michael Widenius wrote:
Hi!
Here is the review for the code that we should put into 10.0
First the base64:
=== modified file 'mysql-test/t/func_str.test' --- mysql-test/t/func_str.test 2013-05-07 11:05:09 +0000 +++ mysql-test/t/func_str.test 2013-08-28 13:14:24 +0000 @@ -1555,3 +1555,118 @@ drop table t1,t2; --echo # End of 5.5 tests --echo #
+ +--echo # +--echo # Start of 5.6 tests +--echo #
Shouldn't this be start of 10.0 tests ? (I know that this code is from MySQL 5.6, but still for us this is 10.0...)
This code is (almost) copy-and-paste from MySQL-5.6. I think it's a good idea when looking inside a test file to be able to see which tests are coming from MySQL-5.6, and which tests are coming from MariaDB-10.0.
I'd suggest to keep "Start of 5.6 tests" in this particular case, and also when merging the tests for the other merged MySQL-5.6 features.
=== modified file 'mysys/base64.c' --- mysys/base64.c 2011-06-30 15:46:53 +0000 +++ mysys/base64.c 2013-03-09 06:22:59 +0000 @@ -1,5 +1,4 @@ -/* Copyright (c) 2003-2008 MySQL AB, 2009 Sun Microsystems, Inc. - Use is subject to license terms. +/* Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved.
Removed 'all rights reserved'. You can't have that for GPL code.
If there is any new code from us, please add a copyright message for the MariaDB foundation too!
Removed 'all rights reserved' and added "MariaDB", as there are some our own changes.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -25,6 +24,28 @@ static char base64_table[] = "ABCDEFGHIJ
"abcdefghijklmnopqrstuvwxyz" "0123456789+/";
+/** + * Maximum length base64_needed_encoded_length() + * can handle without overflow. + */ +int +base64_encode_max_arg_length(**____)
+{ +#if (SIZEOF_INT == 8) + /* + 6827690988321067803 -> 9223372036854775805 + 6827690988321067804 -> -9223372036854775807 + */
Please describe from where the following number comes from (so that anyone can recalculate and check the number if needed):
I added more comments where these numbers come from.
+ return 0x5EC0D4C77B03531BLL +#else + /* + 1589695686 -> 2147483646 <tel:2147483646> <tel:2147483646 <tel:2147483646>>
+ 1589695687 -> -2147483645 <tel:2147483645> <tel:2147483645 <tel:2147483645>>
+ */ + return 0x5EC0D4C6; +#endif +} +
int base64_needed_encoded_length(_**___int length_of_data) @@ -39,6 +60,21 @@ base64_needed_encoded_length(_**___int length_
}
+/** + * Maximum length base64_needed_decoded_length() + * can handle without overflow. + */ +int +base64_decode_max_arg_length(**____)
+{
The following also need a comment.
Also added comments.
+#if (SIZEOF_INT == 8) + return 0x2AAAAAAAAAAAAAAALL; +#else + return 0x2AAAAAAA; +#endif +} + +
<cut>
+/** + * Get the next character from a base64 encoded stream. + * Skip spaces, then scan the next base64 character or a pad character + * and collect bits into decoder->c. + * + * @param decoder Pointer to MY_BASE64_DECODER + * @return + * FALSE on success (a valid base64 encoding character found) + * TRUE on error (unexpected character or unexpected end-of-input found) + */
Add an empty line here.
Done.
It would also be good to have a good description of the base64 format in this file so that one can understand what the functions are supposed to do (or at least a link to the specification).
Some thing I figured out by asking bar on IRC:
- We never generate spaces - When decoding, we allow spaces anywhere. - The end of an 64 base decoded string is always padded with '=' so that the final length is dividable with 4.
I added a reference to http://en.wikipedia.org/wiki/_**___Base64<http://en.wikipedia.org/wiki/____Base64> <http://en.wikipedia.org/wiki/**__Base64<http://en.wikipedia.org/wiki/__Base64>
<http://en.wikipedia.org/wiki/**__Base64<http://en.wikipedia.org/wiki/__Base64> <http://en.wikipedia.org/wiki/**Base64<http://en.wikipedia.org/wiki/Base64>
,
as well all added these your comments in proper places of the code.
+static inline my_bool +my_base64_decoder_getch(MY___**__BASE64_DECODER
*decoder)
Remove inline from the above; It's likely to make the code slower, not faster as this is a big function with lot of if's.
Removed "inline".
{ - char b[3]; - size_t i= 0; - char *dst_base= (char *)dst; - char const *src= src_base; - char *d= dst_base; - size_t j; + if (my_base64_decoder_skip_____**spaces(decoder))
+ return TRUE; /* End-of-input */
- while (i < len) + if (!my_base64_add(decoder)) /* Valid base64 character found */ { - unsigned c= 0; - size_t mark= 0; + if (decoder->mark) + { + /* If we have scanned '=' already, then only '=' is valid */ + DBUG_ASSERT(decoder->state == 3); + decoder->error= 1; + decoder->src--; + return TRUE; /* expected '=', but encoding character found */ + } + decoder->state++; + return FALSE; + }
If you want to have the funtion inline, then move the following to another not inline function; We don't need to generate code for this for every function call:
+ + /* Process error */ + switch (decoder->state) + { + case 0: + case 1: + decoder->src--; + return TRUE; /* base64 character expected */ + break; + + case 2: + case 3: + if (decoder->src[-1] == '=') + { + decoder->error= 0; /* Not an error - it's a pad character */ + decoder->mark++; + } + else + { + decoder->src--; + return TRUE; /* base64 character or '=' expected */ + } + break; + default: + DBUG_ASSERT(0); + return TRUE; /* Wrong state, should not happen */ + }
<cut>
Add to the comment for base64_decode that the 'dst' buffer has to be at least 2 character longer than what is needed to decode src_base.
I added this comment:
* Note: 'dst' must have sufficient space to store the decoded data. * Use base64_needed_decoded_length() to calculate the correct space size.
+int +base64_decode(const char *src_base, size_t len, + void *dst, const char **end_ptr, int flags) +{ + if (my_base64_decoder_getch(&____**decoder) || + my_base64_decoder_getch(&____**decoder) || + my_base64_decoder_getch(&____**decoder) || + my_base64_decoder_getch(&____**decoder))
+ break;
Generating the error handling with a switch for every of the above is wasteful. See previous comment.
Right. my_base64_decoder_getch() is not inline anymore.
+ /* Return error if there are more non-space characters */ + decoder.state= 0; + if (!my_base64_decoder_skip_____** spaces(&decoder))
+ decoder.error= 1; + if (end_ptr != NULL)
Note that base64_needed_decoded_length() used ceil() when it's not needed.
Removed.
<cut>
=== modified file 'sql/item_strfunc.cc' @@ -451,6 +452,101 @@ void Item_func_aes_decrypt::fix____**_length_a set_persist_maybe_null(1); }
+ +void Item_func_to_base64::fix_____** length_and_dec()
+{ + maybe_null= args[0]->maybe_null; + collation.set(default_charset(**____),
DERIVATION_COERCIBLE,
MY_REPERTOIRE_ASCII);
Maybe better to cast both arguments to (ulonglong) as the rest of the code is depending on this.
Also, why is base64_encode_max_arg_length() int instead of uint or even better size_t?
This is because base64_encode() and base64_decode() are used in the replication code using "int" as return value.
When exposing TO_BASE64() and FROM_BASE64() to the SQL level, I did not want to touch the replication code.
+ if (args[0]->max_length > (uint) base64_encode_max_arg_length()**____) + { + maybe_null= 1; + fix_char_length_ulonglong((___**_ulonglong) base64_encode_max_arg_length()**____); + } + else + { + int length= base64_needed_encoded_length((** ____int)
args[0]->max_length); + DBUG_ASSERT(length > 0);
Don't think assert is needed as the function gurantees it already.
+ fix_char_length_ulonglong((___**_ulonglong)
length - 1); + } +}
<cut>
+String *Item_func_from_base64::val___**__str(String
*str)
+{ + String *res= args[0]->val_str_ascii(str); + bool too_long= false; + int length; + const char *end_ptr; + + if (!res || + res->length() > (uint) base64_decode_max_arg_length() || + (too_long= + ((uint) (length= base64_needed_decoded_length((**____int) res->length())) > + current_thd->variables.max____**_allowed_packet)) ||
+ tmp_value.alloc((uint) length) || + (length= base64_decode(res->ptr(), (int) res->length(), + (char *) tmp_value.ptr(), &end_ptr, 0)) < 0 || + end_ptr < res->ptr() + res->length()) + {
Shouldn't we get a readable error or warning if the string contained wrong characters? Now it looks like we will just return null.
Something like 'Malformed base64 string. Error at position %d" would be nice.
Good idea. I have added a warning. Now it looks clear what went wrong if FROM_BASE64() returns NULL.
+ null_value= 1; // NULL input, too long input, OOM, or badly formed input + if (too_long) + { + push_warning_printf(current___**__thd, Sql_condition::WARN_LEVEL_____**WARN, + ER_WARN_ALLOWED_PACKET_____**OVERFLOWED, + ER(ER_WARN_ALLOWED_PACKET_____**OVERFLOWED), func_name(), + current_thd->variables.max____**_allowed_packet);
+ } + return 0; + } + tmp_value.length((uint) length); + null_value= 0; + return &tmp_value; +}
<cut>
<cut>
______________________________**___________________ Mailing list: https://launchpad.net/~maria-_**_developers<https://launchpad.net/~maria-__developers> <https://launchpad.net/~maria-**developers<https://launchpad.net/~maria-developers>
Post to : maria-developers@lists.__launc**hpad.net<http://launchpad.net> <mailto:maria-developers@**lists.launchpad.net<maria-developers@lists.launchpad.net>
<mailto:maria-developers@__lis**ts.launchpad.net<http://lists.launchpad.net> <mailto:maria-developers@**lists.launchpad.net<maria-developers@lists.launchpad.net>
Unsubscribe : https://launchpad.net/~maria-_**_developers<https://launchpad.net/~maria-__developers> <https://launchpad.net/~maria-**developers<https://launchpad.net/~maria-developers>
More help : https://help.launchpad.net/__**ListHelp<https://help.launchpad.net/__ListHelp>
<https://help.launchpad.net/**ListHelp<https://help.launchpad.net/ListHelp>
-- Roberto Spadim SPAEmpresarial
-- Roberto Spadim SPAEmpresarial
-- Roberto Spadim SPAEmpresarial