Guys, i think it's important to explain that we use RFC 2045 maybe at DOCS and maybe at SOURCE too just to make things more clear, php users and others users that only know about base64_encode/decode, don't know about many RFCs for each case... 2013/9/23 Roberto Spadim <roberto@spadim.com.br>
Just one information for everybody it's a veeeeery verrrrry old bug at mysql reported by me :)
http://bugs.mysql.com/bug.php?id=18861 Add base64_encode, base64_decode functions 6 Apr 2006 18:00
thanks mariadb team! =) 7 years old bug resolved =)
2013/9/23 Roberto Spadim <roberto@spadim.com.br>
thanks alexander! :D i will read and test it as soon as possible :)
2013/9/23 Alexander Barkov <bar@mariadb.org>
Hi Roberto,
the patch has been pushed to the 10.0 tree:
lp:~maria-captains/maria/10.0
See here for more details about the 10.0 tree. https://launchpad.net/maria/**10.0 <https://launchpad.net/maria/10.0>
On 09/17/2013 07:42 PM, Roberto Spadim wrote:
nice =) after this patch commited, could report where i could get it? i will try to rewrite to base32 if possible :) and send at jira thanks!
2013/9/17 Alexander Barkov <bar@mariadb.org <mailto:bar@mariadb.org>>
Hello Roberto,
On 09/17/2013 06:38 PM, Roberto Spadim wrote:
hi guys! one question, as a mariadb user... base64 will be exposed to sql layer? could i use TO_BASE64() and FROM_BASE64() at mysql client? there's a MDEV in JIRA about it, if you could attach it to this patch could be nice: MDEV-4387 <https://mariadb.atlassian.__**net/browse/MDEV-4387
<https://mariadb.atlassian.**net/browse/MDEV-4387<https://mariadb.atlassian.net/browse/MDEV-4387>
and with time, implement base32 and base16: MDEV-4800 <https://mariadb.atlassian.__**net/browse/MDEV-4800
<https://mariadb.atlassian.**net/browse/MDEV-4800<https://mariadb.atlassian.net/browse/MDEV-4800>
base64 is very usefull, and many php users use it to "send/receive" files from database base32 from what i know is usefull only for OTP and other password apps base16 is a hexadecimal function HEX() and UNHEX()
Yes, TO_BASE64() and FROM_BASE64() will be exposed to the SQL level, so one case use for example things like this:
INSERT INTO t1 VALUES (FROM_BASE64('base64-string'))**__;
or
SELECT TO_BASE64(column) FROM t1;
There are no plans to implement base32.
thanks guys
2013/9/17 Alexander Barkov <bar@mariadb.org <mailto:bar@mariadb.org> <mailto:bar@mariadb.org
<mailto:bar@mariadb.org>>>
Hi Monty,
thanks for review.
I have addressed most of your suggestions. See the new version attached, and the detailed comments inline:
On 09/12/2013 04:32 PM, Michael Widenius wrote:
Hi!
Here is the review for the code that we should put into 10.0
First the base64:
=== modified file 'mysql-test/t/func_str.test' --- mysql-test/t/func_str.test 2013-05-07 11:05:09 +0000 +++ mysql-test/t/func_str.test 2013-08-28 13:14:24 +0000 @@ -1555,3 +1555,118 @@ drop table t1,t2; --echo # End of 5.5 tests --echo #
+ +--echo # +--echo # Start of 5.6 tests +--echo #
Shouldn't this be start of 10.0 tests ? (I know that this code is from MySQL 5.6, but still for us this is 10.0...)
This code is (almost) copy-and-paste from MySQL-5.6. I think it's a good idea when looking inside a test file to be able to see which tests are coming from MySQL-5.6, and which tests are coming from MariaDB-10.0.
I'd suggest to keep "Start of 5.6 tests" in this particular case, and also when merging the tests for the other merged MySQL-5.6 features.
=== modified file 'mysys/base64.c' --- mysys/base64.c 2011-06-30 15:46:53 +0000 +++ mysys/base64.c 2013-03-09 06:22:59 +0000 @@ -1,5 +1,4 @@ -/* Copyright (c) 2003-2008 MySQL AB, 2009 Sun Microsystems, Inc. - Use is subject to license terms. +/* Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved.
Removed 'all rights reserved'. You can't have that for GPL code.
If there is any new code from us, please add a copyright message for the MariaDB foundation too!
Removed 'all rights reserved' and added "MariaDB", as there are some our own changes.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -25,6 +24,28 @@ static char base64_table[] = "ABCDEFGHIJ
"abcdefghijklmnopqrstuvwxyz" "0123456789+/";
+/** + * Maximum length base64_needed_encoded_length() + * can handle without overflow. + */ +int +base64_encode_max_arg_length(**____)
+{ +#if (SIZEOF_INT == 8) + /* + 6827690988321067803 -> 9223372036854775805 + 6827690988321067804 -> -9223372036854775807 + */
Please describe from where the following number comes from (so that anyone can recalculate and check the number if needed):
I added more comments where these numbers come from.
+ return 0x5EC0D4C77B03531BLL +#else + /* + 1589695686 -> 2147483646 <tel:2147483646> <tel:2147483646 <tel:2147483646>>
+ 1589695687 -> -2147483645 <tel:2147483645> <tel:2147483645 <tel:2147483645>>
+ */ + return 0x5EC0D4C6; +#endif +} +
int base64_needed_encoded_length(_**___int length_of_data) @@ -39,6 +60,21 @@ base64_needed_encoded_length(_**___int length_
}
+/** + * Maximum length base64_needed_decoded_length() + * can handle without overflow. + */ +int +base64_decode_max_arg_length(**____)
+{
The following also need a comment.
Also added comments.
+#if (SIZEOF_INT == 8) + return 0x2AAAAAAAAAAAAAAALL; +#else + return 0x2AAAAAAA; +#endif +} + +
<cut>
+/** + * Get the next character from a base64 encoded stream. + * Skip spaces, then scan the next base64 character or a pad character + * and collect bits into decoder->c. + * + * @param decoder Pointer to MY_BASE64_DECODER + * @return + * FALSE on success (a valid base64 encoding character found) + * TRUE on error (unexpected character or unexpected end-of-input found) + */
Add an empty line here.
Done.
It would also be good to have a good description of the base64 format in this file so that one can understand what the functions are supposed to do (or at least a link to the specification).
Some thing I figured out by asking bar on IRC:
- We never generate spaces - When decoding, we allow spaces anywhere. - The end of an 64 base decoded string is always padded with '=' so that the final length is dividable with 4.
I added a reference to http://en.wikipedia.org/wiki/_**___Base64<http://en.wikipedia.org/wiki/____Base64> <http://en.wikipedia.org/wiki/**__Base64<http://en.wikipedia.org/wiki/__Base64>
<http://en.wikipedia.org/wiki/**__Base64<http://en.wikipedia.org/wiki/__Base64> <http://en.wikipedia.org/wiki/**Base64<http://en.wikipedia.org/wiki/Base64>
,
as well all added these your comments in proper places of the code.
+static inline my_bool +my_base64_decoder_getch(MY___**__BASE64_DECODER
*decoder)
Remove inline from the above; It's likely to make the code slower, not faster as this is a big function with lot of if's.
Removed "inline".
{ - char b[3]; - size_t i= 0; - char *dst_base= (char *)dst; - char const *src= src_base; - char *d= dst_base; - size_t j; + if (my_base64_decoder_skip_____** spaces(decoder))
+ return TRUE; /* End-of-input */
- while (i < len) + if (!my_base64_add(decoder)) /* Valid base64 character found */ { - unsigned c= 0; - size_t mark= 0; + if (decoder->mark) + { + /* If we have scanned '=' already, then only '=' is valid */ + DBUG_ASSERT(decoder->state == 3); + decoder->error= 1; + decoder->src--; + return TRUE; /* expected '=', but encoding character found */ + } + decoder->state++; + return FALSE; + }
If you want to have the funtion inline, then move the following to another not inline function; We don't need to generate code for this for every function call:
+ + /* Process error */ + switch (decoder->state) + { + case 0: + case 1: + decoder->src--; + return TRUE; /* base64 character expected */ + break; + + case 2: + case 3: + if (decoder->src[-1] == '=') + { + decoder->error= 0; /* Not an error - it's a pad character */ + decoder->mark++; + } + else + { + decoder->src--; + return TRUE; /* base64 character or '=' expected */ + } + break; + default: + DBUG_ASSERT(0); + return TRUE; /* Wrong state, should not happen */ + }
<cut>
Add to the comment for base64_decode that the 'dst' buffer has to be at least 2 character longer than what is needed to decode src_base.
I added this comment:
* Note: 'dst' must have sufficient space to store the decoded data. * Use base64_needed_decoded_length() to calculate the correct space size.
+int +base64_decode(const char *src_base, size_t len, + void *dst, const char **end_ptr, int flags) +{ + if (my_base64_decoder_getch(&____**decoder) || + my_base64_decoder_getch(&____**decoder) || + my_base64_decoder_getch(&____**decoder) || + my_base64_decoder_getch(&____**decoder))
+ break;
Generating the error handling with a switch for every of the above is wasteful. See previous comment.
Right. my_base64_decoder_getch() is not inline anymore.
+ /* Return error if there are more non-space characters */ + decoder.state= 0; + if (!my_base64_decoder_skip_____** spaces(&decoder))
+ decoder.error= 1; + if (end_ptr != NULL)
Note that base64_needed_decoded_length() used ceil() when it's not needed.
Removed.
<cut>
=== modified file 'sql/item_strfunc.cc' @@ -451,6 +452,101 @@ void Item_func_aes_decrypt::fix____**_length_a set_persist_maybe_null(1); }
+ +void Item_func_to_base64::fix_____** length_and_dec()
+{ + maybe_null= args[0]->maybe_null; + collation.set(default_charset(**____),
DERIVATION_COERCIBLE,
MY_REPERTOIRE_ASCII);
Maybe better to cast both arguments to (ulonglong) as the rest of the code is depending on this.
Also, why is base64_encode_max_arg_length() int instead of uint or even better size_t?
This is because base64_encode() and base64_decode() are used in the replication code using "int" as return value.
When exposing TO_BASE64() and FROM_BASE64() to the SQL level, I did not want to touch the replication code.
+ if (args[0]->max_length > (uint) base64_encode_max_arg_length()**____) + { + maybe_null= 1; + fix_char_length_ulonglong((___**_ulonglong) base64_encode_max_arg_length()**____); + } + else + { + int length= base64_needed_encoded_length((** ____int)
args[0]->max_length); + DBUG_ASSERT(length > 0);
Don't think assert is needed as the function gurantees it already.
+ fix_char_length_ulonglong((___**_ulonglong)
length - 1); + } +}
<cut>
+String *Item_func_from_base64::val___** __str(String
*str)
+{ + String *res= args[0]->val_str_ascii(str); + bool too_long= false; + int length; + const char *end_ptr; + + if (!res || + res->length() > (uint) base64_decode_max_arg_length() || + (too_long= + ((uint) (length= base64_needed_decoded_length((**____int) res->length())) > + current_thd->variables.max____**_allowed_packet)) ||
+ tmp_value.alloc((uint) length) || + (length= base64_decode(res->ptr(), (int) res->length(), + (char *) tmp_value.ptr(), &end_ptr, 0)) < 0 || + end_ptr < res->ptr() + res->length()) + {
Shouldn't we get a readable error or warning if the string contained wrong characters? Now it looks like we will just return null.
Something like 'Malformed base64 string. Error at position %d" would be nice.
Good idea. I have added a warning. Now it looks clear what went wrong if FROM_BASE64() returns NULL.
+ null_value= 1; // NULL input, too long input, OOM, or badly formed input + if (too_long) + { + push_warning_printf(current___**__thd, Sql_condition::WARN_LEVEL_____**WARN, + ER_WARN_ALLOWED_PACKET_____**OVERFLOWED, + ER(ER_WARN_ALLOWED_PACKET_____**OVERFLOWED), func_name(), + current_thd->variables.max____** _allowed_packet);
+ } + return 0; + } + tmp_value.length((uint) length); + null_value= 0; + return &tmp_value; +}
<cut>
<cut>
______________________________**___________________ Mailing list: https://launchpad.net/~maria-_**_developers<https://launchpad.net/~maria-__developers> <https://launchpad.net/~maria-**developers<https://launchpad.net/~maria-developers>
Post to : maria-developers@lists.__launc**hpad.net<http://launchpad.net> <mailto:maria-developers@**lists.launchpad.net<maria-developers@lists.launchpad.net>
<mailto:maria-developers@__lis**ts.launchpad.net<http://lists.launchpad.net> <mailto:maria-developers@**lists.launchpad.net<maria-developers@lists.launchpad.net>
Unsubscribe : https://launchpad.net/~maria-_**_developers<https://launchpad.net/~maria-__developers> <https://launchpad.net/~maria-**developers<https://launchpad.net/~maria-developers>
More help : https://help.launchpad.net/__**ListHelp<https://help.launchpad.net/__ListHelp>
<https://help.launchpad.net/**ListHelp<https://help.launchpad.net/ListHelp>
-- Roberto Spadim SPAEmpresarial
-- Roberto Spadim SPAEmpresarial
-- Roberto Spadim SPAEmpresarial
-- Roberto Spadim SPAEmpresarial
-- Roberto Spadim SPAEmpresarial