2013/9/23 Roberto Spadim <roberto@spadim.com.br>

Just one information for everybody it's a veeeeery verrrrry old bug at mysql reported by me :)

http://bugs.mysql.com/bug.php?id=18861

Add base64_encode, base64_decode functions
6 Apr 2006 18:00

thanks mariadb team! =) 7 years old bug resolved =)

2013/9/23 Roberto Spadim <roberto@spadim.com.br>

thanks alexander! :D i will read and test it as soon as possible :)

2013/9/23 Alexander Barkov <bar@mariadb.org>

Hi Roberto,

the patch has been pushed to the 10.0 tree:

lp:~maria-captains/maria/10.0

See here for more details about the 10.0 tree.
https://launchpad.net/maria/10.0

On 09/17/2013 07:42 PM, Roberto Spadim wrote:

nice =)
after this patch commited, could report where i could get it? i will try
to rewrite to base32 if possible :) and send at jira
thanks!

2013/9/17 Alexander Barkov <bar@mariadb.org <mailto:bar@mariadb.org>>

Hello Roberto,

On 09/17/2013 06:38 PM, Roberto Spadim wrote:

hi guys!
one question, as a mariadb user...
base64 will be exposed to sql layer? could i use TO_BASE64() and
FROM_BASE64() at mysql client?
there's a MDEV in JIRA about it, if you could attach it to this
patch
could be nice: MDEV-4387
<https://mariadb.atlassian.__net/browse/MDEV-4387

<https://mariadb.atlassian.net/browse/MDEV-4387>>

and with time, implement base32 and base16: MDEV-4800
<https://mariadb.atlassian.__net/browse/MDEV-4800

<https://mariadb.atlassian.net/browse/MDEV-4800>>

base64 is very usefull, and many php users use it to "send/receive"
files from database
base32 from what i know is usefull only for OTP and other
password apps
base16 is a hexadecimal function HEX() and UNHEX()

Yes, TO_BASE64() and FROM_BASE64() will be exposed to the SQL level,
so one case use for example things like this:

INSERT INTO t1 VALUES (FROM_BASE64('base64-string'))__;

or

SELECT TO_BASE64(column) FROM t1;

There are no plans to implement base32.

thanks guys

2013/9/17 Alexander Barkov <bar@mariadb.org
<mailto:bar@mariadb.org> <mailto:bar@mariadb.org

<mailto:bar@mariadb.org>>>

Hi Monty,

thanks for review.

I have addressed most of your suggestions. See the new
version attached,
and the detailed comments inline:

On 09/12/2013 04:32 PM, Michael Widenius wrote:

Hi!

Here is the review for the code that we should put into
10.0

First the base64:

=== modified file 'mysql-test/t/func_str.test'
--- mysql-test/t/func_str.test 2013-05-07 11:05:09
+0000
+++ mysql-test/t/func_str.test 2013-08-28 13:14:24
+0000
@@ -1555,3 +1555,118 @@ drop table t1,t2;
--echo # End of 5.5 tests
--echo #

+
+--echo #
+--echo # Start of 5.6 tests
+--echo #

Shouldn't this be start of 10.0 tests ?
(I know that this code is from MySQL 5.6, but still for
us this
is 10.0...)

This code is (almost) copy-and-paste from MySQL-5.6.
I think it's a good idea when looking inside a test file
to be able to see which tests are coming from MySQL-5.6,
and which tests are coming from MariaDB-10.0.

I'd suggest to keep "Start of 5.6 tests" in this particular
case,
and also when merging the tests for the other merged
MySQL-5.6 features.

=== modified file 'mysys/base64.c'
--- mysys/base64.c 2011-06-30 15:46:53 +0000
+++ mysys/base64.c 2013-03-09 06:22:59 +0000
@@ -1,5 +1,4 @@
-/* Copyright (c) 2003-2008 MySQL AB, 2009 Sun
Microsystems,
Inc.
- Use is subject to license terms.
+/* Copyright (c) 2003, 2010, Oracle and/or its
affiliates.
All rights reserved.

Removed 'all rights reserved'. You can't have that for
GPL code.

If there is any new code from us, please add a
copyright message
for the
MariaDB foundation too!

Removed 'all rights reserved' and added "MariaDB", as there
are some
our own changes.

This program is free software; you can
redistribute it
and/or modify
it under the terms of the GNU General Public
License
as published by
@@ -25,6 +24,28 @@ static char base64_table[] =
"ABCDEFGHIJ

"abcdefghijklmnopqrstuvwxyz"
"0123456789+/";

+/**
+ * Maximum length base64_needed_encoded_length()
+ * can handle without overflow.
+ */
+int
+base64_encode_max_arg_length(____)

+{
+#if (SIZEOF_INT == 8)
+ /*
+ 6827690988321067803 -> 9223372036854775805
+ 6827690988321067804 -> -9223372036854775807
+ */

Please describe from where the following number comes from
(so that anyone can recalculate and check the number if
needed):

I added more comments where these numbers come from.

+ return 0x5EC0D4C77B03531BLL
+#else
+ /*
+ 1589695686 -> 2147483646 <tel:2147483646>
<tel:2147483646 <tel:2147483646>>

+ 1589695687 -> -2147483645 <tel:2147483645>
<tel:2147483645 <tel:2147483645>>

+ */
+ return 0x5EC0D4C6;
+#endif
+}
+

int
base64_needed_encoded_length(____int length_of_data)
@@ -39,6 +60,21 @@
base64_needed_encoded_length(____int length_

}

+/**
+ * Maximum length base64_needed_decoded_length()
+ * can handle without overflow.
+ */
+int
+base64_decode_max_arg_length(____)

+{

The following also need a comment.

Also added comments.

+#if (SIZEOF_INT == 8)
+ return 0x2AAAAAAAAAAAAAAALL;
+#else
+ return 0x2AAAAAAA;
+#endif
+}
+
+

<cut>

+/**
+ * Get the next character from a base64 encoded
stream.
+ * Skip spaces, then scan the next base64
character or a
pad character
+ * and collect bits into decoder->c.
+ *
+ * @param decoder Pointer to MY_BASE64_DECODER
+ * @return
+ * FALSE on success (a valid base64 encoding
character found)
+ * TRUE on error (unexpected character or unexpected
end-of-input found)
+ */

Add an empty line here.

Done.

It would also be good to have a good description of the
base64
format
in this file so that one can understand what the
functions are
supposed to do (or at least a link to the specification).

Some thing I figured out by asking bar on IRC:

- We never generate spaces
- When decoding, we allow spaces anywhere.
- The end of an 64 base decoded string is always
padded with '='
so that the final length is dividable with 4.

I added a reference to
http://en.wikipedia.org/wiki/____Base64
<http://en.wikipedia.org/wiki/__Base64>

<http://en.wikipedia.org/wiki/__Base64
<http://en.wikipedia.org/wiki/Base64>>,

as well all added these your comments in proper places of
the code.

+static inline my_bool
+my_base64_decoder_getch(MY_____BASE64_DECODER

*decoder)

Remove inline from the above; It's likely to make the code
slower, not
faster as this is a big function with lot of if's.

Removed "inline".

{
- char b[3];
- size_t i= 0;
- char *dst_base= (char *)dst;
- char const *src= src_base;
- char *d= dst_base;
- size_t j;
+ if (my_base64_decoder_skip_____spaces(decoder))

+ return TRUE; /* End-of-input */

- while (i < len)
+ if (!my_base64_add(decoder)) /* Valid base64
character
found */
{
- unsigned c= 0;
- size_t mark= 0;
+ if (decoder->mark)
+ {
+ /* If we have scanned '=' already, then only
'=' is
valid */
+ DBUG_ASSERT(decoder->state == 3);
+ decoder->error= 1;
+ decoder->src--;
+ return TRUE; /* expected '=', but encoding
character
found */
+ }
+ decoder->state++;
+ return FALSE;
+ }

If you want to have the funtion inline, then move the
following to
another not inline function; We don't need to generate
code for
this
for every function call:

+
+ /* Process error */
+ switch (decoder->state)
+ {
+ case 0:
+ case 1:
+ decoder->src--;
+ return TRUE; /* base64 character expected */
+ break;
+
+ case 2:
+ case 3:
+ if (decoder->src[-1] == '=')
+ {
+ decoder->error= 0; /* Not an error - it's a pad
character */
+ decoder->mark++;
+ }
+ else
+ {
+ decoder->src--;
+ return TRUE; /* base64 character or '='
expected */
+ }
+ break;
+ default:
+ DBUG_ASSERT(0);
+ return TRUE; /* Wrong state, should not happen */
+ }

<cut>

Add to the comment for base64_decode that the 'dst'
buffer has to be
at least 2 character longer than what is needed to
decode src_base.

I added this comment:

* Note: 'dst' must have sufficient space to store the
decoded data.
* Use base64_needed_decoded_length() to calculate the
correct space
size.

+int
+base64_decode(const char *src_base, size_t len,
+ void *dst, const char **end_ptr, int
flags)
+{
+ if (my_base64_decoder_getch(&____decoder) ||
+ my_base64_decoder_getch(&____decoder) ||
+ my_base64_decoder_getch(&____decoder) ||
+ my_base64_decoder_getch(&____decoder))

+ break;

Generating the error handling with a switch for every
of the
above is wasteful. See previous comment.

Right. my_base64_decoder_getch() is not inline anymore.

+ /* Return error if there are more non-space
characters */
+ decoder.state= 0;
+ if (!my_base64_decoder_skip_____spaces(&decoder))

+ decoder.error= 1;
+
if (end_ptr != NULL)

Note that base64_needed_decoded_length() used ceil()
when it's
not needed.

Removed.

<cut>

=== modified file 'sql/item_strfunc.cc'
@@ -451,6 +452,101 @@ void
Item_func_aes_decrypt::fix_____length_a
set_persist_maybe_null(1);
}

+
+void Item_func_to_base64::fix_____length_and_dec()

+{
+ maybe_null= args[0]->maybe_null;
+ collation.set(default_charset(____),

DERIVATION_COERCIBLE,

MY_REPERTOIRE_ASCII);

Maybe better to cast both arguments to (ulonglong) as
the rest
of the code
is depending on this.

Also, why is base64_encode_max_arg_length() int instead
of uint or
even better size_t?

This is because base64_encode() and base64_decode() are used
in the replication code using "int" as return value.

When exposing TO_BASE64() and FROM_BASE64() to the SQL level,
I did not want to touch the replication code.

+ if (args[0]->max_length > (uint)
base64_encode_max_arg_length()____)
+ {
+ maybe_null= 1;
+ fix_char_length_ulonglong((____ulonglong)
base64_encode_max_arg_length()____);
+ }
+ else
+ {
+ int length= base64_needed_encoded_length((____int)

args[0]->max_length);
+ DBUG_ASSERT(length > 0);

Don't think assert is needed as the function gurantees
it already.

+ fix_char_length_ulonglong((____ulonglong)

length - 1);
+ }
+}

<cut>

+String *Item_func_from_base64::val_____str(String

*str)

+{
+ String *res= args[0]->val_str_ascii(str);
+ bool too_long= false;
+ int length;
+ const char *end_ptr;
+
+ if (!res ||
+ res->length() > (uint)
base64_decode_max_arg_length() ||
+ (too_long=
+ ((uint) (length=
base64_needed_decoded_length((____int)
res->length())) >
+
current_thd->variables.max_____allowed_packet)) ||

+ tmp_value.alloc((uint) length) ||
+ (length= base64_decode(res->ptr(), (int)
res->length(),
+ (char *) tmp_value.ptr(),
&end_ptr, 0)) < 0 ||
+ end_ptr < res->ptr() + res->length())
+ {

Shouldn't we get a readable error or warning if the string
contained wrong
characters?
Now it looks like we will just return null.

Something like 'Malformed base64 string. Error at
position %d" would
be nice.

Good idea. I have added a warning. Now it looks clear what
went wrong
if FROM_BASE64() returns NULL.

+ null_value= 1; // NULL input, too long input,
OOM, or
badly formed input
+ if (too_long)
+ {
+ push_warning_printf(current_____thd,
Sql_condition::WARN_LEVEL_____WARN,
+
ER_WARN_ALLOWED_PACKET_____OVERFLOWED,
+
ER(ER_WARN_ALLOWED_PACKET_____OVERFLOWED),
func_name(),
+
current_thd->variables.max_____allowed_packet);

+ }
+ return 0;
+ }
+ tmp_value.length((uint) length);
+ null_value= 0;
+ return &tmp_value;
+}

<cut>

<cut>

_________________________________________________
Mailing list: https://launchpad.net/~maria-__developers
<https://launchpad.net/~maria-developers>
Post to : maria-developers@lists.__launchpad.net
<mailto:maria-developers@lists.launchpad.net>
<mailto:maria-developers@__lists.launchpad.net
<mailto:maria-developers@lists.launchpad.net>>

Unsubscribe : https://launchpad.net/~maria-__developers
<https://launchpad.net/~maria-developers>
More help : https://help.launchpad.net/__ListHelp

<https://help.launchpad.net/ListHelp>

--
Roberto Spadim
SPAEmpresarial

--
Roberto Spadim
SPAEmpresarial

--
Roberto Spadim
SPAEmpresarial

--
Roberto Spadim
SPAEmpresarial

Roberto Spadim
SPAEmpresarial