On Thu, 2023-07-06 at 01:48 +0400, Nikita Malyavin via developers wrote:
On Thu, 6 Jul 2023 at 01:22, Sergei Golubchik <serg@mariadb.org> wrote:
LEX_CSTRING defines a string by a pointer and a length. You shouldn't use functions that stop at first '\0'.
In some contexts (for table names, for example) it might be ok, but here you create a generic LEX_CSTRING concatenation, let's not implicitly assume that there are not zero bytes inside.
Ouch! Right, there are more encodings than I can encounter. In fact, i don't even know if 0 can be met in any sort of utf.
UTF-16 certainly *does* include 0 bytes all over the place: $ echo -n "Hello" | iconv -f utf8 -t utf16 | hd 00000000 00 48 00 65 00 6c 00 6c 00 6f |.H.e.l.l.o| 0000000a By its ASCII-compatible design, UTF-8 does not encode 0 bytes except when encoding the 0 code point (https://stackoverflow.com/a/6907327).