[Maria-developers] Bug in ENUM options handling?
While working on MariaDB on HP-UX we noticed an inconsistency involving the type of enum arguments. In init_variables and init_one_value (mysys/my_getopt.c), ENUM options are treated as unsigned longs but the enum arguments are actually declared as uint instead of ulong. While this doesn't matter in 32 bit mode when int's and longs's are the same size, it can cause problems in 64 bit mode when long's are 64 bits and int's 32 bits. We found 4 variables that we think should be ulong instead of uint: plugin_maturity plugin_maturity_map thread_handling delay_key_write_options We also changed the value field in the sys_var_enum class (sql/set_var.h) from a uint pointer to a ulong pointer and changed the sys_var_enum and sys_var_enum_const constructors to use a ulong pointer argument for value_arg to make everything consistent. I suppose init_variables and init_one_value could be changed to treat ENUM's as uint instead of ulong but the use of ulong seemed to be intentional so we didn't try making that change as an alternative fix. Is this a change that should be made in the official MariaDB sources? Steve Ellcey sje@cup.hp.com
-----Original Message----- From: maria-developers- bounces+wlad=montyprogram.com@lists.launchpad.net [mailto:maria- developers-bounces+wlad=montyprogram.com@lists.launchpad.net] On Behalf Of Steve Ellcey Sent: Freitag, 28. Oktober 2011 22:38 To: maria-developers@lists.launchpad.net Subject: [Maria-developers] Bug in ENUM options handling?
While working on MariaDB on HP-UX we noticed an inconsistency involving the type of enum arguments.
In init_variables and init_one_value (mysys/my_getopt.c), ENUM options are treated as unsigned longs but the enum arguments are actually declared as uint instead of ulong. While this doesn't matter in 32 bit mode when int's and longs's are the same size, it can cause problems in 64 bit mode when long's are 64 bits and int's 32 bits.
Hi, I would prefer not to use ulong at all. It is not uniformly 64 bit on 64 bit OSes. If one needs unsigned integer that 64 bit on 64 bit OSes, and 32 or 32 bit, size_t seems to be the right choice to me, at least I have not seen exceptions from this rule (but though someone mentioned that on Cray that would not work:). uintptr_t sounds ok too. If nothing else, there would be thousands less warnings on Win64 if people avoided ulong usage. It one does not want to spare bits, ulonglong seems to be right choice. Wlad
Hi!
"Vladislav" == Vladislav Vaintroub <wlad@montyprogram.com> writes:
-----Original Message----- From: maria-developers- bounces+wlad=montyprogram.com@lists.launchpad.net [mailto:maria- developers-bounces+wlad=montyprogram.com@lists.launchpad.net] On Behalf Of Steve Ellcey Sent: Freitag, 28. Oktober 2011 22:38 To: maria-developers@lists.launchpad.net Subject: [Maria-developers] Bug in ENUM options handling?
While working on MariaDB on HP-UX we noticed an inconsistency involving the type of enum arguments.
In init_variables and init_one_value (mysys/my_getopt.c), ENUM options are treated as unsigned longs but the enum arguments are actually declared as uint instead of ulong. While this doesn't matter in 32 bit mode when int's and longs's are the same size, it can cause problems in 64 bit mode when long's are 64 bits and int's 32 bits.
Vladislav> Hi, Vladislav> I would prefer not to use ulong at all. It is not uniformly 64 bit on 64 bit Vladislav> OSes. If one needs unsigned integer that 64 bit on 64 bit OSes, and 32 or 32 Vladislav> bit, size_t seems to be the right choice to me, at least I have not seen Vladislav> exceptions from this rule (but though someone mentioned that on Cray that Vladislav> would not work:). uintptr_t sounds ok too. size_t is not a good choice as it's signed on some platforms. uintptr_t is not a good thing either, as it has to do with pointers, not integer handling (will look strange in function prototypes). 'ulong' stands in MariaDB for 'the most efficient integer type that is may be longer than uint. I don't think it's a problem to use ulong as long as you don't mix it with uint / ulonglong without casting. That said, as this is fixed in 5.5 and there is no known problem with MariaDB 5.3 related to this, this shouldn't be a big issue. Vladislav> If nothing else, there would be thousands less warnings on Win64 if people Vladislav> avoided ulong usage. ulong should only be given to functions that takes ulong, in which case there is no be any warnings. I do agree that we need to ensure that types are used consistently so that we in the end get no warnings at all. Regards, Monty
-----Original Message----- From: Michael Widenius [mailto:monty@askmonty.org] Sent: Donnerstag, 3. November 2011 11:29 To: Vladislav Vaintroub Cc: sje@cup.hp.com; maria-developers@lists.launchpad.net Subject: Re: [Maria-developers] Bug in ENUM options handling?
<skip>
Vladislav> Hi, Vladislav> I would prefer not to use ulong at all. It is not uniformly 64 bit on 64 bit Vladislav> OSes. If one needs unsigned integer that 64 bit on 64 bit OSes, and 32 or 32 Vladislav> bit, size_t seems to be the right choice to me, at least I have not seen Vladislav> exceptions from this rule (but though someone mentioned that on Cray that Vladislav> would not work:). uintptr_t sounds ok too.
Hi Monty
size_t is not a good choice as it's signed on some platforms. uintptr_t is not a good thing either, as it has to do with pointers, not integer handling (will look strange in function prototypes).
Can you tell me on which platforms that we support it is signed? Wikipedia says it is unsigned http://en.wikipedia.org/wiki/Size_t , and personally I do not recall seeing anything else (I do not exclude any possibility it was so on some historical systems, but I'm interested in current ones we use)
'ulong' stands in MariaDB for 'the most efficient integer type that is may be longer than uint. I don't think it's a problem to use ulong as long as you don't mix it with uint / ulonglong without casting.
That said, as this is fixed in 5.5 and there is no known problem with MariaDB 5.3 related to this, this shouldn't be a big issue.
There *may* be plenty of latent hidden problems with it. This bug http://bugs.mysql.com/bug.php?id=43932 took me a week to debug . MyISAM used to do arithmetic on ulongs that resulted in the index corruption once key buffer size was larger than 4GB on Windows (ulong multiplication). i.e it took me a week to debug this problem, just because the startup of a debug server with 5GB buffer size took half an hour on a 4G machine (huge memsets in the debug version of malloc in C runtime), and to catch the error I also had to run 1 hour test. So potential data loss in truncation it is not only an annoying warning, it may well be a bug that results in data corruption on the user's site. Wlad
I believe the the debate ongoing below can be laid to rest at *MOST* platforms size_t will work correctly, I have found few exceptions, such as GCC prior to 2.4 size_t is defined as a signed integer, in sys/types.h to match the definitions located in this file for other nix related platforms, and stddef.h just includes this definition; this is a very old version of GCC and I believe (correct me if I am wrong) it would not compile MariaDB's source code anyway, not positive though haven't tried it. Also I found an interesting side-note, with a separate but obscure hardware related issue with this variable size; on m68k apparently size_t is defined as a 16bit unsigned int, when the processor is capable of handling objects larger as arguments to memcpy (cited in an eetimes.com article), not a very likely situation to run into but the macro SIZE_MAX if it exists for all MariaDB supported platforms may provide the method for evaluating its signed-ness. I am of the understanding that yes, size_t should be unsigned and ssize_t would be the signed version if needed for something. Jakob Lorberblatt
-----Original Message----- From: Michael Widenius [mailto:monty@askmonty.org] Sent: Donnerstag, 3. November 2011 11:29 To: Vladislav Vaintroub Cc: sje@cup.hp.com; maria-developers@lists.launchpad.net Subject: Re: [Maria-developers] Bug in ENUM options handling?
<skip>
Vladislav> Hi, Vladislav> I would prefer not to use ulong at all. It is not uniformly 64 bit on 64 bit Vladislav> OSes. If one needs unsigned integer that 64 bit on 64 bit OSes, and 32 or 32 Vladislav> bit, size_t seems to be the right choice to me, at least I have not seen Vladislav> exceptions from this rule (but though someone mentioned that on Cray that Vladislav> would not work:). uintptr_t sounds ok too.
Hi Monty
size_t is not a good choice as it's signed on some platforms. uintptr_t is not a good thing either, as it has to do with pointers, not integer handling (will look strange in function prototypes).
Can you tell me on which platforms that we support it is signed? Wikipedia says it is unsigned http://en.wikipedia.org/wiki/Size_t , and personally I do not recall seeing anything else (I do not exclude any possibility it was so on some historical systems, but I'm interested in current ones we use)
'ulong' stands in MariaDB for 'the most efficient integer type that is may be longer than uint. I don't think it's a problem to use ulong as long as you don't mix it with uint / ulonglong without casting.
That said, as this is fixed in 5.5 and there is no known problem with MariaDB 5.3 related to this, this shouldn't be a big issue.
There *may* be plenty of latent hidden problems with it. This bug http://bugs.mysql.com/bug.php?id=43932 took me a week to debug . MyISAM used to do arithmetic on ulongs that resulted in the index corruption once key buffer size was larger than 4GB on Windows (ulong multiplication). i.e it took me a week to debug this problem, just because the startup of a debug server with 5GB buffer size took half an hour on a 4G machine (huge memsets in the debug version of malloc in C runtime), and to catch the error I also had to run 1 hour test.
So potential data loss in truncation it is not only an annoying warning, it may well be a bug that results in data corruption on the user's site.
Wlad
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Hi!
"Jakob" == Jakob Lorberblatt <jakob@native-intelligence.net> writes:
Jakob> I believe the the debate ongoing below can be laid to rest at *MOST* Jakob> platforms size_t will work correctly, I have found few exceptions, such as Jakob> GCC prior to 2.4 size_t is defined as a signed integer, in sys/types.h to Jakob> match the definitions located in this file for other nix related Jakob> platforms, and stddef.h just includes this definition; this is a very old Jakob> version of GCC and I believe (correct me if I am wrong) it would not Jakob> compile MariaDB's source code anyway, not positive though haven't tried Jakob> it. I have compiled and used MySQL with very old versions of gcc and with systems where size_t is signed. I don't think it would be hard to get the current code to work on these systems. The issue I have mainly with size_t for enum is that in some people's mind it's not guaranteed to be unsigned and because of that it's not a perfect choice. Personally I prefer to use 'uint, ulong, ulonglong' or even uint32 and uint32 instead of size_t for some objects. For strings and objects sizes in memory I think that size_t is ok. Regards, Monty
-----Original Message----- From: maria-developers- bounces+wlad=montyprogram.com@lists.launchpad.net [mailto:maria- developers-bounces+wlad=montyprogram.com@lists.launchpad.net] On Behalf Of Michael Widenius Sent: Freitag, 4. November 2011 12:11 To: Jakob Lorberblatt Cc: maria-developers@lists.launchpad.net Subject: Re: [Maria-developers] Bug in ENUM options handling?
Hi!
"Jakob" == Jakob Lorberblatt <jakob@native-intelligence.net> writes:
Jakob> I believe the the debate ongoing below can be laid to rest at *MOST* Jakob> platforms size_t will work correctly, I have found few exceptions, such as Jakob> GCC prior to 2.4 size_t is defined as a signed integer, in sys/types.h to Jakob> match the definitions located in this file for other nix related Jakob> platforms, and stddef.h just includes this definition; this is a very old Jakob> version of GCC and I believe (correct me if I am wrong) it would not Jakob> compile MariaDB's source code anyway, not positive though haven't tried Jakob> it.
I have compiled and used MySQL with very old versions of gcc and with systems where size_t is signed. I don't think it would be hard to get the current code to work on these systems.
There is no urgent need to port MariaDB or MySQL to very old gcc.
The issue I have mainly with size_t for enum is that in some people's mind it's not guaranteed to be unsigned and because of that it's not a perfect choice.
For enum, signed or not should not be a big deal. As I see it, even signed char with would suffice.
Personally I prefer to use 'uint, ulong, ulonglong' or even uint32 and uint32 instead of size_t for some objects.
I understand. As long as everyone is using GCC with default setting, which does not warn about implicit conversion, one cannot see which of implicit conversions resulting in 12872 warnings here http://buildbot.askmonty.org/buildbot/builders/win2008r2-vs2010-amd64-packag es/builds/248 are noise, and which ones are not. I used to babysit the build looking for new warning after someone else's push, but I stopped doing that and the number of warnings has approximately doubled since. To fix that cleanly, it is not just enough to eliminate ulong. One would also need to enable -Wno-conversion on newer GCC (this at least catches conversions not involving ulong , e.g longlong to int, double to char), and as long as you prefer using smaller than natural types, write code full of casts , or just adopt size_t whenever appropriate. Wlad
Hi!
"Vladislav" == Vladislav Vaintroub <wlad@montyprogram.com> writes:
<cut>
size_t is not a good choice as it's signed on some platforms. uintptr_t is not a good thing either, as it has to do with pointers, not integer handling (will look strange in function prototypes).
Vladislav> Can you tell me on which platforms that we support it is signed? Wikipedia Vladislav> says it is unsigned http://en.wikipedia.org/wiki/Size_t , and personally I Vladislav> do not recall seeing anything else (I do not exclude any possibility it was Vladislav> so on some historical systems, but I'm interested in current ones we use)
From and older GCC doc:
---------------------- There is a potential problem with the size_t type and versions of GCC prior to release 2.4. ANSI C requires that size_t always be an unsigned type. For compatibility with existing systems' header files, GCC defines size_t in stddef.h to be whatever type the system's sys/types.h defines it to be. Most Unix systems that define size_t in sys/types.h, define it to be a signed type. Some code in the library depends on size_t being an unsigned type, and will not work correctly if it is signed ---------------------- As MariaDB may need to be ported to older systems, it's good to at least keep the above in mind.
'ulong' stands in MariaDB for 'the most efficient integer type that is may be longer than uint. I don't think it's a problem to use ulong as long as you don't mix it with uint / ulonglong without casting.
That said, as this is fixed in 5.5 and there is no known problem with MariaDB 5.3 related to this, this shouldn't be a big issue.
Vladislav> There *may* be plenty of latent hidden problems with it. This bug Vladislav> http://bugs.mysql.com/bug.php?id=43932 took me a week to debug . MyISAM Vladislav> used to do arithmetic on ulongs that resulted in the index corruption once Vladislav> key buffer size was larger than 4GB on Windows (ulong multiplication). i.e Vladislav> it took me a week to debug this problem, just because the startup of a debug Vladislav> server with 5GB buffer size took half an hour on a 4G machine (huge memsets Vladislav> in the debug version of malloc in C runtime), and to catch the error I also Vladislav> had to run 1 hour test. The above may also happen on 32 bit systems. Yes, it's clear that things 'that may be big' should never by ulong. However, there are still cases when using int, uint and ulong make sence. I don't think it's a good idea to start to use ulonglong for everything, which would have been required to solve the above issue. Vladislav> So potential data loss in truncation it is not only an annoying warning, it Vladislav> may well be a bug that results in data corruption on the user's site. Regards, Monty
Hi, Steve! On Oct 28, Steve Ellcey wrote:
While working on MariaDB on HP-UX we noticed an inconsistency involving the type of enum arguments.
In init_variables and init_one_value (mysys/my_getopt.c), ENUM options are treated as unsigned longs but the enum arguments are actually declared as uint instead of ulong. While this doesn't matter in 32 bit mode when int's and longs's are the same size, it can cause problems in 64 bit mode when long's are 64 bits and int's 32 bits.
We found 4 variables that we think should be ulong instead of uint:
plugin_maturity plugin_maturity_map thread_handling delay_key_write_options
Yes. In 5.5 I've added an assert: DBUG_ASSERT(size == sizeof(ulong)); to Sys_var_enum class (5.5 has slightly different syntax for defining variables). So in 5.5 all these variables are already ulong. Regards, Sergei
participants (5)
-
Jakob Lorberblatt
-
Michael Widenius
-
Sergei Golubchik
-
Steve Ellcey
-
Vladislav Vaintroub