Re: [Maria-developers] a4be420c920: MDEV-19235 MariaDB Server compiled for 128 Indexes crashes at startup
Hi, Vladislav! Thanks. That's pretty much the solution I had in mind too :) A couple of comments on details below. I've only commented on sql_bitmap.h, everything else is fine. On May 08, Vladislav Vaintroub wrote:
revision-id: a4be420c920 (mariadb-10.4.4-25-ga4be420c920) parent(s): ee4a2fef181 author: Vladislav Vaintroub <wlad@mariadb.com> committer: Vladislav Vaintroub <wlad@mariadb.com> timestamp: 2019-04-17 17:17:06 +0100 message:
MDEV-19235 MariaDB Server compiled for 128 Indexes crashes at startup
With MAX_INDEXIES=64(default), key_map=Bitmap<64> is just a wrapper around ulonglong and thus "trivial" (can be bzero-ed, or memcpy-ed, and stays valid still)
With MAX_INDEXES=128, key_map = Bitmap<128> is not a "trivial" type anymore. The implementation uses MY_BITMAP, and MY_BITMAP contains pointers which make Bitmap invalid, when it is memcpy-ed/bzero-ed.
The problem in 10.4 is that there are many new key_map members, inside TABLE or KEY, and those are often memcopied and bzeroed
The fix makes Bitmap "trivial", by inlining most of MY_BITMAP functionality. pointers/heap allocations are not used anymore.
diff --git a/sql/sql_bitmap.h b/sql/sql_bitmap.h index 705a8d169e1..74866336a1a 100644 --- a/sql/sql_bitmap.h +++ b/sql/sql_bitmap.h @@ -25,73 +25,184 @@
#include <my_sys.h> #include <my_bitmap.h> +#include <my_bit.h>
template <uint width> class Bitmap { - MY_BITMAP map; +public: uint32 buffer[(width+31)/32];
Why is it public?
public: - Bitmap() { init(); } + Bitmap():buffer()
What does that do?
+ { + } - Bitmap(const Bitmap& from) { *this=from; } - explicit Bitmap(uint prefix_to_set) { init(prefix_to_set); } + explicit Bitmap(uint prefix_to_set):buffer() + { + set_prefix(prefix_to_set); + } - void init() { my_bitmap_init(&map, buffer, width, 0); } - void init(uint prefix_to_set) { init(); set_prefix(prefix_to_set); } uint length() const { return width; } - Bitmap& operator=(const Bitmap& map2) - { + void init() + { + clear_all(); + } + void init(uint prefix_to_set) + { init(); - memcpy(buffer, map2.buffer, sizeof(buffer)); - return *this; + set_prefix(prefix_to_set);
I wonder whether init() method is still needed. You've removed it from the constructor. And if there's any other code that uses it, I'd rather it'd be using clear_all().
} - void set_bit(uint n) { bitmap_set_bit(&map, n); } + void set_bit(uint n) + { + DBUG_ASSERT(n < width); + ((uchar*)buffer)[n / 8] |= (1 << (n & 7)); + } - void clear_bit(uint n) { bitmap_clear_bit(&map, n); } + void clear_bit(uint n) + { + DBUG_ASSERT(n < width); + ((uchar*)buffer)[n / 8] &= ~(1 << (n & 7)); + } - void set_prefix(uint n) { bitmap_set_prefix(&map, n); } + void set_prefix(uint prefix_size) + { + set_if_smaller(prefix_size, width); + uint prefix_bytes, prefix_bits, d; + uchar* m = (uchar*)buffer; + + if ((prefix_bytes = prefix_size / 8)) + memset(m, 0xff, prefix_bytes); + m += prefix_bytes; + if ((prefix_bits = prefix_size & 7)) + { + *(m++) = (1 << prefix_bits) - 1; + // As the prefix bits are set, lets count this byte too as a prefix byte. + prefix_bytes++; + } + if ((d = (width + 7) / 8 - prefix_bytes)) + memset(m, 0, d); + } - void set_all() { bitmap_set_all(&map); } + void set_all() + { + set_prefix(width);
may be better to optimize it? memset(buffer, 0xff, sizeof(buffer) - 1); ((uchar*)buffer)[sizeof(buffer)-1] = last_byte_mask(width);
+ } - void clear_all() { bitmap_clear_all(&map); } + void clear_all() + { + memset(buffer, 0x00, sizeof(buffer)); + } - void intersect(Bitmap& map2) { bitmap_intersect(&map, &map2.map); } + void intersect(Bitmap & map2) + { + for (uint i = 0; i < array_elements(buffer); i++) + buffer[i] &= map2.buffer[i]; + } - void intersect(ulonglong map2buff) - { - // Use a spearate temporary buffer, as bitmap_init() clears all the bits. - ulonglong buf2; - MY_BITMAP map2; - - my_bitmap_init(&map2, (uint32 *) &buf2, sizeof(ulonglong) * 8, 0); - - // Store the original bits. - if (sizeof(ulonglong) >= 8) - { - int8store(const_cast<uchar *>(static_cast<uchar *> - (static_cast<void *>(&buf2))), - map2buff); - } - else - { - DBUG_ASSERT(sizeof(buffer) >= 4); - int4store(const_cast<uchar *>(static_cast<uchar *> - (static_cast<void *>(&buf2))), - static_cast<uint32>(map2buff)); - } - bitmap_intersect(&map, &map2); - } + void intersect_and_pad(ulonglong map2buff, bool pad_with_ones)
make it private?
+ { + compile_time_assert(sizeof(ulonglong) == 8); + uint32 tmp[2]; + int8store(tmp, map2buff); + if (array_elements(buffer) > 2 && pad_with_ones) + set_all();
really? I'd expect that if pad_with_ones==true, you simply don't need to do anything to the rest of the buffer.
+ + buffer[0] &= tmp[0]; + if (array_elements(buffer) > 1) + buffer[1] &= tmp[1]; + + if (array_elements(buffer) > 2 && !pad_with_ones) + memset((char*)buffer + 8, 0 , sizeof(buffer) - 8); + } + void intersect(ulonglong map2buff) + { + intersect_and_pad(map2buff, 0); + } /* Use highest bit for all bits above sizeof(ulonglong)*8. */ void intersect_extended(ulonglong map2buff) { - intersect(map2buff); - if (map.n_bits > sizeof(ulonglong) * 8) - bitmap_set_above(&map, sizeof(ulonglong), - MY_TEST(map2buff & (1LL << (sizeof(ulonglong) * 8 - 1)))); + intersect_and_pad(map2buff, (map2buff & (1ULL << 63))); } - void subtract(Bitmap& map2) { bitmap_subtract(&map, &map2.map); } + void subtract(Bitmap & map2) + { + for (size_t i = 0; i < array_elements(buffer); i++) + buffer[i] &= ~(map2.buffer[i]); + } - void merge(Bitmap& map2) { bitmap_union(&map, &map2.map); } + void merge(Bitmap & map2) + { + for (size_t i = 0; i < array_elements(buffer); i++) + buffer[i] |= map2.buffer[i]; + } - bool is_set(uint n) const { return bitmap_is_set(&map, n); } + bool is_set(uint n) const + { + DBUG_ASSERT(n < width); + return ((uchar*)buffer)[n / 8] & (1 << (n & 7)); + } - bool is_prefix(uint n) const { return bitmap_is_prefix(&map, n); } + bool is_prefix(uint prefix_size) const + { + uint prefix_mask = last_byte_mask(prefix_size); + uchar* m = (uchar*)buffer; + uchar* end_prefix = m + (prefix_size - 1) / 8; + uchar* end; + DBUG_ASSERT(prefix_size <= width); + + /* Empty prefix is always true */ + if (!prefix_size) + return true; + + while (m < end_prefix) + if (*m++ != 0xff) + return false; + + end = ((uchar*)buffer) + (width + 7) / 8 - 1; + if (m == end) + return ((*m & last_byte_mask(width)) == prefix_mask); + + if (*m != prefix_mask) + return false; + + while (++m < end) + if (*m != 0) + return false; + return ((*m & last_byte_mask(width)) == 0); + } - bool is_clear_all() const { return bitmap_is_clear_all(&map); } + bool is_clear_all() const + { + for (size_t i= 0; i < array_elements(buffer); i++) + if (buffer[i]) + return false; + return true; + } - bool is_set_all() const { return bitmap_is_set_all(&map); } + bool is_set_all() const + { + if (width == sizeof(buffer) * 8) + { + for (size_t i = 0; i < array_elements(buffer); i++) + if (buffer[i] != 0xFFFFFFFFU) + return false; + return true; + } + else + return is_prefix(width);
this can also be optimized to not use complex is_prefix(), but I don't think is_set_all() is used as often as set_all(), so it shouldn't matter as much.
+ } - bool is_subset(const Bitmap& map2) const { return bitmap_is_subset(&map, &map2.map); } + bool is_subset(const Bitmap & map2) const + { + for (size_t i= 0; i < array_elements(buffer); i++) + if (buffer[i] & ~(map2.buffer[i])) + return false; + return true; + } - bool is_overlapping(const Bitmap& map2) const { return bitmap_is_overlapping(&map, &map2.map); } + bool is_overlapping(const Bitmap & map2) const + { + for (size_t i = 0; i < array_elements(buffer); i++) + if (buffer[i] & map2.buffer[i]) + return true; + return false; + } - bool operator==(const Bitmap& map2) const { return bitmap_cmp(&map, &map2.map); } + bool operator==(const Bitmap & map2) const + { + return memcmp(buffer, map2.buffer, sizeof(buffer)) == 0; + } - bool operator!=(const Bitmap& map2) const { return !(*this == map2); } + bool operator!=(const Bitmap & map2) const + { + return !(*this == map2); + } char *print(char *buf) const { char *s=buf;
Regards, Sergei Chief Architect MariaDB and security@mariadb.org
Hi Serg, Thanks for looking
+public: uint32 buffer[(width+31)/32]; Why is it public?
No reason, an oversight. Leftover from previous testing, I fixed it.
+ Bitmap():buffer() zero initialization of member variable ,here of the type array (s https://stackoverflow.com/a/23987926/547065) I removed it by replacing with clear_all(), so it is better understandable.
I wonder whether init() method is still needed. You've removed it from the constructor. And if there's any other code that uses it, I'd rather it'd be using clear_all().
Done.
+ void set_all() + { + set_prefix(width);
may be better to optimize it? memset(buffer, 0xff, sizeof(buffer) - 1); ((uchar*)buffer)[sizeof(buffer)-1] = last_byte_mask(width);
I’m not really convinced this would be an optimization 😊 My claim is that, since “width” is constant at compile time (128), and it is divisible by 32, any decent C++ compiler must be able to figure out, at compile time, that set_all() == set_prefix(width)==set_prefix(128) trivially reduces to a shorter memset(buffer, 0xff, 32)
+ { + compile_time_assert(sizeof(ulonglong) == 8); + uint32 tmp[2]; + int8store(tmp, map2buff); + if (array_elements(buffer) > 2 && pad_with_ones) + set_all();
really? I'd expect that if pad_with_ones==true, you simply don't need to do anything to the rest of the buffer.
The original implementation does padding with 0xff or 0x00 , depending on the highest order bit of map2buff, with bitmap_set_above(). retained that property
+ bool is_set_all() const + { + if (width == sizeof(buffer) * 8) + { + for (size_t i = 0; i < array_elements(buffer); i++) + if (buffer[i] != 0xFFFFFFFFU) + return false; + return true; + } + else + return is_prefix(width);
this can also be optimized to not use complex is_prefix(), but I don't think is_set_all() is used as often as set_all(), so it shouldn't matter as much.
In fact, it is optimized 😊 If this condition if (width == sizeof(buffer) * 8) us true, s optimized version is used. This condition can be evaluated at compile time, and is true for width which is divisible by 32. the code for is_prefix should not be even generated by the compile Thanks again for looking, Wlad From: Sergei Golubchik<mailto:serg@mariadb.org> Sent: Wednesday, 8 May 2019 22:49 Subject: Re: a4be420c920: MDEV-19235 MariaDB Server compiled for 128 Indexes crashes at startup Hi, Vladislav! Thanks. That's pretty much the solution I had in mind too :) A couple of comments on details below. I've only commented on sql_bitmap.h, everything else is fine. On May 08, Vladislav Vaintroub wrote:
revision-id: a4be420c920 (mariadb-10.4.4-25-ga4be420c920) parent(s): ee4a2fef181 author: Vladislav Vaintroub <wlad@mariadb.com> committer: Vladislav Vaintroub <wlad@mariadb.com> timestamp: 2019-04-17 17:17:06 +0100 message:
MDEV-19235 MariaDB Server compiled for 128 Indexes crashes at startup
With MAX_INDEXIES=64(default), key_map=Bitmap<64> is just a wrapper around ulonglong and thus "trivial" (can be bzero-ed, or memcpy-ed, and stays valid still)
With MAX_INDEXES=128, key_map = Bitmap<128> is not a "trivial" type anymore. The implementation uses MY_BITMAP, and MY_BITMAP contains pointers which make Bitmap invalid, when it is memcpy-ed/bzero-ed.
The problem in 10.4 is that there are many new key_map members, inside TABLE or KEY, and those are often memcopied and bzeroed
The fix makes Bitmap "trivial", by inlining most of MY_BITMAP functionality. pointers/heap allocations are not used anymore.
diff --git a/sql/sql_bitmap.h b/sql/sql_bitmap.h index 705a8d169e1..74866336a1a 100644 --- a/sql/sql_bitmap.h +++ b/sql/sql_bitmap.h @@ -25,73 +25,184 @@
#include <my_sys.h> #include <my_bitmap.h> +#include <my_bit.h>
template <uint width> class Bitmap { - MY_BITMAP map; +public: uint32 buffer[(width+31)/32];
Why is it public?
public: - Bitmap() { init(); } + Bitmap():buffer()
What does that do?
+ { + } - Bitmap(const Bitmap& from) { *this=from; } - explicit Bitmap(uint prefix_to_set) { init(prefix_to_set); } + explicit Bitmap(uint prefix_to_set):buffer() + { + set_prefix(prefix_to_set); + } - void init() { my_bitmap_init(&map, buffer, width, 0); } - void init(uint prefix_to_set) { init(); set_prefix(prefix_to_set); } uint length() const { return width; } - Bitmap& operator=(const Bitmap& map2) - { + void init() + { + clear_all(); + } + void init(uint prefix_to_set) + { init(); - memcpy(buffer, map2.buffer, sizeof(buffer)); - return *this; + set_prefix(prefix_to_set);
I wonder whether init() method is still needed. You've removed it from the constructor. And if there's any other code that uses it, I'd rather it'd be using clear_all().
} - void set_bit(uint n) { bitmap_set_bit(&map, n); } + void set_bit(uint n) + { + DBUG_ASSERT(n < width); + ((uchar*)buffer)[n / 8] |= (1 << (n & 7)); + } - void clear_bit(uint n) { bitmap_clear_bit(&map, n); } + void clear_bit(uint n) + { + DBUG_ASSERT(n < width); + ((uchar*)buffer)[n / 8] &= ~(1 << (n & 7)); + } - void set_prefix(uint n) { bitmap_set_prefix(&map, n); } + void set_prefix(uint prefix_size) + { + set_if_smaller(prefix_size, width); + uint prefix_bytes, prefix_bits, d; + uchar* m = (uchar*)buffer; + + if ((prefix_bytes = prefix_size / 8)) + memset(m, 0xff, prefix_bytes); + m += prefix_bytes; + if ((prefix_bits = prefix_size & 7)) + { + *(m++) = (1 << prefix_bits) - 1; + // As the prefix bits are set, lets count this byte too as a prefix byte. + prefix_bytes++; + } + if ((d = (width + 7) / 8 - prefix_bytes)) + memset(m, 0, d); + } - void set_all() { bitmap_set_all(&map); } + void set_all() + { + set_prefix(width);
may be better to optimize it? memset(buffer, 0xff, sizeof(buffer) - 1); ((uchar*)buffer)[sizeof(buffer)-1] = last_byte_mask(width);
+ } - void clear_all() { bitmap_clear_all(&map); } + void clear_all() + { + memset(buffer, 0x00, sizeof(buffer)); + } - void intersect(Bitmap& map2) { bitmap_intersect(&map, &map2.map); } + void intersect(Bitmap & map2) + { + for (uint i = 0; i < array_elements(buffer); i++) + buffer[i] &= map2.buffer[i]; + } - void intersect(ulonglong map2buff) - { - // Use a spearate temporary buffer, as bitmap_init() clears all the bits. - ulonglong buf2; - MY_BITMAP map2; - - my_bitmap_init(&map2, (uint32 *) &buf2, sizeof(ulonglong) * 8, 0); - - // Store the original bits. - if (sizeof(ulonglong) >= 8) - { - int8store(const_cast<uchar *>(static_cast<uchar *> - (static_cast<void *>(&buf2))), - map2buff); - } - else - { - DBUG_ASSERT(sizeof(buffer) >= 4); - int4store(const_cast<uchar *>(static_cast<uchar *> - (static_cast<void *>(&buf2))), - static_cast<uint32>(map2buff)); - } - bitmap_intersect(&map, &map2); - } + void intersect_and_pad(ulonglong map2buff, bool pad_with_ones)
make it private?
+ { + compile_time_assert(sizeof(ulonglong) == 8); + uint32 tmp[2]; + int8store(tmp, map2buff); + if (array_elements(buffer) > 2 && pad_with_ones) + set_all();
really? I'd expect that if pad_with_ones==true, you simply don't need to do anything to the rest of the buffer.
+ + buffer[0] &= tmp[0]; + if (array_elements(buffer) > 1) + buffer[1] &= tmp[1]; + + if (array_elements(buffer) > 2 && !pad_with_ones) + memset((char*)buffer + 8, 0 , sizeof(buffer) - 8); + } + void intersect(ulonglong map2buff) + { + intersect_and_pad(map2buff, 0); + } /* Use highest bit for all bits above sizeof(ulonglong)*8. */ void intersect_extended(ulonglong map2buff) { - intersect(map2buff); - if (map.n_bits > sizeof(ulonglong) * 8) - bitmap_set_above(&map, sizeof(ulonglong), - MY_TEST(map2buff & (1LL << (sizeof(ulonglong) * 8 - 1)))); + intersect_and_pad(map2buff, (map2buff & (1ULL << 63))); } - void subtract(Bitmap& map2) { bitmap_subtract(&map, &map2.map); } + void subtract(Bitmap & map2) + { + for (size_t i = 0; i < array_elements(buffer); i++) + buffer[i] &= ~(map2.buffer[i]); + } - void merge(Bitmap& map2) { bitmap_union(&map, &map2.map); } + void merge(Bitmap & map2) + { + for (size_t i = 0; i < array_elements(buffer); i++) + buffer[i] |= map2.buffer[i]; + } - bool is_set(uint n) const { return bitmap_is_set(&map, n); } + bool is_set(uint n) const + { + DBUG_ASSERT(n < width); + return ((uchar*)buffer)[n / 8] & (1 << (n & 7)); + } - bool is_prefix(uint n) const { return bitmap_is_prefix(&map, n); } + bool is_prefix(uint prefix_size) const + { + uint prefix_mask = last_byte_mask(prefix_size); + uchar* m = (uchar*)buffer; + uchar* end_prefix = m + (prefix_size - 1) / 8; + uchar* end; + DBUG_ASSERT(prefix_size <= width); + + /* Empty prefix is always true */ + if (!prefix_size) + return true; + + while (m < end_prefix) + if (*m++ != 0xff) + return false; + + end = ((uchar*)buffer) + (width + 7) / 8 - 1; + if (m == end) + return ((*m & last_byte_mask(width)) == prefix_mask); + + if (*m != prefix_mask) + return false; + + while (++m < end) + if (*m != 0) + return false; + return ((*m & last_byte_mask(width)) == 0); + } - bool is_clear_all() const { return bitmap_is_clear_all(&map); } + bool is_clear_all() const + { + for (size_t i= 0; i < array_elements(buffer); i++) + if (buffer[i]) + return false; + return true; + } - bool is_set_all() const { return bitmap_is_set_all(&map); } + bool is_set_all() const + { + if (width == sizeof(buffer) * 8) + { + for (size_t i = 0; i < array_elements(buffer); i++) + if (buffer[i] != 0xFFFFFFFFU) + return false; + return true; + } + else + return is_prefix(width);
this can also be optimized to not use complex is_prefix(), but I don't think is_set_all() is used as often as set_all(), so it shouldn't matter as much.
+ } - bool is_subset(const Bitmap& map2) const { return bitmap_is_subset(&map, &map2.map); } + bool is_subset(const Bitmap & map2) const + { + for (size_t i= 0; i < array_elements(buffer); i++) + if (buffer[i] & ~(map2.buffer[i])) + return false; + return true; + } - bool is_overlapping(const Bitmap& map2) const { return bitmap_is_overlapping(&map, &map2.map); } + bool is_overlapping(const Bitmap & map2) const + { + for (size_t i = 0; i < array_elements(buffer); i++) + if (buffer[i] & map2.buffer[i]) + return true; + return false; + } - bool operator==(const Bitmap& map2) const { return bitmap_cmp(&map, &map2.map); } + bool operator==(const Bitmap & map2) const + { + return memcmp(buffer, map2.buffer, sizeof(buffer)) == 0; + } - bool operator!=(const Bitmap& map2) const { return !(*this == map2); } + bool operator!=(const Bitmap & map2) const + { + return !(*this == map2); + } char *print(char *buf) const { char *s=buf;
Regards, Sergei Chief Architect MariaDB and security@mariadb.org
Hi, Vladislav! On May 09, Vladislav Vaintroub wrote:
+ void set_all() + { + set_prefix(width);
may be better to optimize it? memset(buffer, 0xff, sizeof(buffer) - 1); ((uchar*)buffer)[sizeof(buffer)-1] = last_byte_mask(width);
I’m not really convinced this would be an optimization 😊 My claim is that, since “width” is constant at compile time (128), and it is divisible by 32, any decent C++ compiler must be able to figure out, at compile time, that set_all() == set_prefix(width)==set_prefix(128) trivially reduces to a shorter
I hoped a compiler would optimize it, all constants are known at compile time, indeed. But I didn't quite trust it to do so :) But ok, whatever you prefer.
+ { + compile_time_assert(sizeof(ulonglong) == 8); + uint32 tmp[2]; + int8store(tmp, map2buff); + if (array_elements(buffer) > 2 && pad_with_ones) + set_all();
really? I'd expect that if pad_with_ones==true, you simply don't need to do anything to the rest of the buffer.
The original implementation does padding with 0xff or 0x00 , depending on the highest order bit of map2buff, with bitmap_set_above(). retained that property
Oh. Could you please add a comment about this method's semantics? Based just on the name, one can really expect that the Bitmap is intersected with "ulonglong map2buff argument, padded with 0's or 1's up to Bitmap's width". And even with that semantics set_all() looks wrong. It should only set the tail (bits above 64) to 1, not all bits. Regards, Sergei Chief Architect MariaDB and security@mariadb.org
participants (2)
-
Sergei Golubchik
-
Vladislav Vaintroub