[Maria-discuss] Strange timestamp index behaviour / bug
Hello, before reporting it as bug I wanted to understand what's really going on or how to investigate it further: Because of the DST changes we experienced a strange issue where afterwards a query suddenly wouldn't use an index on a timestamp column anymore and would do a full tablescan and because of the table size literally brought down the db. The table itself is more complex but I could simplify it down to: CREATE TABLE `tmp` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `ts` timestamp NOT NULL DEFAULT current_timestamp(), PRIMARY KEY (`id`), KEY `ts_2` (`ts`) ) ENGINE=InnoDB CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; (the original table is TokuDB but the same behavior happens also on InnoDB / Server Mariadb 10.3.13): And the issue is: MariaDB [db]> explain select * from tmp where `ts` > '2019-03-31 03:00:00'; +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+ | 1 | SIMPLE | tmp | index | ts_2 | ts_2 | 4 | NULL | 1000 | Using where; Using index | +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+ While actually: MariaDB [db]> select count(*) from tmp where `ts` > '2019-03-31 03:00:00'; +----------+ | count(*) | +----------+ | 175 | +----------+ So either the sql optimizer or the engine does something strange. While for example: MariaDB [db]> explain select * from tmp where `ts` > '2019-03-31 04:00:00'; +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+ | 1 | SIMPLE | tmp | range | ts_2 | ts_2 | 4 | NULL | 175 | Using where; Using index | +------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+ .. works as expected. The data is basically (just missing the 3:xx because of the clock spin): | 35458195 | 2019-03-31 02:55:44 | | 35458196 | 2019-03-31 02:55:45 | | 35458197 | 2019-03-31 02:56:37 | | 35458198 | 2019-03-31 02:59:16 | | 35458199 | 2019-03-31 02:59:39 | | 35458200 | 2019-03-31 04:01:42 | | 35458201 | 2019-03-31 04:02:59 | | 35458202 | 2019-03-31 04:04:02 | | 35458203 | 2019-03-31 04:04:26 | | 35458204 | 2019-03-31 04:04:38 | In the Innodb case the explain while showing that it will return all rows (the test table has 1000) also showing that it will use the 'ts_2' index on the original tokudb table it won't use the index even forced: MariaDB [db]> explain select * from twitt_ext_posts_url where `ts` > '2019-03-31 04:00:00'; +------+-------------+---------------------+-------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+---------------------+-------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | twitt_ext_posts_url | range | ts_2 | ts_2 | 4 | NULL | 42444 | Using where | +------+-------------+---------------------+-------+---------------+------+---------+------+-------+-------------+ .. works again fine, but: MariaDB [db]> explain select * from twitt_ext_posts_url where `ts` > '2019-03-31 03:00:00'; +------+-------------+---------------------+------+---------------+------+---------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+---------------------+------+---------------+------+---------+------+----------+-------------+ | 1 | SIMPLE | twitt_ext_posts_url | ALL | ts_2 | NULL | NULL | NULL | 28547845 | Using where | +------+-------------+---------------------+------+---------------+------+---------+------+----------+-------------+ MariaDB [db]> explain select * from twitt_ext_posts_url force index(ts_2) where `ts` > '2019-03-31 03:00:00'; +------+-------------+---------------------+------+---------------+------+---------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+---------------------+------+---------------+------+---------+------+----------+-------------+ | 1 | SIMPLE | twitt_ext_posts_url | ALL | ts_2 | NULL | NULL | NULL | 28547872 | Using where | +------+-------------+---------------------+------+---------------+------+---------+------+----------+-------------+ Is this to be expected or a bug? Should a gap in index values drop the index use for range queries? Is there anything else I need to add to the report? rr
Hi, Reinis! On Apr 01, Reinis Rozitis wrote:
Hello, before reporting it as bug I wanted to understand what's really going on or how to investigate it further:
Because of the DST changes we experienced a strange issue where afterwards a query suddenly wouldn't use an index on a timestamp column anymore and would do a full tablescan and because of the table size literally brought down the db. ... MariaDB [db]> explain select * from tmp where `ts` > '2019-03-31 03:00:00';
This is intentional behavior. Perhaps it might be improved, but it's not a bug. For range access to work, the literal must have the same data type as the column or it should be possible to convert it to the column type. For example, this is fine, same data type: explain select * from int_column > 10 this is fine too, the data type differs, but it can be converted to an int: explain select * from int_column > 1e9 But this is not, the value cannot be converted to an int: explain select * from int_column > 1e10 In your case you have a timestamp column and a string literal that cannot be converted to a timestamp. Regards, Sergei Chief Architect MariaDB and security@mariadb.org
Hi, I was playing more with the issue of Mariadb not using a timestamp index for a particular time range - can someone explain why the following happens (or how the timestamps actually work?): db:~ # date Wed Apr 3 18:17:36 EEST 2019 MariaDB [db]> SELECT @@GLOBAL.time_zone, @@SESSION.time_zone; +--------------------+---------------------+ | @@GLOBAL.time_zone | @@SESSION.time_zone | +--------------------+---------------------+ | SYSTEM | SYSTEM | +--------------------+---------------------+ MariaDB [db]> explain select * from twitt_ext_posts_url where `ts` > '2019-03-31 03:00:00'; +------+-------------+---------------------+------+---------------+------+---------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+---------------------+------+---------------+------+---------+------+----------+-------------+ | 1 | SIMPLE | twitt_ext_posts_url | ALL | ts_2 | NULL | NULL | NULL | 28468881 | Using where | +------+-------------+---------------------+------+---------------+------+---------+------+----------+-------------+ (same happens with SET TIME_ZONE = 'Europe/Helsinki';) then again: MariaDB [db]> SET TIME_ZONE='+03:00'; MariaDB [db]> SELECT @@GLOBAL.time_zone, @@SESSION.time_zone; +--------------------+---------------------+ | @@GLOBAL.time_zone | @@SESSION.time_zone | +--------------------+---------------------+ | SYSTEM | +03:00 | +--------------------+---------------------+ MariaDB [db]> explain select * from twitt_ext_posts_url where `ts` > '2019-03-31 03:00:00'; +------+-------------+---------------------+-------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+---------------------+-------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | twitt_ext_posts_url | range | ts_2 | ts_2 | 4 | NULL | 55114 | Using where | +------+-------------+---------------------+-------+---------------+------+---------+------+-------+-------------+ So 28 milion rows vs just 55k. I understand that because of DST technically '2019-03-31 03:xx:xx' values for particular EEST timezone in real life can't happen but why is that a reason for the sql query planer to fail to use the index for the range and instead read the whole table?
From the programming perspective the application actually used something like 'ts > NOW() - 10 hours' so for one hour when it converted to 03:xx:xx the db server was burning down.
rr
Hi! Am 03.04.19 um 17:36 schrieb Reinis Rozitis:
Hi, I was playing more with the issue of Mariadb not using a timestamp index for a particular time range - can someone explain why the following happens (or how the timestamps actually work?):
you better say server version and show create table result, but first try compare a timestamp with a timestamp (if it was really timestamp field) try to compare it with number of seconds from epoch (instead of date which do not map to timestamps 1:1). Maybe here is the problem.
participants (3)
-
Oleksandr Byelkin
-
Reinis Rozitis
-
Sergei Golubchik