[Maria-developers] tmp-table-tests MariaDB 5.2.6 vs. 5.5.12 vs. 5.1.57
Moin, I did some tmp-table tests. As it is told MariaDB is faster on tmp-tables because of the Aria-Engine. Therefore Ive got 3 Severs running: onyx:/data/dbod # for i in 6051 6053 6060; do mysql -u erkan -h 10.255.128.4 -P $i -e 'select version()'; done +------------+ | version() | +------------+ | 5.5.12-log | +------------+ +-------------------+ | version() | +-------------------+ | 5.2.6-MariaDB-log | +-------------------+ +------------+ | version() | +------------+ | 5.1.57-log | key_buffer_size is 16MB. For MariaDB there is also: mysql> show global variables like 'aria_pagecache_buffer_size'; +----------------------------+----------+ | Variable_name | Value | +----------------------------+----------+ | aria_pagecache_buffer_size | 16777216 | +----------------------------+----------+ To provoke tmp-tables I did: onyx:/data/dbod # for i in 6051 6053 6060; do mysql -u erkan -h 10.255.128.4 -P $i -e 'show global variables like "tmp_table_size"'; done +----------------+-------+ | Variable_name | Value | +----------------+-------+ | tmp_table_size | 1024 | +----------------+-------+ +----------------+-------+ | Variable_name | Value | +----------------+-------+ | tmp_table_size | 1024 | +----------------+-------+ +----------------+-------+ | Variable_name | Value | +----------------+-------+ | tmp_table_size | 1024 | +----------------+-------+ We got two tables: mysql> desc sort_id; +-------+---------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+---------+------+-----+---------+-------+ | id | int(11) | YES | | NULL | | +-------+---------+------+-----+---------+-------+ 1 row in set (0.00 sec) mysql> desc sort_1; +-------+---------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+---------+------+-----+---------+-------+ | id | int(11) | YES | | NULL | | | tea | text | YES | | NULL | | +-------+---------+------+-----+---------+-------+ 2 rows in set (0.00 sec) With 10000 Rows each. Tests: onyx:/var/tmp # for i in 6051 6053 6060; do mysqlslap -u erkan -P $i -h 10.255.128.4 --create-schema=sbtest -q "SELECT * from sbtest.sort_1 group by id;" --number-of-queries=20 -i 3; done Benchmark [[5.5.12]] Average number of seconds to run all queries: 1.803 seconds Minimum number of seconds to run all queries: 1.791 seconds Maximum number of seconds to run all queries: 1.826 seconds Number of clients running queries: 1 Average number of queries per client: 20 Benchmark [[MariaDB]] Average number of seconds to run all queries: 5.547 seconds Minimum number of seconds to run all queries: 5.542 seconds Maximum number of seconds to run all queries: 5.556 seconds Number of clients running queries: 1 Average number of queries per client: 20 Benchmark [[5.1.57]] Average number of seconds to run all queries: 1.605 seconds Minimum number of seconds to run all queries: 1.603 seconds Maximum number of seconds to run all queries: 1.606 seconds Number of clients running queries: 1 Average number of queries per client: 20 onyx:/var/tmp # for i in 6051 6053 6060; do mysqlslap -u erkan -P $i -h 10.255.128.4 --create-schema=sbtest -q "SELECT * from sbtest.sort_id group by id;" --number-of-queries=20 -i 3; done Benchmark [[5.5.12]] Average number of seconds to run all queries: 1.146 seconds Minimum number of seconds to run all queries: 1.135 seconds Maximum number of seconds to run all queries: 1.170 seconds Number of clients running queries: 1 Average number of queries per client: 20 Benchmark [[MariaDB]] Average number of seconds to run all queries: 4.359 seconds Minimum number of seconds to run all queries: 4.352 seconds Maximum number of seconds to run all queries: 4.366 seconds Number of clients running queries: 1 Average number of queries per client: 20 Benchmark [[5.1.57]] Average number of seconds to run all queries: 1.049 seconds Minimum number of seconds to run all queries: 1.032 seconds Maximum number of seconds to run all queries: 1.068 seconds Number of clients running queries: 1 Average number of queries per client: 20 So MariaDb doesn't perform in this test. So I wonder I did something wrong ... Regards Erkan -- über den grenzen muß die freiheit wohl wolkenlos sein
Hi, erkan! On May 30, erkan yanar wrote:
Moin, I did some tmp-table tests. As it is told MariaDB is faster on tmp-tables because of the Aria-Engine.
I've tried that, but without mysqlslap. Your table structure, few times more rows, mysql-5.1, mariadb-5.2 with aria tmp tables, mariadb-5.2 with myisam tmp tables. I didn't try mysql-5.5. I used a query like SELECT id, tea from t1 group by left(id,1) order by null; where "group by left(id,1)" makes sure the number of groups is small (only ten), so the result set is small and the query doesn't spend most of its time sending thousands of result rows. And "order by null" removes filesort from the query plan. I've got almost identical numbers (which is a bit scary, considering different setup, and different queries). MariaDB with myisam tmp tables performed similar to MySQL (1.2-1.5 sec) MariaDB with aria tables was slower (5.1-5.7 sec). I remember that few months ago Monty fixed a slowdown caused by Aria temp tables. May be this fix didn't get into 5.2? Or you've found a different problem? I'll let Monty comment on it - he was already debugging this problem once. Regards, Sergei
Hi! >>>>> "Sergei" == Sergei Golubchikwrites: Sergei> Hi, erkan! Sergei> On May 30, erkan yanar wrote: >> >> Moin, >> I did some tmp-table tests. As it is told MariaDB is faster on >> tmp-tables because of the Aria-Engine. Sergei> I've tried that, but without mysqlslap. Your table structure, few times more rows, Sergei> mysql-5.1, mariadb-5.2 with aria tmp tables, mariadb-5.2 with myisam tmp Sergei> tables. I didn't try mysql-5.5. I used a query like Sergei> SELECT id, tea from t1 group by left(id,1) order by null; Sergei> where "group by left(id,1)" makes sure the number of groups is small Sergei> (only ten), so the result set is small and the query doesn't spend most Sergei> of its time sending thousands of result rows. And "order by null" Sergei> removes filesort from the query plan. Sergei> I've got almost identical numbers (which is a bit scary, considering Sergei> different setup, and different queries). Sergei> MariaDB with myisam tmp tables performed similar to MySQL (1.2-1.5 sec) Sergei> MariaDB with aria tables was slower (5.1-5.7 sec). Sergei> I remember that few months ago Monty fixed a slowdown caused by Aria Sergei> temp tables. May be this fix didn't get into 5.2? Or you've found a Sergei> different problem? I'll let Monty comment on it - he was already Sergei> debugging this problem once. I have now analysed this in detail. I used sergis test table: CREATE TABLE `t1` ( `id` int(11) DEFAULT NULL, `tea` text ) ENGINE=MyISAM DEFAULT CHARSET=latin1; I repeated then the following query until I got 2097152 rows. insert t1 select rand()*2e8, repeat(rand(), rand()*64) from t1; Then I issued the following queries. SELECT id, tea from t1 group by left(id,1) order by null; SELECT id, count(*), tea from t1 group by left(id,1) order by null The results I got was (for both queries): Aria(8K) row_format=page: 8.56 sec & 6.06 Aria(8K) row_format=dynamic: 3.19 & 13.32 Aria(1K) row_format=page: 6.40 & 4.88 Aria(1K) row_format=dynamic: 2.02 & 13.32 Aria(2K) row_format=page: 6.40 & 4.88 Aria(2K) row_format=dynamic: 2.26 & 13.51 MyISAM: 2.17, 12.89 The number after Aria is the block size (with mysqld --aria-block-size=). The 'dynamic' test was done by patching mysqld to use the dynamic page format instead of 'page' which is default. Dynamic is what MYISAM is using; If the page size would be the same then Aria and MyISAM should perform identically. >From the above you can see - That page size affects the query time notable in this case. This is because there is more data to move back/from the page cache. The bigger page cache however allows longer keys and fewer index levels so for bigger data sets the different should be smaller. It's possible to in the future optimize Aria to not copy pages from the page cache also for index writes and then this difference should disappear. - Aria is 2x faster than MyISAM when you add 'count(*)' to the query. (More about this later). I also tried the change the query to use left(id,2) to get more groups. Then the result was: Aria(8K) row_format=page: 10.21 & 7.96 Aria(8K) row_format=dynamic: 4.85 & 15.47 MyISAM: 4.04 & 15.14 Here one can see that compared to dynamic, the better code in Aria negates the page cache a bit for dynamic. row_format=page is still slower for the simpler query. I then tried by using group on an integer: SELECT id, tea from t1 group by id % 100 order by null; SELECT id, count(*), tea from t1 group by id % 100 order by null; Aria(8K) row_format=page: 9.83 & 10.23 Aria(8K) row_format=dynamic: 7.19 & 17.42 MyISAM: 6.28 & 17.00 Results is still similar. I examined the code and then found out why Aria was slower for the simpler query. MariaDB has an optimization that when you don't have any summary functions in the query it will just write to the group temp table and let duplicate key handling remove the duplicates. With the 'page' format however this is not a good idea as this is optimized for the case when there is few duplicates and it starts by writing the row, then the keys. What happens is that as almost all rows are duplicates we first insert the row, then notice the duplication and then delete the row. Fortunately it's not common that you have group by queries without a sum function and that's why we have not noticed this before. I have now fixed this in 5.2 so that if there is no summary function we are using the dynamic row format for aria. (It's not trivial to fix the way page format works by first writing the row and this is not something I would recommended to done in a stable release). After this (one line change) the results are: First queries LEFT(id,1) : 3.08 & 6.24 Second LEFT(id,2)): 4.87 & 8.20 Third (id % 10): 7.10 & 10.38 We are still a little bit slower than MyISAM for the simpler, not common queries, (which can be fixed by running with a block size of 2K if this would be critical for anyone) However for queries with sum functions we are about 50 % faster than MyISAM. Erkan, thanks for finding this issue! Regards, Monty
participants (3)
-
erkan yanar
-
Michael Widenius
-
Sergei Golubchik