[Maria-developers] Analysing MariaDB 5.5 sysbench performance regression
From this list, it seems that - not surprising - the 5.3 optimiser features seem to have added a bit of overhead. Much of this is spread out over many
I have been analysing the performance difference between MariaDB 5.5.21 and MySQL 5.5.21 using oltp.lua in sysbench 0.5. It turns out that MySQL has about 8% higher tps in this benchmark when running CPU bound. (See below for links to blog posts that started this). I took the base sysbench run and tweaked it to get stable numbers - basically put the datadir in ramdisk and adjusted my.cnf a bit (more details below). Then I used OProfile to analyse where the difference comes from. Oprofile is a really great tool for this kind of analysis, where CPU time spent is the measurement. It is very easy to use, and it gives detailed and accurate measurements on how much time is spent in each function. It is then easy to compare this against MySQL and find where we have performance regressions. Then further details is available by adding call graph to the reports, which shows from where the expensive parts are called; and by using the annotation facility which annotates every function and puts execution times on the individual statement and assembler functions. Here is an overview of where the extra time is spent. Time is in units of samples, which is 1/100000 of a clock cycle. Total was around 4.15M samples. Here are the top functions I found used more time in MariaDB, and the amount of extra samples spent in MariaDB: 39k mysql_select 20k alloc_root 16k JOIN::choose_subquery_plan 10k thd_ha_data 10k make_sortkey() 9k get_best_combination 9k handler::read_range_next 7k handler::ha_thd 7k update_ref_and_keys 7k Item::cmp_type 6k Item::operator delete 5k ha_maria::implicit_commit 4k end_write 4k sort_and_filter_keyuse 4k my_interval_timer 4k st_select_lex::master_unit 4k st_select_lex::save_leaf_tables 4k sel_arg_range_seq_next 4k thd_test_options 4k Item_func::walk 4k TABLE::init 4k sub_select 3k strnmov 3k ut_free 3k vio_write 3k ut_malloc_low 2k setup_tables 2k create_ref_for_key Some care is needed to interpret these data. GCC -O3 does heavy inlining. Oprofile is a statistical profiler, so numbers are somewhat uncertain, more so for smaller numbers. And modern CPUs are superscalar, out-of-order, and hyper-threaded, which makes interpreting individual instruction costs higly challenging. But the list can be used to drill down into the actual code; opannotate --source --assembler is very enlightening, in my experience (few examples below). places in the code, though it could be that the root cause comes from few inlined functions. I checked mysql_select(), and most of the overhead is from lp:941889, expensive constructors in JOIN, which was inlined by GCC. I believe Sergey plans to fix this. More analysis of the individual functions is needed to find the cause there. It seems we are also using alloc_root() somewhat more. thd_ha_data and maybe handler_thd are likely due to using dynamically loaded innodb_plugin rather than built-in. I would like to do more analysis (and would probably need help from optimiser-knowledgable people for most of it) - I think it is feasible to remove most of the overhead in a modest time frame. But my main point is to present these results, to show that we have a very useful tool to pin-point these performance issues, and to start discussion in more detail (and I have plenty more of those). ----------------------------------------------------------------------- More background: These benchmarks started from these blog posts: http://www.mysqlperformanceblog.com/2012/02/18/mariadb-5-3-4-benchmarks/ http://blog.montyprogram.com/benchmarking-mariadb-5-3-4/ The main difference in these tests seem to be caused by different XtraDB buffer pool flush algorithm. So Axel tried using innodb_pluing in MariaDB instead, and got much smaller differences - but a difference was still there. So for 5.5 I took Axel's benchmark, adjusted until I got stable, fully cpu-bound numbers, and analysed the results, as above. I have the actual scripts and my.cnf available if anyone is interested. Main point is datadir in ramdisk, reducing redo log size to 300MB total, and running for 30 seconds. As examples, I analysed a handful of cases, and came up with two simple patches for specific issues. One is update_virtual_fields(), which originally accounted for 0.5% of total time in sysbench. This seems to be called for every row, so just the push/pop of registers in function prologue became significant. I fixed this with below patch; it puts a NULL pointer into the table instead of an empty list of virtual fields, and guards a few frequent calls with an if (NULL) check. Another is my_hash_sort_simple(). This is not a regression, but Oprofile shows we spend 10% of total time in this function. opannotate of the assembler showed the problem. Below simple patch speeds it up by 33%, by reducing 4 memory loads per loop to 2. Still, it would be nice to understand why this is called that much - here is the call stack: my_hash_sort_simple() ha_heap::write_row(). evaluate_join_record() sub_select() do_select() JOIN::exec() mysql_select() It would be nice to understand why this is needed - this is an expensive function, it needs to lookup every character in the collation tables. If this could be eliminated, that alone would probably be enough to win over MySQL, even without fixing any of the regressions. Some other top contenders in the profile are innodb functions. For example buf_calc_page_new_checksum(). This is expensive, as it needs to loop over every byte in each buffer pool page (16K), computing a 7-long dependency chain for each byte. This could be much faster if the checksum was done on 8-byte longwords, but AFAIK, it's not trivial to change the checksum algorithm due to the migration issues. Facebook may have done some work on this. Another top contender is row_sel_store_mysql_rec() and similar - they show high costs in the opannotate on loads from the records, this is probably cache misses from accessing pages in the buffer pool - not too much to do about that I think. Hope this helps, - Kristian. ----------------------------------------------------------------------- Patch for my_hash_sort_simple(): === modified file 'strings/ctype-simple.c' --- strings/ctype-simple.c 2012-01-13 14:50:02 +0000 +++ strings/ctype-simple.c 2012-03-16 16:34:31 +0000 @@ -306,19 +306,24 @@ void my_hash_sort_simple(CHARSET_INFO *c { register const uchar *sort_order=cs->sort_order; const uchar *end; + ulong n1, n2; /* Remove end space. We have to do this to be able to compare 'A ' and 'A' as identical */ end= skip_trailing_space(key, len); - + + n1= *nr1; + n2= *nr2; for (; key < (uchar*) end ; key++) { - nr1[0]^=(ulong) ((((uint) nr1[0] & 63)+nr2[0]) * - ((uint) sort_order[(uint) *key])) + (nr1[0] << 8); - nr2[0]+=3; + n1^=(ulong) ((((uint) n1 & 63)+n2) * + ((uint) sort_order[(uint) *key])) + (n1 << 8); + n2+=3; } + *nr1= n1; + *nr2= n2; } ----------------------------------------------------------------------- Patch for update_virtual_fields(): === modified file 'sql/filesort.cc' --- sql/filesort.cc 2012-03-11 22:45:18 +0000 +++ sql/filesort.cc 2012-03-16 19:17:58 +0000 @@ -589,7 +589,7 @@ static ha_rows find_all_keys(SORTPARAM * { if ((error= select->quick->get_next())) break; - if (!error) + if (!error && sort_form->vfield) update_virtual_fields(thd, sort_form); file->position(sort_form->record[0]); DBUG_EXECUTE_IF("debug_filesort", dbug_print_record(sort_form, TRUE);); @@ -608,7 +608,7 @@ static ha_rows find_all_keys(SORTPARAM * else { error=file->ha_rnd_next(sort_form->record[0]); - if (!error) + if (!error && sort_form->vfield) update_virtual_fields(thd, sort_form); if (!flag) { === modified file 'sql/records.cc' --- sql/records.cc 2012-01-13 14:50:02 +0000 +++ sql/records.cc 2012-03-16 19:15:51 +0000 @@ -344,7 +344,8 @@ static int rr_quick(READ_RECORD *info) break; } } - update_virtual_fields(info->thd, info->table); + if (info->table->vfield) + update_virtual_fields(info->thd, info->table); return tmp; } === modified file 'sql/sql_select.cc' --- sql/sql_select.cc 2012-03-11 22:45:18 +0000 +++ sql/sql_select.cc 2012-03-16 19:15:05 +0000 @@ -15824,7 +15824,8 @@ evaluate_join_record(JOIN *join, JOIN_TA DBUG_RETURN(NESTED_LOOP_KILLED); /* purecov: inspected */ } - update_virtual_fields(join->thd, join_tab->table); + if (join_tab->table->vfield) + update_virtual_fields(join->thd, join_tab->table); if (select_cond) { === modified file 'sql/table.cc' --- sql/table.cc 2012-03-02 23:03:20 +0000 +++ sql/table.cc 2012-03-18 16:10:18 +0000 @@ -2466,30 +2466,35 @@ int open_table_from_share(THD *thd, TABL /* Process virtual columns, if any. */ - if (!(vfield_ptr = (Field **) alloc_root(&outparam->mem_root, - (uint) ((share->vfields+1)* - sizeof(Field*))))) - goto err; - - outparam->vfield= vfield_ptr; - - for (field_ptr= outparam->field; *field_ptr; field_ptr++) + if (!share->vfields) + outparam->vfield= NULL; + else { - if ((*field_ptr)->vcol_info) + if (!(vfield_ptr = (Field **) alloc_root(&outparam->mem_root, + (uint) ((share->vfields+1)* + sizeof(Field*))))) + goto err; + + outparam->vfield= vfield_ptr; + + for (field_ptr= outparam->field; *field_ptr; field_ptr++) { - if (unpack_vcol_info_from_frm(thd, - outparam, - *field_ptr, - &(*field_ptr)->vcol_info->expr_str, - &error_reported)) + if ((*field_ptr)->vcol_info) { - error= 4; // in case no error is reported - goto err; + if (unpack_vcol_info_from_frm(thd, + outparam, + *field_ptr, + &(*field_ptr)->vcol_info->expr_str, + &error_reported)) + { + error= 4; // in case no error is reported + goto err; + } + *(vfield_ptr++)= *field_ptr; } - *(vfield_ptr++)= *field_ptr; } + *vfield_ptr= 0; // End marker } - *vfield_ptr= 0; // End marker #ifdef WITH_PARTITION_STORAGE_ENGINE if (share->partition_info_str_len && outparam->file) -----------------------------------------------------------------------
-----Original Message----- From: maria-developers- bounces+wlad=montyprogram.com@lists.launchpad.net [mailto:maria- developers-bounces+wlad=montyprogram.com@lists.launchpad.net] On Behalf Of Kristian Nielsen Sent: Dienstag, 20. März 2012 11:37 To: maria-developers@lists.launchpad.net Subject: [Maria-developers] Analysing MariaDB 5.5 sysbench performance regression
I have been analysing the performance difference between MariaDB 5.5.21 and MySQL 5.5.21 using oltp.lua in sysbench 0.5. It turns out that MySQL has about 8% higher tps in this benchmark when running CPU bound. (See below for links to blog posts that started this).
I took the base sysbench run and tweaked it to get stable numbers - basically put the datadir in ramdisk and adjusted my.cnf a bit (more details below). Then I used OProfile to analyse where the difference comes from.
Oprofile is a really great tool for this kind of analysis, where CPU time spent is the measurement. It is very easy to use, and it gives detailed and accurate measurements on how much time is spent in each function. It is then easy to compare this against MySQL and find where we have performance regressions. Then further details is available by adding call graph to the reports, which shows from where the expensive parts are called; and by using the annotation facility which annotates every function and puts execution times on the individual statement and assembler functions.
Here is an overview of where the extra time is spent. Time is in units of samples, which is 1/100000 of a clock cycle. Total was around 4.15M samples. Here are the top functions I found used more time in MariaDB, and the amount of extra samples spent in MariaDB:
39k mysql_select 20k alloc_root 16k JOIN::choose_subquery_plan 10k thd_ha_data 10k make_sortkey() 9k get_best_combination 9k handler::read_range_next 7k handler::ha_thd 7k update_ref_and_keys 7k Item::cmp_type 6k Item::operator delete 5k ha_maria::implicit_commit 4k end_write 4k sort_and_filter_keyuse 4k my_interval_timer 4k st_select_lex::master_unit 4k st_select_lex::save_leaf_tables 4k sel_arg_range_seq_next 4k thd_test_options 4k Item_func::walk 4k TABLE::init 4k sub_select 3k strnmov 3k ut_free 3k vio_write 3k ut_malloc_low 2k setup_tables 2k create_ref_for_key
Some care is needed to interpret these data. GCC -O3 does heavy inlining. Oprofile is a statistical profiler, so numbers are somewhat uncertain, more so for smaller numbers. And modern CPUs are superscalar, out-of-order, and hyper-threaded, which makes interpreting individual instruction costs higly challenging.
But the list can be used to drill down into the actual code; opannotate --source --assembler is very enlightening, in my experience (few examples below).
From this list, it seems that - not surprising - the 5.3 optimiser features seem to have added a bit of overhead. Much of this is spread out over many places in the code, though it could be that the root cause comes from few inlined functions.
I checked mysql_select(), and most of the overhead is from lp:941889, expensive constructors in JOIN, which was inlined by GCC. I believe Sergey plans to fix this. More analysis of the individual functions is needed to find the cause there. It seems we are also using alloc_root() somewhat more.
thd_ha_data and maybe handler_thd are likely due to using dynamically loaded innodb_plugin rather than built-in.
I would like to do more analysis (and would probably need help from optimiser-knowledgable people for most of it) - I think it is feasible to remove most of the overhead in a modest time frame.
But my main point is to present these results, to show that we have a very useful tool to pin-point these performance issues, and to start discussion in more detail (and I have plenty more of those).
-----------------------------------------------------------------------
More background:
These benchmarks started from these blog posts:
http://www.mysqlperformanceblog.com/2012/02/18/mariadb-5-3-4- benchmarks/ http://blog.montyprogram.com/benchmarking-mariadb-5-3-4/
The main difference in these tests seem to be caused by different XtraDB buffer pool flush algorithm. So Axel tried using innodb_pluing in MariaDB instead, and got much smaller differences - but a difference was still
Coincidentally, I'm running the some performance comparisons of 5.2 vs 5.3 with a sampling profiler (it is not an oprofile, and is not perf, and I do not think anyone but me will run it) I'm running the simplest possible benchmark here - sysbench 0.5 using prepared SELECT 1 statement, 64 users. Its performance is 80% compared to 5.2 baseline. The things I see here that can be simply avoided 1. Initialization of large arrays of objects (setting up virtual function pointers) : lp:941889, expensive constructors in JOIN, which was inlined by GCC. This one is not as conclusive as I hoped it would be. On some machines it brings as much as 15% , on others merely 1%. 2. Once JOIN constructor is fixed to avoid virtual objects here, comparing the profile shows differences in heap allocations. I think alloc_root in Kristians profile has to do with it, too. Inclusive sample is grown by a high number, however exclusive samples are almost all in RtlAllocateHeap and friends (yes, I'm profiling on Windows). The callstacks point to the same JOIN allocator. My conclusion: JOIN has simply grown too large, to the point where malloc()s do not optimize anymore. Heap allocators (whether Windows or not) optimize well for small sizes, and no more for huge things like JOIN. To prove the theory I changed the #define MAX_TABLES in unireg.h to 10 instead of 64, which has simply grown too large in 5.3. As I see it is also the main reason why Kristian sees grown alloc_root proportion. I can "fix" it by defining MAX_TABLES in unireg.h to 1 instead of 61 3. Seemingly easy thing: This is not 5.3 specific, but seems to be strange to me anyway - on each select, even on "SELECT 1" 8KB of memory is allocated and freed, in sql_base.cc , in open_tables() init_sql_alloc(&new_frm_mem, 8024, 8024); If someone knows why it is necessary, I'd like to hear an opinion. I can change it to init_sql_alloc(&new_frm_mem, 8024, 0); to avoid allocations. With all optimizations (and a hack of redefining MAX_TABLES to 10), I get the kind of performance we have had in 5.2, with 1% less throughput on my "SELECT 1" test, not 25% as original. Other things I noticed, that are not so much performance relevant, but still worth noting- time measuring - THD::set_time, QueryPerformanceCounter can be seen in 5.3 profile (not a big deal, 2%). They were not in 5.2. Wlad there.
So for 5.5 I took Axel's benchmark, adjusted until I got stable, fully cpu-bound numbers, and analysed the results, as above. I have the actual scripts and my.cnf available if anyone is interested. Main point is
ramdisk, reducing redo log size to 300MB total, and running for 30 seconds.
As examples, I analysed a handful of cases, and came up with two simple patches for specific issues.
One is update_virtual_fields(), which originally accounted for 0.5% of total time in sysbench. This seems to be called for every row, so just the
of registers in function prologue became significant. I fixed this with below patch; it puts a NULL pointer into the table instead of an empty list of virtual fields, and guards a few frequent calls with an if (NULL) check.
Another is my_hash_sort_simple(). This is not a regression, but Oprofile shows we spend 10% of total time in this function.
opannotate of the assembler showed the problem. Below simple patch speeds it up by 33%, by reducing 4 memory loads per loop to 2. Still, it would be nice to understand why this is called that much - here is the call stack:
my_hash_sort_simple() ha_heap::write_row(). evaluate_join_record() sub_select() do_select() JOIN::exec() mysql_select()
It would be nice to understand why this is needed - this is an expensive function, it needs to lookup every character in the collation tables. If
datadir in push/pop this
could be eliminated, that alone would probably be enough to win over MySQL, even without fixing any of the regressions.
Some other top contenders in the profile are innodb functions. For example buf_calc_page_new_checksum(). This is expensive, as it needs to loop over every byte in each buffer pool page (16K), computing a 7-long dependency chain for each byte. This could be much faster if the checksum was done on 8-byte longwords, but AFAIK, it's not trivial to change the checksum algorithm due to the migration issues. Facebook may have done some work on this.
Another top contender is row_sel_store_mysql_rec() and similar - they show high costs in the opannotate on loads from the records, this is probably cache misses from accessing pages in the buffer pool - not too much to do about that I think.
Hope this helps,
- Kristian.
----------------------------------------------------------------------- Patch for my_hash_sort_simple():
=== modified file 'strings/ctype-simple.c' --- strings/ctype-simple.c 2012-01-13 14:50:02 +0000 +++ strings/ctype-simple.c 2012-03-16 16:34:31 +0000 @@ -306,19 +306,24 @@ void my_hash_sort_simple(CHARSET_INFO *c { register const uchar *sort_order=cs->sort_order; const uchar *end; + ulong n1, n2;
/* Remove end space. We have to do this to be able to compare 'A ' and 'A' as identical */ end= skip_trailing_space(key, len); - + + n1= *nr1; + n2= *nr2; for (; key < (uchar*) end ; key++) { - nr1[0]^=(ulong) ((((uint) nr1[0] & 63)+nr2[0]) * - ((uint) sort_order[(uint) *key])) + (nr1[0] << 8); - nr2[0]+=3; + n1^=(ulong) ((((uint) n1 & 63)+n2) * + ((uint) sort_order[(uint) *key])) + (n1 << 8); + n2+=3; } + *nr1= n1; + *nr2= n2; }
----------------------------------------------------------------------- Patch for update_virtual_fields():
=== modified file 'sql/filesort.cc' --- sql/filesort.cc 2012-03-11 22:45:18 +0000 +++ sql/filesort.cc 2012-03-16 19:17:58 +0000 @@ -589,7 +589,7 @@ static ha_rows find_all_keys(SORTPARAM * { if ((error= select->quick->get_next())) break; - if (!error) + if (!error && sort_form->vfield) update_virtual_fields(thd, sort_form); file->position(sort_form->record[0]); DBUG_EXECUTE_IF("debug_filesort", dbug_print_record(sort_form, TRUE);); @@ -608,7 +608,7 @@ static ha_rows find_all_keys(SORTPARAM * else { error=file->ha_rnd_next(sort_form->record[0]); - if (!error) + if (!error && sort_form->vfield) update_virtual_fields(thd, sort_form); if (!flag) {
=== modified file 'sql/records.cc' --- sql/records.cc 2012-01-13 14:50:02 +0000 +++ sql/records.cc 2012-03-16 19:15:51 +0000 @@ -344,7 +344,8 @@ static int rr_quick(READ_RECORD *info) break; } } - update_virtual_fields(info->thd, info->table); + if (info->table->vfield) + update_virtual_fields(info->thd, info->table); return tmp; }
=== modified file 'sql/sql_select.cc' --- sql/sql_select.cc 2012-03-11 22:45:18 +0000 +++ sql/sql_select.cc 2012-03-16 19:15:05 +0000 @@ -15824,7 +15824,8 @@ evaluate_join_record(JOIN *join, JOIN_TA DBUG_RETURN(NESTED_LOOP_KILLED); /* purecov: inspected */ }
- update_virtual_fields(join->thd, join_tab->table); + if (join_tab->table->vfield) + update_virtual_fields(join->thd, join_tab->table);
if (select_cond) {
=== modified file 'sql/table.cc' --- sql/table.cc 2012-03-02 23:03:20 +0000 +++ sql/table.cc 2012-03-18 16:10:18 +0000 @@ -2466,30 +2466,35 @@ int open_table_from_share(THD *thd, TABL /* Process virtual columns, if any. */ - if (!(vfield_ptr = (Field **) alloc_root(&outparam->mem_root, - (uint) ((share->vfields+1)* - sizeof(Field*))))) - goto err; - - outparam->vfield= vfield_ptr; - - for (field_ptr= outparam->field; *field_ptr; field_ptr++) + if (!share->vfields) + outparam->vfield= NULL; + else { - if ((*field_ptr)->vcol_info) + if (!(vfield_ptr = (Field **) alloc_root(&outparam->mem_root, + (uint) ((share->vfields+1)* + sizeof(Field*))))) + goto err; + + outparam->vfield= vfield_ptr; + + for (field_ptr= outparam->field; *field_ptr; field_ptr++) { - if (unpack_vcol_info_from_frm(thd, - outparam, - *field_ptr, - &(*field_ptr)->vcol_info->expr_str, - &error_reported)) + if ((*field_ptr)->vcol_info) { - error= 4; // in case no error is reported - goto err; + if (unpack_vcol_info_from_frm(thd, + outparam, + *field_ptr, + &(*field_ptr)->vcol_info->expr_str, + &error_reported)) + { + error= 4; // in case no error is reported + goto err; + } + *(vfield_ptr++)= *field_ptr; } - *(vfield_ptr++)= *field_ptr; } + *vfield_ptr= 0; // End marker } - *vfield_ptr= 0; // End marker
#ifdef WITH_PARTITION_STORAGE_ENGINE if (share->partition_info_str_len && outparam->file)
-----------------------------------------------------------------------
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Kristian Nielsen <knielsen@knielsen-hq.org> writes: Some additional info found from analysis so far:
39k mysql_select
This seems mostly caused by expensive JOIN constructor, as Wlad mentioned.
16k JOIN::choose_subquery_plan
This is caused by unnecessary C++ object construction (thanks, Timour!). Moving the declaration to after a return() statement when no subqueries fixes this.
10k make_sortkey()
False alarm, just caused by different inlining (in my MySQL compilation, this cost is attributed to filesort()).
9k get_best_combination
Also seems to be just different inlining, though there may be a small additional cost due to copying optimiser structures that are larger in MariaDB.
9k handler::read_range_next
Seems this is caused by new features in MariaDB - extra statistics collected (increment_statistics()) and LIMIT_ROWS_EXAMINED - this has a small cost as it is done for each row. - Kristian.
But my main point is to present these results, to show that we have a very useful tool to pin-point these performance issues, and to start discussion in more detail (and I have plenty more of those).
Nice work.
Another is my_hash_sort_simple(). This is not a regression, but Oprofile shows we spend 10% of total time in this function.
Does this do utf aware compares or memcmp?
It would be nice to understand why this is needed - this is an expensive function, it needs to lookup every character in the collation tables. If this could be eliminated, that alone would probably be enough to win over MySQL, even without fixing any of the regressions.
Some other top contenders in the profile are innodb functions. For example buf_calc_page_new_checksum(). This is expensive, as it needs to loop over every byte in each buffer pool page (16K), computing a 7-long dependency chain for each byte. This could be much faster if the checksum was done on 8-byte longwords, but AFAIK, it's not trivial to change the checksum algorithm due to the migration issues. Facebook may have done some work on this.
For the old-style checksum that you describe above "gcc -O3" was much faster than "gcc -O2" using an older version (maybe 4.2) of gcc. If you are willing to require a dump/reload and break InnoDB binary compatability (I don't recommend this) then some results are at: http://mysqlha.blogspot.com/2009/05/innodb-checksum-performance.html The Facebook patch added support for crc32 checksums that could use crc32 instructions on Intel HW. Official MySQL then adding something like that to one of their new releases. -- Mark Callaghan mdcallag@gmail.com
MARK CALLAGHAN <mdcallag@gmail.com> writes:
Another is my_hash_sort_simple(). This is not a regression, but Oprofile shows we spend 10% of total time in this function.
Does this do utf aware compares or memcmp?
This one uses collations - so not memcmp. It is used for SELECT DISTINCT and GROUP BY. This particular test was not utf8 though, it was using an 8-bit charset (probably latin1). The utf-8 version of this looks significantly more expensive. Monty will fix the heap tables to remove this overhead.
For the old-style checksum that you describe above "gcc -O3" was much faster than "gcc -O2" using an older version (maybe 4.2) of gcc. If you are willing to require a dump/reload and break InnoDB binary compatability (I don't recommend this) then some results are at: http://mysqlha.blogspot.com/2009/05/innodb-checksum-performance.html
I checked the assembler, compilation was optimal, it is just an expensive algorithm - minimum 7 cycles per each of 16K bytes in the page. 33.6% is a really high overhead! I suppose this comes from heavy buffer pool flushing, causing new checksum calculation for every few rows inserted. And maybe from cache misses going over each buffer pool page.
The Facebook patch added support for crc32 checksums that could use crc32 instructions on Intel HW. Official MySQL then adding something like that to one of their new releases.
Ok, that's good - best if data format changes are coordinated by InnoDB upstream. Thanks, - Kristian.
Kristian Nielsen wrote:
I have been analysing the performance difference between MariaDB 5.5.21 and MySQL 5.5.21 using oltp.lua in sysbench 0.5.
I have started running new sysbench series last week. So far I have results for MariaDB-5.3-bzr (r3481), MariaDB-5.5-bzr (r3354) and MySQL-5.1. MySQL-5.5 is still running. Sysbench RO results look very promising so far. Maria 5.3 is fastest now and Maria 5.5 is very comparable to MySQL 5.1. For Maria 5.3 there is now only a very small difference between InnoDB and XtraDB. For Maria 5.5 InnoDB is significantly faster. Results for Sysbench RW do not look good :( I see significant stalls and they seem to be worse with Maria 5.5. Those could however be caused by configuration (5.3/5.5 use different configuration of I/O parameters due to different feature set) See yourself in attached plots. XL
participants (4)
-
Axel Schwenke
-
Kristian Nielsen
-
MARK CALLAGHAN
-
Vladislav Vaintroub