
From this list, it seems that - not surprising - the 5.3 optimiser features seem to have added a bit of overhead. Much of this is spread out over many
I have been analysing the performance difference between MariaDB 5.5.21 and MySQL 5.5.21 using oltp.lua in sysbench 0.5. It turns out that MySQL has about 8% higher tps in this benchmark when running CPU bound. (See below for links to blog posts that started this). I took the base sysbench run and tweaked it to get stable numbers - basically put the datadir in ramdisk and adjusted my.cnf a bit (more details below). Then I used OProfile to analyse where the difference comes from. Oprofile is a really great tool for this kind of analysis, where CPU time spent is the measurement. It is very easy to use, and it gives detailed and accurate measurements on how much time is spent in each function. It is then easy to compare this against MySQL and find where we have performance regressions. Then further details is available by adding call graph to the reports, which shows from where the expensive parts are called; and by using the annotation facility which annotates every function and puts execution times on the individual statement and assembler functions. Here is an overview of where the extra time is spent. Time is in units of samples, which is 1/100000 of a clock cycle. Total was around 4.15M samples. Here are the top functions I found used more time in MariaDB, and the amount of extra samples spent in MariaDB: 39k mysql_select 20k alloc_root 16k JOIN::choose_subquery_plan 10k thd_ha_data 10k make_sortkey() 9k get_best_combination 9k handler::read_range_next 7k handler::ha_thd 7k update_ref_and_keys 7k Item::cmp_type 6k Item::operator delete 5k ha_maria::implicit_commit 4k end_write 4k sort_and_filter_keyuse 4k my_interval_timer 4k st_select_lex::master_unit 4k st_select_lex::save_leaf_tables 4k sel_arg_range_seq_next 4k thd_test_options 4k Item_func::walk 4k TABLE::init 4k sub_select 3k strnmov 3k ut_free 3k vio_write 3k ut_malloc_low 2k setup_tables 2k create_ref_for_key Some care is needed to interpret these data. GCC -O3 does heavy inlining. Oprofile is a statistical profiler, so numbers are somewhat uncertain, more so for smaller numbers. And modern CPUs are superscalar, out-of-order, and hyper-threaded, which makes interpreting individual instruction costs higly challenging. But the list can be used to drill down into the actual code; opannotate --source --assembler is very enlightening, in my experience (few examples below). places in the code, though it could be that the root cause comes from few inlined functions. I checked mysql_select(), and most of the overhead is from lp:941889, expensive constructors in JOIN, which was inlined by GCC. I believe Sergey plans to fix this. More analysis of the individual functions is needed to find the cause there. It seems we are also using alloc_root() somewhat more. thd_ha_data and maybe handler_thd are likely due to using dynamically loaded innodb_plugin rather than built-in. I would like to do more analysis (and would probably need help from optimiser-knowledgable people for most of it) - I think it is feasible to remove most of the overhead in a modest time frame. But my main point is to present these results, to show that we have a very useful tool to pin-point these performance issues, and to start discussion in more detail (and I have plenty more of those). ----------------------------------------------------------------------- More background: These benchmarks started from these blog posts: http://www.mysqlperformanceblog.com/2012/02/18/mariadb-5-3-4-benchmarks/ http://blog.montyprogram.com/benchmarking-mariadb-5-3-4/ The main difference in these tests seem to be caused by different XtraDB buffer pool flush algorithm. So Axel tried using innodb_pluing in MariaDB instead, and got much smaller differences - but a difference was still there. So for 5.5 I took Axel's benchmark, adjusted until I got stable, fully cpu-bound numbers, and analysed the results, as above. I have the actual scripts and my.cnf available if anyone is interested. Main point is datadir in ramdisk, reducing redo log size to 300MB total, and running for 30 seconds. As examples, I analysed a handful of cases, and came up with two simple patches for specific issues. One is update_virtual_fields(), which originally accounted for 0.5% of total time in sysbench. This seems to be called for every row, so just the push/pop of registers in function prologue became significant. I fixed this with below patch; it puts a NULL pointer into the table instead of an empty list of virtual fields, and guards a few frequent calls with an if (NULL) check. Another is my_hash_sort_simple(). This is not a regression, but Oprofile shows we spend 10% of total time in this function. opannotate of the assembler showed the problem. Below simple patch speeds it up by 33%, by reducing 4 memory loads per loop to 2. Still, it would be nice to understand why this is called that much - here is the call stack: my_hash_sort_simple() ha_heap::write_row(). evaluate_join_record() sub_select() do_select() JOIN::exec() mysql_select() It would be nice to understand why this is needed - this is an expensive function, it needs to lookup every character in the collation tables. If this could be eliminated, that alone would probably be enough to win over MySQL, even without fixing any of the regressions. Some other top contenders in the profile are innodb functions. For example buf_calc_page_new_checksum(). This is expensive, as it needs to loop over every byte in each buffer pool page (16K), computing a 7-long dependency chain for each byte. This could be much faster if the checksum was done on 8-byte longwords, but AFAIK, it's not trivial to change the checksum algorithm due to the migration issues. Facebook may have done some work on this. Another top contender is row_sel_store_mysql_rec() and similar - they show high costs in the opannotate on loads from the records, this is probably cache misses from accessing pages in the buffer pool - not too much to do about that I think. Hope this helps, - Kristian. ----------------------------------------------------------------------- Patch for my_hash_sort_simple(): === modified file 'strings/ctype-simple.c' --- strings/ctype-simple.c 2012-01-13 14:50:02 +0000 +++ strings/ctype-simple.c 2012-03-16 16:34:31 +0000 @@ -306,19 +306,24 @@ void my_hash_sort_simple(CHARSET_INFO *c { register const uchar *sort_order=cs->sort_order; const uchar *end; + ulong n1, n2; /* Remove end space. We have to do this to be able to compare 'A ' and 'A' as identical */ end= skip_trailing_space(key, len); - + + n1= *nr1; + n2= *nr2; for (; key < (uchar*) end ; key++) { - nr1[0]^=(ulong) ((((uint) nr1[0] & 63)+nr2[0]) * - ((uint) sort_order[(uint) *key])) + (nr1[0] << 8); - nr2[0]+=3; + n1^=(ulong) ((((uint) n1 & 63)+n2) * + ((uint) sort_order[(uint) *key])) + (n1 << 8); + n2+=3; } + *nr1= n1; + *nr2= n2; } ----------------------------------------------------------------------- Patch for update_virtual_fields(): === modified file 'sql/filesort.cc' --- sql/filesort.cc 2012-03-11 22:45:18 +0000 +++ sql/filesort.cc 2012-03-16 19:17:58 +0000 @@ -589,7 +589,7 @@ static ha_rows find_all_keys(SORTPARAM * { if ((error= select->quick->get_next())) break; - if (!error) + if (!error && sort_form->vfield) update_virtual_fields(thd, sort_form); file->position(sort_form->record[0]); DBUG_EXECUTE_IF("debug_filesort", dbug_print_record(sort_form, TRUE);); @@ -608,7 +608,7 @@ static ha_rows find_all_keys(SORTPARAM * else { error=file->ha_rnd_next(sort_form->record[0]); - if (!error) + if (!error && sort_form->vfield) update_virtual_fields(thd, sort_form); if (!flag) { === modified file 'sql/records.cc' --- sql/records.cc 2012-01-13 14:50:02 +0000 +++ sql/records.cc 2012-03-16 19:15:51 +0000 @@ -344,7 +344,8 @@ static int rr_quick(READ_RECORD *info) break; } } - update_virtual_fields(info->thd, info->table); + if (info->table->vfield) + update_virtual_fields(info->thd, info->table); return tmp; } === modified file 'sql/sql_select.cc' --- sql/sql_select.cc 2012-03-11 22:45:18 +0000 +++ sql/sql_select.cc 2012-03-16 19:15:05 +0000 @@ -15824,7 +15824,8 @@ evaluate_join_record(JOIN *join, JOIN_TA DBUG_RETURN(NESTED_LOOP_KILLED); /* purecov: inspected */ } - update_virtual_fields(join->thd, join_tab->table); + if (join_tab->table->vfield) + update_virtual_fields(join->thd, join_tab->table); if (select_cond) { === modified file 'sql/table.cc' --- sql/table.cc 2012-03-02 23:03:20 +0000 +++ sql/table.cc 2012-03-18 16:10:18 +0000 @@ -2466,30 +2466,35 @@ int open_table_from_share(THD *thd, TABL /* Process virtual columns, if any. */ - if (!(vfield_ptr = (Field **) alloc_root(&outparam->mem_root, - (uint) ((share->vfields+1)* - sizeof(Field*))))) - goto err; - - outparam->vfield= vfield_ptr; - - for (field_ptr= outparam->field; *field_ptr; field_ptr++) + if (!share->vfields) + outparam->vfield= NULL; + else { - if ((*field_ptr)->vcol_info) + if (!(vfield_ptr = (Field **) alloc_root(&outparam->mem_root, + (uint) ((share->vfields+1)* + sizeof(Field*))))) + goto err; + + outparam->vfield= vfield_ptr; + + for (field_ptr= outparam->field; *field_ptr; field_ptr++) { - if (unpack_vcol_info_from_frm(thd, - outparam, - *field_ptr, - &(*field_ptr)->vcol_info->expr_str, - &error_reported)) + if ((*field_ptr)->vcol_info) { - error= 4; // in case no error is reported - goto err; + if (unpack_vcol_info_from_frm(thd, + outparam, + *field_ptr, + &(*field_ptr)->vcol_info->expr_str, + &error_reported)) + { + error= 4; // in case no error is reported + goto err; + } + *(vfield_ptr++)= *field_ptr; } - *(vfield_ptr++)= *field_ptr; } + *vfield_ptr= 0; // End marker } - *vfield_ptr= 0; // End marker #ifdef WITH_PARTITION_STORAGE_ENGINE if (share->partition_info_str_len && outparam->file) -----------------------------------------------------------------------