
But my main point is to present these results, to show that we have a very useful tool to pin-point these performance issues, and to start discussion in more detail (and I have plenty more of those).
Nice work.
Another is my_hash_sort_simple(). This is not a regression, but Oprofile shows we spend 10% of total time in this function.
Does this do utf aware compares or memcmp?
It would be nice to understand why this is needed - this is an expensive function, it needs to lookup every character in the collation tables. If this could be eliminated, that alone would probably be enough to win over MySQL, even without fixing any of the regressions.
Some other top contenders in the profile are innodb functions. For example buf_calc_page_new_checksum(). This is expensive, as it needs to loop over every byte in each buffer pool page (16K), computing a 7-long dependency chain for each byte. This could be much faster if the checksum was done on 8-byte longwords, but AFAIK, it's not trivial to change the checksum algorithm due to the migration issues. Facebook may have done some work on this.
For the old-style checksum that you describe above "gcc -O3" was much faster than "gcc -O2" using an older version (maybe 4.2) of gcc. If you are willing to require a dump/reload and break InnoDB binary compatability (I don't recommend this) then some results are at: http://mysqlha.blogspot.com/2009/05/innodb-checksum-performance.html The Facebook patch added support for crc32 checksums that could use crc32 instructions on Intel HW. Official MySQL then adding something like that to one of their new releases. -- Mark Callaghan mdcallag@gmail.com