Re: [Maria-developers] Analysing MariaDB 5.5 sysbench performance regression

22 Mar 2012

      MARK CALLAGHAN <mdcallag@gmail.com> writes:
...
...
Another is my_hash_sort_simple(). This is not a regression, but Oprofile shows
we spend 10% of total time in this function.
Does this do utf aware compares or memcmp?
This one uses collations - so not memcmp. It is used for SELECT DISTINCT and
GROUP BY. This particular test was not utf8 though, it was using an 8-bit
charset (probably latin1).

The utf-8 version of this looks significantly more expensive.

Monty will fix the heap tables to remove this overhead.
...
For the old-style checksum that you describe above "gcc -O3" was much
faster than "gcc -O2" using an older version (maybe 4.2) of gcc. If
you are willing to require a dump/reload and break InnoDB binary
compatability (I don't recommend this) then some results are at:
 http://mysqlha.blogspot.com/2009/05/innodb-checksum-performance.html
I checked the assembler, compilation was optimal, it is just an expensive
algorithm - minimum 7 cycles per each of 16K bytes in the page.

33.6% is a really high overhead! I suppose this comes from heavy buffer pool
flushing, causing new checksum calculation for every few rows inserted. And
maybe from cache misses going over each buffer pool page.
...
The Facebook patch added support for crc32 checksums that could use
crc32 instructions on Intel HW. Official MySQL then adding something
like that to one of their new releases.
Ok, that's good - best if data format changes are coordinated by InnoDB
upstream.

Thanks,

 - Kristian.

Re: [Maria-developers] Analysing MariaDB 5.5 sysbench performance regression

Kristian Nielsen