Sergey Vojtovich <svoj@mariadb.org> writes:
just out of curiosity: is it possible to find out which functions cause highest amount of icache misses?
Yes, see the second post, the profiles marked "Icache misses (ICACHE.MISSES), before PGO" and "Icache misses (ICACHE.MISSES), after PGO". These are level 1 cache misses. You will see that the functions with high cache miss rate are more or less the same as the functions that execute a lot of instructions. Note however that according to Intel documentation, there is a large skid on those events, so one should not rely too much on the precise location reported.
Can it have anything to do with branch misprediction?
If you look at the same post, you will see profiles for BR_MISP_RETIRED.ALL_BRANCHES_PS. This is a precise event, so it points directly to the instruction after the mispredicted branch. We do get 12% or so less mispredictions, so it has some effect. In comparison, we get 23% fewer icache misses. Note that the main source of branch misprediction is frequently called shared library functions (due to the indirect jump in PLT), and virtual function calls. This suggests that the problem here is that the sheer number of branches executed causes eviction of otherwise correctly predicted branches. We are simply executing too much code per request for the CPU to handle efficiently, a common thing in server applications. Another improvement that I noticed is in make_join_statistics(). PGO uses calls to optimised memset() and memcpy() functions for large structure memory writes, instead of byte-by-byte "rep movsb" sequences. There are probably many small improvements that contribute to the overall speedup spread out over the code, it is hard to determine precisely with such a large code base. The reason I mention icache misses in particular is that 1. The performance counter measurements pre-PGO clearly shows that icache miss rate is the main bottleneck in the CPU. 2. PGO is well suited to reducing icache misses. 3. Indeed, measurements post-PGO show a significant reduction in icache misses. - Kristian.