Hi Monty, [Following up on review on IRC] Kristian Nielsen via commits <commits@lists.mariadb.org> writes:
Replace Log_event::writer with just passing the writer object as a function parameter to Log_event::write().
We wondered how the compiled code will look when passing the writer object as a parameter to Log_event::write() vs. temporarily storing it in the Log_event object. Interestingly, the code is actually "better" after my patch despite an extra parameter being passed along. This is because the compiler can cache the function parameter in a register across nested calls. While with a class member, the compiler cannot determine that the member will not be changed by the nested calls, so the value needs to be repeatedly re-loaded from memory. Some more details on this finding below. My reason for the change was probably more general, based on this code:
- ev->writer= this; - int res= ev->write(); - IF_DBUG(ev->writer= 0,); // writer must be set before every Log_event::write + int res= ev->write(this);
It's clear that the intention with this code is to pass the Log_event_writer object `this` to the ev->write() method, and it's a strange way to do it to temporarily set it as a member in the object rather than using a function parameter. Needing an IF_DBUG() to enforce/check that this is done correctly is a warning sign, and I was worried that as I did further changes I would introduce subtle bugs this way. It is interesting that this makes the code clearer not only for the developer (me), but also for the compiler in this case. Function parameter passing is quite well optimized on amd64, but it was still good to see this in detail in this case. And as you predicted we learned something from it! As we discussed, I will extend the commit message to explain these reasons better. Thanks for the suggestion; it's always good in the reviews to question the reason for changes like this. To check the compiled code in detail, I did a release build (BUILD/compile-pentium64-max) with and without this patch, and disassembled the User_var_log_event::write() function, which has a number of calls now passing the writer object as a function parameter:
- return write_header(event_length) || - write_data(buf, sizeof(buf)) || - write_data(name, name_len) || - write_data(buf1, buf1_length) || - write_data(pos, val_len) || - write_data(&flags, unsigned_len) || - write_footer(); + return write_header(writer, event_length) || + write_data(writer, buf, sizeof(buf)) || + write_data(writer, name, name_len) || + write_data(writer, buf1, buf1_length) || + write_data(writer, pos, val_len) || + write_data(writer, &flags, unsigned_len) || + write_footer(writer);
It looks like this would need to pass the `writer` argument many times to called functions. However, write_data() and write_footer() are actually inline functions: bool write_data(const uchar *buf, size_t data_length) { return writer->write_data(buf, data_length); } bool write_footer() { return writer->write_footer(); } So it's actually only write_header() where we now pass an extra parameter. For write_data() and write_footer() it was already being passed as the (implicit) `this` parameter. With the patch, the compiler saves the `writer` parameter in register r12 and uses it for each call: movq %rsi, %r12 ... movq %r12, %rsi # `writer` pointer movq %rbx, %rdi # `this` pointer call _ZN9Log_event12write_headerEP16Log_event_writerm@PLT ... movl $4, %edx # `data_length` parameter movq %r12, %rdi # `writer` pointer call _ZN16Log_event_writer10write_dataEPKhm@PLT ... leaq 137(%rbx), %rsi # `buf` parameter movq %r14, %rdx # `data_length` parameter movq %r12, %rdi # `writer` pointer call _ZN16Log_event_writer10write_dataEPKhm@PLT ... movq %r12, %rdi # `writer` pointer call _ZN16Log_event_writer12write_footerEv@PLT Before the patch, the code for User_var_log_event::write() loads this->writer for each call except write_header(): movq %rbx, %rdi # Here we don't pass `writer`, write_header() loads it from Log_event::writer call _ZN9Log_event12write_headerEm@PLT ... movq 88(%rbx), %rdi # this->writer leaq -126(%rbp), %rsi # `buf` parameter movl $4, %edx # `data_length` parameter call _ZN16Log_event_writer10write_dataEPKhm@PLT ... movq 88(%rbx), %rdi # this->writer leaq 153(%rbx), %rsi # `buf` parameter movq %r13, %rdx # `data_length` parameter call _ZN16Log_event_writer10write_dataEPKhm@PLT ... movq 88(%rbx), %rdi # this->writer call _ZN16Log_event_writer12write_footerEv@PLT I think the reason for the repeated loads of this->writer from 88(%rbx) is because the compiler cannot be sure that this->writer will not be modified by calls to write_data(). So it needs to reload the pointer each time. With the patch the pointer is a function parameter which cannot be modified by nested calls, so it can be saved in a register. Thus, we save some memory loads with my patch for each call of write_data() / write_footer(), which should be a bit faster (though perhaps not noticeably so). In general, passing parameters to functions is cheap (on amd64), as they are passed in registers, and register-to-register moves are very fast on high-performance CPUs, while memory loads have a relative high latency, even when they hit the cache. And like we see here, using local variables helps the compiler do alias analysis and cache values in registers, while this is a lot harder for the compiler to do for class members. - Kristian.