Hi, Andrei, I still don't understand this. 1. Why rgi->options_to_bin_log is only set for GTID_EVENT? Is there a guarantee that there always be a GTID_EVENT before a QUERY_EVENT? What if gtids aren't enabled? 2. how do you guarantee that all query events for a previous value of options_written_to_bin_log have been applied before updating it in the rgi? 3. and if you do guarantee it, why is it atomic? No query events = nobody reads options_written_to_bin_log concurrently. And why not to do a simple patch, like the one I've attached? (note, it's an intentionally minimal patch to show the fix, practically I'd accompany it with a small cleanup patch) Regards, Sergei VP of MariaDB Server Engineering and security@mariadb.org On Sep 02, Andrei Elkin wrote:
Query_log_event cannot just randomly use some options_written_to_bin_log, it must use options_written_to_bin_log from the Format_description_log_event that came from the binlog file where the Query_log_event was read from.
[ Not to engage you into a separate discussion, still why 'randomly'? In the quoted description p.1 says the Query_log_event uses `options_written_to_bin_log` from the immediately preceding-in-binlog FD. ]
If rli->relay_log.description_event_for_exec is not that Format_description_log_event, and assuming that the correct Format_description_log_event is not stored anywhere, the Query_log_event has to remember options_written_to_bin_log itself,
It's an option, but the chosen one avoids such remembering per each Query event. Instead ...
there's not much we can do about it.
Or it can apply the mask in the constructor and store not flags2 but the new value of thd->variables.option_bits.
Or it can extend flags2 to cover all OPTIONS_WRITTEN_TO_BINLOG.
... of taking care of individual events the patch basically goes along the base to use a placeholder that A. lasts (unlike FD instance) the whole slave session time, and B. the stored `options_written_to_binlog` is guaranteed to be a copy of the most recent FD before the Query
To help you to understand the method, here is a sequence diagram, where on the top line the slave SQL thread, its Relay-log-info object, a similar to that Worker's instance, and a Worker thread.
SQL thread RLI/RGI Worker RGI Worker | . . . --->+ read FD . . i | . . . +--------------> stores options . --->+ read Query | . +-------------------------->+ finds Worker RGI to store in theree a copy FD.options | . +-------------------------------------------->+ schedules Query Worker | . | .<----------------+ reads options | | applies ... Query
It must be clear the A holds. To prove B let after Query a second FD^2 is read by the SQL thread and that FD^2 is from a different version server. Then the SQL thread always have a wait-for-workers-to-become-idle condition, so execution would go like this:
--->+ read Query . ... | . | +-------------------------->+ | finds Worker RGI | to store in theree a copy FD.options | | | --->+ read FD^2 | applying of | | Query ... | ... wait for Worker to complete Query | ... | +<============================================| | +-------------->. | stores FD^2 | options ...
So FD^2 can't affect queries from earlier binlogs.
I am all ears to hear about your feedback.
Cheers,
Andrei