[Maria-developers] Parallel replication MDEV-4506
Hi! Just a quick status update of parallel replication in MariaDB 10.0 I yesterday pushed to the 10.0-knielsen tree changes that makes many of the replication variables thread safe. I will today push a patch that should fix the rest of the variables. Here is the changelog entry for this: Fixes for parallel slave: - Made slaves temporary table multi-thread slave safe by adding mutex around save_temporary_table usage. - rli->save_temporary_tables is the active list of all used temporary tables - This is copied to THD->temporary_tables when temporary tables are opened and updated when temporary tables are closed - Added THD->lock_temporary_tables() and THD->unlock_temporary_tables() to simplify this. - Relay_log_info->sql_thd renamed to Relay_log_info->sql_driver_thd to avoid wrong usage for merged code. - Added is_part_of_group() to mark functions that are part of the next function. This replaces setting IN_STMT when events are executed. - Added is_begin(), is_commit() and is_rollback() functions to Query_log_event to simplify code. - If slave_skip_counter is set run things in single threaded mode. This simplifies code for skipping events. - Updating state of relay log (IN_STMT and IN_TRANSACTION) is moved to one single function: update_state_of_relay_log() We can't use OPTION_BEGIN to check for the state anymore as the sql_driver and sql execution threads may be different. Clear IN_STMT and IN_TRANSACTION in init_relay_log_pos() and Relay_log_info::cleanup_context() to ensure the flags doesn't survive slave restarts is_in_group() is now independent of state of executed transaction. - Reset thd->transaction.all.modified_non_trans_table() if we did set it for single table row events. This was mainly for keeping the flag as documented. - Changed slave_open_temp_tables to uint32 to be able to use atomic operators on it. - Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock - Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond - Changed some functions to take rpl_group_info instead of Relay_log_info to make them multi-slave safe and to simplify usage - do_shall_skip() - continue_group() - sql_slave_killed() - next_event() - Simplifed arguments to io_slave_killed(), check_io_slave_killed() and sql_slave_killed(); No reason to supply THD as this is part of the given structure. - set_thd_in_use_temporary_tables() removed as in_use is set on usage - Added information to thd_proc_info() which thread is waiting for slave mutex to exit. - In open_table() reuse code from find_temporary_table() Other things: - More DBUG statements - Fixed the rpl_incident.test can be run with --debug - More comments - Disabled not used function rpl_connect_master() The TODO for parallel replication is documented at top of rpl_parallel.cc. Here is the comment: ------- - Error handling. If we fail in one of multiple parallel executions, we need to make a best effort to complete prior transactions and roll back following transactions, so slave binlog position will be correct. And all the retry logic for temporary errors like deadlock. - Stopping the slave needs to handle stopping all parallel executions. And the logic in sql_slave_killed() that waits for current event group to complete needs to be extended appropriately... - Audit the use of Relay_log_info::data_lock. Make sure it is held correctly in all needed places also when using parallel replication. - We need some user-configurable limit on how far ahead the SQL thread will fetch and queue events for parallel execution (otherwise if slave gets behind we will fill up memory with pending malloc()'ed events). - Fix update of relay-log.info and master.info. In non-GTID replication, they must be serialised to preserve correctness. In GTID replication, we should not update them at all except at slave thread stop. - All the waits (eg. in struct wait_for_commit and in rpl_parallel_thread_pool::get_thread()) need to be killable. And on kill, everything needs to be correctly rolled back and stopped in all threads, to ensure a consistent slave replication state. - Handle the case of a partial event group. This occurs when the master crashes in the middle of writing the event group to the binlog. The slave rolls back the transaction; parallel execution needs to be able to deal with this wrt. commit_orderer and such. - We should notice if the master doesn't support GTID, and then run in single threaded mode against that master. This is needed to be able to support multi-master-replication with old and new masters. - Retry of failed transactions is not yet implemented for the parallel case. ---------- What this means is: In theory things should work, as long as there is no problems in the binary or relay log and we don't fill up memory with too many events. We will merge this branch to 10.0-base and then to 10.0 withing the next few days and start testing it. We also plan to release a beta of 10.0 with this code ASAP as 10.0 is now feature complete. During the beta phase we will fix the above outstanding issues. Kristian will this week start working on the error handling. After that we have to look at the memory consumption (ie, not give the sql execution threads more work than they can handle). I will shortly update the following article with information about how to use parallel replication. https://mariadb.com/kb/en/parallel-replication/ The task itself is documented at: https://mariadb.atlassian.net/browse/MDEV-4506 Regards, Monty
participants (1)
-
Michael Widenius