On Monday, May 6, 2013 at 15:55 , Kristian Nielsen wrote:
Kristian Nielsen <knielsen@knielsen-hq.org> writes:I will try to get some initial docs written up by the end of the week.Giuseppe, here (attached) is my first stab at documentation for MariaDBGTID. Thanks for prodding me to get this done, even if it was a very gentleprodding :-)Daniel, do you think you can get these formatted properly and included in someappropriate place in the Knowledgebase? And feel free to fix any problems youfind in the text on the way, of course.I will then make sure to keep this updated as the code progresses.This is just a first stab, I am sure there is much that is incomplete and Iwill work to extend it as needed (comments welcome if anything is foundmissing).- Kristian.A. MariaDB global transaction IDFrom version 10.0, MariaDB supports global transaction IDs for replication.MariaDB replication in general works as follows: On a master server, allupdates to the database (DML and DDL) are written into the binary log asbinlog events. A slave server connects to the master and reads the binlogevents, then applies the events locally to replicate the same changes as doneon the master. A server can be both a master and a slave at the same time, andit is thus possible for binlog events to replicated through multiple levels ofservers.A slave server keeps track of the position in the master's binlog of the lastevent applied on the slave. This allows the slave server to re-connect andresume from where it left off after replication has been temporarilystopped. It also allows to disconnect from the master server and connect to adifferent server to resume replication from a new master, as long as the newmaster has the proper binlog events, and the new master connection startsreplicationg at the appropriate point in the binlogs.Global transaction ID introduces a new event attached to each event group inthe binlog. (An event group is a collection of events that are always appliedas a unit. They are best thought of as a "transaction", though they alsoinclude non-transactional DML statements, as well as DDL). As an event groupis replicated from master server to slave server, the global transaction ID ispreserved. Since the ID is globally unique across the entire group of servers,this makes it easy to uniquely identify the same binlog events on differentservers that replicate each other (this was not easily possible before MariaDB10.0).Using global transaction ID provides two main benefits:1. Easy to change a slave server to connect to and replicate from a differentmaster server.The slave remembers the global transaction ID of the last event groupapplied from the old master. This makes it easy to know where to resumereplication on the new master, since the global transaction IDs are knowthroughout the entire replication hierarchy. This is not the case whenusing old-style replication; in this case the slave knows only thespecific file name and offset of the old master server of the last eventapplied. There is no simple way to guess from this the correct file nameand offset on a new master.2. The state of the slave is recorded in a crash-safe way.The slave keeps track of its current position (the global transaction IDof the last transaction applied) in a system table mysql.rpl_slave_state.If this table is using a transactional storage engine (such as InnoDB,which is the default), then updates to the state is done in the sametransaction as the updates to the data. This makes the state crash-safe;if the slave server crashes, crash recovery on restart will make sure thatthe recorded replication position matches what changes were actuallyreplicated. This is not the case for old-style replication, where thestate is recorded in a file relay-log.info, which is updated independentlyof the actual data changes and can easily get out of sync if the slaveserver crashes. (This works for DML to transactional tables;non-transactional tables and DDL in general are not crash-safe inMariaDB.)Because of these two benefits, it is generally recommended to use globaltransaction ID for any replication setups based on MariaDB 10.0 or later.However, old-style replication continues to work as always, so there is nopressing need to change existing setups. Global transaction ID integratessmoothly with old-style replication, and the two can be used freely togetherin the same replication hierarchy. There is no special configuration needed ofthe server to start using global transaction ID. However, it must beexplicitly set for a slave server with the appropriate CHANGE MASTER option;by default old-style replication is used by a replication slave, to maintainbackwards compatibility.B. The concept of global transaction IDA global transaction ID, or GTID for short, consists of three numbersseparated with dashes '-'. For example:0-1-10- The first number 0 is the domain ID, which is specific for globaltransaction ID (more on this below). It is a 32-bit unsigned integer.- The second number is the server ID, the same as is also used in old-stylereplication. It is a 32-bit unsigned integer.- The third number is the sequence number. This is a 64-bit unsigned integerthat is monotonically increasing for each new event group logged into thebinlog.The server ID is set to the server ID of the server where the event group isfirst logged into the binlog. The sequence number is increased on a server forevery event group logged. Since server IDs must be unique for every server,this makes the (server_id, sequence_number) pair, and hence the whole GTID,globally unique.B.1. The domain IDWhen events are replicated from a master server to a slave server, the eventsare always logged into the slave's binlog in the same order that they wereread from the master's binlog. Thus, if there is only ever a single masterserver receiving (non-replication) updates at a time, then the binlog orderwill be idential on every server in the replication hierarchy.This consistent binlog order is used by the slave to keep track of its currentposition in the replication. Basically, the slave remembers the GTID of thelast event group replicated from the master. When reconnecting to a master,whether the same one or a new one, it sends this GTID position to the master,and the master starts sending events from the first event after thecorresponding event group.However, if user updates are done independently on multiple servers at thesame time, then in general it is not possible for binlog order to be identicalacross all servers. This can happen when using multi-source replication, withmulti-master ring topologies, or just if manual updates are done on a slavethat is replicating from active master. If the binlog order is different onthe new master from the order on the old master, then it is not sufficient forthe slave to keep track of a single GTID to completely record the currentstate.The domain ID, the first component of the GTID, is used to handle this.In general, the binlog is not a single ordered stream. Rather, it consists ofa number of different streams, each one identified by its own domainID. Within each stream, GTIDs always have the same order in every serverbinlog. However, different streams can be interleaved in different ways ondifferent servers.A slave server then keeps track of its replication position by recording thelast GTID applied within each replication stream. When connecting to a newmaster, the slave can start replication from a different point in the binlogfor each domain ID.For more details on using multi-master setups and multiple domain IDs, seesection "F. Using global transaction with multi-source replication and othermulti-master setups".Simple replication setups only have a single master being updated by theapplication at any one time. In such setups, there is only a singlereplication stream needed. Then domain ID can be ignored, and left as thedefault of 0 on all servers.C. Using global transaction IDIn MariaDB 10.0, global transaction ID is enabled automaticall. Each eventgroup logged to the binlog receives a GTID event, as can be seen withmysqlbinlog or SHOW BINLOG EVENTS.The slave automatically keeps track of the GTID of the last applied eventgroup, as can be seen from the gtid_pos variable:SELECT @@GLOBAL.gtid_pos0-1-1When a slave connects to a master, it can use either global transaction ID orold-style filename/offset to decide where in the master binlogs to startreplicating from. To use global transaction ID, use the master_use_gtid optionof CHANGE MASTER:CHANGE MASTER TO master_use_gtid = 1, master_host = 'my_master', ...When the slave is then later started with START SLAVE, it will send the valueof @@GLOBAL.gtid_pos to the master and start replication from thecorresponding point in the master binlogs.Even when a slave is configured to connect with the old-style binlog filenameand offset (CHANGE MASTER TO master_log_file=..., master_log_pos=...), it willstill keep track of the current GTID position in @@GLOBAL.gtid_pos. This meansthat an existing slave previously configured and running can be changed toconnect with GTID (to the same or a new master) simply with:CHANGE MASTER TO master_use_gtid = 1The slave remembers that master_use_gtid=1 was specified and will use it alsofor subsequent connects, until it is explicitly changed by specifyingmaster_log_file/pos=... or master_use_gtid=0. The current value can be seen asthe field Using_Gtid of SHOW SLAVE STATUS:SHOW SLAVE STATUSUsing_Gtid: 1The slave server internally uses the table mysql.rpl_slave_state to store theGTID position (and so preserve the value of @@GLOBAL.gtid_pos across serverrestarts). After upgrading a server to 10.0, it is necessary to runmysql_upgrade_db (as always) to get the table created.In order to be crash-safe, this table must use a transactional storage enginesuch as InnoDB. When MariaDB is first installed (or upgraded to 10.0), thetable is created using the default storage engine - which itself defaults toInnoDB. If there is a need to change the storage engine for this table (tomake it transactional on a system configured with MyISAM as the defaultstorage engine, for example), use ALTER TABLE:ALTER TABLE mysql.rpl_slave_state ENGINE = InnoDBThe table mysql.rpl_slave_state should not be modified in any other way. Inparticular, do not try to update the rows in the table to change the slave'sidea of the current GTID position; instead useSET GLOBAL gtid_pos = '0-1-1'The actual slave GTID position, and thus the value of @@GLOBAL.gtid_pos, isthe result of a combination of the contents of the mysql.rpl_slave_state andthe contents of the slave binlog (if any, eg. if --log-slave-updates isenabled). This allows to use the same CHANGE MASTER TO ... MASTER_USE_GTID=1command to connect a server as slave to a new master, regardless of whetherthe server was acting as a slave or a master before.D. Setting up a new slave server with global transaction IDSetting up a new replication slave server with global transaction ID is notmuch different from setting up an old-style slave. The basic steps are:1. Setup the new server and load it with the initial data.2. Start the slave replicating from the appropriate point in the master'sbinlog.D.1. Start with empty server, replicate all binary logsThe simplest way for testing purposes is probably to setup a new, empty slaveserver and replicate all of the master's binlogs from the start (this isusually not feasible in a realistic production setup, as the initial binlogfiles will probably have been purged or take too long to apply).The slave server is installed in the normal way. By default, the GTID positionfor a newly installed server is empty, which makes the slave replicate fromthe start of the master's binlogs. But if the slave was used for otherpurposes before, the initial position can be explicitly set to empty first:SET GLOBAL gtid_pos = "";Next, point the slave to the master with CHANGE MASTER. Specify master_hostetc. as usual. But instead of specifying master_log_file and master_log_posmanually, use master_use_gtid=1 to have GTID do it automatically:CHANGE MASTER TO master_host="127.0.0.1", master_port=3310,master_user="root", master_use_gtid=1;START SLAVE;D.2. Setting up a new slave from a backupThe normal way to set up a new replication slave is to restore a backup froman existing server (whether master or slave) as the new server, then point itto start replication from the appropriate position in the master's binlog.It is important that the position at which replication is started correspondsexactly to the state of the data at the point in time that the backup wastaken. Otherwise, the slave can end up with different data than the masterbecause of transactions missing or duplicated.Two common ways to take a backup are XtraBackup and mysqldump. Both of thesecan provide the current binlog position of the backup in a non-blockingway. Of course, if there are no writes to the server being backed up duringthe backup process, then a simple SHOW MASTER STATUS will give the correctposition.Once the current binlog position for the backup has been obtained, in the formof a binlog file name and offset, the corresponding GTID position can beobtained from BINLOG_GTID_POS() on the server that was backed up:SELECT BINLOG_GTID_POS("master-bin.000001", 600);The new slave can now be started replicating by setting the correct@@gtid_pos, issuing CHANGE MASTER to point to the master server, and startingthe slave threads:SET GLOBAL gtid_pos = "0-1-2";CHANGE MASTER TO master_host="127.0.0.1", master_port=3310,master_user="root", master_use_gtid=1;START SLAVE;This method is particularly useful when setting up a new slave from a backupof the master. Remember to ensure that the value of server_id for the newserver is different from that of any other server (this is set in my.cnf).If the backup was taken of an existing slave server, then it already has thecorrect GTID position stored in the table mysql.rpl_slave_state (provided thatthe backup includes this table and is consistent with changes to other tables,of course). In this case, there is no need to explicitly look up the GTIDposition on the old server and set it on the new slave - it will be alreadycorrectly loaded from mysql.rpl_slave_state. This however does not work if thebackup was taken of the master - because then the current GTID position iscontained in the binlog, not in mysql.rpl_slave_state.D.3. Switching an existing old-style slave to use GTID.If there is already an existing slave running using old-style binlogfilename/offset position, then this can be changed to use GTID directly. Thiscan be useful for upgrades for example, or where there are already tools tosetup new slaves using old-style binlog positions.When a slave connects to a master using old-style binlog positions, and themaster supports GTID (ie. is MariaDB 10.0.2 or bigger), then the slaveautomatically downloads the GTID position at connect and updates it duringreplication. Thus, once a slave has connected to the GTID-aware master atleast once, it can be switched to using GTID without any other actions needed;STOP SLAVE;CHANGE MASTER TO master_host="127.0.0.1", master_port=3310,master_user="root", master_use_gtid=1;START SLAVE;(A later version will probably add a way to setup the slave so that it willconnect with old-style binlog file/offset the first time, and automaticallyswitch to using GTID on subsequent connects.)E. Changing a slave to replicate from a different masterOnce replication is running with GTID (master_use_gtid=1), the slave can bepointed to a new master simply by specifying in CHANGE MASTER the newmaster_host (and if required master_port, master_user, and master_password):STOP SLAVE;CHANGE MASTER TO master_host='127.0.0.1', master_port=3312;START SLAVE;The slave has a record of the GTID of the last applied transaction from theold master, and since GTIDs are identical across all servers in a replicationhierarchy, the slave will just continue from the appropriate point in the newmaster's binlog.It is important to understand how this change of masters work. The binlog isan ordered stream of events (or multiple streams, one per replication domain,see section "F. Using global transaction with multi-source replication andother multi-master setups"). Events within the stream are always applied inthe same order on every slave that replicates it. The MariaDB GTID relies onthis ordering, so that it is sufficient to remember just a single point withinthe stream. Since event order is the same on every server, switching to thepoint of the same GTID in the binlog of another server will give the sameresult.This translates into some responsibility for the user. The MariaDB GTIDreplication is fully asynchronous, and fully flexible in how it can beconfigured. This makes it possible to use it in ways where the assumption thatbinlog sequence is the same on all servers is violated. In such cases, whenchanging master, GTID will still attempt to continue at the point of currentGTID in the new binlog.The most common way that binlog sequence gets different between servers iswhen the user/DBA does updates directly on a slave server (and these updatesare written into the slaves binlog). This results in events in the slavesbinlog that are not present on the master or any other slaves. This can beavoided by setting the session variable sql_log_bin false while doing suchupdates, so they do not go into the binlog.It is normally best to avoid any differences in binlogs between servers. Thatbeing said, MariaDB replication is designed for maximum flexibility, and therecan be valid reasons for introducing such differences from time to time. Itthis case, it just needs to be understood that the GTID position is a singlepoint in each binlog stream (one per replication domain), and how this affectsthe users particular setup.Differences can also occur when two masters are active at the same time in areplication hierarchy. This happens when using a multi-master ring. But it canalso occur in a simple master-slave setup, during switch to a new master, ifchanges on the old master is not allowed to fully replicate to all slaveservers before switching master. Normally, to switch master, first writes tothe old master should be stopped, then one should wait for all changes to bereplicated to the new master, and only then should writes begin on the newmaster. Deliberately using multiple active masters is also supported, this isdescribed in the next section.F. Using global transaction with multi-source replication and othermulti-master setupsMariaDB global transaction ID supports having multiple masters active at thesame time. Typically this happens with either multi-source replication ormulti-master ring setups.In such setups, each active master must be configured with its own distinctreplication domain ID, gtid_domain_id. The binlog will then in effect consistsof multiple independent streams, one per active master. Within one replicationdomain, binlog order is always the same on every server. But two differentstreams can be interleaved differently in different server binlogs.The GTID position of a given slave is then not a single GTID. Rather, itbecomes the GTID of the last event group applied for each value of domain ID,in effect the position reached in each binlog stream. When the slave connectsto a master, it can continue from one stream in a different binlog positionthan another stream. Since order within one stream is consistent across allservers, this is sufficient to always be able to continue replicationat thecorrect point in any new master server(s).Domain IDs are assigned by the DBA, according to the need of the application.The default value of @@GLOBAL.gtid_domain_id is 0. This is appropriate formost replication setups, where only a single master is active at a time. TheMariaDB server will never by itself introduce new domain_id values into thebinlog.When using multi-source replication, where a single slave connects to multiplemasters at the same time, each such master should be configured with its owndistict domain ID.Similarly, in a multi-master ring topology, where all master in the ring areupdated by the application concurrently (with some mechanism to avoidconflicts), a distict domain ID should be configured for each server (In amulti-master ring where the application is careful to only do updates on onemaster at a time, a single domain ID is sufficient).Normally, a slave server should not receive direct updates (as this createsbinlog differences compared to the master). Thus it does not matter what valueof gtid_domain_id is set on a slave, though it may make sense to make it thesame as the master (if not using multi-master) to make it easy to promote theslave as a new master. Of course, if a slave is itself an active master, as ina multi-master ring topology, the domain ID should be set according to theserver's role as active master.Note that domain ID and server ID are distinct concepts. It is possible to usea different domain ID on each server, but this is normally not desirable. Itmakes the current GTID position (@@global.gtid_pos) more complicated tounderstand and work with, and looses the concept of a single ordered binlogstream across all servers. It is recommended only to configure as many domainIDs as there are master servers actively being updated by the application atthe same time.It is not an error in itself to configure domain IDs incorrectly (for example,not configuring them at all). For example, this will be typical in an upgradescenario where a multi-master ring using 5.5 is upgraded to 10.0. The ringwill continue to work as before even though everything is configured to usethe default domain ID 0. It is even possible to use GTID for replicationbetween the servers. However, case must be taken when switching a slave to adifferent master. If the binlog order between the old and the new masterdiffers, then a single GTID position to start replication from in the newmaster's binlog may not be sufficient.G. New syntax for global transaction IDG.1. CHANGE MASTERCHANGE MASTER has a new option, master_use_gtid=[0|1]. When enabled (set to1), the slave will connect to the master using the GTID position. Whendisabled, the old-style binlog filename/offset position is used to decidewhere to start replicating when connecting.The value of master_use_gtid is saved across server restarts (inmaster.info). The current value can be seen as the field Using_Gtid in theoutput of SHOW SLAVE STATUS.G.2. BINLOG_GTID_POS().The BINLOG_GTID_POS() function takes as input an old-style binlog position inthe form of a file name and a file offset. It looks up the position in thecurrent binlog, and returns a string representation of the corresponding GTIDposition. If the position is not found in the current binlog, NULL isreturned.H. New system variables for global transaction IDH.1. gtid_posThis variable is the current GTID position of a slave server.It can be set by the user to change the current replication position. Thisrequires all slave threads to be stopped first. Note that the position isshared among all slave connections when using multi-source replication. To setposition for two masters, one using replication domain 1 and anotherreplication domain 2, set a GTID for both domains, for example:SET GLOBAL gtid_pos = "1-10-100,2-20-500";The variable value is updated whenever an event group is replicated on aslave, and whenever something is logged to the binlog on the master.Note that the value of the variable is the result of whatever event happenedlast, either slave replication or master binlogging, per replicationdomain. It is an error to set it to something that conflicts with what is inthe binlog. This means that to completely reset a slave server (RESET SLAVEand delete all tables), it is also necessary to RESET MASTER before@@GLOBAL.gtid_pos can be cleared (if binlogging is enabled on the slave). Thisis in any case necessary to avoid incorrect binlog on the slave.Name: gtid_posType: StringScope: globalPrivileged: yesDynamic: yesH.2. gtid_domain_idThis variable is used to decide which replication domain new GTIDs are loggedin for a master server. See section "F. Using global transaction withmulti-source replication and other multi-master setups" for details.This variable can also be set on the session level. This is used bymysqlbinlog to preserve the domain ID of GTID events.Name: gtid_domain_idType: 32-bit unsigned integerScope: global and sessionPrivileged: yesDynamic: yesH.3. server_idServer_id can be set on the session level to change which server_id value islogged in binlog events (both GTID and other events). This is used bymysqlbinlog to preserve the server ID of GTID events.Name: server_idType: 32-bit unsigned integerScope: global and sessionPrivileged: yesDynamic: yesH.3. gtid_seq_nogtid_seq_no can be set on the session level to change which sequence number islogged in the following GTID event. This is used by mysqlbinlog to preservethe sequence number of GTID events.Name: gtid_seq_noType: 64-bit unsigned integerScope: session onlyPrivileged: yesDynamic: yes