[Maria-discuss] MariaDB 10.1.14 failure to initiate SST after RSU schema upgrade
Hi, We have a repeatable failure to initiate IST with MariaDB 10.1.14 after performing a schema upgrade on a single node in RSU mode. The error condition is when there is a delete query in the format: delete from <table> where id >= <n> on the non-RSU cluster nodes while the node is disconnected from the cluster. On rejoining the node determines that it is in sync with the other cluster nodes and no IST is performed, despite the rows that were deleted in the cluster. If we then delete the rows manually from the joining node, mysqld immediately crashes on the other nodes because they can't execute the new write transaction. The process we followed is: 1. Set up a 3-node cluster, nodes 0,1,2 2. Enable RSU on node 0: SET GLOBAL wsrep_OSU_method='RSU'; 3. Isolate node 0 from the cluster: SET GLOBAL wsrep_cluster_address="gcomm://"; 4. Perform a backward-compatible schema change, since this is the point of this process. In our test we added a single column to a table with a default value of null. Additionally we deleted some rows from a table on nodes 1 and 2, with: delete from <table> where id >= 100; which affected around 20 rows. 5. Rejoin the node to the cluster: SET GLOBAL wsrep_cluster_address="<gcomm string from config file>"; At this point the node immediately rejoins without doing IST and believes it is in sync, yet the rows are deleted on nodes 1 and 2 but not node 0. Interestingly if the delete query is: delete from <table> where id = <n>; there is no problem. Also we have not had any issue with syncing INSERT and UPDATE statements. A combination of INSERT, UPDATE and DELETE where id >= resulted in the insert/update statements being synced but the deletes not synced. It is as if the quorum somehow doesn't recognise delete where id >= as an event. Our next test cases are: 1. Switching node 0 back to TOI mode before rejoining the cluster, although I can't really see how this would make a difference. 2. Upgrading to MariaDB 10.1.16 which was released a couple of days ago. 3. Testing whether regular IST is affected, ie IST that should occur normally without switching to RSU mode or dropping a node out of the cluster. This seems like a pretty basic failure and I'm concerned that it may also affect regular IST, i.e. a node falling behind the cluster for normal reasons without any involvement of RSU mode, which would effectively make the whole system useless if it could randomly drop delete statements. If anyone can shed any light on why this may be happening we would be very grateful! Thanks, Mark
1. Switching node 0 back to TOI mode before rejoining the cluster, although I can't really see how this would make a difference.
Made no difference.
2. Upgrading to MariaDB 10.1.16 which was released a couple of days ago.
Also made no difference, the issue is still reproducable.
3. Testing whether regular IST is affected, ie IST that should occur normally without switching to RSU mode or dropping a node out of the cluster.
Will reply again when we've tested this.
On 20 Jul 2016, at 16:34, Mark Wadham wrote:
3. Testing whether regular IST is affected, ie IST that should occur normally without switching to RSU mode or dropping a node out of the cluster.
Will reply again when we've tested this.
Haven't tested this yet, but we just tested the process without making a schema change and this resulted in the delete statements being synced correctly via IST when the node rejoins, so it seems like making the schema change triggers the bug. I have raised a jira issue for this: https://jira.mariadb.org/browse/MDEV-10406 Since IST is not disturbed when we don't make a schema change it seems likely that regular IST will work correctly, but we will test it anyway.
Hi Mark, On Wed, Jul 20, 2016 at 10:38 AM, Mark Wadham <ubuntu@rkw.io> wrote:
Hi,
We have a repeatable failure to initiate IST with MariaDB 10.1.14 after performing a schema upgrade on a single node in RSU mode. The error condition is when there is a delete query in the format:
delete from <table> where id >= <n>
on the non-RSU cluster nodes while the node is disconnected from the cluster. On rejoining the node determines that it is in sync with the other cluster nodes and no IST is performed, despite the rows that were deleted in the cluster. If we then delete the rows manually from the joining node, mysqld immediately crashes on the other nodes because they can't execute the new write transaction.
The process we followed is:
1. Set up a 3-node cluster, nodes 0,1,2 2. Enable RSU on node 0:
SET GLOBAL wsrep_OSU_method='RSU';
3. Isolate node 0 from the cluster:
SET GLOBAL wsrep_cluster_address="gcomm://";
4. Perform a backward-compatible schema change, since this is the point of this process. In our test we added a single column to a table with a default value of null.
As discussed on IRC #mariadb, you do not really need to take the node off cluster (3). Just set wsrep_osu_method's session value to RSU and perform the schema change. With RSU mode enabled, the node automatically desyncs itself from the cluster before executing any DDL,and thus other nodes in the cluster are not impacted. Best, Nirbhay
Additionally we deleted some rows from a table on nodes 1 and 2, with:
delete from <table> where id >= 100;
which affected around 20 rows.
5. Rejoin the node to the cluster:
SET GLOBAL wsrep_cluster_address="<gcomm string from config file>";
At this point the node immediately rejoins without doing IST and believes it is in sync, yet the rows are deleted on nodes 1 and 2 but not node 0.
Interestingly if the delete query is:
delete from <table> where id = <n>;
there is no problem. Also we have not had any issue with syncing INSERT and UPDATE statements. A combination of INSERT, UPDATE and DELETE where id
= resulted in the insert/update statements being synced but the deletes not synced. It is as if the quorum somehow doesn't recognise delete where id >= as an event.
Our next test cases are:
1. Switching node 0 back to TOI mode before rejoining the cluster, although I can't really see how this would make a difference.
2. Upgrading to MariaDB 10.1.16 which was released a couple of days ago.
3. Testing whether regular IST is affected, ie IST that should occur normally without switching to RSU mode or dropping a node out of the cluster.
This seems like a pretty basic failure and I'm concerned that it may also affect regular IST, i.e. a node falling behind the cluster for normal reasons without any involvement of RSU mode, which would effectively make the whole system useless if it could randomly drop delete statements.
If anyone can shed any light on why this may be happening we would be very grateful!
Thanks, Mark
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Hi Nirbhay,
As discussed on IRC #mariadb, you do not really need to take the node off cluster (3). Just set wsrep_osu_method's session value to RSU and perform the schema change. With RSU mode enabled, the node automatically desyncs itself from the cluster before executing any DDL,and thus other nodes in the cluster are not impacted.
I don't think this was the case during our testing, but I'll test it again anyway. If I understand RSU mode correctly all it does is ensure that DDL statements aren't replicated to the other nodes. AFAIK it won't stop the write events from the other nodes from being synced to the node in RSU mode, meaning that if we have writes to the table being altered we could run into problems. We have some very large tables and one of them we tested took around 7h 45min to add a column. Mark
On 21 Jul 2016, at 9:05, Mark Wadham wrote:
I don't think this was the case during our testing, but I'll test it again anyway. If I understand RSU mode correctly all it does is ensure that DDL statements aren't replicated to the other nodes. AFAIK it won't stop the write events from the other nodes from being synced to the node in RSU mode, meaning that if we have writes to the table being altered we could run into problems. We have some very large tables and one of them we tested took around 7h 45min to add a column.
Confirmed this, on node 0: SET GLOBAL wsrep_OSU_method='RSU'; show status like 'wsrep%'; - still shows "Synced" If we then alter table on node 0 and try to insert to the same table on node 1 we get a deadlock error. Mark
On 21 Jul 2016, at 9:39, Mark Wadham wrote:
Confirmed this
Oh wait, it seems that if we do: SET wsrep_OSU_method='RSU'; (with GLOBAL omitted) then it sort of works. The node doesn't desync immediately but as soon as we execute some DDL it does desync, and then immediately re-syncs as soon as the alter statement is completed. I'll do some more testing as I'm not sure if we can execute multiple DDL statements (or more likely a sequence of migrations for one of our webapps) and be sure that it will stay in RSU mode until they've all been executed. Mark
participants (2)
-
Mark Wadham
-
Nirbhay Choubey