[Maria-discuss] Galera partitioning question
Greetings all, I am running MariaDB 10.2.16 on CentOS in AWS and am seeing a sporadic cluster partitioning and rejoining issue with seemingly no explicable cause. * I have elements in 3 different AWS availability zones in a single galera cluster * Monitoring logs I see this message: /Jul 29 05:33:53 server01 mysqld: 2018-07-29 5:33:53 139633883080448 [Note] WSREP: (eabb848a, 'tcp://0.0.0.0:4567') connection to peer 392b9516 with addr tcp://172.31.17.60:4567 timed out, no messages seen in PT3S/ * I have tried forcing a 1500byte MTU as some others sources mentioned jumbo framing could negatively impact galera replication. * Running prolonged packet captures between nodes i cannot seem to find anything else wrong, network connectivity isn't interrupted and no service restarts occur. * These partition events happen multiple times per day. Has anyone seem this sporadic cluster disconnect and re-join issue in a similar env? I did not previously note this behavior on 10.1. Any help is much appreciated. -Ryan
participants (1)
-
Ryan Delgrosso