-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 25/06/2015 14:42, Guillaume Lefranc wrote:
Well, if you had said that your shell was busybox in the first place, that would have saved us a lot of time... :-)
It's not my shell, my shell is regular bash, but many other tools, yes. Connecting to mysql daemon still fails with the same error, but at least lsof is now working as expected. That's progress :)
2015-06-25 14:41 GMT+02:00 Sylvain Raybaud <sylvain.raybaud@green-communications.fr <mailto:sylvain.raybaud@green-communications.fr>>:
Guillaume,
On 25/06/2015 13:40, Guillaume Lefranc wrote:
Just a suggestion, you can try adding "set -x" to /usr/bin/wsrep_sst_rsync, so the script will dump its output in the log. You should be able to know where it hangs precisely then.
It gives me the following sequence repating forever:
+ check_pid_and_port /var/lib/mysql//rsync_sst.pid 20189 4444 + local pid_file=/var/lib/mysql//rsync_sst.pid + local rsync_pid=20189 + local rsync_port=4444 + which lsof ++ lsof -i :4444 -Pn ++ grep '(LISTEN)' + local port_info= ++ echo ++ grep -w '^rsync[[:space:]]\+20189' + local is_rsync= + '[' -n '' -a -z '' ']'
It seems to correspond to lines 281--284: until check_pid_and_port $RSYNC_PID $RSYNC_REAL_PID $RSYNC_PORT do sleep 0.2 done
check_pid_and_port seems to be checking that rsync is running and listening, basically. Strange thing is: lsof -i :4444 -Pn doesn't return anything although ps shows that rsync was invoked correctly: rsync --daemon --no-detach --port 4444 --config /var/lib/mysql//rsync_sst.conf
Actually, lsof -i :4444 -Pn seems to behave rather differently on my laptop (ubuntu) and on buildroot. Indeed, lsof in buildroot is provided by busybox by default. This sometimes leads to significant differences. I'm going to rebuild my system with the real lsof package and see if it gets better. I'll let you know.
You can also try to run the SST command manually on the nodes, and see what it does. You can get the full command output in ps so you're free to start a donor on one node and a joiner on another node and follow the script output.
I did, and it fails because some variables are unbound. I think this is specific to manual invokation.
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net <mailto:maria-discuss@lists.launchpad.net> Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
- -- Sylvain Raybaud www.green-communications.fr -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJVi/9wAAoJEEkkwl4JtJ9yvKIQAKFRN2S1PVMi4WpRkAy+y3Sx V8SkQdI2ngOmwmC+QoEIZYkqeASHamVItMtHM+t7OrPrVHDCTgOjrcksHt8LPSu8 EH/e4nScB5x12aCSxrgiet2FA4xmLVl24dUWRwtlR3Cm0H2bkUO2hJBw6YEP+FcD ATVrqfzr/kcnFAIW0X//6PtH562lc0FpMqkLiez/03iiZPEhf7LlblNbjr9jwHpv F2p6VSJZF+gdGVES/CKGWI0mSwzqrAjXV77ExxDR6zJ8WMVDrzcuF/IjEfjG/3Wu lu+GQBnNu3c+F7ROXp0R7HJYXdv7yrphp5v3JkZJ48m245gYvpz/d/R915ZPj1g5 8UbkKkWVJq1MwRQSoXf3vT1nbwOfghk5H9/YP5F/dgLXMHfSuQmRk2w2XOs0eYtt Y0pfQNT5KY3WGM4t3H+lss+c1tVYg/kA5ZFytCICbfJJJrbzSg/t0C6FcQ8JNeoY HIYuM/K3K/Nngdivi9RCNOxWiB0JwhWqUOAy4eTrI1LDXv1ySYBSllGgKZF77iS/ PI+MUEbUSfFyh/aFQRHGLzFY6qFrwGXf/n3PcmxrnGcBW3jqV8jZsBf0jvmUQhAw H+F8QImrpfYcaANo52ufi7bsUzjFSuo0jeib+FHoLeQXkqc10SXBZb+LVeuPKR7K fnhLWNkhcUF5YQ3IJK9F =q2jK -----END PGP SIGNATURE-----