Hi Nirbhay, I believe this patch is acceptable. Still I didn't completely understand the need for all this complexity and your answer to Daniel's question. Some minor comments inline. On Fri, May 20, 2016 at 11:04:24PM -0400, Nirbhay Choubey wrote:
revision-id: 454a9dc53a5b02af316dd713c577082eb8cf79c4 (mariadb-10.1.14-4-g454a9dc) parent(s): 9a1c4e900b98fdb9940aab57c895753f175c2bd8 author: Nirbhay Choubey committer: Nirbhay Choubey timestamp: 2016-05-20 23:04:20 -0400 message:
MDEV-10004: Galera's pc.recovery process fails in 10.1 with systemd
Galera recovery process works in two phases. In the first phase, mysqld is started as non-daemon with --wsrep-recover to recover and fetch the last logged global transaction ID. This ID is then used in second phase as the start position (--wsrep-start-position=XX) to start mysqld as daemon.
As this process was implemented in mysqld_safe script, the recovery did not work when server was started using systemd.
Fixed by introducing a shell script (wsrep_recovery.sh) that mimics the first phase of the recovery process.
--- cmake/systemd.cmake | 3 +- scripts/galera_recovery.sh | 94 ++++++++++++++++++++++++++++++++++++++++ support-files/mariadb.service.in | 11 ++++- 3 files changed, 106 insertions(+), 2 deletions(-)
diff --git a/cmake/systemd.cmake b/cmake/systemd.cmake index b0161cf..17f44f7 100644 --- a/cmake/systemd.cmake +++ b/cmake/systemd.cmake @@ -55,9 +55,10 @@ MACRO(CHECK_SYSTEMD) IF(HAVE_SYSTEMD AND HAVE_SYSTEMD_SD_DAEMON_H AND HAVE_SYSTEMD_SD_LISTEN_FDS AND HAVE_SYSTEMD_SD_NOTIFY AND HAVE_SYSTEMD_SD_NOTIFYF) ADD_DEFINITIONS(-DHAVE_SYSTEMD) - SET(SYSTEMD_SCRIPTS mariadb-service-convert galera_new_cluster) + SET(SYSTEMD_SCRIPTS mariadb-service-convert galera_new_cluster galera_recovery) SET(SYSTEMD_DEB_FILES "usr/bin/mariadb-service-convert usr/bin/galera_new_cluster + usr/bin/galera_recovery ${INSTALL_SYSTEMD_UNITDIR}/mariadb.service ${INSTALL_SYSTEMD_UNITDIR}/mariadb@.service ${INSTALL_SYSTEMD_UNITDIR}/mariadb@bootstrap.service.d/use_galera_new_cluster.conf") diff --git a/scripts/galera_recovery.sh b/scripts/galera_recovery.sh new file mode 100755 index 0000000..6d4f1d5 --- /dev/null +++ b/scripts/galera_recovery.sh @@ -0,0 +1,94 @@ +#!/bin/sh + +# Copyright (c) 2016 MariaDB Corporation +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; version 2 of the License. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ + + +# This script is intended to be executed by systemd. It starts mysqld with +# --wsrep-recover to recover from a non-graceful shutdown, determines the +# last stored global transaction ID and echoes it in --wsrep-start-position=XX +# format. The output is then captured and used by systemd to start mysqld. +# If the server was configured to start without wsrep, nothing is echoed. + +user='@MYSQLD_USER@' +log_file=$(mktemp /tmp/wsrep_recovery.XXXXXX) +euid=$(id -u) +recovered_pos="" +skipped="" +start_pos="" +start_pos_opt="" +ret=0 + +log () +{ + local msg="$1" + # Print all messages to stderr as we reserve stdout for printing + # --wsrep-start-position=XXXX. + echo "$msg" >&2 +} + +finish() +{ + rm -f "$log_file" +} + +trap finish EXIT + +# Safety checks +if [ -z "$log_file" ]; then + log "WSREP: mktemp failed" + exit 1 +fi + +if [ -f "$log_file" ]; then + [ "$euid" = "0" ] && chown $user $log_file + chmod 600 $log_file +else + log "WSREP: Failed to change the permissions of $log_file." I'd say it's rather mktemp failed than change permissions failed.
+ exit 1 +fi + +# Redirect server's error log to the log file. +eval /usr/sbin/mysqld --user=$user --wsrep_recover 2> "$log_file" +ret=$? +if [ $ret -ne 0 ]; then + # Something went wrong, let us also print the error log so that it + # shows up in systemctl status output as a hint to the user. + log "WSREP: Failed to start mysqld for wsrep recovery: '`cat $log_file`'" + exit 1 +fi Since this script is executed unconditionally, what shall happen if wsrep is not enabled at compile time (--wsrep_recover is not known)? Fail the whole service startup?
+ +# Parse server's error log for recovered position. The server prints +# "..skipping position recovery.." if started without wsrep. + +recovered_pos="$(grep 'WSREP: Recovered position:' $log_file)" + +if [ -z "$recovered_pos" ]; then + skipped="$(grep WSREP $log_file | grep 'skipping position recovery')" + if [ -z "$skipped" ]; then + log "WSREP: Failed to recover position: '`cat $log_file`'" + exit 1 Can we come here with wsrep disabled?
+ else + log "WSREP: Position recovery skipped." + fi +else + start_pos="$(echo $recovered_pos | sed 's/.*WSREP\:\ Recovered\ position://' \ + | sed 's/^[ \t]*//')" + log "WSREP: Recovered position $start_pos" + start_pos_opt="--wsrep_start_position=$start_pos" +fi + +echo "$start_pos_opt" + diff --git a/support-files/mariadb.service.in b/support-files/mariadb.service.in index b18674b..00162d4 100644 --- a/support-files/mariadb.service.in +++ b/support-files/mariadb.service.in @@ -48,6 +48,12 @@ CapabilityBoundingSet=CAP_IPC_LOCK # Execute pre and post scripts as root, otherwise it does it as User= PermissionsStartOnly=true
+# Perform automatic wsrep recovery. When server is started without wsrep, +# galera_recovery simply returns an empty string. In any case, however, +# the script is not expected to return with a non-zero status. +ExecStartPre=/bin/sh -c "VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && \ + systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1" I believe _WSREP_START_POSITION won't work with multi instance version. Not sure if it works but we may need to use something like _WSREP_START_POSITION%I here.
+ # Needed to create system tables etc. # ExecStartPre=/usr/bin/mysql_install_db -u mysql
@@ -57,9 +63,12 @@ PermissionsStartOnly=true # This isn't a replacement for my.cnf. # _WSREP_NEW_CLUSTER is for the exclusive use of the script galera_new_cluster @SYSTEMD_EXECSTARTPRE@ -ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER +ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION @SYSTEMD_EXECSTARTPOST@
+# Unset _WSREP_START_POSITION environment variable. +ExecStartPost=/bin/sh -c "systemctl unset-environment _WSREP_START_POSITION" + KillMode=process KillSignal=SIGTERM
_______________________________________________ commits mailing list commits@mariadb.org https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits
Regards, Sergey