-
-
Save vodolaz095/5073080 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
# Service watchdog script | |
# Put in crontab to automatially restart services (and optionally email you) if they die for some reason. | |
# Note: You need to run this as root otherwise you won't be able to restart services. | |
# | |
# Example crontab usage: | |
# | |
# Strict check for apache2 service every 5 minutes, pipe results to /dev/null | |
# */5 * * * * sh /root/watchdog.sh apache2 "" > /dev/null | |
# | |
# "Loose" check for mysqld every 5 minutes, second parameter is the name of the service | |
# to restart, in case the application and service names differ. Also emails a report to [email protected] | |
# about the restart. | |
# */5 * * * * sh /root/watchdog.sh mysqld mysql [email protected] > /dev/null | |
# Common daemon names: | |
# Apache: | |
# apache2 - Debian/Ubuntu | |
# httpd - RHEL/CentOS/Fedora | |
# --- | |
# MySQL: | |
# mysql - Debian/Ubuntu | |
# mysqld - RHEL/CentOS/Fedora | |
# --- | |
# Service name | |
DATE=`date +%Y-%m-%d--%H-%M-%S` | |
SERVICE_NAME="$1" | |
SERVICE_RESTARTNAME="$2" | |
EXTRA_PGREP_PARAMS="-x" #Extra parameters to pgrep, for example -x is good to do exact matching | |
MAIL_TO="$3" #Email to send restart notifications to | |
#path to pgrep command, for example /usr/bin/pgrep | |
PGREP="pgrep" | |
#Check if we have have a second param | |
if [ -z $SERVICE_RESTARTNAME ] | |
then | |
RESTART="/sbin/service ${SERVICE_NAME} restart" #No second param | |
else | |
RESTART="/sbin/service ${SERVICE_RESTARTNAME} restart" #Second param | |
fi | |
pids=`$PGREP ${EXTRA_PGREP_PARAMS} ${SERVICE_NAME}` | |
#if we get no pids, service is not running | |
if [ "$pids" == "" ] | |
then | |
$RESTART | |
if [ -z $MAIL_TO ] | |
then | |
echo "$DATE : ${SERVICE_NAME} restarted - no email report configured." | |
else | |
echo "$DATE : Performing restart of ${SERVICE_NAME}" | mail -s "Service failure: ${SERVICE_NAME}" ${MAIL_TO} | |
fi | |
else | |
echo "$DATE : Service ${SERVICE_NAME} is still working!" | |
fi | |
# copylefted from https://gist.github.com/vodolaz095/5073080 |
#!/bin/bash | |
mailto="mymail@mydomain" | |
/bin/bash /root/watchdog.sh mysqld mysqld "$mailto" | |
/bin/bash /root/watchdog.sh httpd httpd "$mailto" | |
/bin/bash /root/watchdog.sh pound pound "$mailto" | |
/bin/bash /root/watchdog.sh redis-server redis "$mailto" | |
/bin/bash /root/watchdog.sh memcached memcached "$mailto" | |
/bin/bash /root/watchdog.sh searchd searchd "$mailto" |
Used this script on my server, but had to execute it with "bash", - not "sh".
POSIX "sh" doesn't understand == for string equality, as that is a bash-ism. Use = instead.
Other people saying that brackets aren't supported by "sh" are wrong, btw.
if you see a "/root/watchdog.sh: 46: [: unexpected operator" message, here is the solution:
use
*/5 * * * * bash /root/watchdog.sh mysqld mysql [email protected] > /dev/null
for your cronjob (I am here on an Ubuntu13.10 - Linux)
Very useful!
Had to change "service " path. Consider using a variable for this just to increase portability SERVICE=which service
Thanks for posting!
Very useful!
Had to change "service " path. Consider using a variable for this just to increase portability
SERVICE=`which service`
Thanks for posting!
How to kill this feature? I removed it from cron and kill watchdog proccess, but i see it still work...
@zednight, i'm surprised for so many comments for script of unknown authorship or origin i manage to dig in HDD from server that stopped working in 2005 year....
Are you sure that ps -e
shows exactly the command for watchdog.sh
?
my PC have few watchdogs already, not related to this script
[vodolaz095@steel ~]$ ps -e | grep watchdog
12 ? 00:00:00 watchdog/0
13 ? 00:00:00 watchdog/1
20 ? 00:00:00 watchdog/2
27 ? 00:00:00 watchdog/3
For Ubuntu16.04 - Linux
RESTART="/sbin/service
change to
RESTART="/usr/sbin/service
and
*/5 * * * * bash /root/watchdog.sh mysqld mysql [email protected] > /dev/null
*/5 * * * * bash /root/watchdog.sh apache2 "" [email protected] > /dev/null
Thanks!
just one more question, if I send STOP siginal to the process id, then the STAT of the process changed from S=>T, how to check?
Thanks for sharing @vodolaz!
I use this script to restart the memcached service if for some reason this service has failed.
Mine is a bit shorter...
#!/bin/sh
app='[some command to start your stuff]'
if [ ! "$(pidof $app)" ]
then
[some command to start your stuff]
fi
tested on Fedora 17 / Centos 6.3