-
-
Save ssimpson89/7207165 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
### VARIABLES ### \ | |
EMAIL="" | |
SERVER=$(hostname) | |
MYSQL_CHECK=$(mysql -e "SHOW VARIABLES LIKE '%version%';" || echo 1) | |
LAST_ERRNO=$(/usr/bin/mysql -e "SHOW SLAVE STATUS\G" | grep "Last_Errno" | awk '{ print $2 }') | |
SECONDS_BEHIND_MASTER=$(/usr/bin/mysql -e "SHOW SLAVE STATUS\G"| grep "Seconds_Behind_Master" | awk '{ print $2 }') | |
IO_IS_RUNNING=$(/usr/bin/mysql -e "SHOW SLAVE STATUS\G" | grep "Slave_IO_Running" | awk '{ print $2 }') | |
SQL_IS_RUNNING=$(/usr/bin/mysql -e "SHOW SLAVE STATUS\G" | grep "Slave_SQL_Running" | awk '{ print $2 }') | |
ERRORS=() | |
### Run Some Checks ### | |
## Check if I can connect to Mysql ## | |
if [ "$MYSQL_CHECK" == 1 ] | |
then | |
ERRORS=("${ERRORS[@]}" "Can't connect to MySQL (Check Pass)") | |
fi | |
## Check For Last Error ## | |
if [ "$LAST_ERRNO" != 0 ] | |
then | |
ERRORS=("${ERRORS[@]}" "Error when processing relay log (Last_Errno)") | |
fi | |
## Check if IO thread is running ## | |
if [ "$IO_IS_RUNNING" != "Yes" ] | |
then | |
ERRORS=("${ERRORS[@]}" "I/O thread for reading the master's binary log is not running (Slave_IO_Running)") | |
fi | |
## Check for SQL thread ## | |
if [ "$SQL_IS_RUNNING" != "Yes" ] | |
then | |
ERRORS=("${ERRORS[@]}" "SQL thread for executing events in the relay log is not running (Slave_SQL_Running)") | |
fi | |
## Check how slow the slave is ## | |
if [ "$SECONDS_BEHIND_MASTER" == "NULL" ] | |
then | |
ERRORS=("${ERRORS[@]}" "The Slave is reporting 'NULL' (Seconds_Behind_Master)") | |
elif [ "$SECONDS_BEHIND_MASTER" > 60 ] | |
then | |
ERRORS=("${ERRORS[@]}" "The Slave is at least 60 seconds behind the master (Seconds_Behind_Master)") | |
fi | |
### Send and Email if there is an error ### | |
if [ "${#ERRORS[@]}" -gt 0 ] | |
then | |
MESSAGE="An error has been detected on ${SERVER} involving the mysql replciation. Below is a list of the reported errors:\n\n | |
$(for i in $(seq 1 ${#ERRORS[@]}) ; do echo "\t${ERRORS[$i]}\n" ; done) | |
Please correct this ASAP | |
" | |
echo -e $MESSAGE | mail -s "Mysql Replication for $SERVER is reporting Error" ${EMAIL} | |
fi |
@roadst4r Im not sure why I didnt just store the output in a variable. Your right, its much better to make one call and then parse it from there.
@@ -1,13 +1,15 @@
#!/bin/bash
### VARIABLES ### \
-EMAIL=""
+EMAIL="[email protected]"
SERVER=$(hostname)
-MYSQL_CHECK=$(mysql -e "SHOW VARIABLES LIKE '%version%';" || echo 1)
-LAST_ERRNO=$(/usr/bin/mysql -e "SHOW SLAVE STATUS\G" | grep "Last_Errno" | awk '{ print $2 }')
-SECONDS_BEHIND_MASTER=$(/usr/bin/mysql -e "SHOW SLAVE STATUS\G"| grep "Seconds_Behind_Master" | awk '{ print $2 }')
-IO_IS_RUNNING=$(/usr/bin/mysql -e "SHOW SLAVE STATUS\G" | grep "Slave_IO_Running" | awk '{ print $2 }')
-SQL_IS_RUNNING=$(/usr/bin/mysql -e "SHOW SLAVE STATUS\G" | grep "Slave_SQL_Running" | awk '{ print $2 }')
+MYSQL='/usr/bin/mysql -u mysql_user -pyourpassword'
+MYSQL_CHECK=$($MYSQL -e "SHOW VARIABLES LIKE '%version%';" || echo 1)
+LAST_ERRNO=$($MYSQL -e "SHOW SLAVE STATUS\G" | grep "Last_Errno" | awk '{ print $2 }')
+SECONDS_BEHIND_MASTER=$($MYSQL -e "SHOW SLAVE STATUS\G"| grep "Seconds_Behind_Master" | awk '{ print $2 }')
+IO_IS_RUNNING=$($MYSQL -e "SHOW SLAVE STATUS\G" | grep "Slave_IO_Running" | awk '{ print $2 }')
+SQL_IS_RUNNING=$($MYSQL -e "SHOW SLAVE STATUS\G" | grep "Slave_SQL_Running" | awk '{ print $2 }')
+ERR_FLAG="/tmp/mysql_replication_err_flag.dat"
ERRORS=()
### Run Some Checks ###
@@ -40,7 +42,7 @@
if [ "$SECONDS_BEHIND_MASTER" == "NULL" ]
then
ERRORS=("${ERRORS[@]}" "The Slave is reporting 'NULL' (Seconds_Behind_Master)")
-elif [ "$SECONDS_BEHIND_MASTER" > 60 ]
+elif [ "$SECONDS_BEHIND_MASTER" -gt 60 ]
then
ERRORS=("${ERRORS[@]}" "The Slave is at least 60 seconds behind the master (Seconds_Behind_Master)")
fi
@@ -48,9 +50,16 @@
### Send and Email if there is an error ###
if [ "${#ERRORS[@]}" -gt 0 ]
then
- MESSAGE="An error has been detected on ${SERVER} involving the mysql replciation. Below is a list of the reported errors:\n\n
- $(for i in $(seq 1 ${#ERRORS[@]}) ; do echo "\t${ERRORS[$i]}\n" ; done)
- Please correct this ASAP
- "
- echo -e $MESSAGE | mail -s "Mysql Replication for $SERVER is reporting Error" ${EMAIL}
+
+ if [ ! -f ${ERR_FLAG} ]
+ then
+ MESSAGE="An error has been detected on ${SERVER} involving the mysql replciation. Below is a list of the reported errors:\n\n
+ $(for i in $(seq 1 ${#ERRORS[@]}) ; do echo "\t${ERRORS[$i]}\n" ; done)
+ Please correct this ASAP\n\n
+ Please remove ${ERR_FLAG} after settle this issue.
+ "
+ echo -e $MESSAGE | mail -s "Mysql Replication for $SERVER is reporting Error" ${EMAIL}
+ touch $ERR_FLAG
+ fi
+
fi
I modified as:
- select a mysql login user
- avoid repeating sending email during error situation but still not yet settle. (in my case, sometimes one mysql LAN is broken(unstable) and need to wait ISP recover it)
- [ "$SECONDS_BEHIND_MASTER" > 60 ] to [ "$SECONDS_BEHIND_MASTER" -gt 60 ]
Thanks for the script! I corrected the output of the messages, since the first error was missing, caused by the index starting with 1.
MESSAGE="An error has been detected on ${SERVER} involving the mysql replication. Below is a list of the reported errors:\n\n
$(for i in "${ERRORS[@]}"; do
echo "\t$i\n"
done)
Please correct this ASAP"
Stephen, I also am grateful for your script. Thank you for posting it.
I made the following corrections:
Added a ":" to exclude "Slave_SQL_Running_State".
SQL_IS_RUNNING=$(/usr/bin/mysql -e "SHOW SLAVE STATUS\G" | grep "Slave_SQL_Running:" | awk '{ print $2 }')
Changed the 1 to 0 to get the entire error message list.
$(for i in $(seq 0 ${#ERRORS[@]}) ; do echo "\t${ERRORS[$i]}\n" ; done)
I also tweaked a few things to my liking:
Added a sending user for the email
[email protected]
echo -e $MESSAGE | mail -s "Mysql Replication for $SERVER is reporting Error" -r ${SENDER} ${EMAIL}
Added more detail if ERRNO is set.
## Check For Last Error ##
if [ "$LAST_ERRNO" != 0 ]
then
LAST_ERROR=$(mysql -e "SHOW SLAVE STATUS\G" | grep "Last_Error" | awk '{ print $2 }')
ERRORS=("${ERRORS[@]}" "Error when processing relay log (Last_Errno = $LAST_ERRNO)")
ERRORS=("${ERRORS[@]}" "(Last_Error = $LAST_ERROR)")
fi
Added more detail for seconds_behind_master and set the threshold to 1 hour.
## Check how slow the slave is ##
if [ "$SECONDS_BEHIND_MASTER" == "NULL" ]
then
ERRORS=("${ERRORS[@]}" "The Slave is reporting 'NULL' (Seconds_Behind_Master)")
elif [ "$SECONDS_BEHIND_MASTER" -gt 3600 ]
then
BEHINDHOURS=$(expr $SECONDS_BEHIND_MASTER / 3600)
ERRORS=("${ERRORS[@]}" "The Slave is more than an hour behind the master (Seconds_Behind_Master = $SECONDS_BEHIND_MASTER [$BEHINDHOURS+ hours])")
fi
I hope others find these tweaks useful.
I another improvement. rather than perform SHOW SLAVE STATUS multiple times for each variable do this
MYSQL_CHECK=$(mysql --defaults-file=/var/lib/mysql/repl_chk.cnf -e "SHOW VARIABLES LIKE '%version%';" || echo 1)
STATUS_LINE=$(mysql --defaults-file=/var/lib/mysql/repl_chk.cnf -e "SHOW SLAVE STATUS\G")"1"
LAST_ERRNO=$(grep "Last_Errno" <<< "$STATUS_LINE" | awk '{ print $2 }')
SECONDS_BEHIND_MASTER=$( grep "Seconds_Behind_Master" <<< "$STATUS_LINE" | awk '{ print $2 }')
IO_IS_RUNNING=$( grep "Slave_IO_Running" <<< "$STATUS_LINE" | awk '{ print $2 }')
SQL_IS_RUNNING=$(grep "Slave_SQL_Running" <<< "$STATUS_LINE" | awk '{ print $2 }')
MASTER_LOG_FILE=$(grep " Master_Log_File" <<< "$STATUS_LINE" | awk '{ print $2 }')
RELAY_MASTER_LOG_FILE=$(grep "Relay_Master_Log_File" <<< "$STATUS_LINE" | awk '{ print $2 }')
Just a note that (with Ubuntu 16.0-4) I had to wrap the 'greater than' comparisions in an extra set of square brackets for them to work properly.
For example:
elif [[ "$SECONDS_BEHIND_MASTER" > 60 ]]
and
if [[ "${#ERRORS[@]}" > 0 ]]
Also, there are two different lines found where returning the query for 'Slave_SQL_Running' (because of the additoinal 'Slave_SQL_Running_State' message, so I recommend adding a colon after 'Slave_SQL_Running' grep line (as IT-Juggler recommended), and I opted to add this to all grep queries except the first one (LAST_ERRNO, SECONDS_BEHIND_MASTER, IO_IS_RUNNING, and SQL_IS_RUNNING)
I have also opted to remove the 'mysql: [Warning] Using a password on the command line interface can be insecure.' message that returns when passing a sql password to the commands via the script, by appending 2>&1 | grep -v "Warning: Using a password" after the mysql command.
For example:
LAST_ERRNO=$(/usr/bin/mysql -u user -ppassword -e "SHOW SLAVE STATUS\G" 2>&1 | grep -v "Warning: Using a password" | grep "Last_Errno:" | awk '{ print $2 }')
Hi, this is great. Any chance you could add a license? :)
For the line:
MYSQL='/usr/bin/mysql -u mysql_user -pyourpassword'
You may want to set up the /var/lib/mysql/repl_chk.cnf file with the MySQL root user ID password and make sure that it's only readable by the root and/or mysql user and then use this:
MYSQL='/usr/bin/mysql --defaults-file=/var/lib/repl_chk.cnf'
This way your root password isn't readable in your script. You can also use this for other monitoring scripts as well.
I took a different approach to parsing. I used sed to delete the first row, then remove leading spaces, and replace colon space with colon (to strip the leading space from $value"):
mysql ${connect_options} -e "SHOW SLAVE STATUS \G" | sed '1d;s/^ *//;s/: /:/;' > $slave_status
Then reading through my file I split the keys and values with a colon delimiter.
while read line; do
key=$(echo $line | cut -f1 -d:)
value=$(echo $line | cut -f2 -d:)
case $key in
Slave_*|SQL_*|Last_*)
#Use Indirect Variable Assigment to assign "Value" to all "Key"s
eval "$key=\$value"
;;
esac
done < $slave_status
When the case statement matches a key, then using eval for indirect variable assignment. For example, I end up with variables $Last_IO_Errno, $Slave_IO_Running, etc. If you prefer UPPERCASE variable names, use tr.
key=$(echo $line | cut -f1 -d: | tr [:lower:] [:upper:])
I'm working on adding threshold for cases where Slave is stopped gracefully - so some short maintenance is possible without getting alerted.
ubuntu 16.04 MariaDB-10.2.10 - needed to add the exit in awk to stop after the first line
SQL_IS_RUNNING=$(grep "Slave_SQL_Running" <<< "$STATUS_LINE" | awk '{ print $2; exit }')
Checkout for the login-path
option in MySQL client so you can sleep at night knowing your MySQL root user is not in pure plain-text, as suggested previously here.
- Generate a login-path using
mysql_config_editor
as explained here. - Add the
--login-path
option:MYSQL_CHECK=$(mysql --login-path=login_path_generated -e "SHOW VARIABLES LIKE '%version%';" || echo 1)
in other statements as well.
The file encryption can be cracked, so ensure no one beside root
have access to it.
Nice script.
Forked here if anyone wants to see the revision
Added some extra checks.
Used --login-path=mysql_login
instead of using a mysql -u -p
via the mysql_config_editor
Added SSMTP sending of email instead of something like postfix.
Added AWS S3 backup granted the check goes through ok.
Changed it to utilise the $STATUS_LINE
variable instead of calling multiple times to SHOW SLAVE STATUS
Few other changes as per the comments above.
Rather than running mysql multiple times, wouldnt it be better to run it just once and parse the output.?
also, if you are using pipefail (http://redsymbol.net/articles/unofficial-bash-strict-mode/) you dont specifically need to check if the connection to mysql succeeded: the script will abort if unable to connect to mysql.
https://gist.github.com/roadst4r/2cb42793c0a7f9a7237a#file-mysql_repl_check-sh