The incident management steps I have in mind when being on-call and getting an alert are:
- Verify the issue
- Triage
- Communicate and scalate if needed
- Mitigate
I hereby claim:
To claim this, I am signing this object:
# Checking for Shellshock attempts in web server logs | |
egrep "};|}\s*;" /var/log/apache2/access.log |
# tail log file in browser | |
# go to server.example.com:777 | |
pip install tailon | |
tailon -f /var/log/example.log -b 0:0:0:0:777 & |
#!/bin/bash | |
# check if mysql slave is running | |
log=/var/log/mysqlslave.log | |
[email protected] | |
date >> $log | |
res=`mysql -u root -pPassword -h db.example.com -N -B -e "show status like 'Slave_running'"|cut -f2` | |
if [ $res = 'ON' ] |
# calculate mysql schemas disk size | |
mysql -h dbhost.example.com -u theuser -pThePassword -e 'select table_schema as DB, round(sum((data_length+index_length)/1024/1024),1) as MB from information_schema.tables group by table_schema order by MB desc' |
# WP "Error establishing database connection" | |
# substitute with site's index static page (when it's up, not with the error duh) | |
wget http://example.com -O [/path/to/wp]/wp-content/db-error.php |
# see http://mysqltuner.com/ | |
wget https://raw.githubusercontent.com/major/MySQLTuner-perl/master/mysqltuner.pl | |
perl mysqltuner.pl |