Created
May 24, 2012 18:53
-
-
Save vitobotta/2783513 to your computer and use it in GitHub Desktop.
Resque: automatically kill stuck workers and retry failed jobs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Also see: http://vitobotta.com/resque-automatically-kill-stuck-workers-retry-failed-jobs/ | |
[[ -f /tmp/retry-failed-resque-jobs ]] && rm /tmp/retry-failed-resque-jobs | |
ps -eo pid,command | | |
grep [r]esque | | |
grep "Processing" | | |
while read PID COMMAND; do | |
if [[ -d /proc/$PID ]]; then | |
SECONDS=`expr $(awk -F. '{print $1}' /proc/uptime) - $(expr $(awk '{print $22}' /proc/${PID}/stat) / 100)` | |
if [ $SECONDS -gt 50 ]; then | |
kill -9 $PID | |
touch /tmp/retry-failed-resque-jobs | |
QUEUE=`echo "$COMMAND" | cut -d ' ' -f 3` | |
echo " | |
The forked child with pid #$PID (queue: $QUEUE) was found stuck for longer than 50 seconds. | |
It has now been killed and job(s) flagged as failed as a result have been re-enqueued. | |
You may still want to check the Resque UI and the status of the workers for problems. | |
" | mail -s "Killed stuck Resque job on $(hostname) PID $PID" [email protected] | |
fi | |
fi | |
done | |
if [[ -f /tmp/retry-failed-resque-jobs ]]; then | |
/bin/bash -c 'export rvm_path=/usr/local/rvm && export HOME=/home/deploy && . $rvm_path/scripts/rvm && cd /var/www/sites/dashboard/current/ && /usr/local/bin/rvm rvmrc load && RAILS_ENV=production bundle exec rake resque:retry-failed-jobs' | |
fi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require "resque/tasks" | |
require 'resque/pool/tasks' | |
# this task will get called before resque:pool:setup | |
# and preload the rails environment in the pool manager | |
task "resque:setup" => :environment do | |
# generic worker setup, e.g. Hoptoad for failed jobs | |
end | |
task "resque:pool:setup" do | |
# close any sockets or files in pool manager | |
ActiveRecord::Base.connection.disconnect! | |
# and re-open them in the resque worker parent | |
Resque::Pool.after_prefork do |job| | |
ActiveRecord::Base.establish_connection | |
end | |
end | |
desc "Retries the failed jobs and clears the current failed jobs queue at the same time" | |
task "resque:retry-failed-jobs" => :environment do | |
(Resque::Failure.count-1).downto(0).each { |i| Resque::Failure.requeue(i) }; Resque::Failure.clear | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I've used and update it to my needs, without using /proc to make it work on OSX as well :)
https://gist.github.com/jobwat/5712437
thanks for sharing !