Skip to content

Instantly share code, notes, and snippets.

@tehprofessor
Last active December 18, 2015 17:18
Show Gist options
  • Save tehprofessor/5817099 to your computer and use it in GitHub Desktop.
Save tehprofessor/5817099 to your computer and use it in GitHub Desktop.
Parallelism w/ Use Case and Example
# Let's say you have a collection of users you want to email once a week.
# Now using Sidekiq, it's easy to move this off the main thread, so that you're not blocking
# incoming requests to the Rails application.
#
#Chances are you have a worker that looks like the EmailUsersWorker below.
#
class EmailUsersWorker
include Sidekiq::Worker
def perform
@users = User.all
@users.each do |user|
UserMailer.weekly_email(user_id).deliver
end
end
end
# While this looks good in theory, and is, because you've now freed up the main thread
# (Rails application) to continue serving requests no matter how long it
# (the background job of sending emails to all your users) takes. At this point though,
# it can still take a very long time, because it's emailing users synchronously.
#
# ... So how can we speed this up? See the next file parallelized.rb
# Let's break this job into smaller jobs, which can be further parallelized
# (by running multiple workers at the same time). Please see
# By breaking down our job of sending all the user's an email, into smaller jobs of
# sending an individual user an email, we're better able to parallelize our work and speed things up.
# Caveats:
# 1.) Ruby is not multi-threaded by default: Rubinius and jRuby both are. So you're not truly parallelizing
# tasks unless you're using one of those interpreters. Though, if you read #2 below, you'll see why this
# isn't strictly necessary (if you're familiar with asyncronous IO)
# 2.) When attempting to parallelize tasks, the easiest and best suited ones,
# are I/O bound. Meaning you don't need to spend much time in the CPU, instead most of the
# time spent is in waiting (for network or disk).
# 3.) It's very important to not pass around instances in sidekiq. Because it's async and! (more importantly!)
# does not necessarily process jobs immediately means you risk using a stale or invalid instance. So it's best to
# pass an id (in the case of an ActiveRecord instance) and call up a fresh instance to work with.
class EmailUsersWorker
include Sidekiq::Worker
def perform
@users = User.all
@users.each do |user|
EmailUserWorker.perform(user.id)
end
end
end
class EmailUsersWorker
include Sidekiq::Worker
def perform(user_id)
@user = User.find(user_id)
UserMailer.weekly_email(user).deliver
end
end
# Lastly, sidekiq has a built in mechanism that makes `background.rb` function exactly like `parallelized.rb`
# Can you figure it out? (It's simpler than you think, but you'll need to check the Sidekiq wiki)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment