Skip to content

Instantly share code, notes, and snippets.

@richmolj
Last active December 18, 2015 14:08
Show Gist options
  • Save richmolj/5794539 to your computer and use it in GitHub Desktop.
Save richmolj/5794539 to your computer and use it in GitHub Desktop.
Dependency Injection
# Everyone should like some amount of Dependency Injection in Ruby.
# How much you like usually depends on how much importance you place on
# two competing values:
#
# 1. Build extensibly, loosely coupled components so you can properly
# adapt as your application scales. And things that start simple usually
# end up complex, so keep a solid foundation (read: solid OOP foundation
# in these concepts early on).
#
# and
#
# 2. Build the simplest thing possible, because YAGNI.
#
# I tend to agree with Jeff Atwood's perspective:
# http://www.codinghorror.com/blog/2004/10/kiss-and-yagni.html
# And with probably the seminal post on DI in ruby by Jamis Buck:
# http://weblog.jamisbuck.org/2008/11/9/legos-play-doh-and-programming
# And follow-up http://weblog.jamisbuck.org/2007/7/29/net-ssh-revisited
#
# DI isn't without benefits, but I feel proponents tend to turn a blind
# eye to the tradeoffs.
#
# DI is an obtrusive level of indirection
# Example from Objects on Rails:
class Post
def publish(clock = DateTime)
self.pubdate = clock.now
blog.add_entry(self)
end
end
# Set the published date on the post when adding to a blog.
# This is great because if we want to add an alternate strategy
# we just sub-out the injected class with a different one. For instance,
# publish after a review period:
class DelayClock
def now
DateTime.now + 24.hours
end
end
@post.publish(DelayClock.new)
# So that's the upside. The downside is our mental model goes from
# "A post becomes published in 24 hours after it is added" to
# "A blog entry could be a post, or something else. In the event it's
# a post, it will be published some period after saving. Look up the
# injected strategy to figure out when".
#
# This is only going to get more complex as we keep adding classes and
# treat everything as an abstraction. Abstraction == level of indirection
# == added complexity. I recommend looking at Net::SSH 1.1.4 (pointed
# out in the blog post above) for a larger, real-world example
#
# When this is a good thing
class Post
def initialize(backend)
@backend = backend
end
def all
@backend.all
end
end
Post.new(DB::Post).new
Post.new(Service::Post).new
# This is great because we have a subbable backend we can inject
# *and we actually need to use one or the other*. It's the difference
# between "we need to ways of finding Posts" and "we MIGHT need two ways
# of finding posts". Not only that, but the obtrusiveness of DI is actually a
# benefit: the code is screaming, "HEY! This is specially configured to support
# multiple strategies!"
#
# Service Locator Pattern
# I was disapponted Objects on Rails didn't touch on this, because a) it's the
# alternative to DI in acheiving the D in SOLID and b) it's more common in the
# ruby community.
#
# It's also usually less heavy-handed than DI. This post has a great
# example that progresses from DI to Service Locator:
# http://kresimirbojcic.com/2011/11/19/dependency-injection-in-ruby.html
#
# Let's say we want all Net::HTTP requests to have a configurable timeout. We could
# inject this:
class LongTimeout
def self.setup
http = Net::HTTP.new
http.read_timeout = 99999
http
end
end
SomeClass.new(LongTimeout.method(:setup))
# But we have to change the constructor to support this injection, and
# it's a little clunky. We can get the same modularily from
class Configuration
@settings = {}
def self.[](key)
@settings[key]
end
def self.[]=(key, value)
@settings[key] = value
end
end
Configuration.settings[:http] = proc {
http = Net::HTTP.new
http.read_timeout = 99999
http
}
class SomeClass
def query_google
Configuration[:http].get 'http://google.com'
end
end
# Which is basically the same thing as DI but less obtrusive and more commonly seen.
# This will also help when we want to use 3rd-party libraries that aren't expecting
# you to dependency inject everything. Carrierwave suggests overriding store_dir
# within an Uploader:
class MyUploader < CarrierWave::Uploader::Base
def store_dir
Configuration[:store_dir]
end
end
# Instead of subclassing MyUploader, modifying the constructor, and passing it
# in (or using a factory) we just reference Configuration. In addition,
# CarrierWave associates an Uploader with an ActiveRecord class by:
mount_uploader :avatar, AvatarUploader
# We don't have to look at mount_uploader and make sure our custom instance will
# work correctly. We just use Carrierwave like normal.
# This pattern is so popular you'll often see this abstracted in many gems:
# http://robots.thoughtbot.com/post/344833329/mygem-configure-block
#
# Service Locator vs DI
#
# Once again, I use DI when I WANT its obtrusiveness. If I really do have an
# abstracted domain where things are being subbed in and out, I want to call
# attention to this. If I just want a damn timeout or per_page setting, I'll
# use Service Locator.
#
# The Problem With Service Locator
#
# Configuration is a mutable global state, so if we want this value dependent
# on the request, we'll have clashes.
#
# Sidekiq (think multithreaded Resque) has this problem. It starts a bunch of
# workers in the same process. Each worker should be able to write to a common log,
# but it would be useful to include some metadata about the worker
# and about the job it's working on, so we can get something like this:
"worker 1 on job foo : Hello World!"
# It'd be pretty heavy-handed to pass around a custom logger to every class that would
# like to call Sidekiq.logger.info. This is how it's done (altered for brevity):
module Sidekiq
module Logging
class Pretty < Logger::Formatter
def call(message)
"#{Thread.current[:sidekiq_context]} : #{message}"
end
end
def self.with_context(msg)
begin
Thread.current[:sidekiq_context] = msg
yield
ensure
Thread.current[:sidekiq_context] = nil
end
end
end
end
Sidekiq::Logging.with_context("#{worker.class.to_s} JobID #{item['jid']}") do
Sidekiq.logger "Hello world!"
end
# with_context sets a thread local, a value only accessible to the current thread.
# we no longer have global mutable state, but we do have configuration we can reference
# anywhere. Take the same concept and apply to our Carrierwave example:
class ConfigurationMiddleware
def call(env)
begin
Thread.current[:__configuration__] = {:store_dir = something_via_request}
@app.call(env)
ensure
Thread.current[:__configuration__] = nil
end
end
end
class Configuration
def self.[](key)
Thread.current[:__configuration__][key]
end
end
class MyUploader < CarrierWave::Uploader::Base
def store_dir
Configuration[:store_dir]
end
end
# So now we have a Service Locator that *does not have mutable global state*.
# We have created a request local.
#
# Time Zones
#
# Rails takes the Thread.current idea and uses it without either pattern. This
# is a great example of how run-time configuration can be so simple you don't
# even have to think about it. Let's say we want time zones to be specific to
# the user. Straight from the Rails docs:
around_filter :set_time_zone
def set_time_zone
begin
old_time_zone = Time.zone
Time.zone = current_user.time_zone if logged_in?
yield
ensure
Time.zone = old_time_zone
end
end
# Underneath the hood you'll find (altered for simplicity):
def self.zone=(val)
Thread.current[:time_zone] = val
end
def self.zone
Thread.current[:time_zone]
end
# DI is often useful. But I don't usually find that the problem with a piece
# of legacy code is that it's well-written and understandable but tightly coupled.
# I do often find hard-to-navigate levels of indirection and abstraction (often
# without even bringing DI into the mix). That's why my experience has me value
# the KISS principle over dogmatic OOP. While both have their problems, I personally
# have less with the format than the latter, which is why I value it more.
@richmolj
Copy link
Author

@ljsc - I'll try to address this more in-depth, but I think

http://therealadam.com/2013/01/03/design-for-test-vs-design-for-api

sums it my thoughts. I don't care too much about his test-vs-api point, but I do strongly agree with his examples that DI adds complexity (side point: we have different definitions of 'indirection', this illustrates what mine is) and is therefore often not worth it.

I expect you disagree with the same example, which is fine as this is pretty subjective. How strongly you agree/disagree with his point likely has a high correlation with each point above.

@ljsc
Copy link

ljsc commented Jun 18, 2013

Sure, I think that's right. I guess it's all what you optimize for, and my preference is to aim for composability wherever reasonably possible. But I think that post offers up some nice points for enriching the discussion. Namely, what do we think of the descriptions the author offers of his own API examples? Personally I think he botched it, and here's why:

  1. This one is definitely arguable, but I think that he's digging too much into implementation details. Here:

Publishes the current post. The created_at timestamp is set to the current time. Returns the created_at timestamp.

should setting the created_at timestamp be documented here? I'm skeptical. If our goal is to document the api, we shouldn't be describing the guts of the method itself. What happens when we change the implementation? This mindset leads to the abstraction hole-poking I was describing above.
2. Failure to identify and document types (abstractions):

Publishes the current post. By default, created_at is set to the current time. Optionally, callers may pass in a Time object, or any object that returns a Time object when sent the now message. The created_at column is set according to that Time value. Returns the value of the created_at timestamp.

This is the wrong place to be describing the interface for the parameter. That's why you have types. We should just say it takes a Timeish and be done with it. Elsewhere in the documentation, you say what a Timeish does. This might seem like splitting hairs, but I think it's actually a really important distinction.

It speaks to I think the problem with Ruby's dynamic nature---yes it's convenient and can help get things done, but it also tends to lure programmers into the trap that there are no types at all. Even though there aren't any annotations, the types are still there, and not thinking about them leads to ignoring the contracts, and thus abstraction Swiss cheese.

This is normally the point where somebody says, hey duck-types to the rescue. This never really prevents things from running off the tracks though, b/c rather than documenting unambiguously somewhere in the system that a duck #quacks, and it #swims, we instead just throw the duck in the water and see if it drowns. See:

There’s some further potential trouble lurking in this API. What if a caller passes in the wrong kind of Time object? What if sending the now message raises an exception?

When I'm looking at any level of abstraction, I want to know what the inputs are to the thing, what the outputs are, and most importantly the invariants. I suggest that those are the most important items for being able to efficiently reason about code. Having DI makes us think about our dependencies, where they're used, and what we expect from them, explicitly.

Gregory Brown gives a really good look into some of these issues in Practicing Ruby article. It's a really good read, so I'll finish by offering that up and then step back and let some others offer thoughts.

@richmolj
Copy link
Author

@AndrewO - I had a response I accidentally deleted, because the iPhone has taught me to immediately click OK whenever I get any sort of confirmation pop-up. In any case:

The Kitten example is true, but I can also have conflicting middleware that both set the same key on env, or conflicting gem modules that both add the same method name. In any case, gems like Rails, Sunspot, Authlogic, Paper Trail, Sidekiq, etc are all using this same pattern. It's not stopping them from being widely used.

For the Carrierwave example, I feel like we're talking about different things. Yes, I can add middleware/initialization, use the env hash. And also add some fake env setup/handling for when we're not running through Rack (console, cron job, etc). My point was: I just want to use Carrierwave, not reconstruct my app and learn the innards of Carrierwave.

I believe you note this when referring to how 'model' comes out of nowhere. Whether we like Carrierwave's implementation seems irrelevant to me. It's an example of a popular gem we may want to use, and Joe Developer now needs to open up the source code and figure out how to rejigger the system, instead of just referring to Configuration and moving on.

@richmolj
Copy link
Author

I had a side-discussion with Lou, which I think is the best, most concise summary of each viewpoint. Please see https://gist.github.com/richmolj/5815196, I think it really distills the debate to its core concepts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment