There exist several DI frameworks / libraries
in the Scala
ecosystem. But the more functional code you write the more you'll realize there's no need to use any of them.
A few of the most claimed benefits are the following:
- Dependency Injection.
- Life cycle management.
- Dependency graph rewriting.
So I'm going to try to demystify each of these assumptions and show you a functional approach instead.
The goal of this document is to explain why the concept of
DI
inScala
is something that has been brought up from theOOP
world and it is not needed at all inFP
applications, and not to talk about how good or bad a DI library might be. Reason why I won't mention any particular library here.
Before getting started I would like to quote a Q&A from a Stack Overflow thread:
What is the idiomatic Haskell solution for dependency injection?
I think the proper answer here is, and I will probably receive a few downvotes just for saying this: forget the term dependency injection. Just forget it. It's a trendy buzzword from the OO world, but nothing more.
Let's solve the real problem. Keep in mind that you are solving a problem, and that problem is the particular programming task at hand. Don't make your problem "implementing dependency injection".
Most libraries provide a kind of DSL
to define how dependencies are created and maybe some annotations, for example:
import mylibrary.di._
case class Config(host: Host, port: Port)
trait Repository
@Inject
class PostgreSQLRepository(config: Config) extends Repository
class InMemoryRepository extends Repository
@Inject
class A(repo: Repository)
@Inject
class B(repo: Repository)
@Inject
class C(a: A, b: B)
trait Service
class ProdService extends Service
class TestService extends Service
val repo: Repository = bind[Repository].to[PostgreSQLRepository].singleton
val a: A = bind[A]
val b: B = bind[B]
val c: C = bind[C]
val service: Service = bind[Service].to[ProdService]
So, what does exactly "Dependency Injection" mean? In simple terms, to provide an input in order to produce an output. Does that sound familiar?
f: A => B
A pure function from A
to B
is equivalent to dependency injection in OOP
. Given the previous example, we can define it as:
val config: Config = ??? // Eg. Read config with `pureConfig`
val repo: Repository = new PostgreSQLRepository(config)
val a: A = new A(repo)
val b: B = new B(repo)
val c: C = new C(a, b)
val service: Service = new ProdService()
Just plain and basic class argument passing. We can think of constructors as plain functions:
val mkRepository: Config => Repository = config =>
new PostgreSQLRepository(config)
val makeA: Repository => A = repo => new A(repo)
val makeB: Repository => B = repo => new B(repo)
val makeC: A => B => C = a => b => new C(a, b)
Another option we have in Scala
is to take advantage of the implicit
mechanism. For example:
class A(implicit repo: Repository)
class B(implicit repo: Repository)
class C(a: A, b: B)
implicit val repo: Repository = new PostgreSQLRepository(config)
val a: A = new A() // or explicitly new A(repo)
val b: B = new B()
val c: C = new C(a, b)
In my experience the implicit
approach plays very well in some cases only, I'll show it in one of the last sections.
Last but not least we have ReaderT
, called Kleisli
in the Cats
ecosystem. It has some benefits but also the downside of contaminating all your type system unless used in an MTL style
.
Eg. this:
class A(repo: Repository) {
def foo: Id[String] = repo.getFoo
}
Becomes this:
class A() {
def foo: Kleisli[Id, Repository, String] = Kleisli { repo =>
repo.getFoo
}
}
A few libraries come with this feature but none of them provide the right abstraction. They are more likely, side-effectful.
For example, they provide a few methods to manage all the resources in your application:
val pool: ExecutorService = ???
val db: Repository = ???
val es: ElasticSearch = ???
def onStart() = {
db.connect()
es.connect()
}
def onStop() = {
db.close()
es.shutdown()
pool.shutdown()
}
And this breaks the nice composability you gain when writing applications with fs2
or cats-effect
since you need to extract the value which means evaluating the effects to access the inner instance.
In Haskell
there are pure functional abstractions that exist for this purpose:
MonadMask
(bracket
)Managed
Resource
In Scala
we have had Stream.bracket
present for a long time in the fs2
library and now we also have Bracket
and Resource
in the Cats Efect
library, which are the right abstraction to manage resources.
NOTE: Bracket
is the only primitive, Resource
builds on top on Bracket
.
For example, this is how you would safely manage a database connection:
import cats.effect._
def myProgram(connection: Connection): IO[Unit] = ???
val releaseDBConnection: Connection => IO[Unit] = conn => IO(conn.close())
IO(db.connect).bracket(myProgram)(releaseDBConnection)
It consists of three parts: acquire
, use
and release
. And it's composable.
Some libraries give you access to a DependencyGraph
that contains all the dependencies. For example:
val prodDeps = DependencyGraph.instance()
Now if you want to have the same dependency graph and just change the Repository
instance for an in-memory implementation (for testing purposes), some of these libraries will allow you to do the following:
val testDeps = prodDeps.bind[Repository].to[InMemoryRepository]
Which will take the existent dependency graph and rewrite the instance for Repository
. Now this is quite convenient since you don't need to re-build your entire dependency graph for that.
However in practice, I found out that in a big project you might need to rewrite up to five dependencies per test suite, but normally you would only need to rewrite one or two. And that's easy to implement as I show in the next section.
There's no pure abstraction for this feature in FP as far as I know, but there are alternatives.
My default choice is to define all the dependencies in a single place that I usually call Module
and a Rewritable
case class that represents all the dependencies that can be re-written in the dependency graph.
For this very simple case we might only need to change either the Repository
or/and the Service
instances (for testing purposes), so we can define it as follows:
case class Rewritable(
repo: Option[Repository] = None,
service: Option[Service] = None
)
And a Module
with an empty instance of Rewritable
by default:
class Module(config: Config)(implicit D: Rewritable = Rewritable()) {
val repo: Repository = D.repo.getOrElse(new PostgreSQLRepository(config))
val a: A = new A(repo)
val b: B = new B(repo)
val c: C = new C(a, b)
val service: Service = D.service.getOrElse(new ProdService())
}
val module: Module = new Module(config)
implicit val deps = Rewritable(
repo = Some(new InMemoryRepository),
service = Some(new TestService)
)
val testModule = new Module(config)
That's it, we were able to change only two instances needed for testing, the rest of the dependencies remain the same.
The examples above are quite simple. In a real-world application you might have hundreds of dependencies and managing your dependencies becomes a non-trivial task.
One of the designs I've been working with successfully is MTL style
or tagless final in which every component of your application is defined as an abstract algebra.
For example, let's consider an application that needs to manage users and also be able to get the current exchange rate for a given currency:
trait UserRepository[F[_]] {
def find(email: Email): F[Option[User]]
def save(user: User): F[Unit]
}
trait ExchangeRateAlg[F[_]] {
def rate(baseCurrency: Currency, foreignCurrency: Currency): F[Option[Rate]]
}
We will have a simple program that requests for a user, and if it exists then it'll request the current exchange rate:
class MyProgram[F[_]: Monad](repo: UserRepository[F], exchangeRate: ExchangeRateAlg[F]) {
def exchangeRateForUser(email: Email, baseCurrency: Currency, foreignCurrency: Currency): F[Option[Rate]] =
for {
maybeUser <- repo.find(email)
rate <- maybeUser.fold((None: Option[Rate]).pure[F]) { _ =>
exchangeRate.rate(baseCurrency, foreignCurrency)
}
} yield rate
}
We also want to have logging and metrics capabilities in the interpreters:
class PostgreSQLRepository[F[_]: Sync](config: Config)(
implicit L: Log[F],
M: Metrics[F]
) extends UserRepository[F] { ... }
class HttpExchangeRate[F[_]: Async](implicit L: Log[F]) extends ExchangeRateAlg[F] { ... }
Here implicits are the best choice IMO.
And finally we assemble the entire application in a Module
with our Rewritable
instance:
case class Rewritable(
repo: Option[UserRepository[F]] = None,
ex: Option[ExchangeRateAlg[F]] = None
)
class Module[F[_]: Effect](config: Config)(implicit D: Rewritable = Rewritable()) {
implicit val log: Log[F] = implicitly // eg. provide a default instance for Sync[F]
implicit val metrics: Metrics[F] = implicitly // eg. provide a default instance for Sync[F]
val userRepo: UserRepository[F] = D.repo.getOrElse(new PostgreSQLRepository[F](config))
val exchangeRate: ExchangeRateAlg[F] = D.ex.getOrElse(new HttpExchangeRate[F])
val program = new MyProgram[F](userRepo, exchangeRate)
}
For example, for a test case we might just want to replace the UserRepository
:
implicit val deps = Rewritable(repo = Some(UserInMemoryRepository[F]))
val module = new Module[Id](config)
I believe that Functional Programming is the way forward since one can rely on immutability, referential transparency and great abstractions created by very smart people. And in return, you'll benefit from local reasoning
and composability
, among others. Now... the choice is completely YOURS!
From my experience in playframework and akka apps I do tend to prefer DI with guice. It helps a lot especially for TDD as code that didn't use DI tended to become not easy to test.