The Curious Case of the `IO` Monad

by Bruce Pascoe - 30 April, 2019

Haskell's IO monad has been a tremendous source of confusion and misconceptions over the years. At its core, it exists to solve the problem of How do we properly sequence function calls which may have side effects in an environment that assumes all functions are pure? Unfortunately, it is also many people's first exposure to the concept of monads, leading them to assume monads are all about isolating side effects. This is not the case at all; the problem IO solves is specific to IO alone. It just so happens that that problem is solvable within the monadic framework.

What is a monad?

I suppose, in the general case, this question is a little out of my depth. My grasp on category theory is tenuous at best, and any answer I tried to give would likely be wildly inaccurate. However, I am reasonably sure of what monads provide in a programming context, so let's pretend I asked that instead.

A monad is a specific kind of functor, so let's start there. Intuitively, a functor defines what it means to evaluate functions within a particular problem space; more formally, it maps functions over the problem space. This idea is encapsulated by the function fmap in Haskell. If the problem space is not more complex than "there's a value and I want to apply a function to it", we can still imagine a functor which simply applies the mapped function to the value: the identity functor. However, this fact is not represented in Haskell's type system; it is not possible, for example, to evaluate fmap (+ 3) 2 without producing a typing error. So we can see already that Haskell treats functors as second-class citizens; only if a type has an explicit Functor instance can we use fmap with it.

A monad takes the idea of a functor a bit further. Whereas a standard functor simply allows us to map a function over the contents of the problem space (that is, the values it contains) and transform them, a monad adds two additional capabilities: 1. return, which allows us to rearrange the problem space at will, and 2. >>= (usually referred to as bind or flatMap), which performs a similar purpose to fmap but additionally allows us to create a new monadic "box" in the process. This is necessary whenever an operation may mutate the environment surrounding the values in addition to the values themselves. For example, we may need to change Just 9001 (there is one value inside the box) to Nothing (now the box is empty). In effect, within the monadic framework we can treat the combination of the value(s) and the ancillary data surrounding them as one atomic context, represented in the program by a single value m a. Here again, we can easily imagine return and >>= being the identity function and identity functor respectively, so the entire value space forms a monad (as well as a comonad, but let's not get ahead of ourselves!), but, as before, >>= is only valid when applied to types with an explicit Monad instance. So Haskell treats monads, too, as second-class citizens when there's no mathematical reason to do so.¹

Fantastic effects and where to find them

Here's where things get interesting. Any function mapped with >>= is expected to have the type signature a -> m b, where m is the monadic type in question (representing the "world" the value lives in), a is the type of the value received as input and b is the type of the value(s) produced as output. What this means is that all functions in a monadic chain receive a pure value as input while producing the entire world as output; in fact, the bind operator >>=--as well as the fish operators >=> and <=<--exist precisely to deal with this impedance mismatch! If we have two functions a -> m b and b -> m c, the only reason we can't glue these together in the usual way (i.e. by using .) is because the types don't match up!

Since Haskell functions are (supposed to be) pure, what would a type signature of a -> m b represent? Intuitively, this means the mapped function can't see the existing environment--because that would make the function impure!--it only receives a context-free value as input and must therefore find a way to recreate the entire world out of whole cloth². The upshot of this is that, so long as we remain within the monadic contract, we can indeed produce as many side effects as we want, but we can't actually make any decisions based on prior mutations.

Of course, mathematicians love duality. So besides our familiar monad, there's also a complementary concept called a comonad, which provides us a framework within which to compose functions of type w a -> b--which is to say, each function takes the entire world as input and produces one new context-free value from that. Within this framework, it's impossible to produce any side effects--because, again, this would make the function impure--but we can indeed use as much or as little of the environment as we need to at each step to compute our result.

`IO` is not actually pure

The implications should be clear--that the impure I/O process, with its many interactions with the outside world, naturally forms both a monad and a comonad. As you chain such computations together through the IO scaffolding, functions will not only produce effects on the outside world, but may in turn depend on the effects of previous computations! In fact, this is the only reason to enforce sequencing of I/O calls in the first place!

However, for whatever reason, the designers of Haskell have chosen to give IO only a Monad instance. This implies functions can be chained off of it using >>=; however doing so means that these functions are not actually pure. They implicitly rely on information which is not represented in the monadic function signature a -> IO b!

It didn't have to be this way

We already know Haskell treats functors, monads and comonads as second-class citizens even as the entire value space represents all three. We also know that IO values must first be passed back to the runtime in order to kickstart the computation they represent, meaning that process is naturally asynchronous (like JavaScript's Promise); they can't be invoked directly by Haskell code. So why should IO have to be a second-class Monad if the only goal was to isolate the impure code via the type system? Would it not make more sense, given a value IO a, to allow a to be extracted directly by the code that cares about it (with the IO representing the outside world and the a being the actual value we need), and then we can have functions IO a -> IO b which could be freely composed without the monadic rigamarole?

IO seems to be so close to making I/O actions appear 100% pure (from the perspective of the program); instead it represents itself as an impure monad while simultaneously making composition of I/O operations more difficult. How did this happen?

Truly, mathematics permeates everything. Whether we want it to or not. ↩
There's a reason bind is sometimes called the God function! ↩

fatcerberus/io.md

The Curious Case of the IO Monad

What is a monad?

Fantastic effects and where to find them

IO is not actually pure

It didn't have to be this way

Footnotes

The Curious Case of the `IO` Monad

`IO` is not actually pure