Examples to show the benefits of the IO monad

Simple Example 1:

Let's say you have a function getVal like this:

getVal :: IO Int
getVal = do
    valRef <- newIORef 2
    modifyIORef valRef (+ 1)
    val <- readIORef valRef
    if val == 3
        then do
            val <- readIORef valRef
            putStrLn $ "A Val: " ++ show val
            return 3
        else do
            val <- readIORef valRef
            putStrLn $ "B Val: " ++ show val
            return 4

and you want to do a simple refactor of moving the code inside the then and else blocks outside of the if statement. In Haskell you can do this by setting values a and b to the code in the branches and then replacing the branch code with a and b.

getVal :: IO Int
getVal = do
    valRef <- newIORef 2
    let a = do
            val <- readIORef valRef
            putStrLn $ "A Val: " ++ show val
            return 3
        b = do
            val <- readIORef valRef
            putStrLn $ "B Val: " ++ show val
            return 4
    modifyIORef valRef (+ 1)
    val <- readIORef valRef
    if val == 3
        then a
        else b

Both the original code and the refactored code have the same result when ran in the main function: it prints "A Val: 3" and returns 3.

This refactor is possible because IO is referentially transparent, which means when you set a variable to a certain value you can replace all usages of that value with the variable. But doesn't this work in other languages too? Let's try this with Scala.

Here's the translation of the original Haskell code in Scala:

def getVal: Int = {
  var v = 2
  v += 1

  if (v == 3) {
    println("A Val: " + v)
    3
  } else {
    println("B Val: " + v)
    4
  }
}

Now we want to move out the branch code into a and b, similar to what we did in the Haskell code:

def getVal: Int = {
  var v = 2
  val a = {
    println("A Val: " + v)
    3
  }
  val b = {
    println("B Val: " + v)
    4
  }
  v += 1

  if (v == 3) {
    a
  } else {
    b
  }
}

Is this code equivalent to the original version? No it isn't! The original code prints out "A Val: 3" and returns 3 but the refactored code prints out "A Val: 2", "B Val: 2" and returns 3. This is because the side effects are run immediately when a and b are declared, whereas with IO only the IO value returned and bound inside the main function is run.

So we can see that for a value v, you cannot write val a = v and substitute a instead of v everywhere and be guaranteed to get the same result. That means assigning values with val isn't referentially transparent and the simple refactor shown in the Haskell code isn't possible with it.

There is a solution to this specific example in Scala without having to use the IO monad. You can replace val with def or lazy val and you will get the same result as the Haskell version:

def getVal: Int = {
  var v = 2
  def a = {
    println("A Val: " + v)
    3
  }
  def b = {
    println("B Val: " + v)
    4
  }
  v += 1

  if (v == 3) {
    a
  } else {
    b
  }
}

So if def a = v solves this problem what makes IO better than it? One problem with def and lazy val is that they do not prevent side effects, they just defer the side effect to be called later. Let's look at another example to show this.

Simple Example 2

In this example we are defining a Person data type with an embedded effectful function printAndGetName which prints the person's name and returns it. We then make a Person and print it to the console.

import Data.IORef

data Person = Person
    { _name            :: String
    , _age             :: Int
    , _printAndGetName :: IO String
    }

instance Show Person where
    show (Person name _ _) = show name
  
makePerson :: String -> Int -> Person
makePerson name age = Person
    { _name            = name
    , _age             = age
    , _printAndGetName = printAndGet
    }
  where
    printAndGet = do
        putStrLn $ "Name: " ++ name
        return name

main :: IO ()
main = putStrLn $ "Person: " ++ show (makePerson "Bob" 30)

This code prints "Person: Bob". Note that makePerson isn't an IO function but it makes an IO value printAndGet. In a pure function, it is easy to pass around and manipulate IO values without being able to accidentally run them. Now let's look at the Scala version:

object Main {
  case class Person(name: String, age: Int, printAndGetName: String)

  def makePerson(name: String, age: Int): Person = {
    def printAndGet = {
      println("Name: " + name)
      name
    }

    Person(name, age, printAndGet)
  }

  def main(args: Array[String]): Unit = {
    println("Person: " + makePerson("Bob", 30))
  }
}

This code has different behavior than the Haskell version; it prints out "Name: Bob" and "Person: Bob". In the Person(name, age, printAndGet) line the printAndGet function is accidentally ran and printing "Name: Bob" side effect happened immediately. Because functions or lazy values are simply deferring the side effect until later it is easier to accidentally run the side effect. In Haskell the only way you can get similar behavior is by using unsafePerformIO, which is much easier to catch in a code review than a function call or evaluating a lazy variable.

Note: I used Scala code for the side effect examples, but Scala also has multiple implementations of IO: https://github.com/typelevel/cats-effect https://github.com/scalaz/scalaz-zio

DarinM223/referential_transparency_examples.md

Simple Example 1:

Simple Example 2