Let's say you have a function getVal
like this:
getVal :: IO Int
getVal = do
valRef <- newIORef 2
modifyIORef valRef (+ 1)
val <- readIORef valRef
if val == 3
then do
val <- readIORef valRef
putStrLn $ "A Val: " ++ show val
return 3
else do
val <- readIORef valRef
putStrLn $ "B Val: " ++ show val
return 4
and you want to do a simple refactor of moving the code inside the then
and else
blocks outside of the if statement. In Haskell you can do this by setting values a
and b
to the code in the branches and then replacing the branch code with a
and b
.
getVal :: IO Int
getVal = do
valRef <- newIORef 2
let a = do
val <- readIORef valRef
putStrLn $ "A Val: " ++ show val
return 3
b = do
val <- readIORef valRef
putStrLn $ "B Val: " ++ show val
return 4
modifyIORef valRef (+ 1)
val <- readIORef valRef
if val == 3
then a
else b
Both the original code and the refactored code have the same result when ran in the main
function: it prints "A Val: 3" and returns 3.
This refactor is possible because IO is referentially transparent, which means when you set a variable to a certain value you can replace all usages of that value with the variable. But doesn't this work in other languages too? Let's try this with Scala.
Here's the translation of the original Haskell code in Scala:
def getVal: Int = {
var v = 2
v += 1
if (v == 3) {
println("A Val: " + v)
3
} else {
println("B Val: " + v)
4
}
}
Now we want to move out the branch code into a
and b
, similar to what we did in the Haskell code:
def getVal: Int = {
var v = 2
val a = {
println("A Val: " + v)
3
}
val b = {
println("B Val: " + v)
4
}
v += 1
if (v == 3) {
a
} else {
b
}
}
Is this code equivalent to the original version? No it isn't! The original code prints out
"A Val: 3" and returns 3 but the refactored code prints out "A Val: 2", "B Val: 2" and returns 3. This
is because the side effects are run immediately when a
and b
are declared, whereas with IO only the IO value returned
and bound inside the main
function is run.
So we can see that for a value v
, you cannot write val a = v
and substitute a
instead of v
everywhere
and be guaranteed to get the same result. That means assigning values with val
isn't referentially transparent and the simple refactor shown in the Haskell code isn't possible with it.
There is a solution to this specific example in Scala without having to use the IO monad. You can replace val
with def
or lazy val
and you will get the same result as the Haskell version:
def getVal: Int = {
var v = 2
def a = {
println("A Val: " + v)
3
}
def b = {
println("B Val: " + v)
4
}
v += 1
if (v == 3) {
a
} else {
b
}
}
So if def a = v
solves this problem what makes IO better than it? One problem with def
and lazy val
is that they do not prevent side effects, they just defer the side effect to be called later. Let's look at another example to show this.
In this example we are defining a Person
data type with an embedded effectful function
printAndGetName
which prints the person's name and returns it. We then make a Person
and print it to the console.
import Data.IORef
data Person = Person
{ _name :: String
, _age :: Int
, _printAndGetName :: IO String
}
instance Show Person where
show (Person name _ _) = show name
makePerson :: String -> Int -> Person
makePerson name age = Person
{ _name = name
, _age = age
, _printAndGetName = printAndGet
}
where
printAndGet = do
putStrLn $ "Name: " ++ name
return name
main :: IO ()
main = putStrLn $ "Person: " ++ show (makePerson "Bob" 30)
This code prints "Person: Bob". Note that makePerson isn't an IO function but it makes an IO value printAndGet
. In a
pure function, it is easy to pass around and manipulate IO values without being able to accidentally run them. Now let's look at the Scala version:
object Main {
case class Person(name: String, age: Int, printAndGetName: String)
def makePerson(name: String, age: Int): Person = {
def printAndGet = {
println("Name: " + name)
name
}
Person(name, age, printAndGet)
}
def main(args: Array[String]): Unit = {
println("Person: " + makePerson("Bob", 30))
}
}
This code has different behavior than the Haskell version; it prints out "Name: Bob" and "Person: Bob". In the Person(name, age, printAndGet)
line the printAndGet
function is accidentally ran and printing "Name: Bob" side effect happened immediately. Because functions or lazy values are simply deferring the side effect until later it is easier to accidentally run the side effect. In Haskell the only way you can get similar behavior is by using unsafePerformIO
, which is much easier to catch in a code review than a function call or evaluating a lazy variable.
Note: I used Scala code for the side effect examples, but Scala also has multiple implementations of IO: https://github.com/typelevel/cats-effect https://github.com/scalaz/scalaz-zio