Explaining some of the mechanics of interpretation of Free programs

Balaji Sivaraman @balajisivaraman_twitter

Hi all, I need some help understanding a piece of Doobie code from the examples. It is the StreamingCopy one: (https://github.com/tpolecat/doobie/blob/series/0.4.x/yax/example/src/main/scala/example/StreamingCopy.scala). I am using a modified version of the fuseMap2 example from that file. Here’s how I’ve modified it for my requirements:

  def fuseMap[F[_]: Catchable: Monad, A, B](
      source: Process[ConnectionIO, A],
      sink: Vector[A] => ConnectionIO[B],
      delete: ConnectionIO[Unit]
  )(
      sourceXA: Transactor[F],
      sinkXA: Transactor[F]
  ): F[Unit] =
    sinkXA.exec.apply(
      source
        .transact(sourceXA)
        .chunk(10000)
        .translate(λ[F ~> Kleisli[F, Connection, ?]](a => Kleisli(_ => a)))
        .evalMap(sink(_).foldMap(sinkXA.interpret))
        .append(Process.eval_(delete.foldMap(sinkXA.interpret)))
        .run
    )

Basically, I needed to use PostgresCopyManager to do Copy, instead of insert as in the example. That’s the reason for the chunking. Also I had to run a delete query as part of the same transaction. You can see the changes for this in the above code. I was able to make modifications purely by looking at the types and functions and using a bit of intuition. It works perfectly. The only problem is I have no idea why.

My knowledge of typed FP is basic. I can grok HKTs, understand why we do Catchable: Monad etc? But type lambdas and kind projector is where my brain becomes mush. I tried splitting up the above piece of code, but that didn’t help much, because I didn’t know how to break down the entire block that is passed to apply.

I'm still having difficulty wrapping my head around the types:

val exec: ~>[Kleisli[F, Connection, ?], F] = sinkXA.exec ->

This is the type of the exec expression. So my understanding is that we need to pass something of type Kleisli[F, Connection, A] to apply or something like that. Is that the type of the expression passed in to apply? For me, it shows the type is F[Unit]. My IntelliJ is also screwing up with the types. I also don't understand this line: translate(λ[F ~> Kleisli[F, Connection, ?]](a => Kleisli(_ => a))). Would appreciate help on that?

I read this post (http://underscore.io/blog/posts/2016/12/05/type-lambdas.html) to get a basic understanding of type lambdas, but it's not of much help here.

Fabio Labella @SystemFw

are you familiar with ~>?

Balaji Sivaraman @balajisivaraman

@SystemFw Apart from following the code and being taken to the natural transformation source code, not really. I also read this post (http://eed3si9n.com/learning-scalaz/Natural-Transformation.html), and got a very rudimentary idea, but the last part of that post with the Cat Theory was lost on me.

Fabio Labella @SystemFw

right, let's skip Cat Theory. Are you familiar with Free at all? (in code, not CT)

Balaji Sivaraman @balajisivaraman

Have seen both Rob's presentation (https://www.youtube.com/watch?v=M5MF6M7FHPo) and Runar's (https://www.youtube.com/watch?v=M258zVn4m2M) , but this is my first real experience using a library with those concepts. I guess that's why I'm struggling.

Fabio Labella @SystemFw

so, basically you have an ADT describing the instructions of your language

trait KVS[A]
case class Get(k: String) extends KVS[Option[String]]
case class Put(k: String, v: String) extends KVS[Unit]

note that KVS has higher kind, i.e. KVS[_], like Option and List but unlike Int. Free represent a description of a program, and it takes two type parameters: Free[Instr[_], A].

Instr[_] is your ADT, it represents the individual instructions in your mini language. It needs to be [_] so that we can express the fact that each instruction returns a different type (e.g. Option[String] for Get, Unit for Put).A represents the return type of your mini program.

So Free[Instr[_], A] can be seen as a program written in the mini language whose primitives are described by Instr[_] which, when run, will eventually result in a value of type A. Are you with me so far?

Balaji Sivaraman @balajisivaraman

Yes. I'm able to follow it.

Fabio Labella @SystemFw

Cool, now you have this data structure that represents a program. The way we run it is by interpreting it into another, more powerful data structure, e.g. IO, or Task, which you can look at as a Free with an unlimited set of instructions.

Once we have this more powerful type, we can keep composing it with other Task (let's pick Task for now) and finally, when we're done, we call unsafePerformSync at the end of the world, and things actually happen. Is this also clear?

Balaji Sivaraman @balajisivaraman

Yeah. Except the part with "unlimited set of instructions", I'm not sure what unlimited means here.

Fabio Labella @SystemFw

it means that in IO you can do everything since IO encapsulates arbitrary side effects from the impure subset of Scala, whereas in your own Free you can only do Get or Put. It's a fairly poor metaphor, apologies.

Balaji Sivaraman @balajisivaraman

Ok. But still limited by what IO and Task can actually do right?

Fabio Labella @SystemFw

yes, but IO and Task can do anything Scala can do:

Task.delay { // all scala code can fit here, e.g. println()}

Balaji Sivaraman @balajisivaraman

In terms of side-effects? Now I get it.

Fabio Labella @SystemFw

yes, albeit they are not side effects, because Task and IO encapsulate them, so nothing is happening until you explicitly run them

Balaji Sivaraman @balajisivaraman

Ah, I'm with you so far.

Fabio Labella @SystemFw

right, so interpreting a Free means going from a less powerful, but more precise language, to a more powerful but less precise language. Natural Transformations are involved in the process of interpretation: I'm going to try to explain this interpretation in english, then we can move to code.

let's say Instr[_] is your ADT, and Target[_] is any Monad (we will pick Task). Interpreting Free means:

if you give me a way to translate each instruction in Instr[_] into an instruction in Target[_], I will translate each program in Free[Instr, A] into a program in Target[A]

does this sentence make sense?

Balaji Sivaraman @balajisivaraman

Okay. Normally (as per the YT presentations and my understanding) when we interpret a Free program, we get something of type A (Get interpreted would give A, Put interpreted would give Unit). Now instead of that, what we're saying is we will interpret it into another target Monad (Task) and then use Task to gain additional benefits.

Fabio Labella @SystemFw

when we interpret a Free program, we get something of type A

No, that's incorrect. A Free monad is always translated into another Monad; the other Monad then does what's needed to produce an A. The only exception is when some examples use Id, with side effects:that's wrong, and I don't like there are so many examples of this, even on the cats website

Balaji Sivaraman @balajisivaraman

Okay. Thanks for the clarification. Now I get it. Ah, that's what threw me off. I'll keep it in mind going forward.

Fabio Labella @SystemFw

yeah, but without getting too technical, that sentence in quotes above is actually what being free means.

Cool, now let's see how the interpretation function looks in code; the most common way to interpret a Free program (there are others) is through foldMap:

 trait Free[F[_], A] {
 //...
 def foldMap[G](translator: F ~> G)(implicit gmonad: Monad[G]): G[A] = ???
}

which you should be able to reconcile with the sentence above

Balaji Sivaraman @balajisivaraman

The F here is Instr and the G here is Task.

Fabio Labella @SystemFw

Yes.

if you give me a way to translate each instruction in Instr[] into an instruction in Target[]

that's F ~> G, I'll explain more about it in a second

I will translate each program in Free[Instr, A]

that's why foldMap is on Free

(implicit gmonad: Monad[G])

as long as the target is a monad

Balaji Sivaraman @balajisivaraman

Ok, that makes sense.

Fabio Labella @SystemFw

now the implementation of foldMap is not important for the end user,what's important is why we need ~>. This might be a bit tricky to understand, but it's doable.

So what you want is pass to foldMap something that can translate each instruction in Instr[_] to something in Task[_], which is normally written as a pattern match. How would you write that? Any ideas? I'm only interested in the signature, the implementation doesn't really matter

Balaji Sivaraman @balajisivaraman

Just taking a shot, def translator[A](instr: Instr[A): Task[A]

Fabio Labella @SystemFw

That's the most reasonable assumption, but it doesn't work. translator takes a type parameter B (let's rename it to B to avoid confusing it with the A in Free), so if you do foldMap(translator) you will need to specify the type B (e.g. foldMap(translator[Int]). What type do you assign to B, instead of the bogus Int I put there?

Balaji Sivaraman @balajisivaraman

Hmmm, my guess is we don't have one yet. We want it to work for all possible Bs, now and in the future.

Fabio Labella @SystemFw

wow, that's the correct answer! We want to pass in a polymorphic function, so that when we pass translator, we want to keep its polymorphism, because we will only specialise once we have a specific instruction.

This difference can be made specific in types, but scala doesn't give you a nice notation for it. I'm going to write it in Haskell for a second, but it should be ok:

theTranslatorYouHave :: forall f g b. (f b -> g b) -- f b = Instr[B], g b = Task[B]

theTranslatorYouWant: forall f g. (forall b. f b -> g b)

which basically means:

we know what F[_] and G[_] are when we pass translator to foldMap (Instr[_] and Task[_] respectively), so we can specialise them, but we want to still be polymorphic in B, because only the body of foldMap knows how to specialise B

Does this make any sense? (btw, very impressive that you came up with "Hmmm, my guess is we don't have one yet. We want it to work for all possible Bs, now and in the future.", this is a deep concept that normally takes time to sink in)

So it should be clear I hope that just a normal function can't work as an argument to foldMap, because we want it to work for all possible B, now and in the future

Balaji Sivaraman @balajisivaraman

Thank you. Yes, because when we call foldMap, we won't have a specialized B. That will only come in higher up the chain when someone calls us. So when we write it, we want it to work for anything that could be passed in. Hopefully that is right.

Fabio Labella @SystemFw

Yes, it's roughly right. The ability to pass polymorphic functions around, without losing their polymorphism, is called higher-rank polymorphism. Scala does not have such a feature, but we can work our way around it.

Let's create a trait:

 trait Translator[F[_], G[_]] {
  def apply[B](in: F[B]): G[B]
}

Now we can write:

def translator: Translator[Instr, Task] = new Translator[Instr, Task] {
  def apply[B](in: Instr[B]): Task[B] = in match {
      case Put(k, v) => Task.delay (DB.put(k, v))
      case Get(k) => Task.delay(DB.get(k))
}

and then say ourFree.foldMap(translator).

Notice that the type of translator doesn't mention B, but only Instr and Task. The apply method is still polymorphic in every B, so the body of foldMap can call it, and it'll work.

Does this make sense?

Balaji Sivaraman @balajisivaraman

Yes. I'm with you so far. Just to clarify, within the body of foldMap, we have an A, which is basically the A of the Free. And this is the A that is applied when we call Translator.apply. Is that right?

Fabio Labella @SystemFw

Not quite. Within the body of foldMap, we pattern match on Free, which is a program containing instructions. So apply is called on each of the instructions: Get is translated to Task[Option[String]], Put is translated to Task[Unit], and so on.

After doing this translation (which is the map, in foldMap), we combine the resulting instructions in the same way they were combined in the original program.

Balaji Sivaraman @balajisivaraman

Okay. That makes a whole lot of sense. Thanks a lot. And the translator in our case is the Natural Transformation.

Fabio Labella @SystemFw

Exactly, A is the return type of the whole program. B is the return type of each individual instructions, that's why we need the polymorphism. And, for Category Theory reasons that are (as I hope I've proved) not necessary to understand how Free works to an usable extent, Translator is the natural transformation:

trait ~>[F[_], G[_]] {
  def apply[A](f: F[A]): G[A]
}

What is there left that confuses you in your original question?

Balaji Sivaraman @balajisivaraman

Nothing. I got it. (I think.) Thanks a lot. And the interpreter for Doobie is the one in transactor: ConnectionOp ~> Kleisli[M, Connection, ?]. This will translate any program in ConnectionIO to a Kleisli program. In my case, it will be a Kleisli[Task, Connection, Unit]. And when I run that, I will get a Task[Unit], which is the final output type.

Fabio Labella @SystemFw

Yes. You need the type lambda because G[_] is of higher kind in ~>: you still need to be polymorphic in B

Balaji Sivaraman @balajisivaraman

Ok. So we need that to mean Kleisli[M, Connection, _], substitute whatever type you want for the underscore. I assume this is also because of a limitation of Scala.

Fabio Labella @SystemFw

so Kleisli[M, Connection, Unit] is wrong, you need type Target[A] = Kleisli[M, Connection, A] and ConnectionOp ~> Target (note Target, not Target[Unit]). The type lambda is a shorthand for that.

Well, it's more like we put an ?, instead of an _, so it's inconsistent (_ means several different things in the scala type system, which is unfortunate), but we can do it: Kleisli[M, Connection, _] is written Kleisli[M, Connection, ?], and there's your type lambda

Balaji Sivaraman @balajisivaraman

Okay. I got it. Just to clarify, based on my reading, that would be the same astype SomeType[A] = Kleisli[M, Connection, A], which falls neatly into the "shape" G[_].

Fabio Labella @SystemFw

Yes, exactly

Balaji Sivaraman @balajisivaraman

Actually, your entire explanation from my question has made a whole lot of sense. Thank you for taking the time to do it. 😀 I now feel I'm in a position to dig deeper into not only the fuseMap function, but also the surface-level of Doobie.

Fabio Labella @SystemFw

Yes, translate is another variation of this pattern, when you want to go from Free[Lang1, A] to Free[Lang2, A], so basically just changing the instruction set.

ConnectionIO is Free. Fs2 Stream has (a lot of) extra machinery, but the basic idea is the same: we parameterise over the effect F, and then in the end interpret to an F[A]

Balaji Sivaraman @balajisivaraman

Okay. I got it. Thank you so much.

Fabio Labella @SystemFw

No worries :)

Balaji Sivaraman @balajisivaraman

I will revisit both fuseMap and also poke around Doobie and Scalaz-Stream.

Fabio Labella @SystemFw

Yes, the important thing is to grasp the key concepts:

A Free monad is always translated to another monad, which does the actual work
The process of interpretation takes a translation between instructions, and gives a translation between programs
You need higher-rank polymorphism (the trait trick)
everything follows by having things fit into the right shape

Zizheng Tai @ZizhengTai

Fabio you are a great teacher.

Balaji Sivaraman @balajisivaraman

Okay. 👍 @ZizhengTai Indeed.

Fabio Labella @SystemFw

@ZizhengTai Thank you! :-) I will put this in a gist for future consultation, I think conversations are a great way to learn

SystemFw/Free conversation.md

karlroberts commented Jul 12, 2018

Yaneeve commented Nov 29, 2018

balanka commented Jun 9, 2020

nmcb commented Sep 25, 2020

bphenriques commented Oct 17, 2023