Balaji Sivaraman @balajisivaraman_twitter
Hi all, I need some help understanding a piece of Doobie code from the examples. It is the StreamingCopy one: (https://github.com/tpolecat/doobie/blob/series/0.4.x/yax/example/src/main/scala/example/StreamingCopy.scala). I am using a modified version of the fuseMap2 example from that file. Here’s how I’ve modified it for my requirements:
def fuseMap[F[_]: Catchable: Monad, A, B](
source: Process[ConnectionIO, A],
sink: Vector[A] => ConnectionIO[B],
delete: ConnectionIO[Unit]
)(
sourceXA: Transactor[F],
sinkXA: Transactor[F]
): F[Unit] =
sinkXA.exec.apply(
source
.transact(sourceXA)
.chunk(10000)
.translate(λ[F ~> Kleisli[F, Connection, ?]](a => Kleisli(_ => a)))
.evalMap(sink(_).foldMap(sinkXA.interpret))
.append(Process.eval_(delete.foldMap(sinkXA.interpret)))
.run
)
Basically, I needed to use PostgresCopyManager to do Copy, instead of insert as in the example. That’s the reason for the chunking. Also I had to run a delete query as part of the same transaction. You can see the changes for this in the above code. I was able to make modifications purely by looking at the types and functions and using a bit of intuition. It works perfectly. The only problem is I have no idea why.
My knowledge of typed FP is basic. I can grok HKTs, understand why we do Catchable: Monad etc? But type lambdas and kind projector is where my brain becomes mush. I tried splitting up the above piece of code, but that didn’t help much, because I didn’t know how to break down the entire block that is passed to apply.
I'm still having difficulty wrapping my head around the types:
val exec: ~>[Kleisli[F, Connection, ?], F] = sinkXA.exec ->
This is the type of the exec expression. So my understanding is that we need to pass something of type Kleisli[F, Connection, A] to apply or something like that. Is that the type of the expression passed in to apply? For me, it shows the type is F[Unit]. My IntelliJ is also screwing up with the types. I also don't understand this line: translate(λ[F ~> Kleisli[F, Connection, ?]](a => Kleisli(_ => a)))
. Would appreciate help on that?
I read this post (http://underscore.io/blog/posts/2016/12/05/type-lambdas.html) to get a basic understanding of type lambdas, but it's not of much help here.
Fabio Labella @SystemFw
are you familiar with ~>?
Balaji Sivaraman @balajisivaraman
@SystemFw Apart from following the code and being taken to the natural transformation source code, not really. I also read this post (http://eed3si9n.com/learning-scalaz/Natural-Transformation.html), and got a very rudimentary idea, but the last part of that post with the Cat Theory was lost on me.
Fabio Labella @SystemFw
right, let's skip Cat Theory. Are you familiar with Free at all? (in code, not CT)
Balaji Sivaraman @balajisivaraman
Have seen both Rob's presentation (https://www.youtube.com/watch?v=M5MF6M7FHPo) and Runar's (https://www.youtube.com/watch?v=M258zVn4m2M) , but this is my first real experience using a library with those concepts. I guess that's why I'm struggling.
Fabio Labella @SystemFw
so, basically you have an ADT describing the instructions of your language
trait KVS[A]
case class Get(k: String) extends KVS[Option[String]]
case class Put(k: String, v: String) extends KVS[Unit]
note that KVS
has higher kind, i.e. KVS[_]
, like Option
and List
but unlike Int
. Free
represent a description of a program, and it takes two type parameters: Free[Instr[_], A]
.
Instr[_]
is your ADT, it represents the individual instructions in your mini language. It needs to be [_]
so that we can express the fact that each instruction returns a different type (e.g. Option[String]
for Get
, Unit
for Put
).A
represents the return type of your mini program.
So Free[Instr[_], A]
can be seen as a program written in the mini language whose primitives are described by Instr[_]
which, when run, will eventually result in a value of type A
. Are you with me so far?
Balaji Sivaraman @balajisivaraman
Yes. I'm able to follow it.
Fabio Labella @SystemFw
Cool, now you have this data structure that represents a program. The way we run it is by interpreting it into another, more powerful data structure, e.g. IO
, or Task
, which you can look at as a Free
with an unlimited set of instructions.
Once we have this more powerful type, we can keep composing it with other Task
(let's pick Task
for now) and finally, when we're done, we call unsafePerformSync
at the end of the world, and things actually happen. Is this also clear?
Balaji Sivaraman @balajisivaraman
Yeah. Except the part with "unlimited set of instructions", I'm not sure what unlimited means here.
Fabio Labella @SystemFw
it means that in IO
you can do everything since IO
encapsulates arbitrary side effects from the impure subset of Scala,
whereas in your own Free
you can only do Get
or Put
. It's a fairly poor metaphor, apologies.
Balaji Sivaraman @balajisivaraman
Ok. But still limited by what IO
and Task
can actually do right?
Fabio Labella @SystemFw
yes, but IO
and Task
can do anything Scala can do:
Task.delay { // all scala code can fit here, e.g. println()}
Balaji Sivaraman @balajisivaraman
In terms of side-effects? Now I get it.
Fabio Labella @SystemFw
yes, albeit they are not side effects, because Task
and IO
encapsulate them, so nothing is happening until you explicitly run them
Balaji Sivaraman @balajisivaraman
Ah, I'm with you so far.
Fabio Labella @SystemFw
right, so interpreting a Free
means going from a less powerful, but more precise language, to a more powerful but less precise language. Natural Transformations are involved in the process of interpretation: I'm going to try to explain this interpretation in english, then we can move to code.
let's say Instr[_]
is your ADT, and Target[_]
is any Monad
(we will pick Task). Interpreting Free
means:
if you give me a way to translate each instruction in
Instr[_]
into an instruction inTarget[_]
, I will translate each program inFree[Instr, A]
into a program inTarget[A]
does this sentence make sense?
Balaji Sivaraman @balajisivaraman
Okay. Normally (as per the YT presentations and my understanding) when we interpret a Free program, we get something of type A (Get interpreted would give A, Put interpreted would give Unit). Now instead of that, what we're saying is we will interpret it into another target Monad (Task) and then use Task to gain additional benefits.
Fabio Labella @SystemFw
when we interpret a Free program, we get something of type A
No, that's incorrect. A Free monad
is always translated into another Monad
; the other Monad
then does what's needed to produce an A
. The only exception is when some examples use Id
, with side effects:that's wrong, and I don't like there are so many examples of this, even on the cats website
Balaji Sivaraman @balajisivaraman
Okay. Thanks for the clarification. Now I get it. Ah, that's what threw me off. I'll keep it in mind going forward.
Fabio Labella @SystemFw
yeah, but without getting too technical, that sentence in quotes above is actually what being free
means.
Cool, now let's see how the interpretation function looks in code; the most common way to interpret a Free
program (there are others) is through foldMap
:
trait Free[F[_], A] {
//...
def foldMap[G](translator: F ~> G)(implicit gmonad: Monad[G]): G[A] = ???
}
which you should be able to reconcile with the sentence above
Balaji Sivaraman @balajisivaraman
The F here is Instr and the G here is Task.
Fabio Labella @SystemFw
Yes.
if you give me a way to translate each instruction in
Instr[]
into an instruction inTarget[]
that's F ~> G
, I'll explain more about it in a second
I will translate each program in
Free[Instr, A]
that's why foldMap
is on Free
(implicit gmonad: Monad[G])
as long as the target is a monad
Balaji Sivaraman @balajisivaraman
Ok, that makes sense.
Fabio Labella @SystemFw
now the implementation of foldMap
is not important for the end user,what's important is why we need ~>
. This might be a bit tricky to understand, but it's doable.
So what you want is pass to foldMap
something that can translate each instruction in Instr[_]
to something in Task[_]
,
which is normally written as a pattern match. How would you write that? Any ideas?
I'm only interested in the signature, the implementation doesn't really matter
Balaji Sivaraman @balajisivaraman
Just taking a shot, def translator[A](instr: Instr[A): Task[A]
Fabio Labella @SystemFw
That's the most reasonable assumption, but it doesn't work.
translator
takes a type parameter B
(let's rename it to B
to avoid confusing it with the A
in Free
), so if you do foldMap(translator)
you will need to specify the type B
(e.g. foldMap(translator[Int]
).
What type do you assign to B
, instead of the bogus Int
I put there?
Balaji Sivaraman @balajisivaraman
Hmmm, my guess is we don't have one yet. We want it to work for all possible Bs, now and in the future.
Fabio Labella @SystemFw
wow, that's the correct answer! We want to pass in a polymorphic function, so that when we pass translator
, we want to keep its polymorphism, because we will only specialise once we have a specific instruction.
This difference can be made specific in types, but scala doesn't give you a nice notation for it. I'm going to write it in Haskell for a second, but it should be ok:
theTranslatorYouHave :: forall f g b. (f b -> g b) -- f b = Instr[B], g b = Task[B]
theTranslatorYouWant: forall f g. (forall b. f b -> g b)
which basically means:
we know what
F[_]
andG[_]
are when we passtranslator
tofoldMap
(Instr[_]
andTask[_]
respectively), so we can specialise them, but we want to still be polymorphic inB
, because only the body offoldMap
knows how to specialiseB
Does this make any sense? (btw, very impressive that you came up with "Hmmm, my guess is we don't have one yet. We want it to work for all possible Bs, now and in the future.", this is a deep concept that normally takes time to sink in)
So it should be clear I hope that just a normal function can't work as an argument to foldMap
, because we want it to work for all possible B
, now and in the future
Balaji Sivaraman @balajisivaraman
Thank you. Yes, because when we call foldMap, we won't have a specialized B. That will only come in higher up the chain when someone calls us. So when we write it, we want it to work for anything that could be passed in. Hopefully that is right.
Fabio Labella @SystemFw
Yes, it's roughly right. The ability to pass polymorphic functions around, without losing their polymorphism, is called higher-rank polymorphism. Scala does not have such a feature, but we can work our way around it.
Let's create a trait:
trait Translator[F[_], G[_]] {
def apply[B](in: F[B]): G[B]
}
Now we can write:
def translator: Translator[Instr, Task] = new Translator[Instr, Task] {
def apply[B](in: Instr[B]): Task[B] = in match {
case Put(k, v) => Task.delay (DB.put(k, v))
case Get(k) => Task.delay(DB.get(k))
}
and then say ourFree.foldMap(translator)
.
Notice that the type of translator doesn't mention B
, but only Instr
and Task
. The apply method is still polymorphic in every B
, so the body of foldMap
can call it, and it'll work.
Does this make sense?
Balaji Sivaraman @balajisivaraman
Yes. I'm with you so far. Just to clarify, within the body of foldMap, we have an A, which is basically the A of the Free. And this is the A that is applied when we call Translator.apply. Is that right?
Fabio Labella @SystemFw
Not quite. Within the body of foldMap
, we pattern match on Free
, which is a program containing instructions.
So apply
is called on each of the instructions: Get
is translated to Task[Option[String]]
, Put
is translated to Task[Unit]
, and so on.
After doing this translation (which is the map
, in foldMap
), we combine the resulting instructions in the same way they were combined in the original program.
Balaji Sivaraman @balajisivaraman
Okay. That makes a whole lot of sense. Thanks a lot. And the translator in our case is the Natural Transformation.
Fabio Labella @SystemFw
Exactly, A
is the return type of the whole program. B
is the return type of each individual instructions, that's why we need the polymorphism. And, for Category Theory reasons that are (as I hope I've proved) not necessary to understand how Free
works to an usable extent, Translator
is the natural transformation:
trait ~>[F[_], G[_]] {
def apply[A](f: F[A]): G[A]
}
What is there left that confuses you in your original question?
Balaji Sivaraman @balajisivaraman
Nothing. I got it. (I think.) Thanks a lot. And the interpreter for Doobie is the one in transactor: ConnectionOp ~> Kleisli[M, Connection, ?]. This will translate any program in ConnectionIO to a Kleisli program. In my case, it will be a Kleisli[Task, Connection, Unit]. And when I run that, I will get a Task[Unit], which is the final output type.
Fabio Labella @SystemFw
Yes. You need the type lambda because G[_]
is of higher kind in ~>
: you still need to be polymorphic in B
Balaji Sivaraman @balajisivaraman
Ok. So we need that to mean Kleisli[M, Connection, _], substitute whatever type you want for the underscore. I assume this is also because of a limitation of Scala.
Fabio Labella @SystemFw
so Kleisli[M, Connection, Unit]
is wrong, you need type Target[A] = Kleisli[M, Connection, A]
and ConnectionOp ~> Target
(note Target
, not Target[Unit]
). The type lambda is a shorthand for that.
Well, it's more like we put an ?
, instead of an _
, so it's inconsistent (_
means several different things in the scala type system, which is unfortunate), but we can do it: Kleisli[M, Connection, _]
is written Kleisli[M, Connection, ?]
, and there's your type lambda
Balaji Sivaraman @balajisivaraman
Okay. I got it. Just to clarify, based on my reading, that would be the same astype SomeType[A] = Kleisli[M, Connection, A], which falls neatly into the "shape" G[_].
Fabio Labella @SystemFw
Yes, exactly
Balaji Sivaraman @balajisivaraman
Actually, your entire explanation from my question has made a whole lot of sense. Thank you for taking the time to do it. 😀 I now feel I'm in a position to dig deeper into not only the fuseMap function, but also the surface-level of Doobie.
Fabio Labella @SystemFw
Yes, translate
is another variation of this pattern, when you want to go from Free[Lang1, A]
to Free[Lang2, A]
, so basically just changing the instruction set.
ConnectionIO
is Free
. Fs2 Stream
has (a lot of) extra machinery, but the basic idea is the same: we parameterise over the effect F
, and then in the end interpret to an F[A]
Balaji Sivaraman @balajisivaraman
Okay. I got it. Thank you so much.
Fabio Labella @SystemFw
No worries :)
Balaji Sivaraman @balajisivaraman
I will revisit both fuseMap and also poke around Doobie and Scalaz-Stream.
Fabio Labella @SystemFw
Yes, the important thing is to grasp the key concepts:
- A Free monad is always translated to another monad, which does the actual work
- The process of interpretation takes a translation between instructions, and gives a translation between programs
- You need higher-rank polymorphism (the trait trick)
- everything follows by having things fit into the right shape
Zizheng Tai @ZizhengTai
Fabio you are a great teacher.
Balaji Sivaraman @balajisivaraman
Okay. 👍 @ZizhengTai Indeed.
Fabio Labella @SystemFw
@ZizhengTai Thank you! :-) I will put this in a gist for future consultation, I think conversations are a great way to learn
Truly a great community! Thank you for helping us mortals grapple with these concepts 💪