Skip to content

Instantly share code, notes, and snippets.

@patrickjcurran
Last active July 1, 2024 15:01
Show Gist options
  • Save patrickjcurran/eaf0e165ff0f0432fde9f9cd852c58f5 to your computer and use it in GitHub Desktop.
Save patrickjcurran/eaf0e165ff0f0432fde9f9cd852c58f5 to your computer and use it in GitHub Desktop.
Vistors, F-Algebras, and Transducers

Vistors, F-Algebras, and Transducers

So Doug made an interesting PR, which uses the visitor pattern described in Haoyi's post.

As I was trying to understand the vistor pattern, I wanted to see how it compared to using F-algebras.

Haoyi describes the benefits of the vistor pattern to include:

  • Composability
  • Not creating intermediate datastructures
  • Streaming

I've found that using F-algebras will give composability without having to create intermediate data structures, but unfortunately there's not a great way to do streaming. (I'm pretty confident that you can write to an output stream, but that you need to start with a full JSON object in memory, so no streaming on the read side.)

This is subjective, but I do think using F-algebras is a bit more elegant, so if you just need the composability, it might be a good choice. In any case it's a neat tool and I thought I'd share since I went through the examples in Haoyi's post and translated them.

Note: I strongly recommend you familiarize yourself with Haoyi's post on the visitor pattern. As all the examples are copied from there, I will simply refer to is as "the post".

Scala Fiddle if you want to go strait to the code.

F is for Functor

The post introduces a "Json-lite" data stucture for the sake of having an example.

abstract class Json
case class Str(value: String) extends Json
case class Num(value: Int) extends Json
case class Dict(pairs: (String, Json)*) extends Json

This data structure is recursive because the Dict values are again Json.

F-algebras can come up when dealing with such recursive data structures, and one of the main things they give us is a generic way to fold over the data structure.

(Side note: Lists are recursive since they are either the empty list or a head and a tail, which is again a list. Lists, of course, have the fold method that you know and love.)

The key insight it to realize that we need to take a step back, and instead of defining Json directly we will define a functor by replacing the Json self reference with a type parameter:

sealed trait JsonF[+T]
case class StrF(value: String) extends JsonF[Nothing]
case class NumF(value: Int) extends JsonF[Nothing]
case class DictF[T](pairs: (String, T)*) extends JsonF[T]

Note that the values of the Dict are now T instead of Json.

(Exercise: define a Functor[JsonF] instance by implementing map.)

I want to emphasize that we don't actually have Json yet. Json is a fixed point of the JsonF functor, and so we'd like to do something like: type Json = JsonF[Json], but the compiler complains "illegal cyclic reference involving type Json".

No worries, we'll define Json later. In the meantime we can actually do quite a bit.

F-Algebras

In the context of what we are doing, an F-algebra is the type of the argument that would be passed to fold. For example, on List[A] the foldRight method takes a B as well as an (A,B) => B. (Note: if you want to combine those into a single argument the type could be Nil + (A,B) => B, where + is some coproducty thing like Either or a sealed trait.)

It's not at all obvious, but that general type that needs to be passed to a fold is:

type Algebra[F[_], A] = F[A] => A

(Note: While the type alias is always defined, it would be in poor taste to use it if F was not a functor.)

Terminating Visitors are JsonF-algebras

In the post, some visitors can be composed with other visitors, while others are "terminating". The terminating visitors are JsonF-algebras:

val summationAlgebra: Algebra[JsonF, Int] = {
 case StrF(x) => 0
 case NumF(n) => n
 case d: DictF[Int] => d.pairs.foldLeft(0)( _ + _._2)
}

private def quote(string: String) = "\"" + string + "\""
val stringifyAlgebra: Algebra[JsonF, String] = {
 case StrF(x) => quote(x)
 case NumF(n) => n.toString
 case d: DictF[String] => s"{${d.pairs.map {case (key, value) => s"${quote(key)}:$value"}.mkString(",")}}"
}

The first sums all the numbers in a Json tree and the second serializes it to a string.

(Note: We still haven't defined Json yet.)

Transducers

Rich Hickey coined the term transducer for "reducing function transformers". (Side note: I think transducers are really cool, but haven't seen the idea much outside of Clojure.) In any case, if you read the post you'll see

;;reducing function signature
whatever, input -> whatever

;;transducer signature
(whatever, input -> whatever) -> (whatever, input -> whatever)

Those "reducing functions" are the kinds of things that one would pass to reduce or fold. For us, those are F-algebras, and so, we can define a transducer as:

type Transducer[F[_], G[_], A, B] = Algebra[F, A] => Algebra[G,B]

(Side Note: I'm letting both the functor (F and G) as well as the carriers (A and B) be different, but I suspect that in practice A == B.)

Composable Visitors are Transducers.

def toIntTreeTransducer[A]: Transducer[JsonF, JsonF, A, A] = alg => {
  case StrF(value) => if (value.forall(_.isDigit))
                        alg(NumF(value.toInt))
                      else
                        alg(StrF(value))
  case x => alg(x)
}

def redactTreeTransducer[A]: Transducer[JsonF, JsonF, A, A] = alg => {
  case d: DictF[A] => alg(DictF(d.pairs.filterNot(_._1 == "hello"): _*))
  case x => alg(x)
}

Since transducers are really just functions they compose with ordinary function composition.

(Note: We still haven't defined Json yet.)

Json: Ad-Hoc Approach

We are going to define Json two ways. The first way is to just add fold to the definition in the post. We know what the signature to fold should be, so the implementation more or less just falls out:

  abstract class Json {
    def fold[B](alg: Algebra[JsonF, B]): B
  }
  case class Str(value: String) extends Json {
    override def fold[B](alg: Algebra[JsonF, B]): B = alg(StrF(value))
  }
  case class Num(value: Int) extends Json {
    override def fold[B](alg: Algebra[JsonF, B]): B = alg(NumF(value))
  }
  case class Dict(pairs: (String, Json)*) extends Json {
    override def fold[B](alg: Algebra[JsonF, B]): B = 
      alg(DictF[B](pairs.map {case (k, v) => k -> v.fold(alg)}: _*))
  }

  val tree: Json = Dict(
    "hello" -> Dict(
      "i am" -> Dict("cow" -> Num(1)),
      "you are" -> Dict("cow" -> Num(2))
    ),
    "world" -> Num(31337),
    "bye" -> Str("314")
  )

  tree.fold(stringifyAlgebra)
  // {"hello":{"i am":{"cow":1},"you are":{"cow":2}},"world":31337,"bye":"314"}
  tree.fold(summationAlgebra)
  // 31340
  tree.fold(redactTreeTransducer(summationAlgebra))
  // 31337
  tree.fold(redactTreeTransducer(stringifyAlgebra))
  // {"world":31337,"bye":"314"}
  tree.fold(redactTreeTransducer(toIntTreeTransducer(summationAlgebra)))
  // 31651

Json: Fixed Point of JsonF

We should be able to get both Json and fold for free based on the definition of JsonF. Json is the fixed point of JsonF, and the fixed point of F is often related to an initial algebra, which means there exists a unique morphism from said initial algebra to every other F-algebra, and fold is part of what makes up that unique morphism. Credit goes to this other post for translating the math to Scala.

We want the fixed point of any F, as well as the generalized fold (called cata below, for catamophism).

final case class Fix[F[_]](unfix: F[Fix[F]])

def cata[F[_]: Functor, A](alg: Algebra[F, A])(e: Fix[F]): A =
  alg(e.unfix.map(cata(alg)))

So then for us we have:

  type Json = Fix[JsonF]
  
  object Json {
    def apply(x: JsonF[Json]): Json = Fix[JsonF](x)
  }

  object Dict {
    def apply(pairs: (String, Json)*): Json = Json(DictF(pairs: _*))
  }

  object Num {
    def apply(value: Int): Json = Json(NumF(value))
  }

  object Str {
    def apply(value: String): Json = Json(StrF(value))
  }
  
  val tree: Json = Dict(
    "hello" -> Dict(
      "i am" -> Dict("cow" -> Num(1)),
      "you are" -> Dict("cow" -> Num(2))
    ),
    "world" -> Num(31337),
    "bye" -> Str("314")
  )

  implicit object JsonFFunctor extends Functor[JsonF] {

    override def map[A, B](fa: JsonF[A])
      (f: A => B): JsonF[B] = fa match {
        case x: StrF => x
        case n: NumF => n
        case d: DictF[A] => DictF(d.pairs.map(x => x._1 -> f(x._2)): _*)
    }
  }
  
  cata(stringifyAlgebra)(tree)
  // {"hello":{"i am":{"cow":1},"you are":{"cow":2}},"world":31337,"bye":"314"}
  cata(summationAlgebra)(tree)
  // 31340
  cata(redactTreeTransducer(summationAlgebra))(tree)
  // 31337
  cata(redactTreeTransducer(stringifyAlgebra))(tree)
  // {"world":31337,"bye":"314"}
  cata(redactTreeTransducer(toIntTreeTransducer(summationAlgebra)))(tree)
  // 31651

One More Algebra

  val initialAlgebra: Algebra[JsonF, Json] = Json.apply

We mentioned the "initial algebra" above. This is the same tree constructing visitor in the post.

  cata(redactTreeTransducer(initialAlgebra))(tree)
  // Fix(DictF(ArrayBuffer((world,Fix(NumF(31337))), (bye,Fix(StrF(314))))))

Or with the Ad-Hoc approach:

val initialAlgebra: Algebra[JsonF, Json] = {
  case StrF(x) => Str(x)
  case NumF(n) => Num(n)
  case d: DictF[Json] => Dict(d.pairs: _*)
}
tree.fold(redactTreeTransducer(initialAlgebra))
// Dict(ArrayBuffer((world,Num(31337)), (bye,Str(314))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment