So Doug made an interesting PR, which uses the visitor pattern described in Haoyi's post.
As I was trying to understand the vistor pattern, I wanted to see how it compared to using F-algebras.
Haoyi describes the benefits of the vistor pattern to include:
- Composability
- Not creating intermediate datastructures
- Streaming
I've found that using F-algebras will give composability without having to create intermediate data structures, but unfortunately there's not a great way to do streaming. (I'm pretty confident that you can write to an output stream, but that you need to start with a full JSON object in memory, so no streaming on the read side.)
This is subjective, but I do think using F-algebras is a bit more elegant, so if you just need the composability, it might be a good choice. In any case it's a neat tool and I thought I'd share since I went through the examples in Haoyi's post and translated them.
Note: I strongly recommend you familiarize yourself with Haoyi's post on the visitor pattern. As all the examples are copied from there, I will simply refer to is as "the post".
Scala Fiddle if you want to go strait to the code.
The post introduces a "Json-lite" data stucture for the sake of having an example.
abstract class Json
case class Str(value: String) extends Json
case class Num(value: Int) extends Json
case class Dict(pairs: (String, Json)*) extends Json
This data structure is recursive because the Dict
values are again Json
.
F-algebras can come up when dealing with such recursive data structures, and one of the main things they give us is a
generic way to fold
over the data structure.
(Side note: Lists are recursive since they are either the empty list or a head and a tail, which is again a list. Lists, of course, have the fold method that you know and love.)
The key insight it to realize that we need to take a step back, and instead of defining
Json
directly we will define a functor by replacing the Json
self reference with a type parameter:
sealed trait JsonF[+T]
case class StrF(value: String) extends JsonF[Nothing]
case class NumF(value: Int) extends JsonF[Nothing]
case class DictF[T](pairs: (String, T)*) extends JsonF[T]
Note that the values of the Dict
are now T
instead of Json
.
(Exercise: define a Functor[JsonF]
instance by implementing map
.)
I want to emphasize that we don't actually have Json
yet. Json
is a fixed point of the JsonF
functor, and so
we'd like to do something like: type Json = JsonF[Json]
, but the compiler complains "illegal cyclic reference involving type Json
".
No worries, we'll define Json
later. In the meantime we can actually do quite a bit.
In the context of what we are doing, an F-algebra is the type of the argument that would be passed to fold
. For example,
on List[A]
the foldRight
method takes a B
as well as an (A,B) => B
. (Note: if you want to combine those into a single
argument the type could be Nil + (A,B) => B
, where +
is some coproducty thing like Either
or a sealed trait.)
It's not at all obvious, but that general type that needs to be passed to a fold
is:
type Algebra[F[_], A] = F[A] => A
(Note: While the type alias is always defined, it would be in poor taste to use it if F
was not a functor.)
In the post, some visitors can be composed with other visitors, while others are "terminating". The terminating visitors are JsonF-algebras:
val summationAlgebra: Algebra[JsonF, Int] = {
case StrF(x) => 0
case NumF(n) => n
case d: DictF[Int] => d.pairs.foldLeft(0)( _ + _._2)
}
private def quote(string: String) = "\"" + string + "\""
val stringifyAlgebra: Algebra[JsonF, String] = {
case StrF(x) => quote(x)
case NumF(n) => n.toString
case d: DictF[String] => s"{${d.pairs.map {case (key, value) => s"${quote(key)}:$value"}.mkString(",")}}"
}
The first sums all the numbers in a Json
tree and the second serializes it to a string.
(Note: We still haven't defined Json
yet.)
Rich Hickey coined the term transducer for "reducing function transformers". (Side note: I think transducers are really cool, but haven't seen the idea much outside of Clojure.) In any case, if you read the post you'll see
;;reducing function signature
whatever, input -> whatever
;;transducer signature
(whatever, input -> whatever) -> (whatever, input -> whatever)
Those "reducing functions" are the kinds of things that one would pass to reduce
or fold
. For us, those are F-algebras,
and so, we can define a transducer as:
type Transducer[F[_], G[_], A, B] = Algebra[F, A] => Algebra[G,B]
(Side Note: I'm letting both the functor (F
and G
) as well as the carriers (A
and B
) be different, but I suspect that in practice A == B
.)
def toIntTreeTransducer[A]: Transducer[JsonF, JsonF, A, A] = alg => {
case StrF(value) => if (value.forall(_.isDigit))
alg(NumF(value.toInt))
else
alg(StrF(value))
case x => alg(x)
}
def redactTreeTransducer[A]: Transducer[JsonF, JsonF, A, A] = alg => {
case d: DictF[A] => alg(DictF(d.pairs.filterNot(_._1 == "hello"): _*))
case x => alg(x)
}
Since transducers are really just functions they compose with ordinary function composition.
(Note: We still haven't defined Json
yet.)
We are going to define Json
two ways. The first way is to just add fold
to the definition in the post. We know what the signature to fold
should be, so the implementation more or less just falls out:
abstract class Json {
def fold[B](alg: Algebra[JsonF, B]): B
}
case class Str(value: String) extends Json {
override def fold[B](alg: Algebra[JsonF, B]): B = alg(StrF(value))
}
case class Num(value: Int) extends Json {
override def fold[B](alg: Algebra[JsonF, B]): B = alg(NumF(value))
}
case class Dict(pairs: (String, Json)*) extends Json {
override def fold[B](alg: Algebra[JsonF, B]): B =
alg(DictF[B](pairs.map {case (k, v) => k -> v.fold(alg)}: _*))
}
val tree: Json = Dict(
"hello" -> Dict(
"i am" -> Dict("cow" -> Num(1)),
"you are" -> Dict("cow" -> Num(2))
),
"world" -> Num(31337),
"bye" -> Str("314")
)
tree.fold(stringifyAlgebra)
// {"hello":{"i am":{"cow":1},"you are":{"cow":2}},"world":31337,"bye":"314"}
tree.fold(summationAlgebra)
// 31340
tree.fold(redactTreeTransducer(summationAlgebra))
// 31337
tree.fold(redactTreeTransducer(stringifyAlgebra))
// {"world":31337,"bye":"314"}
tree.fold(redactTreeTransducer(toIntTreeTransducer(summationAlgebra)))
// 31651
We should be able to get both Json
and fold
for free based on the definition of JsonF
. Json
is the fixed point
of JsonF
, and the fixed point of F is often related to an initial algebra,
which means there exists a unique morphism from said initial algebra to every other F-algebra, and fold
is part
of what makes up that unique morphism. Credit goes to this other post for translating the math to Scala.
We want the fixed point of any F, as well as the generalized fold (called cata
below, for catamophism).
final case class Fix[F[_]](unfix: F[Fix[F]])
def cata[F[_]: Functor, A](alg: Algebra[F, A])(e: Fix[F]): A =
alg(e.unfix.map(cata(alg)))
So then for us we have:
type Json = Fix[JsonF]
object Json {
def apply(x: JsonF[Json]): Json = Fix[JsonF](x)
}
object Dict {
def apply(pairs: (String, Json)*): Json = Json(DictF(pairs: _*))
}
object Num {
def apply(value: Int): Json = Json(NumF(value))
}
object Str {
def apply(value: String): Json = Json(StrF(value))
}
val tree: Json = Dict(
"hello" -> Dict(
"i am" -> Dict("cow" -> Num(1)),
"you are" -> Dict("cow" -> Num(2))
),
"world" -> Num(31337),
"bye" -> Str("314")
)
implicit object JsonFFunctor extends Functor[JsonF] {
override def map[A, B](fa: JsonF[A])
(f: A => B): JsonF[B] = fa match {
case x: StrF => x
case n: NumF => n
case d: DictF[A] => DictF(d.pairs.map(x => x._1 -> f(x._2)): _*)
}
}
cata(stringifyAlgebra)(tree)
// {"hello":{"i am":{"cow":1},"you are":{"cow":2}},"world":31337,"bye":"314"}
cata(summationAlgebra)(tree)
// 31340
cata(redactTreeTransducer(summationAlgebra))(tree)
// 31337
cata(redactTreeTransducer(stringifyAlgebra))(tree)
// {"world":31337,"bye":"314"}
cata(redactTreeTransducer(toIntTreeTransducer(summationAlgebra)))(tree)
// 31651
val initialAlgebra: Algebra[JsonF, Json] = Json.apply
We mentioned the "initial algebra" above. This is the same tree constructing visitor in the post.
cata(redactTreeTransducer(initialAlgebra))(tree)
// Fix(DictF(ArrayBuffer((world,Fix(NumF(31337))), (bye,Fix(StrF(314))))))
Or with the Ad-Hoc approach:
val initialAlgebra: Algebra[JsonF, Json] = {
case StrF(x) => Str(x)
case NumF(n) => Num(n)
case d: DictF[Json] => Dict(d.pairs: _*)
}
tree.fold(redactTreeTransducer(initialAlgebra))
// Dict(ArrayBuffer((world,Num(31337)), (bye,Str(314))))