Skip to content

Instantly share code, notes, and snippets.

@ClaireNeveu
Last active August 29, 2015 14:11
Show Gist options
  • Save ClaireNeveu/8a6ea8e8114e5af1a0eb to your computer and use it in GitHub Desktop.
Save ClaireNeveu/8a6ea8e8114e5af1a0eb to your computer and use it in GitHub Desktop.
Type-Safe Resolution

Instead of simple case classes we make use of a system of Extensible Virtual Models to describe our models. This has the practical advantage of making these models run-time extensible while maintaining type-safety. This document serves as a brief introduction to this system and explains how it relates to normal case classes.

Consider this simplified version of a Post and a function which operates on it:

case class Post(id : Id[Post], authorId : Id[Author], blogId : Id[Blog], title : String, body : Html)

def escapeBody(post : Post) : Post = ...

The model contains the Post's id, title, and body as well as foreign keys to its author and blog. This serves us well for many cases but sometimes we want to access properties of the author or blog of this post. Because retrieving these models requires API calls to the relevant service, we'd like to pass these models around with the post so we only retrieve them once.

To this end we want to add author : Author and blog : Blog fields to this model. However, we don't always want to load these models since doing so is expensive. To accomplish this goal we'll convert Post to an Extensible Virtual Model.

Each field (or collection of fields) in our model has an interface and a concrete implementation. For the most part you don't have to worry about how the concrete implementation is created, just which interfaces to use. To translate our example above we'll need the IsPost interface which exposes the fields in our original Post model. escapeBody now looks like this:

def escapeBody[Post : IsPost](post : Post) : Post = ...

The Post here is now a type variable local to escapeBody. We could have equivalently written def escapeBody[A : IsPost](post : A) : A but we use the Post name for clarity.

Now suppose we want to write a function authorsName which gets the name of the author of the post. We can achieve this with the HasAuthor interface which exposes the author property.

def authorsName[Post : IsPost : HasAuthor](post : Post) : Post = post.author.name

The interface HasBlog allows us to access the blog property.

case class Author(id: Long)
case class Blog(id: Long)
case class Post(title: String, body: String, authorId: Long, defaultBlogId: Long)
object Post {
implicit object PostLikePost extends PostLike[Post] {
def title(p: Post): String = p.title
def body(p: Post): String = p.body
def authorId(p: Post): Long = p.authorId
def defaultBlogId(p: Post): Long = p.defaultBlogId
}
}
trait Extension[A] {
def underlying: A
}
trait PostLike[-P] {
def title(p: P): String
def body(p: P): String
def authorId(p: P): Long
def defaultBlogId(p: P): Long
}
object PostLike {
implicit class PostLikeOps[P](p: P) {
def title(implicit ev: PostLike[P]): String = ev.title(p)
def body(implicit ev: PostLike[P]): String = ev.body(p)
def authorId(implicit ev: PostLike[P]): Long = ev.authorId(p)
def defaultBlogId(implicit ev: PostLike[P]): Long = ev.defaultBlogId(p)
}
implicit val postPostLike =
new PostLike[Post] {
def title(p: Post) = p.title
def body(p: Post) = p.body
def authorId(p: Post) = p.authorId
def defaultBlogId(p: Post) = p.defaultBlogId
}
implicit def PostLike_extensionInstance[A: PostLike] =
new PostLike[Extension[A]] {
def title(p: Extension[A]) = implicitly[PostLike[A]].title(p.underlying)
def body(p: Extension[A]) = implicitly[PostLike[A]].body(p.underlying)
def authorId(p: Extension[A]) = implicitly[PostLike[A]].authorId(p.underlying)
def defaultBlogId(p: Extension[A]) = implicitly[PostLike[A]].defaultBlogId(p.underlying)
}
}
case class WithAuthor[A](underlying: A, author: Author) extends Extension[A]
trait WithAuthorLike[WA] {
def author(wa: WA): Author
}
object WithAuthorLike {
implicit def WithAuthorLike_extensionInstance[A: WithAuthorLike] =
new WithAuthorLike[Extension[A]] {
def author(wa: Extension[A]) =
implicitly[WithAuthorLike[A]].author(wa.underlying)
}
implicit def withBlogLikeInstance[A] = new WithBlogLike[WithAuthor[WithBlog[A]]] {
def blog(wa: WithAuthor[WithBlog[A]]) = wa.underlying.blog
}
implicit class WithAuthorLikeOps[WA](wa: WA) {
def author(implicit ev: WithAuthorLike[WA]): Author = ev.author(wa)
}
}
case class WithBlog[A](underlying: A, blog: Blog) extends Extension[A]
trait WithBlogLike[WB] {
def blog(wb: WB): Blog
}
object WithBlogLike {
implicit def WithBlogLike_extensionInstance[A: WithBlogLike] =
new WithBlogLike[Extension[A]] {
def blog(wb: Extension[A]) =
implicitly[WithBlogLike[A]].blog(wb.underlying)
}
implicit def withAuthorLikeInstance[A] = new WithAuthorLike[WithBlog[WithAuthor[A]]] {
def author(wa: WithBlog[WithAuthor[A]]) = wa.underlying.author
}
implicit class WithBlogLikeOps[WB](wb: WB) {
def blog(implicit ev: WithBlogLike[WB]): Blog = ev.blog(wb)
}
}
object Test {
import Post._
import PostLike._
import WithAuthorLike._
import WithBlogLike._
val post = Post("test", "body", 5, 12)
val author = Author(5)
val blog = Blog(12)
println(s"Title: ${post.title}")
val postWithAuthor = WithAuthor(post, author)
println(s"Title: ${postWithAuthor.title} Author: ${postWithAuthor.author}")
val postWithBlog = WithBlog(post, blog)
println(s"Title: ${postWithBlog.title} Blog: ${postWithBlog.blog}")
val postWithAuthorAndBlog = WithBlog(postWithAuthor, blog)
println(s"Title: ${postWithAuthorAndBlog.title} Blog: ${postWithAuthorAndBlog.blog} Author: ${postWithAuthorAndBlog.author}")
val postWithBlogAndAuthor = WithAuthor(postWithBlog, author)
println(s"Title: ${postWithBlogAndAuthor.title} Blog: ${postWithBlogAndAuthor.blog} Author: ${postWithBlogAndAuthor.author}")
}

Type-Safe Model Resolution Through Type-Classes

This document is a proposal to fix the "resolution problem" that currently exists within our code.

The Resolution Problem

In our code base we need to model "has a" relations between various objects. Consider the relation "Post has an Author". It is convenient to provide a method on Post called author which returns the Author of that Post; this matches the way we think about this relation. This solution creates a problem in how we store and retrieve these objects from the database.

In order to construct a post with a field author : Author we must retrieve the Author from the database when we retrieve the post. In the simple case this solution seems good but it breaks down on more complicated cases. These objects might themselves have "has a" relationships with other objects and these objects would themselves be resolved, incurring unecessary overhead in most cases. For the sake of convenience, we might want to model circular relationships: an Author has a Blog, but that Blog also has an Author (both Authors being the same). In this situation our solution is intractable.

Solutions

A simple variant of this system is to make these fields Optional and query the database to set them when necessary. This solution sacrifices type-safety: we can no longer guarantee that a given Post has an Author. We have traded one problem for another. This is the solution we use today.

If we are willing to sacrifice the convenience of methods, a simple solution presents itself: product types. If we want to work with a Post and its Author, we request a (Post, Author). This approach provides the guarantees we want but it comes with large sacrifices. The relationship between the models is no longer clear, this is especially troubling in Author<->Blog relationships. It also hurts composibility, making it difficult to combine functions which care about different fields.

Shapeless provides extensible records that could be used for this purpose. Shapeless records are very flexible and quite robust, unfortunately they come with some very large drawbacks. Shapeless has an unfamiliar (but learnable) syntax using object("property") instead of object.property. Enabling a simple syntax for functions which operate on Shapeless records requires a prodigious amount of boilerplate, more than we would realistically want to maintain by hand. Consider this simplified version of Post with three fields and no extensions:

case class IsPost[L <: HList](implicit
   s1: Selector[L, id.T] { type Out = Long },
   s2: Selector[L, title.T] { type Out = String },
   s3: Selector[L, body.T] { type Out = String }
)

implicit def make[L <: HList](implicit
   s1: Selector[L, id.T] { type Out = Long },
   s2: Selector[L, title.T] { type Out = String },
   s3: Selector[L, body.T] { type Out = String }
) = IsPost[L]

implicit def unmake1[L <: HList](implicit
   s: IsPost[L]
) = s.s1

implicit def unmake2[L <: HList](implicit
   s: IsPost[L]
) = s.s2

implicit def unmake3[L <: HList](implicit
   s: IsPost[L]
) = s.s3

def test[P <: HList : IsPost](post : P) = {
   post("id")
}

Further, compile-time performance for records with many fields is very poor. Shapeless performance

I'll now present an approach which combines the type safety of the product approach with the ease of the field approach for minimum syntactic overhead.

Extensible Virtual Models

In order to accomplish these goals we will introduce a level of indirection. At its heart our system is the tuple approach (although we'll be using case classes in order to encode our intent on the type level) but instead of operating on the tuples directly we operate on interfaces that expose the fields we want. Each model X is renamed to BasicX (to free the name up) and a type-class IsX is introduced which exposes the fields of BasicPost.

For each resolved model Y we have a type case class BasicHasY[A](a : A, y : Y) and a type-class HasY which exposes a field y : Y (as well as Ops classes for these).

Beyond this we need only write a few instances. First we'll introduce an additional trait to simplify this process:

trait Extension[A] {
	def underlying : A 
}

Extension will be extended by each of our BasicHasZ classes. This trait could have been implemented as a type-class but its usage will be limited to the files where each BasicHasZ is implemented and thus the expression problem is irrelevant.

Now, for each HasZ and IsX we introduce an instance HasZ[Extension[A : HasZ]] and an instance for our concrete type HasZ[BasicHasZ] (and the equivalent for IsX). Instances for all combinations of BasicModels and Extensions are thus created via induction. Our interface now looks like this:

def foo[X : IsX : HasY : HasZ ...](x : X) = {
   ...
   x.y
   ...
   x.z
   ...
}

Our function body remains unchanged from the field-based approach (although notably the fields are no longer Optional). The main difference is our function signature. We have replaced our concrete type X with a type variable (which we have also named X). We then use context bounds to describe the fields our function needs to operate on.

Performance

At compile-time, each invocation to a field on our virtual model is replaced with a method call on the relevant type-class instance. These methods can be inlined. Consider the foo function above; if it is passed a BasicHasY(BasicHasZ(BasicPost(...), ...), ...) then the post implicit-resolution expansion of the method invocations is as follows:

x.propertyOfX  x.underlying.underlying.propertyOfX
x.z  x.underlying.z
x.y  x.y

Performance costs are thus limited to additional accessor application which is considered neglible.

Additional compiliation time is incurred due to the need for additional implicit resolution, but additional compiliation time is considered an acceptable cost for increased compile-time guarantees.

Boilerplate

The boilerplate required for this solution is non-neglible. Specifically, each extensible model requires an additional type-class and two instances for that type-class; each extended field requires a class and a type-class (again with two instances); and each field on the base model must be defined four times (once in the BasicModel, once in the IsModel type-class, and once for each instance).

This boilerplate has two important properties. Firstly, the amount of boilerplate grows linearly with the number of models/extensible fields meaning that it is tractable for all code-base sizes. Second, the generation of boilerplate is completely mechanical which opens the possibility of automatic generation.

Migration

Rename and Conquer

As a first step each model is renamed ModelBasicModel and imports of these models are now qualified import package.{ BasicModel => Model }.

Next each resolved field is renamed fieldoldField so that it does not conflict with the new implicitly granted fields. At the end of the migration we will remove these fields entirely.

Incremental Migration

We'll examine the program structure as a directed acyclic multigraph where each node is a function and the edges are function calls from caller to callee. At the top of our graph we have controllers, and at the bottom we have primitive functions. All paths through the graph which terminate in a resolver function are paths that need to be migrated. All nodes laying outside of these paths can remain untouched.

For the purposes of the migration we will introduce a partial function for each resolver of the form unsafeResolveY : BasicPost => BasicHasY[BasicPost]. Using this function we can migrate the affected nodes incrementally from the top (controllers) down. Each node can be migrated once every node leading to it has been.

As each path is migrated, the unsafeResolveY partial functions will eventually bubble up into their respective resolveY functions, at which point that logic can be made internal to resolveY and that part of the migration is finished.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment