You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Instead of simple case classes we make use of a system of Extensible Virtual Models to describe our models. This has the practical advantage of making these models run-time extensible while maintaining type-safety. This document serves as a brief introduction to this system and explains how it relates to normal case classes.
Consider this simplified version of a Post and a function which operates on it:
caseclassPost(id : Id[Post], authorId : Id[Author], blogId : Id[Blog], title : String, body : Html)
defescapeBody(post : Post) :Post= ...
The model contains the Post's id, title, and body as well as foreign keys to its author and blog. This serves us well for many cases but sometimes we want to access properties of the author or blog of this post. Because retrieving these models requires API calls to the relevant service, we'd like to pass these models around with the post so we only retrieve them once.
To this end we want to add author : Author and blog : Blog fields to this model. However, we don't always want to load these models since doing so is expensive. To accomplish this goal we'll convert Post to an Extensible Virtual Model.
Each field (or collection of fields) in our model has an interface and a concrete implementation. For the most part you don't have to worry about how the concrete implementation is created, just which interfaces to use. To translate our example above we'll need the IsPost interface which exposes the fields in our original Post model. escapeBody now looks like this:
The Post here is now a type variable local to escapeBody. We could have equivalently written def escapeBody[A : IsPost](post : A) : A but we use the Post name for clarity.
Now suppose we want to write a function authorsName which gets the name of the author of the post. We can achieve this with the HasAuthor interface which exposes the author property.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This document is a proposal to fix the "resolution problem" that currently exists within our code.
The Resolution Problem
In our code base we need to model "has a" relations between various objects. Consider the relation "Post has an Author". It is convenient to provide a method on Post called author which returns the Author of that Post; this matches the way we think about this relation. This solution creates a problem in how we store and retrieve these objects from the database.
In order to construct a post with a field author : Author we must retrieve the Author from the database when we retrieve the post. In the simple case this solution seems good but it breaks down on more complicated cases. These objects might themselves have "has a" relationships with other objects and these objects would themselves be resolved, incurring unecessary overhead in most cases. For the sake of convenience, we might want to model circular relationships: an Author has a Blog, but that Blog also has an Author (both Authors being the same). In this situation our solution is intractable.
Solutions
A simple variant of this system is to make these fields Optional and query the database to set them when necessary. This solution sacrifices type-safety: we can no longer guarantee that a given Post has an Author. We have traded one problem for another. This is the solution we use today.
If we are willing to sacrifice the convenience of methods, a simple solution presents itself: product types. If we want to work with a Post and its Author, we request a (Post, Author). This approach provides the guarantees we want but it comes with large sacrifices. The relationship between the models is no longer clear, this is especially troubling in Author<->Blog relationships. It also hurts composibility, making it difficult to combine functions which care about different fields.
Shapeless provides extensible records that could be used for this purpose. Shapeless records are very flexible and quite robust, unfortunately they come with some very large drawbacks. Shapeless has an unfamiliar (but learnable) syntax using object("property") instead of object.property. Enabling a simple syntax for functions which operate on Shapeless records requires a prodigious amount of boilerplate, more than we would realistically want to maintain by hand. Consider this simplified version of Post with three fields and no extensions:
Further, compile-time performance for records with many fields is very poor.
I'll now present an approach which combines the type safety of the product approach with the ease of the field approach for minimum syntactic overhead.
Extensible Virtual Models
In order to accomplish these goals we will introduce a level of indirection. At its heart our system is the tuple approach (although we'll be using case classes in order to encode our intent on the type level) but instead of operating on the tuples directly we operate on interfaces that expose the fields we want. Each model X is renamed to BasicX (to free the name up) and a type-class IsX is introduced which exposes the fields of BasicPost.
For each resolved model Y we have a type case class BasicHasY[A](a : A, y : Y) and a type-class HasY which exposes a field y : Y (as well as Ops classes for these).
Beyond this we need only write a few instances. First we'll introduce an additional trait to simplify this process:
traitExtension[A] {
defunderlying:A
}
Extension will be extended by each of our BasicHasZ classes. This trait could have been implemented as a type-class but its usage will be limited to the files where each BasicHasZ is implemented and thus the expression problem is irrelevant.
Now, for each HasZ and IsX we introduce an instance HasZ[Extension[A : HasZ]] and an instance for our concrete type HasZ[BasicHasZ] (and the equivalent for IsX). Instances for all combinations of BasicModels and Extensions are thus created via induction.
Our interface now looks like this:
Our function body remains unchanged from the field-based approach (although notably the fields are no longer Optional). The main difference is our function signature. We have replaced our concrete type X with a type variable (which we have also named X). We then use context bounds to describe the fields our function needs to operate on.
Performance
At compile-time, each invocation to a field on our virtual model is replaced with a method call on the relevant type-class instance. These methods can be inlined. Consider the foo function above; if it is passed a BasicHasY(BasicHasZ(BasicPost(...), ...), ...) then the post implicit-resolution expansion of the method invocations is as follows:
Performance costs are thus limited to additional accessor application which is considered neglible.
Additional compiliation time is incurred due to the need for additional implicit resolution, but additional compiliation time is considered an acceptable cost for increased compile-time guarantees.
Boilerplate
The boilerplate required for this solution is non-neglible. Specifically, each extensible model requires an additional type-class and two instances for that type-class; each extended field requires a class and a type-class (again with two instances); and each field on the base model must be defined four times (once in the BasicModel, once in the IsModel type-class, and once for each instance).
This boilerplate has two important properties. Firstly, the amount of boilerplate grows linearly with the number of models/extensible fields meaning that it is tractable for all code-base sizes. Second, the generation of boilerplate is completely mechanical which opens the possibility of automatic generation.
Migration
Rename and Conquer
As a first step each model is renamed Model → BasicModel and imports of these models are now qualified import package.{ BasicModel => Model }.
Next each resolved field is renamed field → oldField so that it does not conflict with the new implicitly granted fields. At the end of the migration we will remove these fields entirely.
Incremental Migration
We'll examine the program structure as a directed acyclic multigraph where each node is a function and the edges are function calls from caller to callee. At the top of our graph we have controllers, and at the bottom we have primitive functions. All paths through the graph which terminate in a resolver function are paths that need to be migrated. All nodes laying outside of these paths can remain untouched.
For the purposes of the migration we will introduce a partial function for each resolver of the form unsafeResolveY : BasicPost => BasicHasY[BasicPost]. Using this function we can migrate the affected nodes incrementally from the top (controllers) down. Each node can be migrated once every node leading to it has been.
As each path is migrated, the unsafeResolveY partial functions will eventually bubble up into their respective resolveY functions, at which point that logic can be made internal to resolveY and that part of the migration is finished.