Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save kpmeen/1f4fe3319462a2ad3eb72314a8523c16 to your computer and use it in GitHub Desktop.
Save kpmeen/1f4fe3319462a2ad3eb72314a8523c16 to your computer and use it in GitHub Desktop.
Scala's path dependent types can be used to prove that a serialized value can be deserialized without having to resort to Try/Either/Option. This puts the serialized value into the type, so we can be sure we won't fail. This is very useful for distributed compute settings such as scalding or spark.
import scala.util.Try
object PathSerializer {
trait SerDe[A] {
// By using a path dependent type, we can be sure can deserialize without wrapping in Try
type Serialized
def ser(a: A): Serialized
def deser(s: Serialized): A
// If we convert to a generic type, in this case String, we forget if we can really deserialize
def toString(s: Serialized): String
def fromString(s: String): Try[A]
}
val intSer: SerDe[Int] =
new SerDe[Int] {
type Serialized = String
def ser(a: Int) = a.toString
def deser(s: Serialized) = s.toInt // since this was serialized with this SerDe this is safe
def toString(s: Serialized): String = s
def fromString(s: String): Try[Int] = Try(s.toInt /* this can fail, because we only know it is a String */)
}
def example0 = {
val x = 42
val ser: intSer.Serialized = intSer.ser(x)
val y = intSer.deser(ser)
assert(x == y)
}
def example1 = {
// In a case like spark or scalding, we can remember that deserialization can't fail:
// pretend the list below is an RDD or TypedPipe
val inputs: List[Int] = (0 to 1000).toList
val serialized: List[intSer.Serialized] = inputs.map(intSer.ser(_))
// intSer.Serialized is distinct from String, note the following error
// if we pretend they are the same:
//
// val strings: List[String] = serialized
//
// ts_ser.scala:41: error: type mismatch;
// found : List[PathSerializer.intSer.Serialized]
// required: List[String]
// Look, ma! no Trys!
val deserialized: List[Int] = serialized.map(intSer.deser(_))
assert(inputs == deserialized)
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment