Skip to content

Instantly share code, notes, and snippets.

@serras
Created August 31, 2020 19:37
Show Gist options
  • Save serras/25963d8568160c19e84361c9a8dbe7f6 to your computer and use it in GitHub Desktop.
Save serras/25963d8568160c19e84361c9a8dbe7f6 to your computer and use it in GitHub Desktop.
Variants: the ultimate frontier

Variants: the ultimate frontier

Most data serialization formats, like JSON, YAML, and EDN, feature a similar set of basic building blocks, namely:

  • Some primitive values, like numbers, strings, and booleans;
  • Key-value pairs, also known as maps, dictionaries, or objects;
  • Sequences, usually in the form of lists or arrays, and sometimes also sets.

I completely agree with the fact that those are basic building blocks for data inside any information system. However, as a Haskeller I always miss one additional part of my toolbox: variants. Variants are essentially tagged values which contain further values inside.

Let's fix the syntax #tag to describe one such tag, since at the end of the day we need some concrete syntax. Here are some examples of variants which are heavily used.

  • With support for variants we no longer need booleans to be built in. We can use #true and #false to represent them.
  • Optional values can be defined as #none or #some 3. The same holds for results which can be successful, #success { name: "Alejandro", age: 32 }, or raise an error, #error "non existing person".
  • Actions in Redux are essentially big variants, describing each possible update to the current state. In other frameworks like Elm and Reductive they are defined in that way.

My point is that variants are as important as the rest of building blocks, and that we should make them part of existing data serialization formats. Many data manipulation would become much clearer if expressed as what it is: a variant.

Static typing?

You might be wondering: are variants not available only in Haskell, ML, and languages heavily influenced by them like Rust, Swift, or F#? That's quite true, they all used algebraic data types as their data building blocks. Then your next thought may be that variants are somehow tied to their strong typing discipline. However, this doesn't have to be the case.

OCaml and F#, two languages in the ML strand, feature record and object types. Those can be seen as "anonymous" key-value pairs, in which only the names and the types of the key matter. OCaml also supports polymorphic variants, which introduce this "anonymity" into variants. Here are a couple of their examples:

# [`On; `Off];;
- : [> `Off | `On ] list = [`On; `Off]
# `Number 1;;
- : [> `Number of int ] = `Number 1

OCaml actually introduces typing for these expressions, but this is irrelevant here. The key point is that variants make sense on their own, without reference to a particular type, in an untyped setting like JSON or EDN.

Unions?

TypeScript features discriminating unions as a way to simulate real variants. They follow a common pattern in Redux in which variants are encoded by an object, with a particular key taking the role of tag. GraphQL's union types follow a similar pattern, where __typename takes the role of the tag.

type NetworkFailedState = {
  state: "failed";
  code: number;
};

type NetworkState =
  | NetworkLoadingState
  | NetworkFailedState
  | NetworkSuccessState;

Notice that here we are conflating the data itself, which is nothing but an object, with the typing superimposed by TypeScript. Once again, my point is that variants should be part of the basic blocks for building data:

let state = #failed { code: 3 }

Of course, nobody would complain if TypeScript supported them. That would require less typing acrobatics that the current discriminating unions.

type NetworkState =
  | #loading
  | #failed { code: number }
  | #state  { response: ... }

Where to go from here?

Well, I don't know. My dream would be to have variants in JavaScript, but that's a very long shot. In the short term, I would love to hear your thoughts.

@ivenmarquardt
Copy link

ivenmarquardt commented Sep 2, 2020

Here is my rather simple status quo regarding tagged unions, which I want to improve from:

const union = type => (tag, o) => (
  o[type] = type, // circumvents structural typing
  o.tag = tag.name || tag,
  o);

const match = (tx, o) =>
  o[tx.tag] (tx); // immediately throws an error instead of silently returning undefined during duck typing

const Option = union("Option");

const None = Option("None", {});
const Some = some => Option(Some, {some});

// example

const cata = def => tx =>
  match(tx, {
    None: () => def,
    Some: ({some}) => some
  });

cata(0) (Some(123)); // 123
cata(0) (None); // 0
cata(0) ({tag: "Foo", some: 999}); // type error

The downside is that w/o TS there is no exhaustiveness check. If I'd use Church encoding instead, there were a partial check, because if you don't provide all cases with Church, it always fails, no matter which case is actually expected. This doesn't apply to my simple object encoding. Church encoding is CPS and thus kinda sucks. Bottom line: I don't know how to overcome this limitation an untyped setting assumed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment