Most data serialization formats, like JSON, YAML, and EDN, feature a similar set of basic building blocks, namely:
- Some primitive values, like numbers, strings, and booleans;
- Key-value pairs, also known as maps, dictionaries, or objects;
- Sequences, usually in the form of lists or arrays, and sometimes also sets.
I completely agree with the fact that those are basic building blocks for data inside any information system. However, as a Haskeller I always miss one additional part of my toolbox: variants. Variants are essentially tagged values which contain further values inside.
Let's fix the syntax #tag
to describe one such tag, since at the end of the day we need some concrete syntax. Here are some examples of variants which are heavily used.
- With support for variants we no longer need booleans to be built in. We can use
#true
and#false
to represent them. - Optional values can be defined as
#none
or#some 3
. The same holds for results which can be successful,#success { name: "Alejandro", age: 32 }
, or raise an error,#error "non existing person"
. - Actions in Redux are essentially big variants, describing each possible update to the current state. In other frameworks like Elm and Reductive they are defined in that way.
My point is that variants are as important as the rest of building blocks, and that we should make them part of existing data serialization formats. Many data manipulation would become much clearer if expressed as what it is: a variant.
You might be wondering: are variants not available only in Haskell, ML, and languages heavily influenced by them like Rust, Swift, or F#? That's quite true, they all used algebraic data types as their data building blocks. Then your next thought may be that variants are somehow tied to their strong typing discipline. However, this doesn't have to be the case.
OCaml and F#, two languages in the ML strand, feature record and object types. Those can be seen as "anonymous" key-value pairs, in which only the names and the types of the key matter. OCaml also supports polymorphic variants, which introduce this "anonymity" into variants. Here are a couple of their examples:
# [`On; `Off];;
- : [> `Off | `On ] list = [`On; `Off]
# `Number 1;;
- : [> `Number of int ] = `Number 1
OCaml actually introduces typing for these expressions, but this is irrelevant here. The key point is that variants make sense on their own, without reference to a particular type, in an untyped setting like JSON or EDN.
TypeScript features discriminating unions as a way to simulate real variants. They follow a common pattern in Redux in which variants are encoded by an object, with a particular key taking the role of tag. GraphQL's union types follow a similar pattern, where __typename
takes the role of the tag.
type NetworkFailedState = {
state: "failed";
code: number;
};
type NetworkState =
| NetworkLoadingState
| NetworkFailedState
| NetworkSuccessState;
Notice that here we are conflating the data itself, which is nothing but an object, with the typing superimposed by TypeScript. Once again, my point is that variants should be part of the basic blocks for building data:
let state = #failed { code: 3 }
Of course, nobody would complain if TypeScript supported them. That would require less typing acrobatics that the current discriminating unions.
type NetworkState =
| #loading
| #failed { code: number }
| #state { response: ... }
Well, I don't know. My dream would be to have variants in JavaScript, but that's a very long shot. In the short term, I would love to hear your thoughts.
Here is my rather simple status quo regarding tagged unions, which I want to improve from:
The downside is that w/o TS there is no exhaustiveness check. If I'd use Church encoding instead, there were a partial check, because if you don't provide all cases with Church, it always fails, no matter which case is actually expected. This doesn't apply to my simple object encoding. Church encoding is CPS and thus kinda sucks. Bottom line: I don't know how to overcome this limitation an untyped setting assumed.