Skip to content

Instantly share code, notes, and snippets.

@dead-claudia
Last active July 22, 2017 06:04
Show Gist options
  • Save dead-claudia/02182617afb4f55272ff47bf230ff8af to your computer and use it in GitHub Desktop.
Save dead-claudia/02182617afb4f55272ff47bf230ff8af to your computer and use it in GitHub Desktop.
JS pattern matching strawman

Note: if you wish to comment on this proposal, tell me here, so I can actually be notified.


JS Pattern Matching Strawman

Just an attempt to come up with a sane pattern matching proposal that doesn't hit any of the following pitfalls:

  • Unidiomatic - if it doesn't look at all like JS, not very many will want to try it.
  • Prohibitively slow - if it can't be made fast, implementors are going to question it and users will complain.
  • Overly complex - if it isn't at least doable, implementors won't consider it.

Proposed features

  • Highly optimizable
  • Easy to parse, both by computer and person (avoid ambiguities)
  • Simple to write, expression-oriented
  • Flexible and extensible out of the gate
  • Easily desugared and transpiled
  • New well-known symbols: @@test and @@unapply
  • A few new additions to existing built-in types

Overview

See this for an overview of how the various checks work. Also, see a comparison based on part of one of Redux's examples, featuring one using current syntax and one using this pattern matching proposal.

  • It all starts with a single case expression, something like case (arg) { /* patterns... */ }

  • You can have as many cases as you would like.

    • Each case is separated by patterns.
    • The else or fallthrough case, if present, must be the last.
    • At least one case is required, to avoid ambiguity with the sloppy with statement at the top level.
  • Each case is something like this: pattern => result

    • If you want to match a literal value, you use = someValue => result.
      • In objects, you can use the shorthand {foo = value} instead of foo: foo = value
    • If you want to match according to some arbitrary condition, you use if cond => result.
      • You can even guard entire patterns, like with [...values] if cond: result.
    • If you want to match some interface, you use {then: is "function"} => result or whatever.
      • You can even match iterables: [...values] => result.
      • You can even nest them: {foo: {bar: is Array}} => result.
    • If you want to match an iterable, you use [...values] => result.
    • If you want to match via typeof, you use is "string" => result and similar.
    • If you want to match via instanceof, you use is SomeType => result.
    • If you want to match in a range of values, you use is [...values] => result (elisions allowed).
    • If you want to match a type and the object's values, you can use is Type of [...values] => result
    • If you want to just check if something exists, you use {prop: ?} => result and array elisions.
    • The catch-all pattern is else => result, defaulting to else => undefined if missing.
  • If you want to get the object matched in a pattern, you use foo = pattern, foo is "string", foo if cond, etc.

    • Bindings are defined even in the pattern and applicable guard, so you have full flexibility here.
    • You only need = if the next part is something other than is "string", if cond, etc.
    • In objects, you can omit the foo: in foo: foo, so it's easier to type.
  • If you want a nested value that was matched, you just use that name.

    • It even works with rest patterns [...rest] and spread patterns {...rest}.
    • It also checks for != null if used to get a property or iterable value.
    • The only catch is that it can't shadow anything. Otherwise, it matches a value equal to what's in the binding.
  • It's possible to modify how patterns are matched by defining @@test and/or @@unapply

    • See the proposed semantics for more details.

Useful utilities

Just a few examples of useful custom is targets

const within = {
    [Symbol.test]: (arg, min, max) => min <= arg && arg < max,
}

case (x) {
    is within(5, 10) => console.log(`5 <= ${x} < 10`),
    else => console.log(`${x} is something else`),
}

const match = {
    [Symbol.test]: (arg, re) => re.exec(arg),
    [Symbol.unapply]: (arg, exec) => ({
        get: key => { let v = exec[key]; if (v != null) return {value: v} },
        match: (key, value) => exec[key] === value,
    }),
}

case (input.value) {
    is match(/^\d+$/) => console.log("is a number"),
    is match(/^\w+$/) => console.log("is a word"),
    else => console.log("is something else"),
}

const Iterable = {[Symbol.test]: arg => typeof arg[Symbol.iterable] === "function"}

case (value) {
    is Iterable => { for (const item of value) ... },
    else => console.log("not an iterable"),
}

Potential FAQs

  • What's the inspiration? I drew some from a mixture of Scala's match operator (the case syntax and delimiters, unapply), OCaml's data type syntax (the Type of ... syntax), and Clojure's core.match (for some of the semantics). I also drew some inspiration from Ruby's metaprogramming in how I spec'd the new built-ins. Most of the rest was original, since there really isn't much else to build from.

Open questions

  • Better syntax? It doesn't quite look very JavaScript-ey, but coming up with a realistic pattern matching syntax for a dynamic language is hard enough, not to mention one for a statement-oriented language that actually looks like it fits in natively.

Proposed specc'y stuff

Note that implementations could optimize all of these fairly well with type feedback:

  • Function.p[@@test] and String.p[@@test] could be optimized similarly to Function.p[@@hasInstance].
  • Array.p[@@test] could be optimized into a series of equality checks for literals.

New well-known symbol

  • The well-known symbol @@test ([[Description]]: "Symbol.test") is used control how types are matched when checking values.
  • The well-known symbol @@unapply ([[Description]]: "Symbol.unapply") is used control how values are destructured when matching.

New built-ins

  • Function.prototype[@@test](target) - Return target instanceof this.

  • String.prototype[@@test](target) - Return typeof target === this.

  • Array.prototype[@@test](elem) - Return this.includes(elem).

  • BuiltinType[@@test](value) - Return true if this has all of the internal slots of BuiltinType, false otherwise.

    • BuiltinType is each built-in constructor (e.g. Map) or factory (e.g. Symbol) in the spec.
    • This is specified in such a way that it can match objects cross-realm safely.
    • This does match primitives as well as their wrapper objects.
  • Map[@@unapply](map, _, isIterable), WeakMap[@@unapply](map, _, isIterable) - Return an object with the following:

    • result.done - Map-only: true if all set entries have been checked, false otherwise.
    • result.get(key) - Return {value: map.get(key)} if that call doesn't return undefined.
    • result.match(key, value) - Return map.get(key) === value.
  • Set[@@unapply](set, _, isIterable), WeakSet[@@unapply](set, _, isIterable) - If it's not an iterable, throw an error. Otherwise, return an object with the following:

    • result.done - Set-only: true if all set entries have been checked, false otherwise.
    • result.match(index, value) - Return set.has(value).

Syntax

At least one pattern case is required, to avoid the ambiguity with an empty with statement when used as an expression statement. (This is only observable at parse time, and not to ECMAScript code.)

AssignmentExpression :
    CaseExpression

CoverCaseHead :
    `case` `(` Expression `)`

CaseClause :
    `case` [loohahead ∉ `(`] Expression `:` StatementList[opt]
    CoverCaseHead `:` StatementList[opt]

CaseExpression :
    CoverCaseHead `{` CaseClauseList `}`

CaseExpressionClauseList :
    CaseExpressionClauseFinalItem
    CaseExpressionClauseFinalItem `,`
    CaseExpressionClauseItem `,` CaseClauseExpressionList

CaseExpressionClauseItem :
    CaseCondition CaseExpressionClauseResult

CaseExpressionClauseFinalItem :
    CaseExpressionClauseItem
    `else` CaseExpressionClauseResult
    BindingIdentifier CaseExpressionClauseResult

CaseExpressionClauseResult :
    `=>` [loohahead ∉ `{`] Expression
    `=>` BlockStatement

CaseCondition :
    CasePattern[~Assign, ~Ref] `if` Expression
    CasePattern[~Assign, ~Ref]

CasePattern[?Assign, ?Ref] :
    [~Assign] BindingIdentifier
    [~Assign] BindingIdentifier CasePatternPart[+Assign, ?Ref]
    [loohahead ∉ BindingIdentifer] CasePatternPart[?Assign, ?Ref]

CasePatternSimple[?Ref] :
    [+Ref] `?`
    [loohahead ∉ { `[`, `{` }] Expression
    CaseArrayPattern
    CaseObjectPattern

CasePatternPart[?Assign, ?Ref] :
    [+Assign] `=` CasePatternSimple[?Ref]
    [~Assign] CasePatternSimple[?Ref]
    `is` CaseTestPattern
    `is` CaseTestPattern `of` CaseArrayPattern
    `is` CaseTestPattern `of` CaseObjectPattern

CaseTestPattern :
    Expression Arguments
    CaseTestCallExpressionNoArguments [loohahead ∉ `(`]

CaseTestCallExpressionNoArguments :
    `(` Expression `)`
    NewExpression
    CallExpression `[` Expression `]`
    CallExpression `.` IdentifierName
    CallExpression TemplateLiteral

CaseArrayPattern :
    `[` Elision[opt] `]`
    `[` CaseElementList Elision[opt] `]`
    `[` CaseElementList `,` Elision[opt] `]`
    `[` CaseElementList `,` CaseSpreadElement `]`
    `[` CaseElementList `,` `...` `]`

CaseElementList :
    Elision[opt] CaseElement
    CaseElementList `,` Elision[opt] CaseElement

CaseElement :
    CasePattern[~Assign, +Ref]

CaseSpreadElement :
    `...` BindingIdentifier
    `...` [loohahead ∉ BindingIdentifer] AssignmentExpression

CaseObjectPattern :
    `{` CaseObjectPropertyList `}`
    `{` CaseObjectPropertyList `,` `}`
    `{` CaseObjectPropertyList `,` CaseSpreadElement `}`

CaseObjectPropertyList :
    CaseObjectProperty
    CaseObjectPropertyList `,` CaseObjectProperty

CaseObjectProperty :
    PropertyName `:` `?`
    PropertyName `:` CasePattern[~Assign, +Ref]
    BindingIdentifier
    BindingIdentifier CasePattern[+Assign, +Ref]
  • As a statement in switch bodies, it becomes different from a case with a { when a : is otherwise expected, avoiding ambiguity:

    // switch case
    case ( expr ) :
    
    // case expression
    case ( expr ) {
    

    The CoverCaseHead is a near-trivial cover grammar to address this.

Semantics

  • Equality in pattern matching is checked via SameValueZero (strict equality, but with NaNs equal).

  • Conditions guarding patterns are evaluated and checked after their pattern is.

  • Things like ? and binding placeholders check for existence.

    • Properties like prop: ? are checked for in existence.
    • Iterable elisions are checked for merely having something there.
    • [?] is equivalent to [,] (single elision), but just more explicit
  • Bindings are scoped to the pattern itself, not the case block.

    • There's a TDZ to guard access to uninitialized bindings.
  • @@test and/or @@unapply specifics:

    • is Foo(...rest) calls Foo[@@test](arg, ...rest) to test whether it matches.
    • If the return value is falsy, the match is considered to fail.
    • If the return value is truthy and Foo[@@unapply] is present:
      • is Foo of ... calls Foo[@@unapply] with the argument, the @@test result, and a boolean, true if it's matching an iterable pattern.
      • The result must be an object with the following properties/methods (match required):
        • result.match(key/index, value) - Return a boolean, true if the key's value matches value. Called on constants.
        • result.get(key/index) - Optional: Return a {value} object with the value to destructure or undefined if none exists. Called on all other properties/entries.
        • result.done - Optional: A boolean, true if the result has checked everything, false if it hasn't, and null/undefined if it's unsized.
      • If the arg is an iterable and result.done === true before all entries are checked, the match fails.
      • If the arg is an iterable, result.done === false after checking all enries, and no final ... was used, the match fails.
    • The proper way to pass data from @@test to @@unapply is via an object (always-truthy) return value. If you need to return a potentially falsy primitive, you should wrap it in an object.
      • This is useful in cases like the match utility above.
  • Destructuring could be optimized in the spec kind of like what's done in Clojure's core.match and explained in its linked paper.

    • Object properties are always only accessed/tested once. (via [[GetOwnProperty]], but following the prototype)
    • Iterables are only iterated once, but may be closed early.
    • Only a minimum number of steps should be taken to differentiate two patterns.
// Mostly copied from http://redux.js.org/docs/advanced/ExampleRedditAPI.html#reducers
import { combineReducers } from 'redux'
import {
SELECT_SUBREDDIT,
INVALIDATE_SUBREDDIT,
REQUEST_POSTS,
RECEIVE_POSTS
} from './actions'
function selectedSubreddit(state = 'reactjs', action) {
switch (action.type) {
case SELECT_SUBREDDIT:
return action.subreddit
default:
return state
}
}
function posts(
state = {
isFetching: false,
didInvalidate: false,
items: []
},
action
) {
switch (action.type) {
case INVALIDATE_SUBREDDIT:
return {
...state,
didInvalidate: true
}
case REQUEST_POSTS:
return {
...state,
isFetching: true,
didInvalidate: false
}
case RECEIVE_POSTS:
return {
...state,
isFetching: false,
didInvalidate: false,
items: action.posts,
lastUpdated: action.receivedAt
}
default:
return state
}
}
function postsBySubreddit(state = {}, action) {
switch (action.type) {
case INVALIDATE_SUBREDDIT:
case RECEIVE_POSTS:
case REQUEST_POSTS:
return {
...state,
[action.subreddit]: posts(state[action.subreddit], action)
}
default:
return state
}
}
export default combineReducers({
postsBySubreddit,
selectedSubreddit
})
// Adapted from http://redux.js.org/docs/advanced/ExampleRedditAPI.html#reducers
import { combineReducers } from 'redux'
import {
SELECT_SUBREDDIT,
INVALIDATE_SUBREDDIT,
REQUEST_POSTS,
RECEIVE_POSTS
} from './actions'
function selectedSubreddit(state = 'reactjs', action) {
return case (action) {
{type = SELECT_SUBREDDIT, subreddit} => subreddit,
else => state
}
}
function posts(
state = {
isFetching: false,
didInvalidate: false,
items: []
},
action
) {
return {...state, ...case (action) {
{type = INVALIDATE_SUBREDDIT} => ({didInvalidate: true}),
{type = REQUEST_POSTS} => ({
isFetching: true,
didInvalidate: false
}),
{type = RECEIVE_POSTS, posts, receivedAt} => ({
isFetching: false,
didInvalidate: false,
items: posts,
lastUpdated: receivedAt
}),
else => state
}}
}
function postsBySubreddit(state = {}, action) {
return case (action) {
{type = INVALIDATE_SUBREDDIT, subreddit} =>
({...state, [subreddit]: posts(state[subreddit], action)}),
{type = RECEIVE_POSTS, subreddit} =>
({...state, [subreddit]: posts(state[subreddit], action)}),
{type = REQUEST_POSTS, subreddit} =>
({...state, [subreddit]: posts(state[subreddit], action)}),
else => state
}
}
export default combineReducers({
postsBySubreddit,
selectedSubreddit
})
// Just a quick overview of how the syntax would look, with various permutations
// `let binding in` optional, declares local `binding === arg` for matched case
//
// Each `expr(...)` call is made with all the declared bindings for that case
let result = case (arg) {
// equality
binding = value => expr(binding),
= value => expr(),
= func() => expr(),
// boolean test
binding if test => expr(binding),
if test => expr(),
// literal `undefined`/`null`
binding = undefined => expr(binding),
binding = null => expr(binding),
= undefined => expr(),
null => expr(),
// other primitive types
binding is "string" => expr(binding),
is "string" => expr(),
is "string"() => expr(), // equivalent, *technically* valid
// instanceof
binding is ClassType => expr(binding),
is ClassType => expr(),
is ClassType() => expr(), // equivalent
// call `func[Symbol.test](arg, ...explicit) -> destructured`
binding is match(/re/) => expr(binding),
is match(/re/) => expr(), // calls `match[Symbol.test](arg, /re/)`
is (computed()) => expr(),
is (computed())() => expr(), // equivalent
// destructuring object
binding = {foo: value} => expr(),
{foo: value} => expr(),
{foo: foo} => expr(foo),
{foo} => expr(foo), // shorthand for {foo: foo}
{foo: ?} => expr(), // arg.foo != null
{foo: is "string"} => expr(),
{foo is "string"} => expr(foo),
{foo: {nested}} => expr(nested),
{foo: is "string", ...rest} => expr(rest)
// destructuring iterable
// calls `Symbol.iterator` on arg
binding = [= value] => expr(),
[= value] => expr(),
[bar] => expr(bar),
[is "string"] => expr(),
[bar is "string"] => expr(bar),
[is "string", ...] => expr() // ignore rest (iterator closed)
[is "string", ...rest] => expr(rest) // capture rest
// destructuring match
// customize with `Symbol.test` and `Symbol.unapply`
binding is ClassType of {foo} => expr(binding, foo),
is ClassType of {foo} => expr(foo),
is ClassType of [bar] => expr(bar),
// guarded cases
binding = [foo] if test => expr(foo, binding),
is Bar of [bar] if test => expr(bar, binding),
// nesting is very much allowed
[{nested is "number"}] if nested < 10 => expr(nested),
// catch-all (only one is actually allowed)
binding => expr(binding),
else => expr(),
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment