Skip to content

Instantly share code, notes, and snippets.

@rbuckton
Last active September 9, 2022 19:45
Show Gist options
  • Save rbuckton/df6ade207eecad4fc94cedc3aae79ceb to your computer and use it in GitHub Desktop.
Save rbuckton/df6ade207eecad4fc94cedc3aae79ceb to your computer and use it in GitHub Desktop.
Pattern Matching Exploration

Pattern Matching Exploration

The following sections explore various ways we could consider introducing pattern matching to ECMAScript, and can be used to inform a discussion on how best to layer the proposal. The various explorations in this document are not intended to be taken as a whole, as some sections directly conflict with others. Any exploration will specifically call out other explorations that are related.

This list is by no means complete. Suggestions for additions or changes are welcome.

Extending Existing Syntax

One way to introduce pattern matching is to extend or inject patterns into existing syntax, allowing it to be gradually introduced to an existing codebase.

Extending Assignment

This could be similar to Rust's and Scala's pattern matching mechanism, where any assignment is a pattern. As per both Rust's and Scala's semantics, destructuring patterns must be exhaustive. Failing to match should result in an Error.

Irrefutable Patterns

A normal assignment or variable declaration would be an irrefutable pattern (it always matches successfully).

let x = 1;
x = 2;

Refutable Patterns

An assignment pattern or binding pattern would become a refutable pattern (it can fail to match).

const { x } = obj;          // obj must be non-nullish
const [y] = ar;             // ar must be non-nullish and have a [Symbol.iterator]

In the same vein, we could employ other patterns that don't introduce bindings:

const { x, y: 0 } = obj;    // bind x, match y, may have other properties
const { x: 0, y } = obj;    // match x, bind y, may have other properties
const { x: 0, y: 0 } = obj; // match x, match y, may have other properties

const [x, 1] = ar;          // bind x, match 1, may have other elements
const [0, y] = ar;          // match 0, bind y, may have other elements
const [0, 1] = ar;          // match 0, match 1, may have other elements

This has several drawbacks:

  • Patterns are not strict (may have other properties/elements), so we'd need to either:
    • Introduce syntax to enforce strictness in patterns.
    • Have patterns make the destructuring strict, which could become confusing.
  • Need a way to both match and bind.
  • Need a way to match special identifiers like Infinity, NaN, undefined, and custom matchers:
    • Treat those identifiers specially, but this is a breaking change:
      const { x: Infinity, y: Point, z } = ...;       // already legal: bind Infinity, bind Point, bind z
    • Require pattern parens to escape out of binding:
      const { x: (Infinity), y: (Point), z } = ...;   // match x, match y, bind z (use ${})
    • Introduce interpolation syntax to break out of the pattern:
      const { x: ${Infinity}, y: ^{Point}, z } = ...; // match x, match y, bind z (use ${})
      const { x: ^Infinity, y: ^Point, z } = ...;     // match x, match y, bind z
    • Leverage @ to match but discard the binding.
      const { x: @Infinity, y: @Point, z } = ...;     // match x, match y, bind z (use prefix-@ like above)

This would also work reasonably well with extractors:

const Option.Some({ x: 0, y }) = opt;

Strict patterns using {| |} and [| |]

Since destructuring is an existing part of the language, we can't change the semantics of existing syntax without introducing breaking changes. One way to address this would be to introduce a "strict" form of the object and array patterns:

const { x: 0, y: 1 }        = { x: 0, y: 1, z: 2 }; // ok
const {| x: 0, y: 1 |}      = { x: 0, y: 1, z: 2 }; // error
const {| x: 0, y: 1, ... |} = { x: 0, y: 1, z: 2 }; // ok

const [0, 1]        = [0, 1, 2];                    // ok
const [|0, 1|]      = [0, 1, 2];                    // error
const [|0, 1, ...|] = [0, 1, 2];                    // ok

Match and Bind using @

Rust uses @ as a mechanism to indicate a pattern should both match and introduce a binding. This might looks something like the following:

const { x: 0, y: 1 }     = { x: 0, y: 1 };  // matches, but does not bind
const { x, y }           = { x: 0, y: 1 };  // binds, but does not match (other than that input is non-nullish)
const { x: x@0, y: y@1 } = { x: 0, y: 1 };  // matches and binds

const [0, 1]     = [0, 1];                  // matches, but does not bind
const [x, y]     = [0, 1];                  // binds, but does not match (other than that input is iterable)
const [x@0, y@1] = [0, 1];                  // matches and binds

In addition, @ could be used to help disambiguate whether a pattern identifier should introduce a binding or reference a binding in scope to use as a custom matcher:

const { x: Infinity, y: Point, z } = ...;       // already legal: bind Infinity, bind Point, bind z
const { x: @Infinity, y: @Point, z } = ...;     // match `Infinity`, match `Point`, bind z

void Patterns

This was actually discussed in the Explicit Resource Management proposal, #88 to make the proposed using void = ... syntax more general to support void as a BindingIdentifier in any binding pattern.

In essence, void would be an irrefutable pattern that essentially means "I don't care about the value". With strict pattern matching, property or element testing must be exaustive. A void pattern could allow you to be exhaustive in an object or array pattern, but discard the value when matching the particular element:

// using a form of strict patterns
const {| x: 0, y: void, z |} = ...; // must have `x`, `y`, and `z`. match `x`, discard `y`, bind `z`

// discarding values in an normal object binding pattern:
const { x, y: void, ...rest } = { x: 0, y: 1, z: 2 }; // `y` is discarded
x; // 0
rest; // { z: 2 }

Extending if

Pattern matching would also be a useful addition to if statements. In this case, patterns need not be exhaustive as failing to match would result in chosing the else branch instead.

if (let pat = expr) ...

This is similar to Rust's if let statement. Here, we use let (or const) in the head of an if statement to to test a pattern and introduce bindings scoped to the "true" branch of the statement, which would by necessity be a new block scope.

This would lean heavily on the idea that any assignment is a pattern as discussed in the Extending Assignment section.

if (let { x: 0, y } = expr) ...; // match `0`, bind `y`, execute statement
if (let [x, 1] = expr) ...; // match `1`, bind `x`, execute statement

This could also be used for pattern matching even if no bindings are introduced:

if (let 0 or 1 = expr) ...; // match `0 or 1`, no bindings, execute statement

Note that an irrefutable pattern in the if condition will always match:

if (let x = foo()) ...; // bind `x`, execute statement

But we could also consider expanding this to something like C++ 17's if-with-initializer statement:

if (let x = foo(); x.bar) ...; // bind `x`, test `x.bar`, execute statement

What's interesting about this syntax is that it's not unlike a for-statement without an incrementer:

if  (let id = expr; condition             ) Statement
for (let id = expr; condition; incrementer) Statement

if (expr is pat) ...

Another approach would be to leverage an infix is pattern matching expression, similar to C#. In this case, we wouldn't need to do anything specific for if aside from ensuring any bindings introduced in the is expression's pattern are available inside of the "true" branch.

Extending while

Much like if, we could extend while in a similar fashion.

while (let pat = expr) ...

This is similar to Rust's while-let statement, and leans on the same concepts introduced in if (let pat = expr) ... and Extending Assignment. In this case, the condition of the while would match the pattern on each iteration. Any bindings introduced by the pattern would be scoped to the body of the while, which would necessitate introducing a new per-iteration block scope. As with if, a while-let would also support const (and potentially var).

while (let { x: 0, y } = expr) ...; // per-iteration: match `0`, bind `y`, execute statement
while (let [x, 1] = expr) ...;      // per-iteration: match `1`, bind `x`, execute statement
while (let 0 or 1 = expr) ...;      // per-iteration: match `0 or 1`, execute statement

We could also introduce a variant of while similar to the if-with-initializer example above:

while (let x = foo(); x.bar) ...;

This would be very similar to a for statement, except that the let clause would be reevaulated each iteration, making the semantics similar to the following:

for (let x = foo(); x.bar; x = foo()) ...;

Except that the while vesion could be made const, but the for version cannot:

for (let x = foo(); x.bar; x = foo()) ...;  // `let` ok, but cannot be `const`
while (const x = foo(); x.bar) ...;         // can be `const`

The basic while-let form would not be supported in a do-while statement as any bindings introduced would be unreachable in the first iteration, though the while-let-condition form could be adopted such that the bindings introduced by the let are only scoped to the condition, e.g.:

do {
    // x is unreachable
}
while (let x = foo(); x.bar);

while (expr is pat) ...

As with if (expr is pat) ... above, this variant would instead leverage an infix is expression to match a pattern. As well, we wouldn't need to do anything specific for while except to ensure that bindings introduced by the pattern would be available in the body of the while.

Unlike while-let, the is expression could be used in a do-while statement, though its bindings would not be reachable.

Extending switch

While it is intended that pattern matching be exhaustive, we could consider allowing non-exhaustive pattern matching in a switch statement to support gradual adoption of pattern matching. The following sections explore the potential of extending switch to support pattern matching functionality.

Use switch as is

switch (input) {
    case 1: break; // ok
    case { x: 1 }: break; // no, compares input to object reference
    case NaN: break; // no, NaN !== NaN
    default: // ok
}

It is not possible to use a switch statement as-is since many patterns can also be parsed as legal expressions.

switch match (expr) { ... }

switch match (expr) { ... }

While it is feasible to introduce a keyphrase like switch match, it is a bit wordy compared to alternatives like match (expr).

switch (expr) is { ... }

switch (expr) is { ... }

This suffers from the same drawbacks as switch-match.

switch? (expr) { ... } and case? pat: ...

We could introduce a postfix token to the switch or case keywords to toggle between pattern or expression matching. This could allow you to gradually introduce pattern matching to an existing codebase.

For example, here I use case? in a normal switch statement to indicate that case clause selector is a pattern and should use pattern matching semantics:

// Normal `switch`, so `case` clause selectors parse expressions by default. Use `case?` to opt-in to pattern matching.
switch (expr) {
  case 0: break;            // selector is expression, match using equality
  case? { x: 0 }): break;   // selector is pattern, match using pattern matching
  case? [0, 1]: break;      // selector is pattern, match using pattern matching
}

Using switch? instead would change the default behavior of case to parse patterns instead, and require something like case! to opt back into parsing expressions for selectors:

// Semantically equivalent to the previous example, except `case` selectors are patterns by default. Use `case!`
// to opt-out of pattern matching.
switch? (expr) {
  case! 0: break;           // selector is expression, match using equality
  case { x: 0 }): break;    // selector is pattern, match using pattern matching
  case [0, 1]: break;       // selector is pattern, match using pattern matching
}

If switch were modified in such a way it would not be equivalent to the proposed match expression, since a switch can be non-exhaustive and permits fall-through. We could require switch? to be exhaustive while a normal switch is not, though that would mean the two examples above would not be semantically equivalent.

We could choose to allow patterns to introduce bindings in a case? clause, but we would have to decide on how bindings are scoped. Would the bindings be scoped to the switch or the case? clause? If bindings are scoped to the switch, the user would have to work around duplicate bindings. If the bindings are scoped to the case? clause, we would probably need to disallow fall-through and introduce a new block scope to the case? clause statements. We also would need to determine if the bindings introduced in a case? where mutable or immutable bindings.

case let pat: ...

Another way to introduce gradual pattern matching to switch would be to borrow from if-let and while-let above and add a case let (or case const) pattern. This would also lean on the idea that any assignment is a pattern as discussed in the Extending Assignment section:

switch (expr) {
    case 0: break;              // ok, regular equality
    case let 1 or 2: break;     // ok, matches pattern but no bindings are introduced
    case let { x: 0 }: break;   // ok, matches pattern but no bindings are introduced
    case let [x, 1]: break;     // ok, matches pattern, binds x
    case let y: break;          // ok, irrefutable pattern, binds `y`
}

As with the case? example above, we'd need to determine the scope for bindings and whether that affects fall-through and block scope. The upside is that we can use const instead of let to choose where bindings are immutable or mutable, and even support var if necessary.

We could also potentially reuse the if-let-condition and while-let-condition syntax for pattern guards (i.e., ;):

switch (expr) {
    case let { x }; x.bar: break; // match, bind x, test x.bar, execute statements
}

Though its also likely we could just continue to leverage if for pattern guards here:

switch (expr) {
    case let { x }: ...             // pattern only
    case let { x } if x > 0: ...    // pattern with guard
    case if y: ...                  // guard only
}

when pat: ... instead of case expr: ...

switch (expr) {
    when 1: break; // ok
    when { x: 1 }: break; // ok
    when NaN: break; // ok
    default: // ok
}

This has potential, but you would not be able to mix when and case, which could be confusing and would not allow for gradual adoption. Since ordering matters for pattern matching, you could not have a default: clause come first since it wouldn't be clear whether this is a normal switch or a pattern-matching switch. Fallthrough also could be confusing, especially in relation to bindings introduced in the pattern.

case is pat: ...

I've been considering an infix is expression for pattern matching, which could potentially apply to case in a normal switch, i.e.:

switch (expr) {
    case is 1: break; // ok
    case is { x: 1 }: break; // ok
    case is ({ x: 1 }): break; // no, looks like a function call
}

However, it is too easy to end up needing a parenthesized pattern, which turns the is into a function call and would be a breaking change to expression evaluation semantics.

Extending function

A number of pattern matching languages (F#, Scala, Haskell, various Lisp variants, etc.) allow you to define overloaded functions based on patterns. If we leveraged the idea that any assignment is a pattern from Extending Assignment, this might look something like the following:

function move({ direction: "north", steps }) { ... }
function move({ direction: "south", steps }) { ... }
function move({ direction: "east", steps }) { ... }
function move({ direction: "west", steps }) { ... }

This is an extremely powerful capability, but there are a number of concerns we would need to address:

  • Is this a normal function, with each independent function nested within?
  • How would this be affected by function or parameter decorators?
    @dec // can replace just this one or all?
    function move({ direction: "north", steps }) { ... }
    
    // does `@dec` know this is overloaded?
    @dec // what does this one see?
    function move(@dec { direction: "south", steps }) { ... }
    function move({ direction: "east", steps }) { ... }
    function move({ direction: "west", steps }) { ... }
  • Would this become an impedement to the Type Annotations proposal?
  • How is this any better than a match expression?
    const move = input => input match {
      when { direction: "north", steps }: ...;
      when { direction: "south", steps }: ...;
      when { direction: "east", steps }: ...;
      when { direction: "west", steps }: ...;
    }

Extending catch

It has often been requested that we introduce some form of catch guard to allow for multiple, independent catch clauses in a try statement. There are multiple ways this could be achieved.

catch (pat) { ... }

This approach would leverage the idea that any assignment is a pattern from Extending Assignment, allowing you to leverage custom matchers and extractors:

try {
    ...
}
catch (e@{ name: "FileNotFoundError" }) { ... }   // using `@` to bind and match
catch (e@FileNotFoundError) { ... }               // using `@` to bind and match
catch (@FileNotFoundError) { ... }                // using `@` to match
catch (Error{ cause: e@ReferenceError }) { ... }  // using extractors

Introducing New Syntax

In addition to extending existing syntax, we can also chose to introduce new syntax to perform pattern matching.

match Expressions

match expressions, per the current proposal, offer a much improved, expression-level version of switch that is specific to pattern matching. A match expression matches an input to one of several alternative branches until a match is found. Unlike switch, a match expression is exhaustive. If no branch is successfully matched, the expression will throw an error. Also, unlike switch, a match expression has exclusive branches — there is no fall-through.

match (input) {
    when 0: expr1;
    when 1 or 2: expr2;
    default: expr3;
}

is Expressions

An is expression is an infix operator that matches an expression on the left, to a pattern on the right. Unlike match, is is not exhaustive — If the pattern on the right matches, it results in true; if it does not match, it results in false.

Extractors

Extractors are a pattern mechanism that allows you to execute user-defined code during destructuring or pattern matching. There are two types of extractors: array extractors and object extractors.

Array Extractors

An Array Extractor is a pattern consisting of a qualified name (i.e., foo or foo.bar), and a parenthesized list of pattern elements:

const Option.Some(value) = opt;
const Node(left, right) = node;

The qualified name is resolved to an expression in scope, and the input is matched via a [Symbol.matcher] method that returns a result object indicating if the match was successful, and what the match value should be. The match value is then further destructured using array destructuring.

Object Extractors

An Object Extractor is a pattern consisting of a qualified name (i.e., foo or foo.bar), and a curly-bracketed list of pattern properties:

const Message.Move{ x, y } = opt;

The qualified name is resolved to an expression in scope, and the input is matched via a [Symbol.matcher] method that returns a result object indicating if the match was successful, and what the match value should be. The match value is then further destructured using object destructuring.

let and const patterns

If we decide not to consider the idea that any assignment is a pattern, we might instead want to have a way to declare explicit bindings in match and is expressions. One mechanism to do this would be to use something like C#'s variable patterns.

A variable pattern is an irrefutable pattern (it always matches) that introduces a lexical binding that is initialized to the match subject:

if (expr is { x: let x }) { ... }   // `let` pattern in `is`

match (expr) {
    when { x: let x }: ...;         // `let` pattern in `match`
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment