- Patterns should be concise
- Patterns should be expressive
- Patterns should be explicit
- Patterns should be extensible
- Pattern matching should be exhaustive
The primary goal of pattern matching is to reduce the overhead incurred by non-pattern based control flow. This means simplifying multi-statement or multi-expression comparisons to improve readability, as well as to cut down on the need to introduce temporary variables used only to avoid recomputing potentially side-effectful or expensive code.
Consider the following ECMAScript code sample:
if (obj.dimensions.height === 10 && obj.dimensions.width === 10) { … }
Example 1 — A multi-step conditional. (ECMAScript)
In this example, the dimensions
property could potentially be an accessor that is side-effectful, or may
perform an expensive computation. As such, we may want to avoid re-evaluating obj.dimensions
and might instead use a
variable to store the partial computation:
const dimensions = obj.dimensions;
if (dimensions.height === 10 && dimensions.width === 10) { … }
Example 2 — A multi-step conditional with a temporary variable. (ECMAScript)
However, this further increases the complexity of the condition by adding an additional statement, and requires the naming of an otherwise unused variable.
Now consider the same condition using the pattern matching syntax from C#:
if (obj is { dimensions: { height: 10, width: 10 } }) { … }
Example 3 — Nested pattern matching. (C#)
The pattern matching example avoids re-execution of the property access from example 1, as well as the need to introduce a temporary variable like in example 2.
Patterns should be concise. However, for a pattern to be adequately concise, the pattern syntax must also be expressive.
In order for a pattern to be concise, the pattern syntax must be expressive enough to represent the most common cases of multi-step conditionals. There are a number of ways patterns can support such expressivity:
- Literal constant patterns that express
===
or SameValue/SameValueZero equality. - Nested property patterns that can reduce the need for temporary variables and repeated use of
&&
. - List patterns that can reduce the need for complex iteration and caching logic, as well as avoid opaque index accesses
like
x[0][1]
. - Type-family patterns that test whether a value is an instance of a specific type.
- Relational patterns that can express other relational operators (i.e.,
>
,>=
,<
,<=
), or even ranges. - Logical patterns that allow for conjection, disjunction, negation, and grouping.
Most languages with pattern matching support many of the above capabilities:
input switch {
0 => …, // literal constant
{ x: 0, y: 0 } => …, // nested properties
(1, 2, 3) => …, // tuples
[1, 2, 3, ..] => …, // lists
Rect => …, // type family
>1 => …, // relational
not null => …, // logical negation
1 or 2 or 3 => …, // logical disjunction
>0 and <3 => …, // logical conjunction
(>0 or >0F) and (<100 or <100F) => … // grouping
}
Example 4 — Expressivity in C# switch
expressions. (C#)
match input with
| 0 -> … // literal constant
| { x = 0; y = 0; } -> … // nested properties
| (1, 2, 3) -> … // tuples
| [1; 2; 3] -> … // lists
| [|1; 2; 3 |] -> … // arrays
| p : Point -> … // type family
| :? Point -> … // type family
| 1 | 2 | 3 -> … // logical disjunction
| (a, b) & (_, "test") -> … // logical conjunction
| (1 | 2) & (2 | 3) -> … // grouping
| …
Example 5 — Expressivity in F# match
expressions. (F#)
match input {
0 => …, // literal constant
Point { x: 0, y: 0 } => …, // nested properties
(1, 2, 3) => …, // lists/tuples
Point { .. } => …, // type family
0..=5 => …, // range
1 | 2 | 3 => …, // logical disjunction
…
}
Example 6 — Expressivity in Rust match
expressions. (Rust)
input match
case 0 => … // literal constant
case Some(value) => … // nested properties (case classes and extractors)
case (1, 2, 3) => … // tuple pattern
case 1::2::3 => … // list pattern (infix operator and extractors)
case p: Point => … // type family
case 1 | 2 | 3 => … // logical disjunction
…
Example 7 — Expressivity in Scala match
expressions. (Scala)
Patterns should be expressive. However, there are often either complex or niche cases where patterns cannot be sufficiently expressive. In such cases, patterns should also be extensible. And while patterns should be sufficiently expressive to handle the majority cases, their syntax should also be unambiguous. For this reason, patterns must also be explicit.
Because patterns should be concise and expressive, we must take care that we do not overload the meaning of various syntactic constructs. This would introduce confusion to readers and result in potentially unexpected behavior. To avoid this, patterns should be explicit — there should be no ambiguity in the meaning of a particular pattern.
For example, consider a list pattern like [A, B]
(or (A, B)
, depending on the language). Should this pattern
introduce new variables A
and B
corresponding to the first and second elements of the input? Or should it instead
resolve the values stored in the existing A
and B
variables and match them against the corresponding elements of the
input?
In C#, the pattern (A, B)
is unambiguous and can only indicate references to existing types. The syntax related to
introducing new variables requires a type prefix or var
keyword. (A, B)
matches a tuple whose elements are the
types A
and B
, respectively. (A a, B b)
matches a tuple with elements of type A
and B
and stores those values
into the variables a
and b
, respectively. (var a, var b)
matches a tuple regardless of the types of the two
elements, and introduces the variables a
and b
.
Depending on the underlying syntax, such collisions can be avoided. For ECMAScript, there are a number of options to explore:
- Leveraging the existing
let
,const
, andvar
keywords (and potentially the proposedusing
keyword), similar to C#'svar
and type binding patterns. - Introducing novel syntax for explicitly named bindings, such as Rust's
@
binding operator. - Dislocating the bindings introduced from the pattern itself.
We must take care that whatever syntax we choose aligns as closely as possible with our preceding core tenets, namely:
- Being explicit should not unduly prevent us from remaining concise.
- Being explicit should not unduly prevent us from remaining expressive.
While patterns should ideally be sufficiently expressive, there are often more complex or niche cases that cannot be readily handled by pattern syntax. To address this, patterns should be extensible — providing escape hatches to address these cases.
There are several mechanisms for providing extensibility:
- Pattern guards, which introduce additional conditions to evaluate to match a specific case.
- Custom matchers, which augment pattern matching behavior either through custom operators or a dynamic API.
- Interpolation, which escapes from the pattern syntax to access runtime values or custom matchers.
In cases where the pattern syntax is insufficiently expressive, many languages with pattern matching also introduce pattern guards which provide additional expressivity at the cost of dislocation of the guard from the pattern itself:
input switch {
(var a, var b) when a > 0 && b > 0 => …, // pattern guard using 'when'
…
}
Example 8 — when
pattern guard in C# switch
expressions. (C#)
match input with
| (a, b) when a > 0 && b > 0 -> … // pattern guard using 'when'
| …
Example 9 — when
pattern guard in F# match
expressions. (F#)
input match {
(a, b) if a > 0 && b > 0 => … // pattern guard using 'if'
}
Example 10 — if
pattern guard in Rust match
expressions. (Rust)
input match
case (a, b) if a > 0 && b > 0 => … // pattern guard using 'if'
…
Example 11 — if
pattern guard in Scala match
expressions. (Scala)
While pattern guards are useful for handling niche cases, a truly expressive pattern syntax primarily favors a more comprehensive set of patterns that promote locality of conditions to improve readability.
In addition to pattern guards, pattern matching languages often introduce extensibility mechanisms
through a dispatch mechanism. In C#, this is handled via the
Deconstruct
method, which allows
for the evaulation of user-defined behavior during destructuring and matching. In F#, this is addressed by
Active Patterns, which allow you
to reference a user-defined function to customize matching behavior. In Scala, this is supported via
Extractor Objects
and infix operator patterns.
class Person {
public string GivenName;
public string FamilyName;
public Person(string givenName, string familyName) {
GiveName = givenName;
FamilyName = familyName;
}
public void Deconstruct(out string givenName, out string familyName) {
givenName = GivenName;
familyName = familyName;
}
}
input match {
Person(var givenName, var familyName) => … // checks type and calls Deconstruct
}
Example 12 — Pattern extensibility via Deconstruct
. (C#)
let (|RGB|) (col : System.Drawing.Color) = ( col.R, col.G, col.B )
match col with
| RGB(r, g, b) -> … // invokes RGB with input and destructures resulting tuple
| …
Example 13 — Pattern extensibility via Active Patterns. (F#)
object CustomerID {
def unapply(customerID: String): Option[String] = {
val stringArray: Array[String] = customerID.split("--")
if (stringArray.tail.nonEmpty) Some(stringArray.head) else None
}
}
input match
case CustomerID(name) => … // Invokes CustomerID.unapply
Example 14 — Pattern extensibility via Extractor Objects. (Scala)
// NOTE: exists in standard library
object :: {
def unapply[A](ls: List[A]): Option[(A, A)] = {
if (ls.empty) None
else Some((ls.head, ls.tail))
}
}
input match
case h::t => … // Looks up `::` operator, invokes `unapply` and destructures result into `h` and `t`
Example 14 — Pattern extensibility via Extractor Objects and infix operator patterns. (Scala)
One last extensibility mechanism to consider is interpoloation. This is an escape hatch from the pattern syntax that allows you to substitute the result of evaluating an expression into a pattern. Such a mechanism is not widely employed in any language, however.
Statement-based control flow in ECMAScript often allows you to elide branches. You can have an if
without an else
and a switch
without a default
and this is acceptable within the language itself because these statements can
simply produce an empty completion record for unhandled cases. However, unhandled branches are frequently a source of
bugs in software and many static analysis tools often implement control flow analysis to find and surface such issues.
Expression-based control flow in ECMAScript differs in that it does not allow you to elide branches. A conditional
expression cannot elide the "falsy" branch, and operators like &&
, ||
, and ??
do not allow you to elide the
right-hand of the expression. Even operators like ?.
are essentially exhaustive as they are often just a shorthand
for a conditional operation whose "falsy" branch is just undefined
.
Exhaustiveness is achieved in pattern matching through the use of refutable patterns and irrefutable patterns. A refutable pattern is one that conditonally matches a value. If such a pattern does not match, the next alternative or branch should be attempted. An irrefutable pattern is one that will always match. For a pattern matching construct to be considered exhaustive it must either cover the input domain with sufficient refutable patterns, or it must also have an irrefutable pattern.
For example, in Rust every variable declaration is essentially a pattern. The x
in the statement let x = 5
is an
irrefutable pattern, as there is no input value that would not match. However, the Some(x)
in the statement
let Some(x) = opt
is a refutable pattern: if opt
is not a Some
, the pattern will not match and an error will
be reported.
This is not unlike destructuring in ECMAScript today. The statement let { x } = input
is an example of a refutable
pattern: if input
is not an object, the statement will throw. In the same vein, let [x] = input
is also a
refutable pattern: if input
does not have a [Symbol.iterator]
method, the statement will throw.
One way we could implement exhaustiveness in ECMAScript pattern matching would be to also throw if some branch of the
pattern fails to match. For example, if we were to replace the simple conditional a ? b : c
with a switch
expression in C#, it might look something like this:
a switch {
true => b,
false => c
}
Example 15 — Exhaustiveness in a switch
expression. (C#)
If we were to elide either branch, C#'s control flow analysis would fail, not unlike if we'd written a ? b :
or
a ? : c
.
Event without static typing we can still support exhaustiveness in ECMAScript at runtime similar to how we support it
for destructuring, by throwing a TypeError
. In addition, third-party static analysis tools such as TypeScript and Flow
would be able to recognize non-exhaustive pattern matching expressions and provide early warnings of potential errors.
The one instance where exhaustiveness should not be required is when the intent is to test whether the pattern matched
rather than what was matched. In C#, the is
expression can be used to conditionally test a pattern and returns
either true
if the pattern was exhaustive, or false
if it was not:
if (obj is int) { ... }
Example 16 — Non-exhaustive pattern matching via the is
expression. (C#)
Similarly, in Rust you can use if let
and while let
to conditionally match a non-exhaustive pattern:
if let Some(x) = opt { ... }
Example 17 — Non-exhaustive pattern matching via the if let
statement. (Rust)
Pattern matching should be exhaustive, as exhaustiveness helps to avoid defects.