Created
February 2, 2026 07:49
-
-
Save kmicinski/456e65ec3cafaa750348df305e8a2466 to your computer and use it in GitHub Desktop.
Pattern matching
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #lang racket | |
| ;; CIS352 -- Feb 3, 2026 | |
| ;; Pattern matching: | |
| ;; □ -- Trees, manually | |
| ;; □ -- Recursion over trees | |
| ;; □ -- Introducing pattern matching | |
| ;; □ -- Matching list patterns ("quasipatterns") | |
| ;; □ -- Matching predicate patterns | |
| ;; □ -- Trees, and algebraic data types generally | |
| ;; Quickanswer | |
| ;; Warmup: let's write (interleave l0 l1), a function that takes two lists | |
| ;; l0/l1 and returns the elements of l0 interleaved with the elements of l1 | |
| ;; Assume l0/l1 have the same length | |
| ;; (interleave '(0 1 2) '(x y z)) => '(0 x 1 y 2 z) | |
| (define (interleave l0 l1) | |
| 'todo) | |
| ;; Introduction: a big idea in CIS352 is being able to explain complex | |
| ;; programming language features in terms of simpler ones. A very | |
| ;; reasonable question is: "why Racket? Why not Haskell, Rust, etc." | |
| ;; The answer is that all of those languages have a *large* set of | |
| ;; core features (e.g., static type system), which makes it harder to | |
| ;; understand a very tiny core. There is an answer about Racket | |
| ;; specifically in the next lecture--but for now, we will study just | |
| ;; one more feature. | |
| ;; | |
| ;; Later on in the course we will study the λ-calculus: a language with only | |
| ;; three forms. We will show that in fact it is possible to encode every | |
| ;; single language feature in terms of the λ-calculus. While CIS352 covers | |
| ;; Racket, a main reason for doing this is that it is a language that has | |
| ;; a very direct connection to the untyped λ-calculus. | |
| ;; | |
| ;; Today we will talk about pattern matching, an increasingly-common language | |
| ;; feature which originated in functional programming (SML, Haskell, etc.) | |
| ;; but now exists in Python, Rust, and other general-purpose multi-paradigm | |
| ;; languages. | |
| ;; | |
| ;; We are learning the absolutely minimal number of features necessary to build | |
| ;; a complete, general-purpose programming language that allows us to get real | |
| ;; work done. Because we want to fully understand *everything*, we will take a | |
| ;; minimal number of things as fundamental. In the case of Racket, we will take | |
| ;; only a *single* data structure as fundamental: lists. Nearly *everything* | |
| ;; we build in class--from a data structures point of view--will be done *only* | |
| ;; using lists. | |
| ;; | |
| ;; Next week, we'll use algebraic data types for their main purpose in CIS352: | |
| ;; representing the *syntax* of another programming language, so that we can | |
| ;; start writing interpreters. | |
| ;; □ -- Trees, manually | |
| ;; In Racket, we will define an algebraic data type via a "type predicate:" | |
| (define (tree? t) | |
| (cond [(equal? t 'empty) #t] | |
| [(and (list? t) | |
| (equal? (length t) 4) ;; node's length is exactly 4 | |
| (equal? (first t) 'node) | |
| (tree? (third t)) | |
| (tree? (fourth t))) | |
| #t] | |
| [else #f])) | |
| ;; A binary tree is either: | |
| ;; - The symbol 'empty | |
| ;; - The list `(node ,v ,l0 ,l1) where: | |
| ;; -> l's first element is 'node | |
| ;; -> l's second element is any value v | |
| ;; -> l's third element l0 is also a tree? | |
| ;; -> l's fourth element l1 is also a tree? | |
| ;; - A tree is *nothing else* | |
| ;; Notice how the second `cond` case is extremely ugly: we have to | |
| ;; spell out the shape of the list very precisely. The point of | |
| ;; pattern matching is to avoid that pain. | |
| ;; A few example trees: | |
| 'empty | |
| '(node 5 empty empty) | |
| '(node 5 | |
| (node 0 empty empty) | |
| (node 10 (node 7 empty empty) (node 12 empty empty))) | |
| ;; QuickAnswer | |
| ;; Write the tree: | |
| ;; 20 | |
| ;; / \ | |
| ;; 10 30 | |
| ;; in a manner that defines tree? | |
| ;; (node ...) | |
| ;; One fact about trees in Racket: they are *just lists*. We can build | |
| ;; them entirely out of cons cells. We don't need any fancy machinery | |
| ;; to define trees, other than recursion and lists. Once we have | |
| ;; recursion and lists, we will see we can build the rest of | |
| ;; computation. | |
| ;; □ -- Recursion over trees | |
| ;; Now we want to write a recursive function over trees. Let's start: | |
| ;; A trees size is the number of *nodes* it contains | |
| (define (tree-size t) | |
| (if (equal? t 'empty) | |
| ;; (tree-size 'empty) => 0 | |
| 0 | |
| ;; it's a node: | |
| ;; the left subtree is (third t), the right subtree is (fourth t) | |
| (let ([t0-size (tree-size (third t))] | |
| [t1-size (tree-size (fourth t))]) | |
| (+ 1 t0-size t1-size)))) | |
| ;; As we mentioned, every algebraic data type will tell us what the | |
| ;; "shape" of our recursion should look like. In the case of trees, we | |
| ;; get the following recursion principle: | |
| ;; | |
| ;; To define a recursive function f that works on all trees, define: | |
| ;; - (f 'empty) | |
| ;; - (f `(node ,v ,t0 ,t1)) | |
| ;; -> *Using* the value of (f t0), (f t1). I.e., the result of | |
| ;; applying f to the left / right subtree | |
| ;; | |
| ;; In the above definition, we used (third t) and (fourth t) to | |
| ;; manually extract t0/t1. Today, we'll learn about pattern matching, | |
| ;; which will make this much easier. But for now, let's work another | |
| ;; example in the same style as the previous: | |
| ;; QuickAnswer | |
| ;; Define (tree-sum t), assuming t satisfies tree?: | |
| ;; - If t is 'empty, return 0 | |
| ;; - If t is a node, return (second t) + tree-sum applied to | |
| ;; the left/right subtree (use (third t) and (fourth t) to get them). | |
| ;; (tree-sum '(node 5 (node 10 empty empty) (node -5 empty empty))) => 10 | |
| (define (tree-sum t) | |
| 'todo) | |
| ;; □ -- Introducing pattern matching | |
| ;; We will introduce a new construct: (match e [pattern body] ...) | |
| ;; This says: "match an expression e against a number of patterns." | |
| ;; | |
| ;; Here is a brief example: | |
| (define (fib x) | |
| (match x | |
| [0 1] | |
| [1 1] | |
| [n (+ (fib (- x 1)) (fib (- x 2)))])) | |
| ;; A match expression consists of a: | |
| ;; - Discriminee, an expression *being matched on / observed* | |
| ;; - A number of pattern / body pairs, surrounded by [ ... ] | |
| ;; | |
| ;; The semantics is this: | |
| ;; 1. First, evaluate the discriminee down to a value, v | |
| ;; 2. Try to match v against the first pattern: | |
| ;; -> If the pattern matches, bind pattern variables and evaluate the body | |
| ;; 3. If the first pattern fails to match, try the next pattern, ... | |
| ;; ... | |
| ;; - If no pattern matches, raise an exception. | |
| ;; | |
| ;; Trivia: we call a match *total* or *exhaustive* when it matches all | |
| ;; possible values that v could be. In this case, we can rest assured | |
| ;; that we will never face an error simply due to the fact that we | |
| ;; didn't have the right match case. Later on in the class, we will | |
| ;; study tools for how we could reason about this (and related | |
| ;; errors). | |
| ;; So far, we've seen two examples of patterns: | |
| ;; - Every literal value (like 0, "hello", 'foo) can be used as a pattern | |
| ;; -> Matches exactly that literal | |
| ;; - A variable forms a pattern and binds the variable | |
| ;; - The wildcard pattern _ matches anything | |
| ;; The two are equivalent | |
| (define (foo x) | |
| (cond [(equal? x "hello") 1] | |
| [(equal? x 15) 2] | |
| [else 3])) | |
| (define (foo-using-match x) | |
| (match x | |
| ["hello" 1] | |
| [15 2] | |
| [_ 3])) | |
| ;; QuickAnswer | |
| ;; Translate the following function, bar, to use match | |
| (define (bar x) | |
| (cond [(equal? x 1) 1] | |
| [(equal? x -1) -1] | |
| [else (bar (- (- x 1)))])) | |
| (define (bar-match x) | |
| 'todo) | |
| ;; □ -- Matching list patterns ("quasipatterns") | |
| ;; The next pattern we'll learn is the *most important for CIS352.* | |
| ;; | |
| ;; Let's look at the function match-list, which takes a list l | |
| ;; and returns: | |
| ;; - the first element if it is of length 1 | |
| ;; - the sum of the two elements if it is of length 2 | |
| ;; - the third elements onward if it is of length > 2 | |
| (define (match-list l) | |
| (match l | |
| [`(,x0) x0] | |
| [`(,x0 ,x1) (+ x0 x1)] | |
| [`(,x0 ,x1 ,rest ...) rest])) | |
| ;; IMPORTANT: in a pattern, you *MUST* use `, not ' | |
| ;; | |
| ;; IMPORTANT: when we use "..." in a pattern, it matches *zero or more* | |
| ;; occurrences of the pattern immediately previous to it. | |
| ;; If that pattern is a variable, then it is bound to a list. | |
| ;; | |
| ;; IMPORTANT: just to repeat this, in the following example, y is bound | |
| ;; to a *list*, because we have `...` following y in the pattern. | |
| (define (f+ l) | |
| (match l | |
| [`(1 ...) "matches zero or more 1s"] | |
| [`(x ,y ...) "matches any list starting with 'x, followed by y (matches as a list)"])) | |
| ;; (match-list '(5)) => 5 | |
| ;; (match-list '(1 2)) => 3 | |
| ;; (match-list '(1 2 3)) => '(3) | |
| ;; (match-list '(1 2 3 4)) => '(3 4) | |
| ;; The `(,x0 ,x1 ,x2 ...) is called a *quasipattern*, which sounds complex, but | |
| ;; it is actually quite simple once you get the gist of it: | |
| ;; - `(,x0 ,x1 ,x2) matches a list of size 3 | |
| ;; -> x0, x1, and x2 are bound as pattern variables | |
| ;; - `(,x0 ,x-rest ...) matches a list of size >1 | |
| ;; -> x-rest can be '(), or any length >0 | |
| ;; - We will only ever use `...` at the *end* of a quasipattern (pattern that | |
| ;; matches a list). | |
| ;; | |
| ;; It turns out that x0, x1, and x2 are just *patterns*, which happen | |
| ;; to be (in this case) variable patterns. We can use literals too, or | |
| ;; any other pattern: | |
| (match '(1 2 3) | |
| [`(1 2 ,x) x]) ;; returns 3 | |
| ;; Quickanswer | |
| ;; Use match to define a function, (every-other l), which returns every | |
| ;; other element in l. | |
| ;; (every-other '()) => '() | |
| ;; (every-other '(1)) => '(1) | |
| ;; (every-other '(1 2)) => '(1) | |
| ;; (every-other '(1 2 3)) => '(1 3) | |
| ;; hints: three patterns to use | |
| ;; `(,x0 ,x1 ,x-rest ...), `(,x0), and '() | |
| (define (every-other l) | |
| 'todo) | |
| ;; □ -- Matching predicate patterns | |
| ;; Sometimes we want to say: "match this, but only if it satisfies a | |
| ;; certain predicate." To do that, we can use predicate patterns: | |
| (define (bar++ x) | |
| (match x | |
| [(? even? y) (bar++ (- y 1))] ;; no reason I had to bind y | |
| [(? odd?) (* x 3)])) ;; because y = x in this case! | |
| ;; The syntax for a predicate pattern is one of two things: | |
| ;; - (? predicate?) Match if predicate? (a function returning #t/#f) holds | |
| ;; - (? predicate? x) Match if predicate? holds, and *also* bind x | |
| ;; Now, remember how list patterns work: `(,x ...)` matches a *list* | |
| ;; and binds it to `x`. Previously however, we mentioned that x need | |
| ;; not be a name: it can be *any* pattern. It is often a name, in | |
| ;; which case it matches *any* list as x. But we can also constrain | |
| ;; it, to match a list of elements, all of which satisfy some | |
| ;; property: | |
| (define (evens-or-odds? l) | |
| (match l | |
| ['() "We got nothing."] | |
| [`(,(? even? evens) ...) "We got all evens."] | |
| [`(,(? odd? odds) ...) "We got all odds."] | |
| [_ "We got a mix."])) | |
| (evens-or-odds? '(1 3 5)) | |
| (evens-or-odds? '(2 4 6)) | |
| ;; QuickAnswer | |
| ;; Describe the behavior of mystery-function. What does it return? | |
| ;; Consider all possible inputs, but describe its behavior as | |
| ;; precisely as you can. | |
| (define (mystery-function l) | |
| (match l | |
| [`(1 ,(? zero? xs) ...) (length xs)] | |
| [_ "error"])) | |
| ;; □ -- Trees, and algebraic data types generally | |
| ;; We can use pattern matching to implement "type predicates" for | |
| ;; algebraic data types: types defined via constructors of associated | |
| ;; arities. We rewrite the tree? predicate using `match` as follows. | |
| (define (match-tree? t) | |
| (match t | |
| ['empty #t] | |
| [`(node ,v ,(? match-tree? t0) ,(? match-tree? t1)) #t] | |
| [_ #f])) | |
| ;; Now, let's use matching to sum up all of the elements in the tree | |
| ;; sum-tree : match-tree? -> number? (assume tree of numbers) | |
| (define (sum-tree t) | |
| (match t ;; match is total by assumption that t satisfies match-tree? | |
| ['empty 0] | |
| [`(node ,v ,t0 ,t1) (+ v (sum-tree t0) (sum-tree t1))])) | |
| ;; Now, we can define a function to calculate the maximum element in a | |
| ;; tree of numbers. Make 'empty return -inf.0 | |
| (define (max-tree t) | |
| (match t | |
| ['empty -inf.0] ;; make the maximum negative infinity | |
| [`(node ,v ,t0 ,t1) (max v (max-tree t0) (max-tree t1))])) | |
| ;; QuickAnswer | |
| ;; A tree satisfies the BST (binary search tree) property whenever | |
| ;; every subtree `(node ,v ,t0 ,t1) has that (max t0) < v < (max t1). | |
| ;; | |
| ;; Basic idea: match t | |
| ;; - If it's 'empty, return #t | |
| ;; - If it's `(node ,v ,t0 ,t1), call `(max-tree t)` (above) to | |
| ;; calculate the maximum of t0 and t1, then check the necessary | |
| ;; properties. | |
| (define (bst-property? t) | |
| 'todo) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment