Last active
April 26, 2019 06:04
-
-
Save spitz-dan-l/8ac450008121dfb89bac3c9c0e26a9d0 to your computer and use it in GitHub Desktop.
Python Atrocity: Structural Pattern Matching built on Exception Handling Syntax
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import sys | |
import functools | |
""" | |
Python Atrocity: Structural Pattern Matching built on Exception Handling Syntax | |
Daniel Spitz | |
(Python 3.7+ only) | |
The goal of this Python Atrocity is to implement extensible, composable pattern | |
matching in a way that is syntactically remeniscent of the pattern matching in | |
functional languages. | |
We target exception handling syntax for this purpose. | |
(See the very bottom for 3 demos of it in use.) | |
Exception handling in python allows for multiple exception types to be checked | |
sequentially, with a different code block getting executed according to which | |
exception type matches first. It also allows for the optional binding of a new | |
name within the scope of the handling code block (e.g. "except Exception as e:"), | |
although binding multiple names is not supported. | |
So with the right massaging, we ought to be able to change this: | |
try: | |
... | |
except IndexError: | |
... # handle case 1 | |
except TypeError: | |
... # handle case 2 | |
to this: | |
try: | |
raise obj | |
except Pattern1 as p1: | |
... # do stuff with p1 | |
except Pattern2 as p2: | |
... # do stuff with p2 | |
Unfortunately, such a direct expression of pattern matching is not actually possible. | |
Here are the stumbling blocks: | |
1. We cannot raise any old object as an exception. Only instances of BaseException can be raised. | |
2. We cannot use any old class in an except clause. Only subclasses of BaseException can be used. | |
3. The class in an except clause is determined to be a match via a direct check on the raised | |
object's __mro__, not via the more generic and overrideable isinstance() machinery. | |
4. We can only bind a single new name in a given except clause, making concise destructuring | |
assignment, a key feature of pattern matching, difficult. | |
In this python atrocity, we attempt to address all 4 stumbling blocks, to varying levels of | |
success (if you can really call it that). | |
""" | |
""" | |
Exception handling hacking implementation. | |
Lifted from an old python-ideas thread[0] with far less nefarious purposes in mind | |
originally[1]. | |
Thanks to Random832 for the key insight about using sys.exc_info() *inside the except clause*! | |
"This is, of course, a *profoundly* ugly way to do this." - Random832 | |
The idea here is that we can call this new function _isinstance_except() inside of | |
each except clause, returning its result rather than the original exception class in | |
the clause. | |
So, this: | |
except TypeError: | |
pass | |
would become | |
except _isinstance_except(TypeError): | |
pass | |
The *effect* of this change would be roughly the same as if python used | |
isinstance() to check if an exception clause is a match, rather than directly | |
checking the __mro__. Thus do we address stumbling block #3. | |
To address the syntactic ugliness, we can hide this call in an attribute access on | |
the pattern class being checked: | |
except MyPatternType.M: | |
pass | |
The ".M" implicitly calls _isinstance_except(), and a naive end user might | |
simply shrug it off as standing for "Match". Ignorance is bliss. | |
[0] https://mail.python.org/pipermail/python-ideas/2015-November/037121.html | |
[1] https://mail.python.org/pipermail/python-ideas/2015-November/037104.html | |
""" | |
class _NeverThrown(BaseException): | |
""" | |
Exception class which is expected never to be thrown. | |
Used to disqualify certain `except` blocks from ever | |
matching a raised exception. | |
""" | |
def _isinstance_except(type): | |
""" | |
Normally python checks if an exception is of a certain type | |
by looking for the type in the exception's class' __mro___. | |
By wrapping your exception types in a call to this function we | |
simulate what it'd be like if the core exception handling logic | |
used isinstance() instead of explicitly checking the __mro__. | |
This allows us to attain good, useful, worthwhile power by | |
arbitrarily overriding __instancecheck__ on the metaclass of a | |
new BaseException subclass. | |
""" | |
etype, e = sys.exc_info()[:2] | |
if isinstance(e, type): | |
return etype | |
return _NeverThrown | |
""" | |
Structural Pattern Matching Implementation | |
We define a collection of classes and decorators for the purposes of | |
encapsulating Structural Patterns and the mechanisms by which they can be hooked | |
into python's exception handling machinery. | |
Importantly, any object raised must be an instance of BaseException, | |
and any class appearing in an except clause must be a subclass of | |
BaseException. | |
We therefore define a MatchWrapper class to wrap arbitrary objects for | |
matching. Since MatchWrapper is a subclass of BaseException, it can be | |
raised, and can then carry along the object being matched to any of the | |
patterns appearing in the except clauses. | |
We likewise define a Pattern base class for all patterns intended to | |
appear in an except clause. In order to gain control of instance-checking | |
against Pattern subclasses, we give Pattern a metaclass with an overridden | |
__instancecheck__ method. Together, the MatchWrapper and Pattern classes | |
help us address stumbling blocks #1 and #2. | |
Our overridden __instancecheck__ will dispatch to an overrideable classmethod | |
match() to determine if a given object is an instance of the given pattern. It | |
will also specially recognize MatchWrapper objects. If a MatchWrapper is passed | |
in, then the wrapped object will be passed along to the match() call, rather than | |
the wrapper itself. | |
Additionally, if the call to match() returns something, then | |
that result is set to the wrapper's .bindings attribute, for usage within | |
the exception handling code block. While a bit disappointing, this was the best | |
way I could find to address stumbling block #4. While this does not let us | |
do destructuring assignment within the "except" line itself, it lets us do it | |
in the first line of the handling block, via e.g.: | |
except Pair.M as p: | |
first, second = p.bindings | |
... | |
The Pattern base class is intended to be extended by end users to | |
define their own patterns. A function decorator, @pattern, is provided for | |
defining simple patterns with less boilerplate than writing a class would | |
require. | |
On the other side of things, the match() function provides the means of | |
pushing an object through the matching machinery, and returning the result. | |
The notion of a "matcher" is defined as a generator that behaves in a certain | |
way, and end users can use the @matcher decorator to bundle up their | |
matcher generator functions with an implicit call to match(). | |
""" | |
class MatchWrapper(BaseException): | |
""" | |
Wrap any object to make it throwable. This in turn lets us | |
"catch" any object according to arbitrary predicates, allowing | |
us to achieve structural pattern matching with exception handling | |
syntax! | |
""" | |
def __init__(self, object): | |
self.object = object | |
self.bindings = None | |
class NoMatch(Exception): | |
""" | |
Raised by pattern matchers when they find no match. | |
""" | |
class PatternMeta(type): | |
""" | |
Metaclass for Pattern classes. Overrides | |
__instancecheck__ to call the class' .match() method. | |
""" | |
def __instancecheck__(cls, instance): | |
if isinstance(instance, MatchWrapper): | |
try: | |
result = cls.match(instance.object) | |
except NoMatch: | |
return False | |
else: | |
instance.bindings = result | |
return True | |
else: | |
try: | |
result = cls.match(instance, match) | |
except NoMatch: | |
return False | |
else: | |
return True | |
@property | |
def M(cls): | |
""" | |
Just syntax sugar to concisely wrap any given exception class | |
with a call to _isinstance_except(). | |
""" | |
return _isinstance_except(cls) | |
class Pattern(BaseException, metaclass=PatternMeta): | |
""" | |
Base class for patterns. A pattern subclass must implement | |
match(). You can then use the pattern class in an except clause. | |
""" | |
@classmethod | |
def match(cls, obj): | |
""" | |
Attempt to match obj, returning a bindings object (which can be anything) | |
if successful. If the object fails to match, raise a NoMatch exception. | |
""" | |
raise NotImplementedError() | |
def pattern(func): | |
""" | |
Convenience decorator to create simple Pattern subclasses. | |
The decorated function becomes the match() implementation on | |
the new subclass. It is assumed to be a static method, so must not | |
accept a class as the first argument. | |
""" | |
return PatternMeta( | |
func.__name__, | |
(Pattern,), | |
{'match': staticmethod(func)}) | |
def match(m_gen, obj): | |
""" | |
Actually match an object against a "matcher". | |
A matcher is a python generator which is expected to | |
yield once, then the object being matched is wrapped in a | |
MatchWrapper instance and thrown into the generator. | |
The generator is then expected to handle the exception and optionally | |
return some result. | |
Because of all the hacks, the "matcher" generator's code can | |
look eerily like pattern matching syntax seen in some functional | |
languages, but with strange python boilerplate. | |
By putting matchers into a no-arg generator with a "try: yield" at the | |
beginning, we "abstract out" the actual process of wrapping the | |
input object in a MatchWrapper and raising it for the user. This minimizes | |
the needed boilerplate down to "try: yield", leaving more room for the | |
actual beef of the logic, the matching code. Put another way, this helps | |
further obscure the horror that's actually happening from the user, for their | |
own psychological well-being. | |
""" | |
try: | |
next(m_gen) | |
except StopIteration as e: | |
raise RuntimeError("Matcher raised StopIteration too early") from e | |
try: | |
m_gen.throw(MatchWrapper(obj)) | |
except StopIteration as e: | |
return e.value | |
except MatchWrapper: | |
raise RuntimeError('Matcher failed to match object: {}'.format(obj)) | |
else: | |
raise RuntimeError('Matcher did not return or raise') | |
def matcher(genfunc): | |
""" | |
Decorator for matcher generator functions. This will convert | |
the generator function into a single-argument function which | |
invokes match() on the passed-in argument against the decorated | |
generator function. | |
The generator function is always called with no arguments. | |
""" | |
@functools.wraps(genfunc) | |
def wrapped(x): | |
return match(genfunc(), x) | |
return wrapped | |
""" | |
Core Pattern Types | |
Here we define a few core pattern types that permit writing a wide | |
range of pattern matching code. | |
Included here are: | |
- Destructuring Tuple patterns | |
- Dictionary patterns | |
- Wildcard (Any) patterns | |
- Non-destructuring type-based patterns (Instance) | |
""" | |
class Tuple(Pattern): | |
""" | |
Tuple pattern. Matches a tuple of values against one | |
pattern for each index into the tuple. | |
""" | |
def __class_getitem__(cls, patterns): | |
if not isinstance(patterns, tuple): | |
raise TypeError('Tuple[] input must be a tuple.') | |
return PatternMeta('Tuple', (Tuple,), {'patterns': patterns}) | |
@classmethod | |
def match(cls, obj): | |
if not isinstance(obj, tuple) or len(obj) != len(cls.patterns): | |
raise NoMatch() | |
return tuple(p.match(o) for p, o in zip(cls.patterns, obj)) | |
class Dict(Pattern): | |
""" | |
Dictionary pattern. Matches the values of a dict against the | |
corresponding pattern for each value's key. | |
""" | |
def __class_getitem__(cls, patterns): | |
if not isinstance(patterns, dict): | |
raise TypeError('Dict[] input must be a dict.') | |
return PatternMeta('Dict', (Dict,), {'patterns': patterns}) | |
@classmethod | |
def match(cls, obj): | |
if not isinstance(obj, dict) or not obj.keys() >= cls.patterns.keys(): | |
raise NoMatch() | |
return {k: p.match(obj[k]) for k, p in cls.patterns.items()} | |
def D(*args, **kwds): | |
""" | |
Syntax sugar for Dict patterns (especially useful for string keys): | |
D(a=Any, b=Any) -> Dict[{'a': Any, 'b': Any}] | |
""" | |
result = {} | |
for a in args: | |
result.update(a) | |
result.update(kwds) | |
return Dict[result] | |
@pattern | |
def Any(obj): | |
""" | |
Matches any object, and binds it to the match's .bindings attribute. | |
""" | |
return obj | |
class Instance(Pattern): | |
""" | |
Just matches anything that isinstance() of a given type. | |
""" | |
def __class_getitem__(cls, t): | |
return PatternMeta('Instance', (Instance,), {'type': t}) | |
@classmethod | |
def match(cls, obj): | |
if not isinstance(obj, cls.type): | |
raise NoMatch() | |
return obj | |
""" | |
Demo: Type-based dispatch! | |
""" | |
@matcher | |
def say_type(): | |
try: yield | |
except Instance[str].M as s: | |
print('got a string:', s.bindings) | |
except Instance[int].M as i: | |
print('got an int:', i.bindings) | |
say_type('horse') # got a string: horse | |
say_type(35904) # got an int: 35904 | |
""" | |
Demo: Add two things! | |
""" | |
from numbers import Number | |
@matcher | |
def add(): | |
""" | |
A matcher can only receive a single argument as input, so | |
by convention if we are adding two arguments then we receive | |
a 2-tuple as input. | |
Tuple[Any, Any] ~> A 2-tuple whose contents can both be anything | |
""" | |
try: yield | |
except Tuple[Any, Any].M as t: | |
x, y = t.bindings | |
return x + y | |
print(add((6, 18))) # 24 | |
print(add(('af', 'sdfg'))) # 'afsdfg' | |
# error: | |
# print(add((3,5,9))) # error: input is length 3, not 2 | |
""" | |
Demo: Destructuring python lists into linked lists! | |
""" | |
class Cons(Pattern): | |
head_pattern = Any | |
tail_pattern = Any | |
def __class_getitem__(cls, item): | |
head_pattern, tail_pattern = item | |
return PatternMeta('Cons', (Cons,), { | |
'head_pattern': head_pattern, | |
'tail_pattern': tail_pattern}) | |
@classmethod | |
def match(cls, obj): | |
if isinstance(obj, list) and len(obj) > 0: | |
# return the bindings for list head and tail | |
return cls.head_pattern.match(obj[0]), cls.tail_pattern.match(obj[1:]) | |
raise NoMatch() | |
@pattern | |
def Nil(obj): | |
if not (isinstance(obj, list) and len(obj) == 0): | |
raise NoMatch() | |
@matcher | |
def sum(): | |
""" | |
Recursive sum() function straight outta CS101. | |
Demonstrates how bindings can be used for destructuring rather | |
than just returning the original object. | |
""" | |
try: yield | |
except Cons[Instance[Number], Instance[list]].M as c: | |
head, tail = c.bindings | |
return add((head, sum(tail))) | |
except Nil.M: | |
return 0 | |
print(sum([1,2,3,4])) # 10 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Its beautiful.