Lambda and the nature of data

The Three Warnings (50s)

Before we begin, a few warnings.

This talk contains code. It contains a lot of code, and you are encouraged to type it in and run it yourself. If you are allergic to code, congratulations on getting this far into a two-day Python conference, and I hope you have one more epi pin handy.
This is not a practical talk. I promise you it contains nothing that will make your code faster, more correct, or easier to work with. This is a 1:30 on a Sunday talk. This one's just for fun.
This talk contains a lot of code that will be transformed, as if by eldritch magic, into incomprehensible gibberish the instant you stop paying attention. Try and follow along. I think it'll be worth it.

What is `lambda`? (1m20s)

How many of you have used lambda yourself, in a real program?

OK, that's most of us. For the rest of you: I'm sorry, I guess you will actually learn something useful today.

lambda is just inline shorthand for a function.

Like, a normal function looks like this in Python,

def square(x):
    """ Return the argument squared. """
    return x * x

map(square, [1, 2, 3, 4])

and if I wanted to write this more compactly, instead of actually naming and documenting this function, which is boring, I can just jam the function right into the line of code where it's used, and that looks like this:

map(lambda x: x * x, [1, 2, 3, 4])

See, I've taken the two indispensible parts of the square function, the input and the output, and chucked everything else.

So that's lambda. It's useful with tools that take function arguments, like map, filter, reduce, list.sort, assertRaises, anything having to do with callbacks.

Also if you want to write obfuscated Python code, basically all you have to work with is lambdas and list comprehensions. The rest of the language is just way too sensible.

A little history

Python didn't always have lambda. It used to be that if you wanted a function, you used def and you gave it a name.

But then Mark Lutz figured out the following despicable hack:

def genfunc(args, expr):
    exec('def f(' + args + '): return ' + expr)
    return eval('f')

See what this does?

map(genfunc('x', 'x*x'), [1, 2, 3, 4])

This introduced a shorthand notation for functions.

This was so ludicrous that Guido capitulated and lambda was added to the language.

However, the scoping rules were not the same as the lambda calculus until Python 2.3, many years later.

What is the lambda calculus and why would anybody care? (2:15)

This of course is Alan Turing. I saw this online the other day.

1936 - Alan Turing invents every programming language that will ever be but is shanghaied by British Intelligence to be 007 before he can patent them.

...which is a joke, but it's basically true. In 1936 Alan Turing publishes a paper demonstrating this clever technique of proving that two computers, or two programming languages, are effectively equivalent, that one can compute everything the other can. It turns out all programming languages and all the computers you've ever used are equally powerful in this sense. The rest is performance and convenience. Turing basically relaunches the field of computer science which has been dormant since Babbage.

And then by 1938, Turing's working with Dilly Knox at Bletchley, cracking the Enigma cipher so England can beat the Nazis.

Less well known is what Turing did in those two years in between. He moved to America. Went to Princeton. To study under Alonzo Church.

1936 - Alonzo Church also invents every language that will ever be but does it better. His lambda calculus is ignored because it is insufficiently C-like. This criticism occurs in spite of the fact that C has not yet been invented.

Church and Turing independently published papers proving the same thing, using similar techniques, but Turing's was more readable. Turing machines were, well, they were machines, right? They operated step by step, kind of like a Python program operates statement by statement. It was tangible, easy to follow.

Church's basic model of programming was based on functions, functions calling functions, function all the things. You'll see how it is in a second. It's crazy.

But a funny thing happened.

It's 78 years later, Turing machines are a historical curiosity. They were never intended to be practical. Neither was lambda calculus.

But the lambda calculus is embedded in Python.

>>> &lambda; in python
True

Python didn't start out with the lambda calculus in it. It evolved it. Guido van Rossum, whose language design sense I respect as much as any human's, even if I don't always understand what he's thinking, started out with no lambdas and slowly ended up turning his language into a superset of Church's lambda calculus.

Isn't that weird?

What the lambda calculus is for: extreme simplicity

Church invented the lambda calculus to reason about programs. Specifically to prove things about what programs can and can't do.

As engineers, we like simplicity, right? Simple code is more maintainable. It's more likely to be correct.

Simple is better than complex.
Complex is better than complicated. --Tim Peters

Well, mathematicians love simplicity even more than you do. Simplicity means you can prove things. Simplicity leads to unexpected insight.

(blank slide)

But whether you're a mathematician or just a hacker, it's fun to try to reduce a programming language to the smallest possible subset that can still do stuff.

Here's an example of the kind of thing I mean: for loops.

When you see this:

for VARIABLE in ITERABLE:
    BLOCK

What Python actually does is something like this:

_iterator = iter(ITERABLE)
while True:
    _next = _iterator.__next__
    try:
        _value = _next()
    except StopIteration as _:
        break
    VARIABLE = _value
    BLOCK

So you see, a for loop is terribly convenient syntax. But that's all it is. If you have attribute lookups, function calls, exception handling, and while, you don't need for.

Well, to save us a bit of time, I'm just going to jump straight to the end here. I claim for is not the only bit of Python that's merely convenient. You can throw away every part of python except lambda and function calls.

The rest of this talk is going to try to hint at what you can do with just functions. Namely, everything --- except for interaction and storage. We're talking only about pure computation here. That is, answering questions.

Numbers

Let's try to create functions that represent the numbers. Maybe that is too big a task to begin with. Puzzle #0: Let's just try to create a function, a lambda, that represents the number 2.

Two

It has to be somehow different from all the other numbers, and ideally it somehow represents the essence of two-ness. So you could imagine:

two = lambda x: x + x

But this isn't allowed, because remember there's no addition operator.

two = lambda x: add(x, x)

No, there's no add function either. Let's try and invent at least one number before we try and define addition.

two = lambda x: [x, x]

And there are no lists.

What if we say that two is a function that takes both a function and another value as arguments, and calls the function twice?

two = lambda f, x: (f(x), f(x))

Not bad. But there are no tuples in the lambda calculus. There must be some way to call a function twice.

Any ideas?

two = lambda f, x: f(f(x))

There is just one more problem with this. Python lambdas can have multiple arguments, but in the lambda calculus that is forbidden. Every lambda has exactly one argument.

We can hack our way around this using a very clever trick called currying:

two = lambda f: lambda x: f(f(x))

Instead of two(f, x), we'll say two(f)(x). The first lambda will return the second lambda, which we immediately call, so that

two(f)(x) === f(f(x))

So this is the function I'd like you to actually type in. Put it in a file or a notebook. We'll be running this program lots of times.

two = lambda f: lambda x: f(f(x))

Two in action

Let's see this function two in action. Type this into your program or your Python shell.

import os

def spam(x):
    print("SPAM \a")
    os.system("say -r 300 SPAM")
    return x

The command to make your computer say something is say on a Mac. If you're running Linux, you want espeak.

Now this spam function is not a pure lambda. The function spam causes a side effect. Which isn't strictly allowed in the lambda calculus. But it's a nice hack because it lets us know how many times a function is being called.

two(spam)(dummy)
===> spam(spam(dummy))

So you see this will simply call spam twice. Try it out and see if it works.

Interesting. We have a function which behaves a little bit like a number. It's called two and it does something twice.

The other numbers

Now what do you think 3 would look like?

three = ...

Sure, it would be a function that takes two arguments, just like before:

three = lambda f: lambda x: ...

and it just calls f three times instead of two.

three = lambda f: lambda x: f(f(f(x)))

Try it out:

three(spam)(dummy)
===>  spam(spam(spam(dummy)))

You can see the pattern:

one   = lambda f: lambda x:       f(x)
two   = lambda f: lambda x:     f(f(x))
three = lambda f: lambda x:   f(f(f(x)))
four  = lambda f: lambda x: f(f(f(f(x))))

Type all these in, wont you?

These are called the Church numerals. They're a way to encode the natural numbers into the lambda calculus. Each number is a function, they all share the same API, and each one behaves differently, in a way that tells us which number it is.

Puzzle #0: What is the Church numeral for zero?

zero = ...

Try to guess what it might be. If you're doing it right I should hear the beautiful sound of zero spam.

(pause)

zero  = lambda f: lambda x:         x

Arithmetic

Well, this is cute and all; with a hack we can use our Church numerals for saying "spam"; that's nice.

But can these numbers actually do the things numbers do? Can we do arithmetic with them?

add1

Puzzle #1: How can we write the Church version of n+1?

add1 = ...
add1 = lambda n: ...
add1 = lambda n: (lambda f: lambda x: ...)
add1 = lambda n: (lambda f: lambda x: ...n(f)(x)...)
add1 = lambda n: lambda f: lambda x: f(n(f)(x))

add

Puzzle #2: Can we add Church numerals?

add = ...

What would this look like? It has to be something that takes two Church numeral arguments.

add = lambda a: lambda b: ...

And then it has to return a Church numeral somehow. The easiest way to return a Church numeral is just to build it on the spot:

add = lambda a: lambda b: (lambda ...)

What does a Church numeral look like again? It's a function with two arguments, f and x:

add = lambda a: lambda b: (lambda f: lambda x: ...)

and the result of this has to be applying f to x, a+b times. Figuring that bit out, that's the hardest part. Well, we know that a(f) returns a function, and that function takes any value and just applies f to it a times in a row.

          ---a times--
a(f)(x) = f(f(f(f(...(x)...))))

Likewise, b(f)(x) applies f to x b times in a row.

          ----b times-----
b(f)(x) = f(f(f(f(f(f(...(x)...))))))

So all we really have to do is apply both a(f) and b(f), one after the other.

Maybe you can figure out what that looks like.

add = lambda a: lambda b: (lambda f: lambda x: ...)

Start with x.

add = lambda a: lambda b: (lambda f: lambda x: ...x...)

First apply b(f) to it.

add = lambda a: lambda b: (lambda f: lambda x: ...b(f)(x)...)

Then apply a(f) to that.

add = lambda a: lambda b: (lambda f: lambda x: a(f)(b(f)(x)))

As it happens the outermost parentheses here are not required. We can drop them.

add = lambda a: lambda b: lambda f: lambda x: a(f)(b(f)(x))

Go ahead and type this in.

Of course this looks like gibberish. Maybe it made sense in bits and pieces, little flashes of insight as we worked through it, but it's so easy to forget all that once the work is done.

The good news is that lambda is the ultimate abstraction boundary. Having written this once, we can go straight into using it, and we never have to think of how it's implemented again. We know what it does when we call it. That's all that matters.

five = add(two)(three)
five(spam)(dummy)

Don't try to imagine the values of all the variables and what's captured what; just keep in your head what the final effect of each thing is.

mul

Can we multiply Church numerals?

mul = lambda a: lambda b: ...
mul = lambda a: lambda b: (lambda f: lambda x: ...)

                                               ---(a*b)times---
mul = lambda a: lambda b: (lambda f: lambda x: f(f(f(f(f(f(f(f(...x...)))))))))

mul = lambda a: lambda b: (lambda f: lambda x: (a*b)(f)(x))  # begging the question

mul = lambda a: lambda b: (lambda f: lambda x: a(f)(x))  # that would be a times

                                               -----b-times--------
mul = lambda a: lambda b: (lambda f: lambda x: a(f)(a(f)(a(f)(a(f)(...x...))))

mul = lambda a: lambda b: (lambda f: lambda x: b(a(f))(x))
mul = lambda a: lambda b: (lambda f: lambda x: a(b(f))(x))

Aside: lambda f: (blah)(f) is a function that does to f whatever (blah) would do to f. So the whole shebang, lambda and all, can be simplified to: lambda f: (blah)(f)

def tweedledee(x):
    return tweedledum(x)

is the same as:

tweedledee = tweedledum

So in the case of mul:

mul = lambda a: lambda b: lambda f: lambda x: a(b(f))(x)
mul = lambda a: lambda b: lambda f:           a(b(f))

Again, let's test it:

mul(three)(three)(spam)(dummy)

pow

Square:

square = lambda a: mul(a)(a)
square = lambda a: lambda f:   a(a(f))

Cube:

cube   = lambda a: lambda f: a(a(a(f)))

This is starting to look familiar. Can we generalize this pattern? Can we write the function pow() for Church numerals?

pow = lambda a: lambda exponent: lambda f: exponent(a)(f)
pow = lambda a: lambda exponent:           exponent(a)
pow = lambda a: lambda b:                  b(a)

Let's test it just one last time.

sixteen = pow(two)(four)
spam(sixteen)(dummy)

Aside: It's strange that the functions get simpler as they get higher-order.

add = lambda a: lambda b: lambda f: lambda x: a(f)(b(f)(x))
mul = lambda a: lambda b: lambda f:           a(b(f))
pow = lambda a: lambda b:                     b(a)

I don't understand why that is the case. At least, not well enough to attempt to explain it.

The effect is all that matters

The only thing we ever do with functions is apply them, so we have no idea what they're like "inside".

Tighter representations, more like binary than Peano arithmetic, are possible (with different interfaces).

Subtraction

Subtract 1: This one is kind of confusing.

sub1 = lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)

Subtraction: a-b: Just compose sub1 b times, and apply to a!

sub = lambda a: lambda b: b(sub1)(a)

At this point, let's leave numbers for the time being.

Pairs

What about data structures? We said that the lambda calculus doesn't have tuples. But can you build tuples out of functions?

What are the operations on tuples?

makepair; pairleft; pairright

| thing      | with Python tuples | with functions alone                   |
|------------+--------------------+----------------------------------------|
| make_pair  | lambda x, y: (x,y) | lambda x: lambda y: lambda s: s(x)(y)  |
| pair_left  | lambda p: p[0]     | lambda p: p(lambda l: lambda r: l)     |
| pair_right | lambda p: p[1]     | lambda p: p(lambda l: lambda r: r)     |

make_pair = lambda x: lambda y: lambda s: s(x)(y)

Of course, once you've got pairs, you can make larger tuples.

If you don't know how to do that, pick up a language like Scheme sometime.

Conditionals

API: 'if_else(cond)(then)(else)' returns either 'then' or 'else', depending on whether 'cond' is true or false.

if_else = lambda cond: lambda then: lambda els: (make_pair(then)(els))(cond)
true  = lambda l: lambda r: l
false = lambda l: lambda r: r

Can we test whether a Church numeral is zero?

is_zero = lambda a: a(lambda x: false)(true)

answer = if_else(is_zero(sub(three)(two)))(four)(seven)
answer(spam)(dummy)

Loops

So far, all the lambdas we've made finish running in a finite amount of time.

Can we write loops using only lambdas?

No fair using global variables

We need to be careful not to cheat, here. We've been using Python's = to assign names to things, so we can certainly write things like this:

factorial = lambda n: if_else(is_zero(n))
                             (lambda dummy: one)
                             (lambda dummy: mul(n)(factorial(sub1(n))))
                             (dummy)

because it's understood in Python that 'factorial' is in scope within the definition of 'factorial'.

But global variable binding is not part of the lambda calculus. As long as we're just using 'x = y' to mean, "Where I say 'x', I mean 'y'", then we're fine; that's just an abbreviation. But if you write:

factorial = lambda n: mumble ... factorial ... grumble

then that's not something you can just treat like an abbreviation, because it's infinite in length! It works in Python because Python only fetches variables when you get there: it waits until we actually call factorial before it fetches the value of factorial.

But, my claim is that 'lambda v: ...', '...(...)', and 'v' are the only kinds of expressions you need. I don't want to use Python global variables.

Stay awhile, stay forever

Indulge me for a moment. What does this expression evaluate to?

(lambda x: x(x))(lambda x: x(x))

It's a call; substitute the latter into the former, yielding:

(lambda x: x(x))(lambda x: x(x))

THE SAME THING. Which we then call again. Stack overflow.

What if we slip an f into the mix?

(lambda x: f(x(x)))(lambda x: f(x(x)))

the repetitive structure is the same; substitute the argument into the function:

(lambda x: f(x                  (x                  )))(lambda x: f(x(x)))
           f((lambda x: f(x(x))) (lambda x: f(x(x))) )
         f(f((lambda x: f(x(x))) (lambda x: f(x(x))) ))
       f(f(f((lambda x: f(x(x))) (lambda x: f(x(x))) )))

It still loops forever. We can try it out:

Y = lambda f: (lambda x: f(x(x)))(lambda x: f(x(x)))
Y(add1)  # RuntimeError: maximum recursion depth exceeded

Darn it! We can bump up the recursion limit:

import sys
sys.setrecursionlimit(2000)
Y(add1)  # RuntimeError: maximum recursion depth exceeded

Maybe we're really determined:

import sys
sys.setrecursionlimit(200000)  # so there!
Y(add1)  # Segmentation fault

Hmm.

The eager Y combinator

So we're succeeding at making lambdas that loop, but they loop in a way that overflows the stack and crashes python.

Do you think maybe it's impossible to make practical loops in the lambda calculus?

What was the bug here?

loop = lambda f: (lambda x: f(x(x)))(lambda x: f(x(x)))

This expands to:

loop(f) = f(f(f(f(f(f(f(f(...))))))))

The problem is that we never actually get to call f, because this chain of function calls is infinite. Python never gets to the center of it.

But if we can make it a bit less aggressive, it may get us somewhere.

What if we could call the first f here, and instead of passing it the result of this whole chain of calling f forever, which we definitely can't do, we instead pass it a function representing all the rest of that? That is, what if we can arrange for loop to expand to this:

loop(f) = f(lambda _: f(f(f(f(f(f(f(...))))))))

That way, at least f actually gets to run. It can decide for itself whether or not to call the rest. Unfortunately, if it does, that will be a RuntimeError again. But we could use the same trick when calling that f:

loop(f) = f(lambda _: f(lambda _: f(lambda _: f(...))))

Because we are running short on time I will skip to the answer:

Y = (lambda f: (lambda x: f(lambda _: x(x)))
               (lambda x: f(lambda _: x(x))))

Here we're working around an annoying problem brought on by the sheer simplicity of the lambda calculus. There's no way to write a self-referential function without using a trick like this.

I should give you one example of a self-referential function.

factorial_body = (lambda self: lambda n: if_else(is_zero(n))
                                                (lambda _: one)
                                                (lambda _: mul(n)(self(dummy)(sub1(n))))
                                                (dummy))

factorial = Y(factorial_body)

factorial(three)(spam)(dummy)

Just A Dash Of Set Theory

Now this is the part where it gets a little complicated.

Remember how the reals are larger than the integers? There are different sizes of infinities, and they're incomparable.
- "same size" for infinite sets means there exists a full pairing of the two sets.
- integers are the same size as the even integers
- integers are the same size as the naturals
- integers are the same size as the rationals (for each i, list all the fractions in least terms that, say, have n and d that add up to i; now number them. The numbering pairs integers with fractions.)
- Integers are smaller than the reals.
  Alice says, "The integers are smaller than the reals; no pairing exists."
  Joe says, "No, they're the same size."
  Alice says, "full pairing or it didn't happen."
  Joe provides a pairing.
  Alice constructs a real number that appears nowhere in Joe's list.
- Is a lambda really a function?
- In math, a function is a set, possibly infinite, of pairs with unique left sides.
- What's the domain? The range? They're both the set of all lambdas.
Let's call the set of all functions whose range is some set A and whose domain is some set B A->B.
- For example, we could talk about the set {0..7} -> {0,1}: all possible functions mapping three bits onto one bit.
- How large is that set? draw chart of all eight possible inputs, with a..h as the outputs; each assignment of bits to a..h gives us a distinct function.
- There are 2⁸ possible assignments of bits to a..h; thus, there are only 256 distinct functions from {0..7} -> {0,1}.
So, the set of all lambdas, call it F.
- F's domain is F; and its range is F.
- F = F->F
What's the size of F? sizeof(F)**sizeof(F)

There's a problem, here. There's no such size. If sizeof(A) > 1, then A**A is always larger than A. Even when A is infinite, sizeof(A->A) is always a larger infinity than sizeof(A); you can prove it with a diagonalization argument. Thus, the set F = F->F doesn't exist.
This made mathematicians suspicious of the lambda calculus for a while. But interpreted correctly, this result is actually trying to tell us something really interesting. It's true that these two sets are not equal.
```
F =/= F -> F
```
Yet every element of F is in fact one of these functions, right? So F is a subset of F->F.
```
F  <  F -> F
```
That means there are some functions in this set on the right that are not in the set of lambdas, this set of the left.

There are functions which are so strange in their behavior that they can't be encoded in a lambda expression of finite size. Functions which are not lambdas. These functions are called non-computable.

The computable functions are those that can be written out as lambdas of finite size, and that set is the same size as the integers.

Outro

What does this actually tell us?

I don't have a grand point, it's now 2:30 on a Sunday, and this is all just for fun.

jorendorff/lambda.md