...A thing or idea seems meaningful only when we have several different ways to represent it — different perspectives and different associations. ...Then we can turn it around in our minds, so to speak: however it seems at the moment, we can see it another way and we never come to a full stop. In other words, we can 'think' about it. If there were only one way to represent this thing or idea, we would not call this representation thinking. - Marvin Minsky
Today I'd like to show you a way to model functional programming concepts as operations on a single infinite table, reducing complex n-dimensional concepts into a more familiar two-dimensional space.
Remember that this is just a modeling exercise - I'm not necessarily making ontological claims about programming, I'm simply trying to explore a new vocabulary and model for reasoning about computation.
A tuple
is a collection of values organized into a data structure. For instance, ["myk", 34, 0x0000FF]
is a tuple - it says "these three values are somehow related". The exact nature of that relationship is unknown without more context, but the important thing is that we can model any set of values in a fixed order as a tuple
.
A relation
is perhaps more commonly known as a table
- it's a set of tuples where each tuple has the same structure, and that structure reflects some specific semantics. For instance, our example tuple above may be a row from a table that looks like this:
Name | Age | Favorite Color |
---|---|---|
"myk" | 34 | 0x0000FF |
"han" | 29 | 0x00FF00 |
A function
is defined as a relation
between some set of inputs (defined as some subset of the columns in our table) and some set of outputs (defined as some other subset of the columns in our table). In the simplest case, we can have an identity function - its output is its input.
Input | Output |
---|---|
1 | 1 |
"abc" | "abc" |
Each row in our "identity function" table is a single call to the function with a specific input argument. It returns a specific output argument. To flesh this out a bit, a slightly more complex function may look something like this -
input1 | input2 | output1 |
---|---|---|
1 | 1 | 2 |
7 | 3 | 10 |
It looks like we may be looking at a 'Sum' function here, where input1
is the augend, input2
is the addend and output
is the sum. We'll get into more details about this later, but I want to make sure that it makes sense that you can represent a function as a table, aka as a relation of tuples.
A data type
can be understood as the set of all values that satisfy some constraint. It's really a predicate function - given some input X, is X in the set of acceptable values? Some common types are things like Integer
or String
or Boolean
, and applications will often introduce new types like User
or Session
.
In our functions-as-tables metaphor, data types
are used to constrain the size of our tables by restricting the number of valid values for each column. For instance, if we specify that the three columns in our sum
table above all have the Integer
data type then we won't have any rows where we're adding null
to "Hello world!"
- those are simply invalid rows, and logically cannot exist in our (still infinite) table.
Abstraction
, in this model, has a very specific definition: it's when you add a new column to your table. For instance, we can create a more abstract version of the sum
function by adding an operator
column. The resulting math
function looks like this:
input1 | input2 | operator | output |
---|---|---|---|
1 | 1 | + | 2 |
1 | 1 | - | 0 |
1 | 2 | + | 3 |
Concretion
is what we'll call the dual of Abstraction, in this model - rather than adding a column, we're removing a column. We do that by "partially applying" a value to the function - restricting the size of the function, much as we did with data types, such that one of our columns has a data type that has only a single value in it. So for instance, if we wanted to create a multiply
function we could partially apply the *
operator into the Math function to create a new function:
input1 | input2 | output |
---|---|---|
1 | 1 | 1 |
7 | 3 | 21 |
By now you should be thinking "Huh... that sure seems like sum
, multiply
and math
are all just different views into the same abstract space..." and you'd be 100% correct! There are interesting consequences of this that we'll get to shortly!
Finally, a higher order function
is a function where one or more of the columns has the data type function
. In other words, every value for one of our columns becomes a whole table! As an admittedly forced example, consider this greetUser()
function which takes as inputs a user's first name, a user's last name, and a function used to get the appropriate greeting for the current time of day.
First Name | Last Name | getGreeting() | Output | ||||||||||
Myk | Bilokonsky |
|
??!! |
At first glance it seems like there's a problem here - if each row of one of our tables is an invocation of the represented function, what do we do if our row contains a nested table? How do we represent that idea? It seems like our output
column can't have a value until we step into this nested space?
But, thanks to the magic of math - I think, can anyone give me some links to formal references to this kind of thing? - we can simply change how we represent the problem. A critical rule here is that the columns of a nested table can be pulled into the parent table. The function above can be rewritten as follows:
First Name | Last Name | Current Time | Greeting | Output |
---|---|---|---|---|
Myk | Bilokonsky | 5-12 | "Good Morning" | "Good Morning Myk Bilokonsky" |
Myk | Bilokonsky | 12-6 | "Good Afternoon" | "Good Afternoon Myk Bilokonsky" |
Myk | Bilokonsky | 6-9 | "Good Evening" | "Good Evening Myk Bilokonsky" |
Myk | Bilokonsky | 9-5 | "Good Night" | "Good Night Myk Bilokonsky" |
etc and so on. Now our specific invocation of the function maps cleanly to a row in our relation.
Inputs
vs Arguments
, Outputs
vs Return Value
- when we model functions in our code we generally think about our function's arguments as its inputs, and its return value as its output. But as anyone who has ever tried to shoe-horn formal logic into an impure (read: existing) codebase, our functions often live within the scope of some existing state and they often mutate that state as a side-effect when invoked.
This is fine - but it means that we need to think of things like "current time" as an input, even when it's not an argument. It's simply a property of the environment at execution time. Similarly, a function can, say, write some values to a database without actually returning anything. That change of database state must be included when we discuss the "output" of the function.
So far we've seen how functions can be modeled as relations, where each column can be constrained by a data type and where we can ultimately treat every function call as a lookup in an infinite table. add(1, 7)
can be expressed as something like SELECT output FROM sum WHERE input1=1 AND input2=7
, right?
(Interestingly, it means we can also do something like SELECT input1 FROM sum WHERE input2=7 AND output=8
- we get, at least conceptually, the full implications of using immutable relations.)
To summarize:
- Functions can be modeled as relations.
- Data Types can be used to restrict valid rows in our relations.
- Abstraction and Concretion can be treated as formal processes whereby columns are added or removed from our relations.
- Higher Order Functions work when you realize that they are simply a mechanism to express abstraction, adding columns to our table.
- Inputs are not limited to arguments, outputs are not limited to return values. We can easily model side-effects here.
Subsequent essays on this topic will explore complex, abstract ideas from functional programming in this simplified but expressive space. We'll look at what functors and monads and applicatives are, we'll explore currying and some of the ramifications of abstraction, and we'll look at how once you go down this rabbit hole of modeling functions as relations you quickly run into the much larger rabbit hole of modeling applications as relations - and then, just maybe, we'll explore what it means to think of the universe itself as a single enormous function.
Finally turning this into a course for egghead.io, for anyone following along at home. This is gonna be fun! :)