I first met APL (A Programming Language) in 1968.
I've always used it a lot, and now I use it every day.
Many developers will never have heard of the language. Many of those that have, either love it (as I do) or hate it.
I think now is the time to explain what APL is, why I use it so often, and why I think that developers should take a look at it.
I regard software as stored knowledge. I'm not sure where that idea came from, but it seems so obvious to me that I take it for granted.
Software can be useful because software can be executed, and it can also be useful because well-written software is a concise and unambiguous way of capturing the knowledge that it contains.
What knowledge do I want to capture?
I'm interested in AI and on the human brain, human learning and human thought.
Much of what I do involves transforming arrays of data. That's where APL excels.
Virtually all Artificial Neural Networks deal with arrays of inputs, outputs and neurons. In many cases these arrays are tensors - multi-dimensional arrays which APL transforms with unmatched ease. When I'm creating or discovering knowledge I use APL whenever I can.
When I want to communicate my work to the widest possible audience I might not use APL for that. I'll often use Python with numpy, TensorFlow or PyTorch.
If you're a software developer and want to expand your skills, you will want to learn new languages. They can broaden your mind and give you new insight into the problems that you need to solve. If you feel that way, think about adding APL to your list of languages to study.
If you're working with tensors in numpy, PyTorch or TensorFlow, you will almost certainly benefit form learning a little APL. You'll see some APL code that manipulates tensors below.
If you've met APL before and used to like it, now's a good time to take another look. It's easier to use and even more powerful than it was.
If you used to hate APL, it might still be worth having a second look. It might be that you've only seen badly written APL. I'd hate for you to miss out on something that might be of value to you!
I've seen people take against languages for all sorts of reasons, and when they do it can be hard for them to change their opinion. (One of the brightest people I know refused to try Python because it used indentation for semantic purposes!)
Some people are wary of APL because of its special symbols. If you've learned one language you may be able to guess what a program in another language does, and that's useful. You can only do that with APL if you're willing to invest a little time in learning the basics of the language. Not everyone is willing to do that.
No. You should use the most appropriate tool given your background and the context in which you're working. I have no idea what these constraints are, but you do.
Some people hate APL because they think that APL programmers deliberately write unreadable code.
People may have heard about APL 'one-liners', which can be very cryptic, but they don't understand why we sometimes write them.
One-liners started because the first interactive implementation of APL (APL\1130) had very tight memory constraints. As a result, programmers were told that one-line programs in APL ran in memory. That allegedly made them run much faster than programs that spanned several lines.
That issue went away in the 1960s!
These days, one-liners are a great way of flexing coding muscles. One way of motivating that exercise is to practice code golf. Code golfers try to find very short solutions to problems. APL excels at that.
That doesn't mean that we normally write code that way!
Here's a neural network layer in a few short lines of APL.
relu ← {⍵⌈0}
relu
implements a Rectified Linear
activation function. It returns zero if its argument is less than zero, and
returns the value of the argument if it is zero or positive. It works for an array of any shape.
relu ¯1 ¯0.5 0 0.5 1 2
0 0 0 0.5 1 2
rweights ← {1-2×?⍵⍴0}
Given a shape, rweights
creates an array of random weights between minus one and one. The array could be a vector, a
matrix or a tensor depending on the number of dimensions provided in the shape. The example creates weights for a
network layer that has two neurons and three inputs.
⊢weights ← rweights 2 3
¯0.4888232213 0.7919346464 0.4248547319
0.1003400358 0.4756984886 0.8823578991
activate ← {⍺⍺ ⍺+.×⍵}
activate
takes an activation function (like relu
) and applies it to the matrix product of the weights on its
left with the inputs on its right.
Given an array of three inputs, compute the activation of each neuron:
weights relu activate 0.1 0.5 1
0.7719397329 1.130241147
We've just implemented the setup and forward pass of a Rectified Linear Layer of a fully-connected feed-forward neural network. Adding back-propagation involves a little more work but not a lot.
Rodrigo Girão Serrão has created an interesting YouTube series which covers these ideas in more detail.
Some network architectures don't use back-propagation.
The Hopfield network has been around for quite a while. It does interesting things and and variants are still [actively researched] (https://www.semanticscholar.org/paper/How-Memory-Conforms-to-Brain-Development-Mill%C3%A1n-Torres/ebaf956b02247815b1eb1425fd6eec32d9a025b0).
Hopfield Networks use two-valued inputs and outputs. The code below represents them as -1
and 1
.
The state of a Hopfield network is defined by a symmetric matrix of weights which are zero along the leading diagonal. These weights are adjusted when the network is trained to recognise a new vector, When a subset of the ones in that vector is presented, the outputs should recreate the complete input that it learned.
Here's the Python code for the training and recall phases, slightly adapted from https://github.com/tomstafford/emerge
def train(patterns):
# This trains a network to remember the patterns it is given
from numpy import zeros, outer, diag_indices #import functions for vector calculus
r,c = patterns.shape # take the patterns and make them vectors. There is a neuron for each pixel in the patterns
W = zeros((c,c)) # there is a weight between each neuron in the network
for p in patterns: # for each pattern
W = W + outer(p,p) # change the weights to reflect the correlation between pixels
W[diag_indices(c)] = 0 # neurons are not connected to themselves (ie the weight is 0)
return W/r # send back the normalised weights
def recall(W, patterns):
# This tests the network. You give it a pattern and see what it produces
from numpy import vectorize, dot # vector calculus functions
sgn = vectorize(lambda x: -1 if x<0 else +1) # convert input pattern into a -1/+1 pattern
for _ in range(5): # over a number of iterations
patterns = sgn(dot(patterns,W)) # adjust the neuron activity to reflect the weights
return patterns # return the final pattern
Here's the same code in APL:
op ← {(⍵∘.×⍵)×(i∘.≠i←⍳⍴⍵)}
train←{(+⌿(op⍤1⊢⍵))÷1↑⍴⍵}
match ← {¯1 1[0≤⍵+.×⍺]}
recall ← {⍺ match⍤5 ⊢ ⍵}
You can start to explore APL tutorials at the TryAPL website without installing any software.
If you want to dig deeper the TryAPL website will tell you how to do so. In particular, you can install Dyalog APL on your own computer. It's free for non-commercial use.
It runs under Windows, Mac OS X, Linux on Intel/AMD and on the Raspberry Pi.
Point your browser at TryAPL.