Learning programming in APL taught me many new things about Python and this article is an account of that.
The Python 🐍 problem-solving bootcamp is starting soon. Join the second cohort now!
In the past, I've written an article where I share some areas in which my Python programming was heavily influenced by learning APL. You can read this article here: “Why APL is a language worth knowing”.
I've given it further thought and now I have a better understanding of how APL influenced my Python code. It is likely that I am not fully aware of all the repercussions that APL had on me, but I have new things to tell you about and that is what I want to focus on in this article.
So, without further ado, let me tell you what APL taught me about Python.
There is one line of code (LOC) that changed everything! Figuratively speaking.
Some time ago I wrote a piece of Python code pretty naturally. You know when you are so focused on what you are doing that things just flow? That's the state I was in. When suddenly? I looked at the code I had written and I saw this piece of code:
sum(age > 17 for age in ages)
When I looked at it, I was surprised by it. I was surprised because that is not the type of Python code that I usually see written in the books or tutorials I learn from. And it's also not a pattern I had seen in Python code from other projects... Until it hit me: I wrote this Python code because of the influence that APL had on the way I think about programming.
But first, let us make sure we understand this code!
sum(age > 17 for age in ages) do?
ages is a list with the age of a bunch of different people, then the line of code we saw above will count how many people are aged 18 or older:
>>> ages = [18, 15, 42, 73, 5, 6] >>> sum(age > 17 for age in ages) 3
That's what the code does. That's it. It is not that magical. But, curiously enough, it encapsulates plenty of things that APL taught me about Python, so let me tell you all about them.
Now, the remainder of this article might seem like I will be stating the obvious over and over and over and over. That may be the case. But the truth of the matter is, before I stated these things to myself, they hadn't clicked. Or, put another way, it was when I stated these things out loud and wrote about them that everything finally made sense. After all, just because something is obvious, it doesn't mean it is worthless saying it out loud!
A prime example of something obvious that we can benefit from by saying it out loud – and even giving it a name – is the pigeonhole principle. It is like a mathematical theorem, but excruciatingly obvious. And yet, it allows you to prove non-obvious things, like the fact that there cannot be a perfect compression algorithm. (If you're intrigued by that, take a stab at this problem about imperfect compression.)
The first thing I want to look at is the built-in
What if I told you that
sum is very closely related to six other Python built-ins and functions?
sum is related to all these other functions, but in what way?
Long story short, all these seven functions are specialised versions of
If you are not familiar with
functools.reduce, you can learn about it in this article.
There is also a 5 minute talk in which I explain this concept and you can watch it here:
Another big thing that APL made me realise is that the Boolean values
False and the integers
0 are tightly connected.
Python hints at this because the type
bool is a subclass of the type
>>> issubclass(bool, int) True
This means that we can use the Boolean values
False as the integers
>>> True + True 2 >>> 3 * False 0
This may look terrible, but the truth is that this connection is extremely meaningful.
For one, it is related to how hardware works.
What's more, the fact that
False can be used as the integers
0 enables data-driven conditionals in Python.
For example, going back to the expression
sum(age > 17 for age in ages), if you look closely, the subexpression that is being summed up is
age > 17.
age > 17 is either
False, depending on the value of the age:
>>> age = 18 >>> age > 17 True >>> age = 5 >>> age > 15 False
ages is a list with ages, then what we are doing is computing a series of Boolean values that indicate whether each value is greater than 17 or not:
>>> ages = [18, 15, 42, 73, 5, 6] >>> bools = [age > 17 for age in ages] >>> bools [True, False, True, True, False, False]
What the built-in
sum does is sum all those Boolean values up, interpreting the
1 and the
>>> sum(bools) 3
But this seeming interchangeability – because
0 can also be used as
False – unlocked even more things for me.
Using Booleans as integers helps you write data-driven conditionals.
I've also written about data-driven conditionals before, but I can also show you an example here. Consider this standard Python code:
count = 0 for age in ages: if age > 17: count += 1
This code uses an
if statement to determine whether or not to add
1 to the variable
So, depending on the value of the condition
age > 17, we perform an addition or not.
In other words, we use an
if to determine whether we do an addition or not.
The code above is equivalent to this more symmetric version, even though it may be more verbose:
count = 0 for age in ages: if age > 17: count += 1 else: count += 0
Rewriting the code in this way exposes a pattern: we always want to add something to
count, in both branches, but the condition
age > 17 changes the value we want to add.
This is even more obvious if we use a conditional expression:
count = 0 for age in ages: count += 1 if age > 17 else 0
In this code, we used a conditional expression to determine what value to add, as opposed to determining whether we wanted to do an addition or not.
That's the main idea behind data-driven conditionals!
Instead of trying to decide whether to take an action or not, we just take that action and instead compute the appropriate parameters.
In this example, that was picking between
So, a conditional expression is kind of similar to a data-driven conditional. But there are differences:
In our simple case, we can go from
count += 1 if age > 17 else 0 to
count += age > 17, which would be a “purer” data-driven conditional.
This type of code in Python will likely be frowned upon, but in APL we use data-driven conditionals commonly.
You can read another example of a data-driven conditional here.
And while data-driven conditionals don't always translate directly into idiomatic Python code, the concept of a data-driven conditional helped me appreciate the situations in which I can write my code in a more symmetric way.
We saw this above.
I had an
if statement and I made it more symmetric by including the
else branch that was implicit (and also a bit redundant).
And what happens often, at least for me, is that adding the redundant branch(es) that are missing helps me spot patterns and rewrite the whole
if statement in a cleaner way.
This, in turn, reduces nesting in my Python code, which is a good thing.
All in all, the way APL handles Boolean values helped me understand the relationship between Booleans and the integers 1 and 0, and it made me aware of some patterns in Python code that now I know that can be simplified. If you are interested, go ahead and learn more about APL and Boolean values.
In case you haven't noticed, I am partially obsessed with list comprehensions. I even wrote a book with +200 exercises about list comprehensions and related concepts, like set/dictionay comprehensions and generator expressions. And the reason I am so obsessed with list comprehensions is that I am convinced that there is a big portion of the Python community that doesn't give list comprehensions their due credit.
List comprehensions are insanely useful and most people are unaware of what the main advantage of list comprehensions is.
It is not speed, nor is it the fact that they are shorter to type than the corresponding
The true advantage of list comprehensions is something that I can only justify after having learned about how APL handles scalar functions – more on that in a second!
The main advantage of list comprehensions is that they tend to be more readable than their
for loop counterparts.
That's something many people will tout.
But why are they “more readable”..?
I claim it is because they highlight the data transformation.
What does this mean..?
Again, let us look at two pieces of code to compare.
First, take a look at this
is_adult =  for age in ages: is_adult.append(age > 17)
We only have three lines of code, but the thing that matters the most is hidden at the bottom right of the code: the expression
age > 17.
That is the expression that determines what we fill the list
is_adult with, and it is at the bottom right of the code.
If I rewrite the code as a list comprehension, this is what I end up with:
is_adult = [age > 17 for age in ages]
Notice that the expression that matters,
age > 17, is now much closer to the top left, which is where we start reading the code.
The loop itself, the
for age in ages, was moved to after the expression because the loop itself matters less than the expression!
And this is what makes list comprehensions typically more readable than the explicit loops!
Now, if you haven't learned list comprehensions yet, you may say that list comprehensions are more complicated. But you say that because you haven't learned them yet. Just read this “list comprehensions 101” or my book “Comprehending Comprehensions” and soon you'll agree with me!
I mentioned scalar functions in APL before.
Many functions in APL are scalar functions.
Loosely speaking, this means that handling one single value or multiple values is done in the exact same way, without needing loops or list comprehensions.
age>17 determines whether the value
age is greater than
ages is a list of ages, like
18 15 42 73 5 6, then
ages>17 just works!
This is what APL would return as a result:
18 15 42 73 5 6 > 17 1 0 1 1 0 0
Notice how each number in the list
1 0 1 1 0 0 corresponds to a single comparison.
For example, the last
0 is the result of computing
6 > 17.
The APL code would be roughly equivalent to this Python code, if it worked:
>>> ages = [18, 15, 42, 73, 5, 6] >>> ages > 17 # If `>` were a scalar function like in APL: [True, False, True, True, False, False]
In Python, you can't use
> with an integer and a list but in APL you can!
APL only cares about the computation that you are doing, it doesn't care about the looping.
In Python, you have to write some code to do the looping, but you can “hide” it by using a list comprehension and putting it after the expression that matters.
So, you can't write only
ages > 17, but you can write
[age > 17 ...] to get the same result.
If you are curious about APL's scalar functions and how they relate to list comprehensions, you can learn more by reading this.
In practice, I talked about scalar functions and list comprehensions, but the rationale applies equally to set and dictionary comprehensions, as well as generator expressions, which is what we'll be using in the next section.
So, I realised that APL led me to understand that
sum and many other built-ins are just specialised reductions.
This let me establish connections between functions and algorithms that I didn't know were connected.
APL also taught me about data-driven conditionals and how to look for ways to make my code more symmetric.
Finally, comprehensions started making much more sense to me after I realised that the point of all comprehensions is to highlight the data transformation.
When everything converged at the same time, I wrote this:
sum(age > 17 for age in ages)
In hindsight, this is a translation of the APL code that does the same thing, which is just
+/ in APL is equivalent to
sum in Python and we've seen that
ages>17 in APL needs to be written as a comprehension in Python.
I dissected this Python code and studied each of its parts. Again, the reductions, the Booleans/data-driven conditionals, and the scalar functions/comprehensions. And when I took all of those things and put them together, I learned one other thing!
The code I've written exhibits this pattern:
sum(predicate(element) for element in iterable)
predicate is a function that returns
False, then the code above is an idiom that counts how many elements in
iterable satisfy the function
In our previous example, we had this correspondence:
>>> iterable = [18, 15, 42, 73, 5, 6] >>> predicate = lambda age: age > 17 >>> sum(predicate(element) for element in iterable) 3
Suppose we have a list of words and want to count how many words have the letter
"a" in it.
We can use the same idiom:
>>> words = ["cat", "dog", "parrot"] >>> has_a = lambda word: "a" in word >>> sum(has_a(word) for word in words) 2
How cool is this? I like this idiom a lot and I use it often in my own Python code!
All in all, learning APL was a terrific experience because it forced me to think about programming in new ways and, as a result, it also influenced the way I think about programming in the other languages that I already knew, namely Python. If you have similar stories of how learning one language changed the way you wrote code in another, share them with me! Drop a comment below or tag me elsewhere on the Internet!
I write about Python every week. Join +16.000 others who are taking their Python 🐍 skills to the next level 🚀, one email at a time.