Kris Nuttycombe asks:
I genuinely wish I understood the appeal of unityped languages better. Can someone who really knows both well-typed and unityped explain?
I think the terms well-typed and unityped are a bit of question-begging here (you might as well say good-typed versus bad-typed), so instead I will say statically-typed and dynamically-typed.
I'm going to approach this article using Scala to stand-in for static typing and Python for dynamic typing. I feel like I am credibly proficient both languages: I don't currently write a lot of Python, but I still have affection for the language, and have probably written hundreds of thousands of lines of Python code over the years.
Obviously the biggest problem with writing Python compared to Scala is that you have many fewer static guarantees about what the program does. I'm not going to sugarcoat this -- it's a big disadvantage.
Most of the other advantages have to be understood in terms of this. If you value compile-time guarantees you may be tempted not to acknowledge the advantages. I think this is a mistake. If you really want to understand what makes writing Python appealing (or even fun), you have to be willing to suspend disbelief.
At least when I was writing Python, the right way to think about Python's types was structural: types are strong (a string can't become a number for example), but best understood as a collection of capabilities. Rather than asserting that x
is a Collator
, and then calling .collate()
once that fact is established, we just call .collate()
directly.
Because types are not enforced, you would not tend to use them to guide API decisions. Compared to languages like Scala or Java, Python strongly encourages APIs that don't require exotic types, many parameters (unless sensible defaults can be provided for almost all of them), or deeply-nested structures.
In an interesting way, it keeps you from going all-in on object-oriented programming. You only want to create a class when a casual API user will understand how it works and what they would use it for. Otherwise, you tend to prefer using static methods (or similar) that can act on simpler data types. Similarly, there is strong pressure to use the standard collection types (lists, sets, dictionaries) in almost all cases.
This has a number of consequences:
- You rarely have to wade through huge class hierarchies with poor documentation
- APIs tend to be strongly-focused on what you want to do
- You get much more mileage out of learning the default collections' APIs
- Custom collections feel stronger pressure to conform to default APIs
- You can assume most data has a useful string representation
- You rarely have to worry about baked-in limitations of your types
To address the last point: you rarely have to worry about someone baking in the wrong collection type, or numeric type. If you have a type that behaves similarly, you can use that instead.
(A corollary here is that someone who comes to Python from e.g. Java may tend to produce APIs that are very hard to use.)
In Scala or Java, you often end up with tons of classes that are essentially data containers. Case classes do a great job of minimizing the boilerplate of these. But in Python, all of these classes are just tuples. You don't have to try to come up with names for them, or anything else. It is really liberating to be able to build modules that are much smaller, by virtue of not worrying about having to give things names, or which class to use.
Abstracting over things like variance become much simpler. It's interesting that many Python programmers see the major difference between lists and tuples as immutability (tuples are immutable), but it makes sense when you consider that both can be iterated over, indexed by number, and have no limitations on their size. Compare this to the difficulties of correctly expressing and abstracting over product types in Scala. Even with Shapeless' help, it is a lot of work.
More generally, finding abstractions in Python feels much more like pattern recognition. If you see two stanzas of code that are essentially the same, it is trivial to abstract over their differences and write a common method. This is true even when the differences are down to:
- Field or method names used
- Arity of functions or tuples
- Classes instantiated
- Class or package names
- Imports needed
In static typing, these sorts of abstractions involve figuring out how to relate the types of ASTs as well as the AST shape itself. It doesn't feel as much like a pure abstraction or compression problem as it does in a dynamic language.
John De Goes made a point about fixing dynamic programs one-error-at-a-time, versus hundreds of compiler errors at once. I think he's right about that, but I don't think it does justice to why the approach sometimes feels better.
One of the first things we have to learn as programmers is how to emulate what the "machine" (a computer, interpreter, VM, whatever) is going to do. We learn to trace through programs imagining counters incrementing, data being allocated, functions being called, etc. We can use print
to view intermediate values, we can raise exceptions to halt the process at an intermediate point, etc.
There are arguments that this is the wrong way to do programming. I find some of them convincing. But even for most people using static types, this is how they determine what their program will do, how they assemble it out of other programs, how they debug it, etc. Even those of us who wish programming was more like writing a proof do this.
One advantage Python has is that this same faculty that you are using to create a program is also used to test it and debug it. When you hit a confusing error, you are learning how the runtime is executing your code based on its state, which feels broadly useful (after all, you were trying to imagine what it would do when you wrote the code).
By contrast, writing Scala, you have to have a grasp on how two different systems work. You still have a runtime (the JVM) which is allocating memory, calling methods, doing I/O, and possibly throwing exceptions, just like Python. But you also have the compiler, which is creating (and inferring) types, checking your invariants, and doing a whole host of other things. There's no good way to peek inside that process and see what it is doing. Most people probably never develop great intuitions around how typing works, how complex types are encoded and used by the compiler, etc. (Although in Scala we are fortunate to have a lot of folks like Stephen Compall, Miles Sabin, and Jason Zaugg who do and are happy to talk about it.)
Not having to learn (or think about) this whole parallel system of constraints and proofs is really nice. I think it's easy for those of us who have learned both systems to ignore the intellectual cost to someone who is getting started.
An obvious question is why we have to mentally emulate a machine at all? In the long run I'm not sure we do. But with the current offering of statically-typed languages most folks are likely to use, I think we still do.
People are often confused that many scientists seem to love Python. But I think it makes sense.
Static typing is most useful in large, shared codebases where many of the main risks are misusing someone else's API, failing to refactor something correctly, or dealing with long-lived codebases full of deeply-nested interacting structures.
By contrast, a scientist's main concerns are probably mathematical errors (most of which the type system won't catch), methodological problems (even less likely to be caught) and overall code complexity. They are also unlikely to maintain code for very long periods of time or share codebases. This is someone for whom an empirical (and dynamic) runtime debugging process probably seems more pleasant than trying to understand what the type system and compiler are complaining about. (Even after their program compiles they will probably need to do the runtime testing anyway.)
I don't plan to stop writing code in Scala, Haskell, or Rust (or even C). And when I write Python these days, I do find that I miss the static guaranteees and type-driven development. But I don't hate writing Python, and when I'm writing Scala I still find things to envy.
I had a small encounter with Rust the other day that reminded me of this whole debate. Not too interesting, really, not some case where a dynamic language avoids massive horrid ugliness through exquisite cleverness, but a paper cut - paper cuts matter. I had a parser which took a bunch of fields in a certain format from a binary file, interpreted them, and copied them to corresponding fields in a data structure. It was pretty straightforward; each field got a line like this:
But I had I decided I wanted to extend it to support writing as well as reading. I started to write another function that had some similar looking code going in the opposite direction, but since there are a bunch of fields and some complexity to how the fields are nested (which I'll gloss over), I decided the result would be better and more readable if I declared a list of fields in one place and used it in both the reading and writing functions, in a more abstract fashion. To do this I'd need to specify in some fashion which struct field and which field from the binary format (which is also just structs) should be used. Rust, like most languages, does not have any built-in concept of a 'first-class field', i.e. something like
&Struct::field
which you could then apply to multiple objects. I knew I could simulate it with a lambda going from a pointer to the object to a pointer to the field:|x: &mut MachO| &mut x.dyld_weak_bind
but of course I didn't want to write that out for every field (that is, twice for each native-struct to format-struct mapping, since each side has a field), so I needed to write a macro...
But hold on, let's take stock for a second at how the situation I'm facing would differ in other languages.
method_missing
type stuff, especially in languages such as Rust where they have weird limitations and aren't just text substitution. (Of course C style text substitution has many many problems of its own, so.)Map<String, Foo>
(on one side at least), but that just makes the rest of the code feel weird and out of place. (This is the fallacy that people who call dynamic typing "unityping" tend to fall into [not that the term implies the fallacy, but the blog post that started it rather strongly does]: You can treat dynamic typing as a special case of static typing, dynamically typed objects as a special case of hash maps, but unless you actually go the whole hog and use that subset of the language for your code [which is probably syntactically ugly at the least], instances where it could make some bit of code nicer are nearly impossible to actually realize.)You can certainly do it in any of the semi-problematic ways above, but if it's uglier than just doing reading and writing in the dumb non-abstract way, you probably won't; you'll do the easy thing and it will make the code "better", within the confines of that language, but worse compared to a theoretical optimum.
And in most dynamically typed languages this is super easy. There is no need to decide whether to do it the easier[?] way (boilerplate) or the elegant way (abstract field access) because the elegant way is easy, just two function calls:
And it's not just field access; you get analogous, rarer but more dramatic comparisons with other things like boilerplate object shapes or functions (in the many cases that generics don't cover). With a dynamic language I can say: "I will never do something gross just because it pleases the compiler; I will take a hard line on boilerplate." Sure, you're giving up the myriad ways compilers can help you if you follow their wishes. But you can say that.
Another way of saying this: Dynamic languages have a more expressive type system than static languages. It's just that you have to do all the type checking in your head. :)
Anyway, I did end up using a macro; I'm not saying they're that arcane or evil (though I've seen a few different people in the Rust mailing list clamor for their removal on roughly those grounds), just a bit steeper of a hill than I'd like. In my particular case it wouldn't have been that bad except for some Rust-specific issues which aren't really pertinent. Simple enough looking:
For the record, I'm not saying this issue somehow makes the advantages of static typing irrelevant. It has various advantages and neat features even in the small, and I find it credible (don't have much personal experience) that coding in the large, type checking's assurances of (at least basic) consistency are indispensable. (This conflict is one reason I find gradual type systems interesting.) I am merely stating one aspect of the appeal of dynamic typing; there are a few others.
* Or Haskell lenses, though AFAIK they currently require the comparatively unpopular Template Haskell to actually autogenerate a lens for each record field. I think lenses are pretty neat.