type systems.md

There are two goals we often see in programming that I consider almost synonymous: writing readable, clean, maintainable code, and writing correct code. This is because unmaintainable and hard-to-understand code leads to bugs: even if the original author didn't leave any bugs, eventually a maintainer will misunderstand the code and introduce a bug. And in this situation the original author is equally to blame for the bug as the maintainer.

Type systems are the single most powerful weapon for writing readable, clean, maintainable code. This is the principle of abstraction: types are what gives meaning to bytes. If all programmers had room-sized brains, they wouldn't need types, because they'd be able to hold all details about all bytes in their head at once.

I, for one, make a point of forgetting code as soon as I write it: if I need to keep context in my head all the time, then I'm not doing a good job at writing code. The trouble with me is that I also make a point of forgetting code as soon as I read it, too. (The upside of this is that I keep my head free of information overload.) (@britttttk this is why I seemed a bit lost in the settlement system, despite having worked with it for months: I forget these things immediately and often intentionally.)

Dynamic type systems serve the purpose of interpreting bytes well enough. But there is an additional property of static type systems: because types are generally way easier to describe than implementation details, static type systems have an additional property of being very lightweight proof systems, further helping developers avoid bugs.

To illustrate what I mean by "types are generally way easier to describe than implementation details". Let's say I'm implementing a function for hashing the inverse of a string. The implementation of such a function can be extremely complex, especially if I'm writing it from scratch. But I can immediately say: this function needs to accept a string and return a byte array. The types are trivial, and yet they give a very useful contract: that function cannot accept anything other than a string, and it cannot return anything other than a byte array. So I can prove that a whole class of bugs simply cannot happen. (So long as the type system is safe, which is a topic for another rant 😄.)

However. There is yet another property of static type systems, that is to me even more important. In fact, this is the most important element of modern software development. Type systems serve as documentation in code.

Most commonly, documentation is written in the form of comments. Doc comments are good: I think they're important and they help people understand your code. But there is one big problem with comments: they are not code. So while code can say one thing, the comment that's supposed to describe it can say something completely different. So the comment now becomes not only useless, but even worse it's actively harmful and serves to further confuse the reader. That's not good.

Type systems are the way to write self-documenting code: they serve as documentation, but unlike comments, they are code. So they present unbreachable contracts: within a sound static type system, it's impossible for a variable's type signature to say string and for the variable to hold an int. Such code is refused by the compiler: it simply cannot happen.

I suppose that the difference between a string and an int is obvious enough. But with comments, you can describe way more than if something is simply a string or an int. And I agree: type systems can never entirely replace comments. However, I believe they can get way further in that direction than most people think.

The ILightningNode, UnfundedPsbt, and FundedPsbt types are a perfect example of this. I could have simply had a Psbt type, and with a few comments, I could have explained that the FundPsbt method must be called with an unfunded PSBT, while PublishPsbt and ReleasePsbt must be called with a funded PSBT. However, I made such comments completely unnecessary by encoding these facts in code, as part of the type system. Due to constructor contracts, an UnfundedPsbt cannot hold a funded psbt, and a FundedPsbt cannot hold an unfunded psbt. And the relevant part of ILightningNode's interface is now entirely self-documenting.

There is one final benefit of static types, which is often brought up by the functional programming people: when you see a function's type signature, it helps you make reasonable assumptions about the function's implementation, i.e., what the function might do. For example, if you know the problem domain, and you see a function that takes an UnfundedPsbt and returns a FundedPsbt, you can make a reasonable assumption about what that function does: it funds the psbt. Same for a function that takes a FundedPsbt and returns an OnchainTransaction: this function probably broadcasts the PSBT to the Bitcoin network. When paired with function names, type signatures leave little uncertainty as to what the implementation of the function might be: just like the name, they serve as further documentation of a function's behavior, allowing you to understand the function without having to read its implementation in detail.

sistemd/type systems.md