Interfaces only allow you to model based on method set and they are, by design, open. Any type that satisfies the interface, satisfies the interface. They share some similarities to sum types, but, as far as they do, they are, essentially, infinite sums. While this often desired, there are times when you need to limit the options to a closed set of types.
There's no direct way in Go to say "these types, even though they share no methods in common". You have to use an interface{}
, which says nothing—even though you know exactly what you want to say. You have to handle an invalid case at runtime.
There's no direct way to specify only the types in this package. Using an interface with an unexported "tag" method gets you part of the way. There's still nil and embedding and, at every point, those need to be dealt with—or, all too often, ignored.
These cases can lead to trivial errors that could be caught by the compiler, but instead need to be handled by defensive coding and extensive testing. Defensive coding and extensive testing will always be needed, of course, but it would be better if that effort could be focused on more pertinent, higher level concerns instead of having to buttress against an inability of the type system to understand "I can only handle these specific types". The larger a project gets the more opportunity is for the invariants to be violated and it is too easy to violate them.
Without a way to define a closed set of types, we end up doing the type checker's job whenever the need arises. Linting can help somewhat. I do not believe it can be done with 100% accuracy except in simple cases and there are enough variations within the "closed set of types" space to require multiple different kinds of linters.
The unnecessary boxing can increase allocations and be an optimization fence, leading to inefficient data structures and algorithms. With a way to model a closed set of types, the compiler has more information to work with because it has fewer possibilities to consider.
The lack wastes programmer time and machine spacetime.
These issues do not come up at every turn—unless you're writing a compiler/interpreter where they are pretty much the only turn. But they do still come up, and, when they do, there are no great options in the language.
When dealing with an FFI, there are often times when you need to specify "only these types" and that can be outside of your control. Similar situations can arise with serialization: json.Token
being an example.
One of the more vexing issues I run into, outside scenarios like the above, is needing to have multiple channels in a select when it would be cleaner to have one or two with sum types representing the set of allowable messages and their respective disjoint values. That would better model the structure of the communication and I could only return one or two channels from the code that started the server instead of a handful.
Being able to say "every type like this" is extremely useful in general, but sometimes you need to be able to say "only these types".
For a specific use case, while developing a Go library for interacting with LLVM IR (https://github.com/llir/llvm) several work-arounds had to be used since the notion of only these types was lacking in Go.
For instance, LLVM IR instructions were modelled using an ir.Instruction interface, which while functional (the instructions could easily be modelled using an every type like this representation) comes with a few drawbacks.
Firstly, it requires that the different set of instructions implementing the interface be documented, rather than inferred from code. Extract from http://godoc.org/github.com/llir/llvm/ir#Instruction
Note, a common use case is to use type switches to perform type assertions on values implementing
ir.Instruction
:For this reason, it becomes paramount to keep this documentation of instructions implementing
ir.Instruction
up to date, to be usable by users. When the documentation is not kept in sync with code, users may miss adding a case to handle a specific type of instructions (e.g.*ir.InstTrunc
); thus resulting in incorrect interpretations of the LLVM IR input or run-time panics.This brings us to another other issue cased by using interfaces, when wanting to represent only these types. Often, dummy methods have to be added to interface definitions, to mimick sum types; as was the case in the representation of LLVM IR constants
The
Immutable
method is a dummy method used to distinguish a set of types as constants, without having any method distinct from the value.Value interface to separate constant values form non-constant values. Needless to say, treating a constant, as a non-constant would be incorrect and the API should this prevent such use cases.The problem with the
Immutable
dummy method, is that - since it is exported - it makes it impossible to successfully lint a source file to ensure that each case in a type switch have been implemented; i.e. the linter method described by Jimmy Frasché falls short here as it fails to identify all types implementing theconstant.Constant
interface, and thus requires users to rely on (potentially incorrect, or not-up to date) documentation to handle each type.This clarify that this is not a hypothetical issue, but one that do arise, the type switches in the implementation of an LLVM IR traversal walk function have often missed handling types which do implement these work-around set type representations (e.g. llir/llvm@a1472af); which have caused run-time panics to occur only when the LLVM IR input assembly manage to trigger that specific code path.
Note, an issue I haven't even touched upon yet, is when a library is updated to include new types which implement the work-around representation of set types, as those type additions are likely to go unnoticed by users of the library until they run into run-time panics or incorrect interpretation of input by not handling all instructions. This too, is an issue which is not hypothetical, but rather likely to happen. In the case of the LLVM IR library for Go, when the LLVM project adds new instructions to the LLVM IR language, those instructions will be modelled in the LLVM IR library, and users of the library are very likely to miss handling those added instructions in some part of their code; as the compiler does not help to enforce the only these types representation, and that linters falls short because it is impossible to catch all sum type implementations, when using exported dummy methods. And, to keep the package boundaries, it may not always be possible to unexport those dummy methods.
To sum up, not being able to represent only these types leads to work-arounds which introduce issues (such as run-time panics) that at times cannot be caught by linters, and which put extended pressure on the documentation to always be kept up to date, and forces users to do grunt work which the compiler should be able to relieves them from, given a type system which understands the notion of "I can only handle these specific types".