In my previous post, I talked about some of the underlying runtime concepts behind generics in Go.
Here I'll go a bit deeper into how we might, in principle, be able to compile a generic function once for many instantiated types.
Before we can start compiling functions, we'd like to know what the type parameters are to the functions. If a generic function is only ever called with a single type, there wouldn't be much point in compiling it with generic code that could work with any type at all.
There are two crucial restriction with generic types that make it possible to determine the set of all possible type parameters at compile time:
- there is no way to take a type from a dynamic value and turn it into a type parameter.
In an interface value, the type lives inside the value, but there's no way to get at that type without naming an existing type that is compatible with it.
- it is not possible to have dynamic values or closures that are generic.
Generic methods are not allowed, and there's no way to have a value (as opposed to a type or a top level function) that has a type parameter.
For example, consider this program:
package main
import "fmt"
func main() {
foo(43)
foo("hello")
}
func foo(type T)(x T) {
bar(&x)
bar(make(chan T))
}
func bar(type T)(x T) {
fmt.Println(x)
}
The arcs of the call graph look something like this:
main() -> foo(int)
main() -> foo(string)
foo(T) -> bar(*T)
foo(T) -> bar(chan T)
By traversing the call graph from the root (main) and keeping track of the types we end up with, we can determine all the possible type parameters for each function. Pseudo-code for a naive algorithm might look something like this:
addInstantiations(function, types) {
add (function, types) to set of instantiations
for each generic call in function body {
addInstantiations(target of call, types passed to call)
}
}
We'd obtain the following set of instantiations:
foo(int)
foo(string)
bar(*int)
bar(chan int)
bar(*string)
bar(chan string
One problem with this is that it falls down when functions are recursive, because the algorithm will never terminate. In fact, as is pointed out in the "unfortunate" BigArray example in the Go generics overview, the final set of types might be infinite.
That's reasonable straightforward to solve though. We can make it illegal to create cycles like this that can result in an arbitrarily large set of types. That is, if we find a recursive call where the types aren't the same as the types passed to the original call, we can give a compiler error. A modified algorithm might look something like this:
addInstantiations(function, types, stack) {
add (function, types) to set of instantiations
if function is in stack {
if types in stack are not the same as types {
compiler error: "infinite type cycle"
}
return
}
for each generic call in function body {
push (function, types) onto stack
addInstantiations(target of call, types passed to call, stack)
pop stack
}
}
Now that we've determined all the types that a generic function can be bound to, we have to decide what code to generate.
There is one easy option: each item in the set of instantiations gets compiled separately, so in the previous example, instead of a single compiled instance of main, foo and bar, we'd have main, two copies of foo and four copies of bar.
In a large program with many type combinations, this approach can result in huge binaries and runtime inefficiency when instruction caches overflow. So we'd like to avoid it, if possible.
Another possibility is that we could merge some generated implementations for types where we can generate identical code, with small runtime cost.
Specifically, it should be possible to merge two instantiations if all their argument types have the same size and pointer layout.
For example, let's look at our earlier example with this in mind. On a 64 bit architecure, the sizes and pointer maps look like this:
Instantiation | Size | Pointer map |
---|---|---|
foo(int) | 8 | [false] |
foo(string) | 16 | [true, false] |
bar(*int) | 8 | [true] |
bar(chan int) | 8 | [true] |
bar(*string) | 8 | [true] |
bar(chan string) | 8 | [true] |
We can see although bar
is invoked with 4 different type parameters, they all have
the same size and pointer map. As long as bar only copies the value around or
creates new zero instances of it (and that's all the contract of T allows), we can
generate only one instance of it.
The generated instance might look like this:
func bar(x unsafe.Pointer) {
fmt.Println(interfaceValue{
word: x,
})
}
However, this has omitted one important detail - bar
calls fmt.Println
,
which requires that we take the generically typed x
value and turn
it into an interface value. For that we need its type, which the above implementation
does not have access to.
Naively, one might think that the caller of a generic function could pass
any generic types as explicit parameter. So the generated bar
code might look like this:
func bar(T reflect.Type, x unsafe.Pointer) {
fmt.Println(interfaceValue{
typ: T,
word: x,
})
}
There's a problem with that though, which is that functions have access to
more types than just those in its type parameters. For example, the above
foo
function is passed the single type T but uses that to create both *T
and chan T
.
Theoretically a function could use the reflect package's functions (SliceOf
, StructOf
, etc)
to dynamically construct the types, but this would be prohibitively inefficient.
But there's a neat alternative. As we've already generated a table of all the possible instantations of each function, we can store some representation of that table in a global variable. Then all the static information about a particular instantiation can be obtained with just one integer index into the table.
The whole program might now look something like this. Note that when we've generated separate versions of a function, there's no need to pass it an instance index because each version "knows" which set of types it was instantiated with.
package main
import "fmt"
func main() {
foo__int(43)
foo__string("hello")
}
func foo__int(x int) {
bar(0, &x)
bar(1, make(chan int))
}
func foo__string(x string) {
bar(2, &x)
bar(3, make(chan string))
}
type instance {
types []reflect.Type
}
var _barInstances = []instance{{
types: []reflect.Type{
reflect.TypeOf((*int)(nil)),
},
}, {
types: []reflect.Type{
reflect.TypeOf((chan int)(nil)),
},
}, {
types: []reflect.Type{
reflect.TypeOf((*string)(nil)),
},
}, {
types: []reflect.Type{
reflect.TypeOf((chan string)(nil)),
},
}}
func bar(_instance int, x unsafe.Pointer) {
fmt.Println(interfaceValue{
typ: _barInstances[_instance].types[0],
word: x,
})
}
What happens if bar
calls another generic function that has a shared implementation?
We can add more information to the instance
type above. Given an instance
and a specific call in the function, we know which function is being called, which
tells us which instance of that function we require and hence the instance index to
use when calling that function.
type instance {
types []reflect.Type
// calls holds the instance index for each call site.
calls []int
}
When a function has both shared and non-shared instances, we'd need to generate a stub for the non-shared instance to absorb the unneeded instance index parameter.
Being able to have shared instances of generic functions is good, but there's one important omission so far. We can't do anything with the generic values.
Luckily, the instance table global variable can be used here too. Inside the instance struct we can store a set of function pointers, one for each operation that the function's contracts allow. Consider the following code, for example:
contract comparer(t T) {
t == t
}
func indexOf(type T comparer)(xs []T, v T) int {
for i, x := range xs {
if x == v {
return i
}
}
return -1
}
The only operation allowed by the contract is ==
, and that's easily represented
with a function; for example:
func intEqual(a, b int) int {
return a == b
}
When the contract allows an interface type, there's no need to generate a new function - we already know how to generate runtime call tables for interfaces and the generated call can just invoke that.
For more complex operations permitted by the new generics design, such as range
and field
indexing, it may not be possible to express them as a function call. In this
case it's always possible to fall back to generating one instance per set of type parameters.
I believe they should be disallowed but that's a matter for further discussion.
In summary, I believe that we can use shared implementations for Go generic functions, solving Russ's generic dilemma, and that the runtime cost can be very small.
I don't believe this thinking works. It is at odds with compiling packages in isolation, bottom-up (i.e. leafs in the dependency graphs first). What you present here is a whole-program analysis, so it only really can happen at link-time. I'm not sure exactly how much work the Go toolchain is deferring to link-time, but this approach can only really work, if you essentially defer all the compilation work for generic functions to link-time, at which point you reduce the amount of work that can be cached. I think in essence you have a separate, link-time optimization pass, where generically compiled code is heuristically specialized for some subset of functions.
TBH, I don't think you're really solving the generic dilemma. I think you are still simply opting for slow compilers here.