Skip to content

Instantly share code, notes, and snippets.

@jimmyfrasche
Created September 4, 2018 02:15
Show Gist options
  • Save jimmyfrasche/656f3f47f2496e6b49e041cd8ac716e4 to your computer and use it in GitHub Desktop.
Save jimmyfrasche/656f3f47f2496e6b49e041cd8ac716e4 to your computer and use it in GitHub Desktop.
generics proposal

Note

This proposal relies on other proposals:

  • golang/go#8082 - consider two defined interfaces with identical definition to be identical
  • golang/go#19642 - define a universal, untyped zero value; written as zero in this proposal, pending choice of a syntax.
  • golang/go#26842 - allow all incomparable types to be compared against zero
  • golang/go#27481 allow interfaces to specify that they are only satisfied by comparable types. This document uses the variant that achieves this by using or embedding the predeclared comparable interface.

Abstract

This is a draft proposal for generics. It uses interfaces as the primary means to constrain instantiation of polymorphic types.

Definitions and syntax sketch

In

type [T interface{}] Ts []T
type Ints = Ts[int]
func [S, T interface{} | T(S)] Convert(s []S) []T {
  // ...
}
  • Ts is a generic type.
  • [T interface{}] is the type constraint on Ts.
  • T is a type parameter.
  • interface{} is the metatype of T.
  • Ts[int] is an instantiation of Ts and int is the type argument for T.
  • Convert is a generic func.
  • The | T(S) portion of the type constraint on Covert is a convertibility requirement. The portion before the | is its type parameterization.
  • A specific type/func is a non-generic type/func or the instantiation of a generic type/func.
  • A concrete type is a specific type that is not an interface.
  • In the below, when something is said to be "illegal", that means that it results in a compile-time error.

Proposal

Only package-level types and funcs may be generic.

A generic type or func cannot be used unless instantiated.

An instantiated type or func can be used as if it had been written as the equivalent specific type or func.

All instantiations of a defined generic type exist in the package that defined the generic type, regardless of which package instantiated the generic type. Multiple instantiations of the same defined generic type parameterized with identical types are identical.

A metatype can be any specific interface. All type parameters must have a metatype. [S, T interface{}] introduces two type parameters with the same metatype but they are distinct types parameters.

A type argument can be any specific type satisfying the metatype of the type parameter. In the absence of convertibility requirements, the metatype is the only contract the type must satisfy.

A convertibility requirement, T(S), adds the additional constraint that values of S must be convertible to values of T.

Type parameters are not required to be used in the type/func body. Unused type parameters are still taken into account for the identity of an instantiated defined type.

Generic interfaces

type [T interface{}] Ordered interface {
  Equal(T) bool
  Less(T) bool
}

type SatisfiesOrderedInt int
func (a SatisfiesOrderedInt) Equal(b int) bool { return a == b }
func (a SatisfiesOrderedInt) Less(b int) bool { return a < b }

var _ Ordered[int] = SatisfiesOrderedInt(0)

No type implements uninstantiated Ordered even if there is some T that would make everything work. A type can only implement a specific instantiation of Ordered such as Ordered[int] in the example.

Generic interfaces used as metatypes may be instantiated by other type parameters in the type constraint, including themselves, such as [T Ordered[T]]. Other than self-loops, cycles are illegal.

Generic interfaces may recur

type [T interface{}] Recursive interface {
  Recur() Recursive[T]
}

Instantiating Recursive with, for example, int expands to

interface {
  Recur() Recursive[int]
}

and, as that is itself the definition of Recursive[int], the expansion halts.

The same applies to mutually recursive interfaces and those that permute their type parameters. However, the implementation is free to put an upper bound to limit the expansion to prevent adversarial code from creating a finite but ridiculously large interface.

Given

type [T interface{}] Ier interface {
  I() T
}
type [T interface{}] UsesIerInterface interface {
  A() Ier[T]
}
type [T Ier[T]] UsesIerT interface {
  B() T
}
  • In UsesIerInterface, the return of A is the interface Ier[T]
  • In UsesIerT, the return of B is a type that satisfies the interface Ier[T].

Type assertions for generic types in an interface value require an instantiation i.(F[S, T])

Generic funcs

Because type parameters instantiate to arbitrary types that satisfy the type constraint, only operations described by the type constraint are allowed. This applies even if they could be inferred from a particular type constraint. Required operations must be described specifically and if they cannot be described by the type constraint they are not allowed.

This is illegal

func [T interface{}] IsNil(t T) bool {
  return t == nil
}

T may or may not be a type that can be compared to the untyped nil. The untyped zero must be used:

func [T interface{}] IsZero(t T) bool {
  return t == zero
}

This is illegal

func [T interface{}] Equal(a, b T) bool {
  return a == b
}

The constraint on T is too weak to ensure that T is comparable. The comparable interface, or an interface that embeds comparable must be used.

The builtin map type acts as if it had the type constraint [K comparable, V interface{}] when it is declared using type arguments.

The value of a type argument is assignable to any interface that its metatype is assignable to. Otherwise, a specific conversion matching a stated convertibility requirement is needed.

If the metatype has methods, these may be used as normal.

func [S fmt.Stringer] str(s S) string {
  return s.String()
}

While the type arguments do not necessarily instantiate to interfaces, type switches and type assertions are legal. However, an unconditional type assertion is illegal unless in a branch where the type has been determined by a previous conditional type assertion or switch. That is

func [T interface{}] f(a, b T) {
  // Illegal
  _ = a.(int)

  if _, ok := a.(int); ok {
    // Legal since T = int here
    _ = b.(int)
  }
}

Unlike ordinary type assertions/switches, these can be determined entirely at compile time when the type argument is a concrete type, allowing the compiler the option to create a specialized version where the failing cases are removed as dead code.

If the type argument is an interface, they are normal type switches and assertions. Specifically, the legal unconditional type assertion in the previous example may still panic if T is instantiated by an interface and a and b have different dynamic types.

In the body of a generic function with convertibility requirement S(T), when the type argument T is an interface, the conversion S(t) is equivalent to t.(S).

Since the metatypes of the type parameters used in a convertibility requirement do not themselves need to satisfy the convertibility requirement, interface type arguments can lead to a runtime failure when the dynamic types are not convertible. This can only happen when both type arguments are interfaces. If only one is an interface, instantiation fails at compile time as the type assertion is impossible.

Generic concrete types

Given

type [T interface{}] S struct {
  t T
  i int
}
var s = S[bool]

s is convertible to struct{ t bool; i int } and unsafe.OffsetOf(s.i) is dependent on the result of unsafe.SizeOf for the T used to instantiate S.

Regular methods of a generic type mirror the syntax for instantiation on the receiver:

func (g *G[S, T]) M(s S) T {
  return T(s)
}

The parameters may differ from the names given in the type definition† but must be the same number. These type arguments may be used in the signature and body the same as if they were defined on the func. Convertibility requirements are not repeated but G.M fails to compile if G does not specify that T(S).

† not recommended!

This is illegal

type [T interface{}] Recur struct {
  field Recur[T]
}

as Recur will be infinitely sized if T is not a pointer, but this is legal

type [T interface{}] Recur struct {
  field *Recur[T]
}

Because type parameters not used in the declaration of the type are part of the type identity and available to methods, so-called associated types can be specified during instantiation:

type [Node interface{}] Edge interface {
  Nodes() (from, to Node)
}

type [Node comparable, Edge Edge[Node]] Graph map[Node]Set[Node]

func (g G[Node, Edge]) EdgesOf(n Node) []Edge

func (g G[Node, Edge]) ShortestPath(src, dst Node) []Edge

Embedding generic types

This applies to both interfaces and structs.

A type parameter may not be embedded directly, though it may be used to instantiate an embedded generic type.

When an instantiation of a generic type is embedded, the field name is the name of the generic type. All parameters are ignored in the name of the field.

Generic methods

On a specific type, a generic method is similar to a generic func:

func (s *S) [T interface{}] F(T)

Calling new(S).F(0) can use an int-specialized F as the types are known at compile time.

A generic method on a generic type can use the type arguments of the receiver in the type constraint of the method:

func (f *G[K, S]) [T I[K] | T(S), K(T)] F(T)

Interfaces may specify generic methods

interface {
  [T interface{}] F(T)
}

These are only satisfied by methods with the same type constraint (in addition to all preexisting rules). A type with an F(int) method does not satisfy this interface even though int would satisfy the type constraint of F.

Invoking a generic method through on an interface value cannot use a specialization since the types are not known in general so the generic implementation of a generic method must always exist, though as shown above specializations can be created.

Generic interfaces may themselves have generic methods

type [T interface{}] X interface {
  [S interface{} | S(T)] Y() S
}

Generic type aliases

A generic type may be aliased without instantiating it, type A = G, but to use A it must be instantiated the same as G.

The declaration

type short = Generic[with, lots, of, parameters]

works as any other type alias, allowing an alternate name for the RHS type.

Type aliases may themselves have a type constraint.

type [S, T interface{}] Pair = struct{
  Left S
  Right T
}

The instantiation Pair[int, string] is in every way identical to the type

struct {
  Left int
  Right string
}

Aliases may also be used to partially instantiate a generic or to enforce a stricter constraint.

Given

type [K comparable, V interface{}] CustomMap map[K]V

the alias

type [T interface{}] Names = CustomMap[string, T]

defines a shorthand for CustomMaps with key type string. The type constraint T on Names does not need to match the constraint on CustomMap exactly. It may be stronger but not weaker. So this is allowed

type [T fmt.Stringer] NamedStringers = CustomMap[string, T]

but this is not

type [T interface{}] Ints = CustomMap[T, int]

Similarly, an alias may add additional convertibility constraints

type [K, V comparable | K(V)] CvtMap = CustomMap[K, V]

Additional constraints do not effect the RHS of the type alias. They are only taken into account when using the aliased name directly in code.

Reflection

Reflecting an instantiation of a generic type works as usual but additionally provides a means to access reflection info about the generic type itself and the current parameterization.

The generic type can be instantiated with new parameters as long as the new parameters satisfy the type constraint. Instantiations that do not exist in the program can be created.

Type checking

Type checking involves checking the type arguments against their metatypes and checking the convertibility requirements. If either fail, instantiation fails, but the checks can be done independently.

Checking the type parameters

Given the constraint [T M], type checking succeeds if the type argument for T satisfies its metatype M using the normal rules for interface satisfaction: the type argument must have the appropriate methods and, if M requires comparability, T must be a comparable type.

For [T M1[int]], T must satisfy M1 instantiated by int.

For [T M1[M2[S]], S M0]

  1. S must satisfy M0.
  2. S must satisfy the constraint for M2
  3. M2 instantiated by S must satisfy the constraint for M1

For [T M[T]], T must be a type that satisfies M instantiated by T.

Since the only allowed cycles are self loops, the above is sufficient.

Checking the convertibility requirements

Multiple convertibility requirements can be checked independently.

A convertibility requirement applies to two specific types and holds if the inner type is convertible to the outer type. For type arguments, the metatypes of the type parameters do not factor in. This can only be checked for a specific instantiation.

In [S, T interface{} | T(S)], T(S) holds when S = int and T = float64 but not S = string and T = fmt.Stringer.

[T interface{} | int(int)] always holds. (As does int(int8) and T(T)).

[T interface{} | int(struct{})] fails to instantiate for any T.

[R interface{} | R(io.Reader)] only holds if the type argument for R is an interface with an appropriate Read method.

[R interface{} | io.Reader(R)] is a lot like the simpler constraint [R io.Reader]. But they are different. With [R interface{} | io.Reader(R)], a value of R would not have access to the Read method and would not be assignable to io.Reader though an explicit conversion to io.Reader would succeed.

Function argument type deduction

Same as the contracts draft proposal, with one addition. If all the values for a single type argument are untyped numeric literals with different default types, do a third pass and take the "largest" default type used (defined by int <: float64 <: complex128) if the other values are so assignable. That way F(1, 3.14) instantiates to float64 instead of failing to compile.

Potential simplifications

The most dramatic simplification would be to disallow interface types as type arguments. This would make it easier to reason about the code in generic func/method bodies and easier to generate efficient non-specialized versions. It would also disallow useful constructs such as CopyAndConvertSlice[interface{}, string](a, b) and make it hard to mix interface code and generic code.

Removing generic methods would be a minor simplification. The hardest part of adding those are having non-specialized implementations of the methods but those are going to exist regardless.

Removing the ability for reflection to create new instantiations is okay for a v1 but, like generic methods, all the hard parts are taken care of already (at least on this end, reflect still needs better support for methods sets and defined types).

Various examples

func [T interface{}] Filter(s []T, func(S) bool) []T {
  acc := make([]T, 0, len(s))
  for _, v := range s {
    if f(v) {
      acc = append(acc, v)
    }
  }
  return acc
}

ppl := []person{
  {Name: "Bolaetzio", Title: "Magistrate of the Fountains"},
  {Name: "Grutapsious", Title: "Office scamp"},
}
Filter(ppl, func(p person) bool {
  return strings.HasPrefix(p.Title, "Office ")
})


func [T comparable] SliceEquals(a, b []T) bool {
  if len(a) != len(b) {
    return false
  }
  for i := range a {
    if a[i] != b[i] {
      return false
    }
  }
  return true
}

SliceEquals([]int{0, 1, 2}, []int{0, 1, 2})


func [T interface{}] SortSlice(s []T, func(a, b T) bool) // impl. omitted

SortSlice(times, func(a, b time.Time) bool {
  return a.Before(b)
})


func [S, T interface{} | S(T)] SliceConvert(src []T) []S {
  dst := make([]S, len(src))
  for i := range dst {
    dst[i] = S(src[i])
  }
  return dst
}

si := SliceConvert[interface{}, string](s)


func [K comparable, T interface{}] Keys(m map[K]V) []K {
  acc := make([]K, len(m))
  for k := range m {
    acc = append(acc, k)
  }
  return acc
}


type [T comparable] Set map[T]struct{}

func (set Set[T]) Has(e T) bool {
  _, ok := set[e]
  return ok
}

func (set Set[T]) Add(e T) {
  set[e] = struct{}{}
}

func (set Set[T]) Diff(other Set[T]) {
  for e := range other {
    delete(set, e)
  }
}

func [T comparable] Uniq(in <-chan T) <-chan T {
  out := make(chan T)
  go func() {
    seen := Set[T]{}
    for {
      if m := <-in; !seen.Has(m) {
        seen.Add(m)
        out <- m
      }
    }
  }()
  return out
}
@jba
Copy link

jba commented Sep 10, 2018

It looks like this proposal gives up on writing generic math functions, since there's no way to say that the type parameter supports addition or other arithmetic operators.

@jimmyfrasche
Copy link
Author

Sorry I didn't notice this earlier. Gists really need to send notifications.

Without operator overloading there are very few types that have operators (aside from ==), so you end up with a generic system that has this fundamental split between types with operators and types with methods.

It's not useless by any means, but it ends up being very complex and it must overlap with interfaces.

Types with methods are far more flexible. I can add a Less method to any type. I can only reuse existing < operators. If I need to, I can always create a type like

type Int int
func (a Int) Less(b int) bool { return a < b }

or pass in a func(a, b int) bool { return a < b }. I could even use code generation to make a package that has all of these predefined.

That's annoying, but it keeps everything simple and easy to work with when you're creating something more complex than Min.

Let's say we go with a generic system that allows operators. If you're making a very complex data structure that requires knowing if two values are less than another, you either need to do it twice—once for < and once for Less—or you're going to do it once for Less and require wrapper types anyway even though you have the option to use <. At best you could create a generic wrapper for types with a < operator, but you still need wrappers.

IMO: Allowing operators is a false sense of convenience that does not scale. The additional complexity of operators is not worth it. I'd rather have a simple generic system. You can still use code generation to work with types with operators. There aren't that many of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment