Since the draft generics design was published in late August, there has been a great deal of feedback and counter-proposals.
This document re-examines the draft design in the light of this feedback and proposes that some changes be made. It also discusses some of the major criticisms that have been levelled but which I think should nevertheless be dismissed.
It is often forgotten that the Go team have spent years thinking about how generics could be implemented and have probably rejected many of the alternative ideas which are now being floated either because they fall short of the desired goals in some respects or are simply unworkable. I think this previous experience should be respected and that one should therefore only depart from the draft design where there is a very good reason for doing so.
It is also worth bearing in mind that the draft design has answers for almost everything (other than stated omissions) and, even if one doesn't like those answers, it's more than can be said for many of the counter-proposals.
So, let's examine the various aspects of the design.
Although some people don't like this at all, I think it is fair to say that the majority of commentators are fairly comfortable with it though a common criticism is that there's not enough visual discrimination between a function's type parameters and its ordinary parameters - the too many parentheses syndrome.
To meet this criticism, I propose that the parentheses should be replaced by square brackets. So:
func someFunction(type T) (param T) T { ... }
would become:
func someFunction[type T] (param T) T { ... }
and, at the call-site (ignoring type inference):
var t T = someFunction(T)(param)
would become:
var t T = someFunction[T](param)
As the type keyword is being retained, I think I'm right in saying that this change wouldn't introduce any ambiguities (or any more than there already are) or parsing difficulties.
Incidentally, some people have suggested that type parameters need not be declared at all but can be deduced from the call-site. However, in the draft design it says:
"As you can see, the first decision to make is: how should the type parameter T be declared? In a language like Go, we expect every identifier to be declared in some way.
Here we make a design decision: type parameters are similar to ordinary non-type function parameters, and as such should be listed along with other parameters. However, type parameters are not the same as non-type parameters, so although they appear in the parameters we want to distinguish them. That leads to our next design decision: we define an additional, optional, parameter list, describing type parameters. This parameter list appears before the regular parameters. It starts with the keyword type and lists type parameters."
I believe this is an important principle that would meet the expectations of the majority of Go users and should therefore be maintained.
Many people have expressed disappointment that there is no provision for this in the draft design, particularly in relation to sorting. The problem of course is that only the built-in types (and types based on them) support operators; other types have to rely on equivalent methods.
One suggestion for dealing with this is to introduce named methods for the built-ins corresponding to each operator and then use these named methods when coding unified generic functions. The problems with this are that it would require a language change (the built-ins don't currently have methods) and a long list of method names (and their signatures) would be required which people would need to know about
Another suggestion is to go one step further than this and allow operators (or a restricted subset of them) to be overloaded. If this were done, then the overloaded operators themselves could be used to code the unified generic functions. This suggestion clearly has the same problems as the first.
Also the Go team has never (to my knowledge) shown any enthusiasm for operator overloading nor, for that matter, has the community at large. It is dismissed in the Go FAQ with this sentence:
"Regarding operator overloading, it seems more a convenience than an absolute requirement. Again, things are simpler without it."
Given what a can of worms operator overloading can be, I'm not really surprised at this attitude. Overloading the equality operator is a particularly thorny problem which could lead to a lot of confusion and there's little doubt that operator overloading in general (apart from some mathematical types) makes code harder to read.
So, I certainly don't think that the Go team will want to introduce operator overloading just to alleviate some aspects of the generic design.
Stepping back a little, is it really such a big deal to have two functions for anything involving operators - one for the relevant built-ins and another for user-defined types which satisfy an equivalent interface?
I don't think this was even highlighted as a problem in the draft design though it was mentioned that operator methods had been omitted. Moreover, as was shown in another recent proposal, it is still possible to code unified functions using basic generics with adaptors even if the code to do this is rather verbose.
So my conclusion on this is that the proposed cures are worse than the disease and we should simply accept that there is (and alway will be) an intrinsic difference between the built-ins and user defined types.
Many people see no need for contracts when we already have what appears to be a similar concept with interfaces.
However, the problem is that contracts can do a lot more things than interfaces currently can and it is not easy to extend the latter to meet this shortfall.
Another important principle enunciated in the design document is this:
"This ability for type parameters to refer to other type parameters illustrates an important point: it should be a requirement for any attempt to add generics to Go that it be possible to instantiate generic code with multiple type arguments that refer to each other in ways that the compiler can check."
This of course is referring to the (in)famous Graph example which is awkward to express using two interfaces but easy to express as one contract.
Having wrestled with this problem in my own proposals, my conclusion is that it's better to retain contracts and leave interfaces as they are other than now allowing them to take type parameters.
We now come to probably the most contentious issue of all - what exactly a contract should be allowed to contain.
It is difficult to find anybody who is happy with this as it stands as it is felt to be too opaque and complex. Most commentators would be much more comfortable with restricting the contents in some way and having a simple and standard way of expressing the various constraints. I've therefore concluded that this aspect needs to be re-thought.
It seems to me that a contract should ideally be able to express the following in relation to each of its type parameters:
-
The operators it permits.
-
The conversions it allows.
-
The interfaces it implements.
-
Other embedded contracts it satisfies.
-
In the case of a struct what fields it has and the types of those fields.
-
The signature of any other method it has which is not included in an interface.
-
Whether it could be used as a condition in an if or for statement.
-
The exclusion of certain types which it would otherwise encompass.
There may be others I haven't thought of but these are certainly the most important.
I also think it is unnecessary for the type parameters to be accompanied by a variable of that type i.e. instead of mycontract(t T)
, just: mycontract(T)
will suffice.
So let's take each of the above in turn and see how we could express them in a simple way.
I should point out in advance that many of my suggestions require special syntax but, as it could only appear in a contract, would still be backwards compatible.
Although strange at first, I believe the current approach for expressing operators should be retained as it's simple, clear and at least as good as anything else. So, thumbs up to:
contract comparable(T) {
T == T
}
I also think that it is a good and convenient idea for some operators to permit other operators as suggested in the draft design. For example:
- T < T implies T <= T, T == T, T >= T, T > T and T != T
- T == T implies T != T
Support for the indexation, len, cap and range operations could be expressed as follows:
contract sequencing(T, U) {
T[int] U // a T, when indexed with an int, produces a U
len(T)
cap(T)
range(T) (int, U) // when iterating through a T, an int and a U are produced
}
Again no problem sticking with the status quo:
contract convertible(To, From) {
To(From)
From(To)
From == From
}
Conversions from untyped constants could be dealt with as follows:
contract untyped(T) {
T(untyped int)
}
int in the above contract could be replaced by: bool, rune, float64, complex128 or string and would apply to ANY untyped constant value which has those types as its default.
Arguably, all of these untyped conversions are unnecessary anyway and could be replaced by the corresponding typed conversions.
If I is an interface which T satisfies, this can be dealt with as a conversion:
contract ibased(T) {
I(T)
}
If T and U satisfy another contract E, then it can be dealt with as at present:
contract embedded(T, U) {
T == T
U == U
E(T, U)
}
If T is required to be a struct with a field F of type int, I suggest:
contract sbased(T) {
struct(F int)(T)
}
Parentheses (rather than braces) are used around the field list to avoid possible confusion with a struct conversion.
For this I suggest we simply state the method signature:
contract mbased(T) {
func (T) somefunc(int, bool) T
func (*T) anotherfunc(T, T) *T
func (*T) variadicfunc(...T) *T
}
Note that we can now express any kind of method here and its return type. We can also distinguish between value and pointer receivers and there can be no confusion with struct fields which have a function type.
Quite simply:
contract condition(T) {
if T
for T
}
Perhaps the use of one should automatically include the other.
Arguably, they're unnecessary anyway and could be replaced by a bool(T) conversion.
It has become clear that something like this is needed in the case of numeric types where certain operations involving the smaller types simply won't compile if they would produce out of range values. Also the present approach for excluding them seems both clumsy and not particularly intuitive.
I suggest:
contract add1K(T) {
T + T
T(unsigned int)
exclude{string, int8, uint8}(T) // can't add a thousand to these
}
Any type in an exclusion clause should automatically exclude any type with the same underlying type. So if you have:
type UByte uint8
then UByte would be excluded as well.
So that's basically it, folks!
I don't think there's anything here which is particularly difficult to understand, parse or compile and, as far as I can see, it will still deal with all the examples in the draft paper.
These are my thoughts on other aspects raised in the draft design and overview papers as well as issues raised by other commentators.
If a single type parameter only needs to satisfy an interface, then one could use the interface directly in the function/type definition rather than having to embed it in a separate contract.
For example, if you had this where Iface is an interface:
func doSomething[type T Iface] () { .... }
then it would behave exactly the same as if T were subject to a contract temp defined as follows:
contract temp(T) {
Iface(T)
}
I've seen this proposed elsewhere and I think it would be a sensible convenience feature which should please those who prefer interfaces to contracts in any case.
Although there are ways around it, I think it would be useful to introduce specific syntax for this, my own preference being for T{}.
Although attractive at first sight, I think it might prove confusing if the T == T operator constraint is insufficent for describing the key type or the value type also needs to be constrained in some way.
On balance, I think it's best to express this explicitly.
My view is that they should not for the reasons given in the draft paper.
It can always be worked around using a top-level function which may be clearer in any case.
I don't really see why not. It's one of those areas where you should simply trust the programmer.
My view is that this should be permitted but the name of the field should always be the (original) type parameter used in its definition even if the struct has a method which uses a different name for the type parameter.
I think that using the different name to refer to that field within the method would be too confusing and, if the different name happened to coincide with the name of another field of the same struct, then it would of course be ambiguous.
I know that currently the compiler can detect infinite recursion within a struct or interface (and issue an invalid recursive type error) but I'm not aware of a similar test for functions.
Failing that, if it compiles then I think it will have to be allowed on the trust the programmer principle and that the compiler should use the same considerations to determine whether implementation can be left until runtime as it does for non-recursive cases.
var t T = someFunctionT
Is ambiguous, is it a generic function call or is it calling a function from within an indexed type? The second is often used in dispatch tables so it’s not something that can easily be ignored.