More than a year ago, I wrote a blog post titled Context Should Go Away For Go 2 which received a fair amount of support and response. In said blog post, I described reasons why the "context"
package is a bad idea because it's too infectious.
As explained in the blog post, the reason why "context"
spreads so much and in such an unhealthy fashion is because it solves the problem of cancellation of long-running procedures.
I promised to follow the blog post (which only complained about the problem) with a solution. Considering the recent progress around Go 2, I decided it's the right time to do the follow up now. So, here it is!
My proposed solution is to bake cancellation into the language and thus avoiding the need to pass the context around just to be able to cancel long-running procedures. The "context"
package could still be kept for the purpose of goroutine-local data, however, this purpose does not cause it to spread, so that's fine.
In the following sections I'll explain how exactly the baked-in cancellation would work.
One quick point before we start: this proposal does not make it possible to "kill" a goroutine - the cancellation is always cooperative.
I'll explain the proposal in a series of short, very contrived examples.
We start a goroutine:
go longRunningThing()
In Go 1, the go
keyword is used to start a goroutine, but doesn't return anything. I propose it should return a function which when called, cancels the spawned goroutine.
cancel := go longRunningThing()
cancel()
We started a goroutine and then cancelled it immediately.
Now, as I've said, cancellation must be a cooperative operation. The longRunningThing
function needs to realize its own cancellation on request. How could it look like?
func longRunningThing() {
select {
case <-time.After(5 * time.Second):
fmt.Println("finished")
}
}
This longRunningThing
function does not cooperate. It takes 5 seconds no matter what. That's the first takeaway: cancellation is optional - if a goroutine does not support cancellation, it remains unaffected by it. Here's how we add the support:
func longRunningThing() {
select {
case <-time.After(5 * time.Second):
fmt.Println("finished")
cancelling:
fmt.Println("cancelled")
}
}
I propose the select
statement gets an additional branch called cancelling
(a new keyword) which gets triggered whenever the goroutine is scheduled for cancellation, i.e. when the function returned from the go
statement gets called.
The above program would therefore print:
cancelled
What if the long-running thing spawns some goroutines itself? Does it have to handle their cancellation explicitly? No, it doesn't. All goroutines spawned inside a cancelled goroutine get cancelled first and the originally cancelled goroutine starts its cancellation only after all its 'child' goroutines finish.
For example:
func longRunningThing() {
go anotherLongRunningThing()
select {
case <-time.After(5 * time.Second):
fmt.Println("finished")
cancelling:
fmt.Println("cancelled")
}
}
func anotherLongRunningThing() {
select {
case <-time.After(3 * time.Second):
fmt.Println("child finished")
cancelling:
fmt.Println("child cancelled")
}
}
This time, running:
cancel := go longRunningThing()
cancel()
prints out:
child cancelled
cancelled
This features is here because the child goroutines usually communicate with the parent goroutine. It's good for the parent goroutine to stay fully intact until the child goroutines finish.
Now, let's say, that instead of in another goroutine, longRunningThing
needs to execute anotherLongRunningThing
three times sequentially, like this (anotherLongRunningThing
remains unchanged):
func longRunningThing() {
anotherLongRunningThing()
anotherLongRunningThing()
anotherLongRunningThing()
}
This time, longRunningThing
doesn't even handle the cancellation at all. But, cancellation propagates to all nested calls. Cancelling this longRunningThing
would print:
child cancelled
child cancelled
child cancelled
All anotherLongRunningThing
calls got cancelled one by one.
What if anotherLongRunningThing
can fail, or just wants to signal it was cancelled instead of finishing successfully? We can make it return an error:
func anotherLongRunningThing() error {
select {
case <-time.After(3 * time.Second):
return nil
cancelling:
return errors.New("cancelled")
}
}
Now we update the longRunningThing
to handle the error (using the new error handling proposal):
func longRunningThing() error {
check anotherLongRunningThing()
check anotherLongRunningThing()
check anotherLongRunningThing()
return nil
}
In this version, longRunningThing
returns the first error it encounters while executing anotherLongRunningThing
three times sequentially. But how do we receive the error? We spawned the function in a goroutine and there's no way to get the return value of a goroutine in Go 1.
Here comes the last thing I propose. I propose that the function returned from the go
statement has the same return values as the function that was set to run in the goroutine. So, in our case, the cancel
function has type func() error
:
cancel := go longRunningThing()
err := cancel()
fmt.Println(err)
This prints:
cancelled
However, if we waited 10 seconds before cancelling the goroutine (longRunningThing
takes 9 seconds), we'd get no error, because the function finished successfully:
cancel := go longRunningThing()
time.Sleep(10 * time.Second)
err := cancel()
fmt.Println(err)
Prints out:
<nil>
And lastly, say we have a function called getGoods
which contacts some service, gets some goods back, and sends them on a channel. We only want to wait for the goods for 5 seconds, no more. Here's how we implement a timeout:
goods := make(chan Good)
cancel := go getGoods(goods)
select {
case x := <-goods:
// enjoy the goods
case <-time.After(5 * time.Second):
err := cancel()
return errors.Wrap(err, "timed out")
}
And that is the end of this series of short examples. I've shown all of the proposed features. In the next section, I'll describe the features more carefully and explain precisely how they'd work.
I propose to extend the go
statement to return a function, which takes no arguments and its return type is the same as the return type of the function called in the go
statement, including multiple return values. Secondly, I propose to extend the select
statement with an optional cancelling
branch.
For example:
var (
f1 func() float64 = go math.Sqrt(100)
f2 func() (*os.File, error) = go os.Open("file.txt")
f3 func() int = go rand.Intn(20)
)
Calling the function returned from the go
statement suspends until the spawned goroutine returns, then it returns exactly what the spawned function returned. Calling the returned function multiple times causes nothing additional and always returns the same results.
Calling the functions assigned above results in this:
fmt.Println(f1()) // 10
fmt.Println(f2()) // &{0xc4200920f0} <nil>
fmt.Println(f3()) // 17
// we can call them as many times as we want
fmt.Println(f3(), f3(), f3()) // 17 17 17
Furthermore, calling the returned function causes the spawned goroutine to start a cancellation process. The cancellation process has two stages:
- Cancelling all child goroutines (goroutines spawned inside the goroutine that is being cancelled). This stage finishes when all child goroutines finish cancellation.
- Switching into the cancelling mode. In this mode, all
select
statements always select thecancelling
branch if present. If not present, they function normally.
Eventually the goroutine returns. The call to the function returned from the go
statement unsuspends and returns these values.
The mechanism can also be used for other purposes. One that comes to my mind is to use the functions returned from the go
statement as futures. Indeed, this is a common pattern in Go:
ch := make(chan T)
go func() {
ch <- function()
}()
// some other code
x := <-ch
This whole boilerplate is here just to execute function
concurrently and use its return value later. With my proposal, we could simplify that code like this:
future := go function()
// some other code
x := future()
Of course, this would only work if function
wouldn't support cancellation, but most functions shouldn't support it, and those that do should document it.