-
-
Save atotto/9342938 to your computer and use it in GitHub Desktop.
package chan_test | |
import ( | |
"testing" | |
) | |
func BenchmarkStructChan(b *testing.B) { | |
ch := make(chan struct{}) | |
go func() { | |
for { | |
<-ch | |
} | |
}() | |
for i := 0; i < b.N; i++ { | |
ch <- struct{}{} | |
} | |
} | |
func BenchmarkBoolChan(b *testing.B) { | |
ch := make(chan bool) | |
go func() { | |
for { | |
<-ch | |
} | |
}() | |
for i := 0; i < b.N; i++ { | |
ch <- true | |
} | |
} | |
func BenchmarkIntChan(b *testing.B) { | |
ch := make(chan int) | |
go func() { | |
for { | |
<-ch | |
} | |
}() | |
for i := 0; i < b.N; i++ { | |
ch <- 1 | |
} | |
} |
atotto
commented
Mar 4, 2014
$ go version
go version go1.5.3 linux/amd64
$ go test -test.bench Bench
testing: warning: no tests to run
PASS
BenchmarkStructChan-4 5000000 313 ns/op
BenchmarkBoolChan-4 5000000 333 ns/op
BenchmarkIntChan-4 5000000 290 ns/op
ok phpsw.net/chan/test 5.725s
On macOS El Capitan:
elie ~/tmp/Benchmarks 0$ go version
go version go1.7.3 darwin/amd64
elie ~/tmp/Benchmarks 0$ go test -bench . chan_test.go
testing: warning: no tests to run
BenchmarkStructChan-8 10000000 210 ns/op
BenchmarkBoolChan-8 10000000 210 ns/op
BenchmarkIntChan-8 10000000 211 ns/op
PASS
ok command-line-arguments 6.991s
Note that it was an Overclocked i7 4770K @ 4.3 GHz
Channels are actually slow.
You get to pass message within 210 ns on an overclocked CPU of 4.3 GHz. Typically, most CPUs will run at around 2.x GHz and some 3.x GHz. So, it's easy to get a result like the other ones above, i.e., 300 - 400 ns.
This directly means, that you'll be able to send only around 2.5 million messages per second, if that was all you did. And the bad news? You can't optimize that, that's the maximum it will ever get. Now, channels aren't just the only bottleneck, goroutines scheduling, your application logics, etc... Unfortunately, once you use Golang, you'll likely be forced to always use channels at least one time, now the overhead of using many channels will render your ability to send messages to a much lower rate; think less than a million messages/s. Remember, everytime you use the channel, that's 300~400 ns.
If you use Mutex, they are usually around 20 ns latency. If you get your CPU caches right, this means you'll be able to update locked values 50 million times per second. Of course, that's not the only thing you do in your application. But this is definitely a huge savings on CPU time over 400 ns, I mean, Mutexes are about 20 times faster than channels.
Depending on your application, if you need the throughput, channels will kill your performance. If you can avoid channels, do so in your hot spot, and use Mutex instead.
Go definitely needs some new "kind of channel" that has a lot more performance thoughts put into it. After all, the current channel implementation is just weird at times: you cannot close and a closed channel or you'll panic, you'll panic when sending to a closed channel, you'll wait undefinitely on null channel, etc... man!!!
Hint: You can use buffered channels
BenchmarkStructChan 20000000 72.3 ns/op
BenchmarkBoolChan 20000000 73.9 ns/op
BenchmarkIntChan 20000000 73.7 ns/op
vs.
BenchmarkStructChan-2 5000000 288 ns/op
BenchmarkBoolChan-2 5000000 300 ns/op
BenchmarkIntChan-2 5000000 302 ns/op
go version go1.7.5 linux/amd64
on Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz
Nothing overclocked
To unsuspecting visitors to this page. Few things.
-
This not how you should use channels, do not pass tiny amount of work over channels, pass around a chunk of work and work in batches. You will see almost horizontal scaling of the program using this simple technique.
-
Results of the exact same bench on my 3.5GHz dual core Virtualbox VM
BenchmarkStructChan-2 10000000 172 ns/op 0 B/op 0 allocs/op
BenchmarkBoolChan-2 10000000 175 ns/op 0 B/op 0 allocs/op
BenchmarkIntChan-2 10000000 175 ns/op 0 B/op 0 allocs/op
PASS
ok benchmarks/ifac2 5.800s
go version go1.8.3 linux/amd64
So, fret not. Go is fine, it requires a little bit of introspection. It ain't gonna do all the work for you on its own.
- With buffered channels
go test -bench=. -benchmem -benchtime=1s
BenchmarkStructChan-2 20000000 59.0 ns/op 0 B/op 0 allocs/op
BenchmarkBoolChan-2 30000000 55.9 ns/op 0 B/op 0 allocs/op
BenchmarkIntChan-2 30000000 60.0 ns/op 0 B/op 0 allocs/op
PASS
ok benchmarks/ifac2 4.991s
To add onto that, if you were really trying to send one item at a time, you should be using a mutex from sync.
Here's what I got on a 2 vCPU (Haswell 3GHz) VPS with 1.6.2 and 1.9.2.
$ go version
go version go1.6.2 linux/amd64
$ go test -test.bench Bench
testing: warning: no tests to run
PASS
BenchmarkStructChan-2 5000000 260 ns/op
BenchmarkBoolChan-2 5000000 283 ns/op
BenchmarkIntChan-2 5000000 264 ns/op
ok _/root/test 4.948s
$ go version
go version go1.9.2 linux/amd64
$ go test -test.bench Bench
goos: linux
goarch: amd64
BenchmarkStructChan-2 10000000 258 ns/op
BenchmarkBoolChan-2 5000000 243 ns/op
BenchmarkIntChan-2 10000000 234 ns/op
PASS
ok _/root/test 5.650s