The default Go implementation of
sync.RWMutex does not scale well
to multiple cores, as all readers contend on the same memory location
when they all try to atomically increment it. This gist explores an
n
-way RWMutex, also known as a "big reader" lock, which gives each
CPU core its own RWMutex. Readers take only a read lock local to their
core, whereas writers must take all locks in order.