Skip to content

Instantly share code, notes, and snippets.

@richardbuckle
Last active June 12, 2019 21:18
Show Gist options
  • Save richardbuckle/e6064ae2a2d2c9be1f303231d669a289 to your computer and use it in GitHub Desktop.
Save richardbuckle/e6064ae2a2d2c9be1f303231d669a289 to your computer and use it in GitHub Desktop.
Why public mutable collection properties aren't thread-safe

Intro

I've been asked to expand on why C#/.NET classes having public mutable collection properties are incompatible with thread-safety, and why immutable collections or the specialised concurrent collections might be better choices.

All code below can be run in Visual Studio's "C# Interactive" window.

The problem

The problem is this: it's fundamentally not possible to make a class thread-safe which exposes properties that are mutable collections, even if the setter is private.

This is because there is no good way to lock a mutable collection for exclusive access. Consider this:

class A
{
    public List<int> guts {get; private set;} = new List<int>();
}

A a = new A();

Now suppose two threads each hold a reference to a. One is a producer thread that adds integers to a.guts. The other is a consumer thread that removes them and does something with them. There is no good way for the two threads to co-ordinate access: List is fundamentally not suited for this because there is no good way to lock the list for exclusive access.

(There is a bad way involving the Synchronized property, but is has very poor performance and is not 100% safe against race conditions, so Microsoft have quite rightly deprecated it.)

Moreover, making set private doesn't really help us at all because any code can obtain a reference to a's internal list with List<int> reachInside = a.guts;, pass it to any thread, and start manipulating it arbitrarily and in a thread-unsafe manner. So however much we might add lock sections within our own code, external code can torpedo us at any time.

var a = new A();
a.guts.Add(42); // guts is now [42]

List<int> reachInside = a.guts;
someOtherThread(reachInside); // reachInside is now passed to another thread...

// in another thread, possibly client code in a thread of that client's own devising...
reachInside.Add(57); // guts is now [42, 57] via only reachInside, bypassing any locks

The lock keyword is not your friend

Now in C# we have the lock() keyword. It's the most primitive thread-safety construct in the language, so it's tempting to reach for it first, but it's often not the best choice: it has serious performance implications because it makes threads wait upon each other, limiting CPU parallelism, and it is terribly easy to cause deadlocks. Also there are a lot of misconceptions and bad online advice surrounding it:

  1. It locks an object for exclusive access. No, it doesn't. It locks a section of code from being run on more than one thread at the same time (this is more commonly called a critical section). If you want to ensure exclusive write access to a collection, you have manually make sure to add locks to every bit of code that might mutate it, and as we have seen above this is impossible with a public List<T> property, even if the setter is private, as client code can easily obtain a reference to the property and mutate it without taking the lock.
  2. The performance cost is small. No. The performance cost of an individual lock() statement is small, but the cost of locking several sections of code with the same lock can be very severe, as threads that were designed to run in parallel without blocking each other find themselves waiting in each other all the time: in effect they have become serialized again. It's all too easy to turn a multi-core machine back into a single-core machine. This can be huge, and is particularly noticeable if the UI thread becomes blocked on a worker thread.
  3. It is safe to just lock the relevant sections of code and have done with it. Nope. It is incredibly easy to use lock to write code that will throw the program into deadlock, where one thread is waiting on thread A and another is waiting on thread B. I see novices do this time after time. There are lots of rules of thumb for avoiding this, the most current being "only take a lock() on a privately owned object and on code that cannot possibly call out to code you don't control." Given that we've shown that you cannot control access to a mutable collection property, that's tantamount to saying that it's impossible to manage a mutable collection property in a thread-safe manner.

The bottom line is that locks are a severe pain point for both performance and code safety.

The solution

The issues surrounding locks are why MS introduced the concurrent collection classes in .NET 4 and the immutable collection classes in 4.5, both of which enable lock-less concurrent programming.

https://docs.microsoft.com/en-us/dotnet/standard/collections/

The solution is in fact very simple and stems from functional programming. Simply use immutable collection classes:

class A
{
    public ImmutableList<int> guts {get; private set;} = new List<int>();
}

With immutable collection classes, thread safety is no longer an issue.

Addendum: Hoodathunk has pointed me to this very good MSDN article on the immutable collection classes. It is very clear and gives a lot of useful insight into the performance trade-offs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment