Skip to content

Instantly share code, notes, and snippets.

@polytypic
Last active October 5, 2023 11:51
Show Gist options
  • Save polytypic/3bff98742e26bcbbc8ca3e8dfac103d7 to your computer and use it in GitHub Desktop.
Save polytypic/3bff98742e26bcbbc8ca3e8dfac103d7 to your computer and use it in GitHub Desktop.
A generalized approach to turn a functional data structure into a lock-free starvation avoiding data structure

A generalized approach to turn a functional data structure into a lock-free starvation avoiding data structure

Any functional data structure can be turned into a lock-free data structure by wrapping it inside an Atomic. For example, consider the following set implementation:

module Identity_set : sig
  type 'a t
  val empty : 'a t
  val add : 'a -> 'a t -> 'a t
  val contains : 'a -> 'a t -> bool
  val remove : 'a -> 'a t -> 'a t
end = struct
  type 'a t = 'a list
  let empty = []
  let add x' xs = x' :: xs
  let contains x' xs = List.exists ((==) x') xs
  let[@tail_mod_cons] rec remove x' = function
    | [] -> []
    | x :: xs -> if x == x' then xs else x :: remove x' xs
end

One can turn that into a lock-free data structure by wrapping it inside an Atomic:

module Lock_free_identity_set : sig
  type 'a t
  val create : unit -> 'a t
  val add : 'a t -> 'a -> unit
  val contains : 'a t -> 'a -> bool
  val remove : 'a t -> 'a -> unit
end = struct
  type 'a t = 'a Identity_set.t Atomic.t
  let create () = Atomic.make Identity_set.empty
  let rec add t x =
    let before = Atomic.get t in
    let after = Identity_set.add x before in
    if not (Atomic.compare_and_set t before after) then
      add t x
  let contains t x = Identity_set.contains x (Atomic.get t)
  let rec remove t x =
    let before = Atomic.get t in
    let after = Identity_set.remove x before in
    if not (Atomic.compare_and_set t before after) then
      remove t x
end

While the above is a correct lock-free data structure, it has a major problem: the above implementation is prone to starvation.

That is because different operations on the functional set implementation take different amounts of time. If we have two different domains running in parallel where one domain repeatedly adds new elements to the set and another domain removes them, then it is quite possible that the domain trying to remove elements from the set never manages to complete operations, because the operation to add an element is O(1) and the operation to remove an element is O(n). If the set grows sufficiently large, it is quite possible that add operations always finish before remove operations and the remove operations never make progress.

How do we fix this?

Let's first examine a potentially obvious, but incorrect solution.

Let's assume we'd have a safe lazy implementation that allows a lazy object to be forced in parallel from multiple domains. One could then wrap the computations with Lazy:

module Blocking_identity_set : sig
  type 'a t
  val create : unit -> 'a t
  val add : 'a t -> 'a -> unit
  val contains : 'a t -> 'a -> bool
  val remove : 'a t -> 'a -> unit
end = struct
  type 'a t = 'a Identity_set.t Lazy.t Atomic.t
  let create () = Atomic.make (Lazy.from_val Identity_set.empty)
  let rec add t x =
    let before = Atomic.get t in
    let after = lazy (Identity_set.add x (Lazy.force before)) in
    if Atomic.compare_and_set t before after then
      Lazy.force after |> ignore
    else
      add t x
  let contains t x = Identity_set.contains x (Lazy.force (Atomic.get t))
  let rec remove t x =
    let before = Atomic.get t in
    let after = lazy (Identity_set.remove x (Lazy.force before)) in
    if Atomic.compare_and_set t before after then
      Lazy.force after |> ignore
    else
      remove t x
end

As the name suggests, that is actually not lock-free. It is not even non-blocking. That is because forcing a lazy value would block the caller to wait until the domain that first called force completes the operation.

Let's then create a lock-free implementation that avoids starvation. We do that by making it so that lock-free operations always take roughly the same time such that no matter what operations are being attempted in parallel they all have roughly the same probability of winning the race.

module Starvation_avoiding_identity_set : sig
  type 'a t
  val create : unit -> 'a t
  val add : 'a t -> 'a -> unit
  val contains : 'a t -> 'a -> bool
  val remove : 'a t -> 'a -> unit
end = struct
  type 'a operation =
    | Id of 'a Identity_set.t
    | Add of 'a * 'a Identity_set.t
    | Remove of 'a * 'a Identity_set.t
  type 'a t = 'a operation Atomic.t
  let create () = Atomic.make (Id Identity_set.empty)
  let complete = function
    | Id xs -> xs
    | Add (x, xs) -> Identity_set.add x xs
    | Remove (x, xs) -> Identity_set.remove x xs
  let rec add t x =
    let before = Atomic.get t in
    let intent = Add (x, complete before) in
    if not (Atomic.compare_and_set t before intent) then
      add t x
  let rec contains t x =
    match Atomic.get t with
    | Id xs -> Identity_set.contains x xs
    | intent ->
      let xs = complete intent in
      let after = Id xs in
      if Atomic.compare_and_set t intent after then
        Identity_set.contains x xs
      else
        contains t x
  let rec remove t x =
    let before = Atomic.get t in
    let intent = Remove (x, complete before) in
    if Atomic.compare_and_set t before intent then
      (* We explicitly make sure to complete the operation after publishing it
         to avoid potentially leaking the [x] indefinitely. *)
      let after = Id (complete intent) in
      Atomic.compare_and_set t intent after |> ignore
    else
      remove t x
end

As one can see, all the operations on starvation avoiding sets take roughly the same amount of time in their critical sections between Atomic.get and Atomic.compare_and_set. The reason for that is simple. Operations are delayed by one step such that each operation will first complete the previous operation (whatever that was) and then merely record the next operation. It no longer matters that remove operations are slower, because every domain will try to perform the same operation.

With a proper use of a backoff mechanism, this technique should give ok performance. Read-only operations can proceed largely in parallel. Read-write operations are serialized, which means that they do not scale to take advantage of parallelism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment