/*
   * There are two fundamental modes here: sequential and parallel. There is very little overlap
   * in semantics between the two apart from the submission side. The whole thing is split up into
   * a submission queue with impure enqueue and cancel functions which is drained by the `Worker` and an
   * internal execution protocol which also involves a queue. The `Worker` encapsulates all of the
   * race conditions and negotiations with impure code, while the `Executor` manages running the
   * tasks with appropriate semantics. In parallel mode, we shard the `Worker`s according to the
   * number of CPUs and select a random queue (in impure code) as a target. This reduces contention
   * at the cost of ordering, which is not guaranteed in parallel mode. With sequential mode, there
   * is only a single worker.
   *
   * On the impure side, the queue bit is the easy part: it's just a `LinkedBlockingQueue` which
   * accepts Registration(s). It's easiest to think of this a bit like an actor model, where the
   * `Worker` is the actor and the enqueue is the send. Whenever we send a unit of work, that
   * message has an `AtomicReference` which allows us to back-propagate a cancelation action. That
   * cancelation action can be used in impure code by sending it back to us using the Finalizer
   * message. There are certain race conditions involved in canceling work on the queue and work
   * which is in the process of being taken off the queue, and those race conditions are negotiated
   * between the impure code and the `Worker`.
   *
   * On the pure side, the two different `Executor`s are very distinct. In parallel mode, it's easy:
   * we have a separate `Supervisor` which doesn't respawn actions, and we use that supervisor to
   * spawn a new fiber for each task unit. Cancelation in this mode is easy: we just cancel the fiber.
   * For sequential mode, we spawn a *single* executor fiber on the main supervisor (which respawns).
   * This fiber is paired with a pure unbounded queue and a shutoff latch. New work is placed on the
   * queue, which the fiber takes from in order and executes in-place. If the work self-cancels or
   * errors, the executor will be restarted. In the case of external cancelation, we shut off the
   * latch (to hold new work), drain the entire work queue into a scratch space, then cancel the
   * executor fiber in-place so long as we're sure it's actively working on the target task. Once
   * that cancelation completes (which will ultimately restart the executor fiber), we re-fill the
   * queue and unlock the latch to allow new work (from the `Worker`).
   *
   * Note that a third mode is possible but not implemented: sequential *without* cancelation. In
   * this mode, we execute each task directly on the worker fiber as it comes in, without the
   * added indirection of the executor queue. This reduces overhead considerably, but the price is
   * we can no longer support external (impure) cancelation. This is because the worker fiber is
   * *also* responsible for dequeueing from the impure queue, which is where the cancelation tasks
   * come in. The worker can't observe a cancelation action while it's executing another action, so
   * cancelation cannot then preempt and is effectively worthless.
   */