This document is now a blog post! You can find it at https://beaurivage.io/atsamd-hal-async/

The atsamd-hal async story

This document aims to explain the concepts and design decisions leading adding support for an async HAL.

Understanding async in an embedded context

async/await is a fantastic tool to leverage concurrency, especially on single-core systems, where some tasks may be offloaded to peripherals, freeing up the CPU to do other and better things. Instead of polling these peripherals - by either busy-waiting, or using interrupts with a potentially complex state machine to check on progress - we can abstract that complexity away to the compiler, and write code that has the appearance of being straightforward and linear.

The three building blocks of `async/await` in Rust

To make cooperative multitasking possible in an embedded context, we will need three things:

A Future: The task we want to perform. Conceptually, a Future is just a task that will complete eventually. The trait looks like this:
```
pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output>;
}
```
It returns Poll, which just indicates whether the task is ready to return:
```
pub enum Poll<T> {
    Ready(T),
    Pending,
}
```
An async function or block is just syntax sugar for a function which returns a future (some details erased for clarity):
```
async fn some_function() {}
// Desugars to
fn some_function() -> impl Future<Output = ()>
```
In its essence, awaiting a future is just calling poll() until the future returns Poll::Ready. Note that a future should do nothing until it is first polled. If it returns Poll::Pending, it will be polled again some time later, until it returns Ready. Note that a future is free to panic (but not to display undefined behavior) if it is polled again after it has returned Ready.
An executor: Schedules the tasks to be ran. It's responsible for checking in with tasks when they are ready to make progress. It's also responsible for parking tasks that aren't. In the embedded world, the two most popular executors are probably embassy-executor and rtic.
A reactor: The name says it all: it reacts to stuff. Their job is to listen to external events, and wake the executor when a task is ready to make progress. Typically, in a std executor like tokio, the executor also brings in its own reactors. In the embedded world, when building our own futures from scratch, we will need to provide those reactors ourselves.

How do we know when a task is ready?

Alright, so a naive executor implementation could repeatedly poll a future in a busy loop until it returns:

async fn some_future() {
    // ...
}

fn naive_executor(){
    let future = some_future();

    let result = loop {
        if let Poll::Ready(value) = future.poll(/* what should we put in here? */){
            break value;
        }
    };
}

However, this completely defeats the purpose of using cooperative multitasking, because our CPU is completely tied up in repeatedly checking if our task is ready. Therefore, we can't free it up to do other things while we're waiting for this task to complete. So how does the executor know when a task is ready to make progress?

Wakers

The Future::poll method takes a &mut Context argument, which itself contains a Waker. The executor provides this waker when polling a future, and expects the future to call Waker::wake when it wants to make more progress. When constructing our own futures, we need some way to know when we're ready to wake the executor.

You guessed it - Interrupts

Conveniently enough, microcontrollers have interrupts! They're perfect for this task - they can preempt whatever code is running at the time, wake the waker, and let the executor know that we're ready to move forward. The interrupt handlers are our reactors - they react to external events (the peripherals making progress), and can wake the executor to signal that progress has been made.

How to build a future from scratch - microcontroller style

Let's now dive into the design of the atsamd-hal async APIs. Let's take the example of an async timer, as it has the simplest API.

Our goal is to start with TimerCounter, and end up with the following function:

pub async fn delay(&mut self, count: NanosDurationU32) {
    // ...
}

To get there, we need to take multiple steps:

Figure out a way to tie a particular peripheral to its interrupt handler. We want the peripheral to take ownership of the handler - we don't want to rely on the user calling the right functions in the handler, which could be very error prone.
Wake the waker inside the interrupt handler
Register the interrupt to be woken inside our delay method

Tying an interrupt handler to a peripheral

To provide a reusable to bind an interrupt handler to an interrupt, we leverage two traits:

// src/async_hal/interrupt.rs

// Represents a struct that holds an interrupt handler
pub trait Handler<I: InterruptSource>: Sealed {
    // The actual handler
    unsafe fn on_interrupt();
}

// Represents a valid binding of interrupt source to handler
pub unsafe trait Binding<I: InterruptSource, H: Handler<I>> {}

// This trait has a blanket implementation for every interrupt in the PAC.
pub trait InterruptSource: crate::typelevel::Sealed {
    unsafe fn enable();

    fn disable();

    fn unpend();

    fn set_priority(prio: Priority);
}

First, we need a place to store the wakers that our executor gives us. This needs to be static storage, since the poll method may return before the task is ready. We need to keep the waker around until we've returned Ready

// src/peripherals/timer/async_api.rs
struct State {
    // AtomicWaker is part of `embassy-sync`, and provides a convenient way to hold
    // a waker that can be woken with a shared reference, meaning we can store it
    // in a static variable.
    waker: AtomicWaker,
    ready: AtomicBool,
}

impl State {
    /// Store the waker the executor gave us in our atomic waker
    fn register(&self, waker: &Waker) {
        self.waker.register(waker)
    }

    /// Wake the executor
    fn wake(&self) {
        self.ready.store(true, Ordering::SeqCst);
        self.waker.wake()
    }

    /// Is our task ready?
    fn ready(&self) -> bool {
        self.ready.swap(false, Ordering::SeqCst)
    }
}

// Each TC gets its own entry
const STATE_NEW: State = State::new();
static STATE: [State; NUM_TIMERS] = [STATE_NEW; NUM_TIMERS];

We can then write the interrupt handler:

// src/peripherals/timer/async_api.rs

/// Interrupt handler for async timer operarions
pub struct InterruptHandler<T: AsyncCount16> {
    _private: (),
    _tc: PhantomData<T>,
}

impl<A: AsyncCount16> Handler<A::Interrupt> for InterruptHandler<A> {
    unsafe fn on_interrupt() {
        // Steal the TC peripheral. We will only mess with the interrupt flags.
        let periph = unsafe { crate::pac::Peripherals::steal() };
        let tc = A::reg_block(&periph);
        let intflag = &tc.count16().intflag();

        // Check if the overflow interrupt flag is set for this TC
        if intflag.read().ovf().bit_is_set() {
            // Clear the flag so we don't reenter the handler in a loop
            intflag.modify(|_, w| w.ovf().set_bit());
            // Wake the waker!
            STATE[A::STATE_ID].wake();
        }
    }
}

Now we can create a newtype struct that wraps a TimerCounter, statically ensuring the the correct interrupt source has been bound to the handler

// src/peripherals/timer/async_api.rs

pub struct TimerFuture<T>
where
    T: AsyncCount16,
{
    timer: TimerCounter<T>,
}

impl<T> TimerCounter<T>
where
    T: AsyncCount16,
{
    /// Transform a [`TimerCounter`] into an [`TimerFuture`]
    #[inline]
    pub fn into_future<I>(mut self, _irq: I) -> TimerFuture<T>
    where
        I: Binding<T::Interrupt, InterruptHandler<T>>,
    {
        T::Interrupt::unpend();
        unsafe { T::Interrupt::enable() };
        self.enable_interrupt();

        TimerFuture { timer: self }
    }
}

And finally, write our async method:

// src/peripherals/timer/async_api.rs
impl<T> TimerFuture<T>
where
    T: AsyncCount16,
{
    
    pub async fn delay(&mut self, count: NanosDurationU32) {
        self.timer.start(count);
        self.timer.enable_interrupt();

        // poll_fn is a nice way of avoiding to write a struct that implements Future.
        // The closure will be called every time the waker is woken. That's why we need
        // to start the timer outside the closure, otherwise it would be restarted every
        // time the future is polled!
        poll_fn(|cx| {
            // Register the waker the executor gave us into the corresponding AtomicWaker
            STATE[T::STATE_ID].register(cx.waker());
            // The interrupt handler determines if the task is done.
            if STATE[T::STATE_ID].ready() {
                return Poll::Ready(());
            }

            Poll::Pending
        })
        .await;
    }
}

Registering interrupts: the `bind_interrupts` macro

The user-facing mechanism to bind an interrupt source to the correct interrupt handler is the bind_interrupts macro. This macro does 2 things:

Create a ZST struct that implements Binding. This struct may be passed to any peripheral to statically prove that the correct interrupt source has been bound to the correct interrupt handler for a given peripheral
Declares the interrupt handler, and calls Handler::on_interrupt().

use atsamd_hal::async_hal::interrupts;

// For example,
atsamd_hal::bind_interrupts!(struct Irqs {
    EIC => atsamd_hal::eic::InterruptHandler;
});

// Inlines to 
#[derive(Copy,Clone)]
struct Irqs;

#[allow(non_snake_case)]
#[no_mangle]
unsafe extern "C" fn EIC(){
    <atsamd_hal::eic::InterruptHandler as interrupts::Handler<interrupts::EIC>>::on_interrupt();
}

unsafe impl interrupts::Binding<
    interrupts::EIC,
    atsamd_hal::eic::InterruptHandler
> for 
    Irqs 
where interrupts::EIC: interrupts::SingleInterruptSource
{

}

Multi-interrupt peripherals

Some peripherals have multiple interrupt sources. In the HAL, to simplify the design, we only use a single handler for all the interrupts. The bind_multiple_interrupts macro lets the user bind multiple interrupt sources to the same handler. Currently, the DMAC and SERCOM peripherals follow this interrupt scheme.

use atsamd_hal::async_hal::interrupts;

// For example,
atsamd_hal::bind_multiple_interrupts!(struct SpiIrqs {
    SERCOM2: [SERCOM2_0, SERCOM2_1, SERCOM2_2, SERCOM2_3, SERCOM2_OTHER] => atsamd_hal::sercom::spi::InterruptHandler<Sercom2>;
});

// Inlines to
#[derive(Copy,Clone)]
struct SpiIrqs;

#[allow(non_snake_case)]
#[no_mangle]
unsafe extern "C" fn SERCOM2_0(){
    <atsamd_hal::sercom::spi::InterruptHandler<Sercom2> as interrupts::Handler<interrupts::SERCOM2>>::on_interrupt();
}
#[allow(non_snake_case)]
#[no_mangle]
unsafe extern "C" fn SERCOM2_1(){
    <atsamd_hal::sercom::spi::InterruptHandler<Sercom2> as interrupts::Handler<interrupts::SERCOM2>>::on_interrupt();
}
#[allow(non_snake_case)]
#[no_mangle]
unsafe extern "C" fn SERCOM2_2(){
    <atsamd_hal::sercom::spi::InterruptHandler<Sercom2> as interrupts::Handler<interrupts::SERCOM2>>::on_interrupt();
}
#[allow(non_snake_case)]
#[no_mangle]
unsafe extern "C" fn SERCOM2_3(){
    <atsamd_hal::sercom::spi::InterruptHandler<Sercom2> as interrupts::Handler<interrupts::SERCOM2>>::on_interrupt();
}
#[allow(non_snake_case)]
#[no_mangle]
unsafe extern "C" fn SERCOM2_OTHER(){
    <atsamd_hal::sercom::spi::InterruptHandler<Sercom2> as interrupts::Handler<interrupts::SERCOM2>>::on_interrupt();
}

unsafe impl interrupts::Binding<
    interrupts::SERCOM2,
    atsamd_hal::sercom::spi::InterruptHandler<Sercom2>
> for
    SpiIrqs 
where interrupts::SERCOM2: interrupts::MultipleInterruptSources{}

jbeaurivage/atsamd-hal-async-story.md