Asynchronous Programming

In the last chapter, we explained how to use threads in Rust to spread compute-intensive work across multiple processor cores, putting the entire machine to work. But threads are more than just a way to get things done faster: they also make a programming language more expressive. The clearest way to write a program that carries out multiple activities independently, but concurrently, is by creating a thread to handle each one. For example, if a network server creates a separate thread for each connection, then each thread can focus on managing state, receiving requests, sending replies, and so on for that one connection. In principle, you could handle all the connections with a single thread, using a hand-rolled state machine to juggle all the connections, but the multi-threaded approach is easier to read and maintain.

As an expressive tool, however, threads have some drawbacks. Even a thread that does almost nothing uses around twenty kilobytes of memory on most platforms (including kernel and user space memory), and the call stack's memory consumption can be surprising as well. And creating threads can be slow.

Rust's asynchronous code is a cooperative multitasking system that offers the expressiveness of threaded code, with lower overhead---sometimes much lower. The asynchronous counterpart of a thread is called a task. Asynchronous tasks in Rust are analogous to "goroutines" in the Go language, or "virtual threads" in Java. Those facilities provide a smoother developer experience than asynchronous code in Rust, but they also place complex demands on their language runtimes. Rust's asynchronous code, in contrast, works well even in embedded environments.

If you're doing compute-intensive work, you'll need to use threads to exploit your machine's parallelism. But if your workload spends most of its time waiting for I/O, as is often the case for network services or interactive programs, then, for a given budget of memory and processor time, asynchronous code can handle many more tasks simultaneously. This is why most popular I/O-related crates in the Rust ecosystem provide asynchronous interfaces. But threads and asynchronous code are often used in combination, to handle workloads that have a mix of interactive and computationally intensive tasks.

This chapter approaches asynchronous code from a user's point of view: how to write it, how to use published crates with asynchronous interfaces, and how to work through common problems. The next chapter explains how asynchronous code works: exactly how it is suspended, scheduled, and resumed. You'll need those details in order to understand how asynchronous code will perform, and extend the system with your own primitives. But in this chapter, we're only concerned with making stuff work.

What Asynchronous Code Looks Like

Although asynchronous code shows its value best in applications that create many tasks, its core features are easiest to understand in simpler situations. So while we'll see highly concurrent code later in the chapter, this section starts with an example that offers fewer distractions.

Here's a program that uses Wikipedia's public API to list the sections of the Rust language's wiki page:

use anyhow::Result;

async fn make_request() -> Result<String> {
    let url = "https://en.wikipedia.org/w/api.php\
               ?action=parse&format=json&prop=sections\
               &page=Rust_(programming_language)";
    let response = reqwest::get(url)
        .await?
        .text()
        .await?;
    Ok(response)
}

fn main() -> Result<()> {
    let runtime = tokio::runtime::Runtime::new()?;

    let response = runtime.block_on(make_request())?;
    println!("{response}");
    Ok(())
}

Here's our program's Cargo.toml file:

[package]
name = "reqwest-wikipedia"
version = "0.1.0"
edition = "2021"

[dependencies]
anyhow = "1"
tokio = { version = "1.41", features = ["full"] }
reqwest = "0.12"

When we run this, it writes the query's result to its standard output:

$ cargo run -p reqwest-wikipedia
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.08s
     Running `/home/jimb/async/target/debug/reqwest-wikipedia`
{
  "parse": {
    "title": "Rust (programming language)",
    "sections": [
      {
        "line": "History",
        "number": "1",
        "fromtitle": "Rust_(programming_language)",
        "linkAnchor": "History"
      },
      ...
    ]
  }
}
$

This program's code doesn't look wildly different from ordinary Rust. And that's the idea: asynchronous code generally resembles classic, synchronous Rust, with a few additions to make the cooperative multi-tasking possible:

An asynchronous function, defined with async fn, is one whose execution can be suspended while it waits for I/O or other blocking operations to complete. In our program, make_request is an asynchronous function. It calls reqwest::get, an asynchronous function provided by reqwest, a popular HTTP client crate.
Within an asynchronous function, we can use await expressions to wait for asynchronous operations to complete. An await expression is written as .await, following the operation we want to wait for. (Although this looks like we're accessing a struct field named await, .await is actually a postfix operator built into Rust's grammar.)

The make_request function uses await expressions to wait for the result from reqwest::get, and then for the result of calling the text method on its success value. The calls to get and text will typically block waiting for the network, and when they do, the await expression suspends the execution of make_request itself until the task is ready to make progress again.
In any program that uses asynchronous code, you must specify the runtime that will run your asynchronous tasks. A runtime is asynchronous Rust's analogue of the operating system's thread scheduler, maintaining a queue of runnable tasks, and tracking tasks that are blocked on operations like I/O, waiting for a timer, or acquiring a mutex.

Our program uses the tokio crate's runtime. We create a Runtime value, and then call its block_on method to start an asynchronous task that drives the call to make_request to completion. When it's done, block_on returns make_request's result, which we then print.

In addition to the Runtime type, the tokio crate provides asynchronous versions of the standard library's I/O and synchronization primitives, and various frequently needed utilities. tokio is by far the most widely used asynchronous support crate, so almost any other crate you find will be ready to work with it.

That said, tokio is just an ordinary third-party crate, not part of the Rust language. There's nothing about implementing a runtime that entails using non-portable, unstable or private Rust internal details. In fact, the next chapter shows how to implement several of a runtime's core features yourself.

Because it's so common for programs that use asynchronous code to start, as we do here, by creating a runtime and then immediately using it to call some asynchronous function that does all the real work, tokio provides an attribute macro, #[tokio::main], to make this easy. Using the macro, we can replace both make_request and main with a single function:

#[tokio::main]
async fn main() -> Result<()> {
    let url = "https://en.wikipedia.org/w/api.php\
               ?action=parse&format=json&prop=sections\
               &page=Rust_(programming_language)";
    let response = reqwest::get(url)
        .await?
        .text()
        .await?;
    println!("{response}");
    Ok(())
}

This equivalent to our original code, but the runtime boilerplate has been hidden away, and main has become an asynchronous function, so it can use await expressions directly. tokio's interface is designed such that you rarely need to refer the runtime explicitly, so having it hidden away like this is almost always fine.

Futures

In the Wikipedia query program, the call to block_on may look a little strange:

runtime.block_on(make_request())

If make_request were an ordinary function, this code would wait for its body to finish execution before calling block_on, which doesn't make much sense: the work is done, so there's nothing left to block on.

But make_request is not an ordinary function: it's asynchronous. When you call an asynchronous function, its body does not even begin execution. Instead, the function immediately returns a future: a Rust value representing an asynchronous call in progress. A future records the call's arguments, and includes space for whatever local variables and temporaries the function body will need as it executes. Passing this future to runtime.block_on drives it to completion, and then passes along its ultimate return value---in this case, the query results.

Once you know that asynchronous function calls return futures, you can make better sense of code like this:

reqwest::get(url).await?

Here, too, the call to reqwest::get returns a future. The await expression waits for that future to produce a value, to which we apply the ? operator to see if the get succeeded.

We've said that futures are values representing an asynchronous function call in progress, but let's make the point a bit more explicit. Consider the following program:

async fn print(message: &str) {
    println!("{message}");
}

#[tokio::main]
async fn main() {
    let bonnie = print("Bonnie");
    let clyde = print("Clyde");

    println!("Starring:");
    clyde.await;
    println!("and");
    bonnie.await;
}

This produces the output:

Starring:
Clyde
and
Bonnie

Each call to print returns a future, without beginning execution of the function's body. The main function saves these futures in the local variables bonnie and clyde. Only when we await a future does the body begin execution. Thus, even though we call print("Bonnie") before print("Clyde"), the program prints Clyde first, because we await the future clyde first.

Concretely, a future is any value that implements the standard library trait std::future::Future. The Future trait is tricky to use directly, so we'll cover its definition in the next chapter. For now, what matters is that a type that implements Future can be polled to see if the computation it represents is done yet. If not, the future must tell the runtime when it should be polled again, to make some more progress towards completion.

The concept of "polling" is often associated with inefficient implementation techniques like busy waiting, but rest assured that Rust's asynchronous code performs quite well. In practice, properly implemented runtimes poll properly implemented futures only when there is actual progress to be made.

Rust's futures are roughly analogous to Promise in JavaScript or Task in C#: these all represent the not-yet-available value of an asynchronous function call. However, there is a key difference. In most languages that support asynchronous programming, calling an asynchronous function implicitly registers the call with some sort of task scheduler, built into the language runtime, that will drive the function call to completion, even if you never await the promise or task. In Rust, this is not the case: the language has no built-in task scheduler. Futures are ordinary, passive Rust values that do nothing unless you await them, pass them to a runtime function like block_on, or take some other explicit action to drive their execution. So, for example, if you change the program above to simply drop the clyde future without awaiting it, the asynchronous call print("clyde") will never run.

In practice, Rust prints a warning at compile time if you ignore the future of an asynchronous function call, so mistakes of that sort are rare. And requiring the program to indicate explicitly how it expects a future to be driven avoids the need for a runtime built into the language itself. Leaving the choice of runtime up to the developer helps make Rust's asynchronous code practical to use in a broad range of environments, ranging from high-powered servers to embedded systems with minimal operating systems, or no operating system at all.

Asynchronous Blocks

Beyond asynchronous functions, Rust also has asynchronous blocks:

async {
    println!("Starring:");
    clyde.await;
    println!("and");
    bonnie.await;
}

An asynchronous block produces a future which, when driven to completion, will run the statements it contains, and return the value of the final expression, if any.

Like an ordinary block, an asynchronous block's contents may refer to variables in the surrounding code. This means that the future of an asynchronous block must capture those variables' values, just as a closure would. The asynchronous block above would return a future that captures the values of bonnie and clyde.

TODO: explain how to indicate the return type, so ? works

TODO: async move

TODO: can't use break or continue (worth mentioning?)

Asynchronous Functions Are Ordinary Functions

Calling an asynchronous function returns a future, and a future is a value that implements the Future trait---but isn't that something ordinary functions can do, too?

Indeed it is. Consider an asynchronous function like this:

async fn f(...) -> Xlerb {
    // function body
}

The Future trait has an associated type, Output, which is the type of the value the future will produce when it completes. We can use this to write the above as an ordinary, synchronous function:

fn f(...) -> impl Future<Output = Xlerb> {
    async move {
        // function body
    }
}

In other words, an asynchronous function returning a value of type Xlerb is just an ordinary function that returns a future that will produce a value of type Xlerb. And we can use an asynchronous block to build exactly the future we want to return: one that executes the function body and passes along its value.

In our Bonnie and Clyde program, note that the print function's argument is a &str, a reference to a string slice. Our initial definition takes advantage of Rust's tolerance for omitting boring details, but if we write out the reference's full lifetime and include the return type, print's definition is:

async fn print<'a>(message: &'a str) -> () {
    println!("{message}");
}

Since an asynchronous function call's future retains the argument values passed, the future that print returns must not outlive the referent of message, which has the lifetime 'a. So print's full definition as an ordinary, synchronous function would be:

fn print<'a>(message: &'a str) -> impl Future<Output = ()> + 'a {
    async move {
        println!("{message}");
    }
}

Since the body of print uses the string that message borrows, it makes sense that print's future can live no longer than lifetime 'a, as the return type now makes explicit.

jimblandy/200-asynchronous.md

Asynchronous Programming

What Asynchronous Code Looks Like

Futures

Asynchronous Blocks

Asynchronous Functions Are Ordinary Functions