Disclaimer: This is my understanding of the concepts of WebGPU, thus some content might be outright wrong. Please correct me in the comment section and I'll fix it.

More learning materials

Adapter and device and surface

Adapter: An implementation of WebGPU on the machine, i.e. the GPU and an implementation on top of it.

Device: Is a connection to the adapter, that exposes WebGPU interface to the user.

Why are they separated?

Allows separation of concerns.
Device allows isolated context.

Surface

This is where the rendered result is presented. It can be an OS window in a native environment or an HTML <canvas> in a web browser. Surface is not mandatory, for example when rendering off-screen to a hard disk.

Capabilities

We can specify the adapter's capabilities, which may or may not be satisfied. We can then specify the device's capabilities, which must be a subset of that in the previous step. Subsequent operations on the device will be validated on its capabilities.

This makes dependency on hardware explicit and the program portable.

Asynchrony

Some operations with WebGPU are async, for example, requesting device and adapter.

Creating new objects from a device and its children always immediately returns on the CPU side. However, in javascript API, they pretty much work in an asynchronous manner. When an object is created on the CPU side, the actual content on the GPU might be created a bit later. The reason is that creations of actual GPU objects require time for validations and allocations. The immediately returned objects on the CPU act like handles, and are either valid or invalid. Quote the draft:

One way to interpret WebGPU’s semantics is that every WebGPU object is actually a Promise internally and that all WebGPU methods are async and await before using each of the WebGPU objects it gets as argument. However, the execution of the async code is outsourced to the GPU process (where it is actually done synchronously).

However, this is not right if WebGPU is used in C++ (and maybe Rust, too). Unlike javascript event-driven model, they allow blocking GPU operations. See more in Timelines.

Contagious invalidity

If the returned object later turns out invalid (for example invalid descriptor or out of memory), the caller can still use it for subsequent operations. If these operations create new objects, they will also be invalid (hence the name).

This is useful because the caller does not need to wait until the GPU finishes creating an object to do something with it.

For example, the user creates an object A a, then creates an object B by calling webgpuMakeB(a, ...). Normally, they will have to wait until GPU finishes creating the underlying representation of a, check if a is valid, and then call the function. It takes at least one round trip to instruct the next step to GPU. However, in the above model, webgpuMakeB can be directly invoked without waiting, because if a is invalid then b is simply also invalid and the whole operation is no-op ¹.

Timelines

Timelines in WebGPU represent the execution context of different parts of WebGPU. There are Device, Content, and Queue timelines.

Device timeline: executes creations of WebGPU underlying objects like adapter, device, and GPU resources and objects.
Content timeline: refers to client code, including but not limited to your code that calls exposed API functions of WebGPU.
Queue timeline: refers to the executions on graphical units, e.g. draw, copy, etc.

In a web environment, the content timeline is the wasm and javascript being executed, the device timeline is often the OS process(es) of the browser controlling GPU (some browsers do not use process) and the queue timeline is the work done by GPU itself.

Because content and device timeline lives in different processes, communication takes time and was made async as mentioned in Asynchrony.

In contrast, in a native environment (for example native software with C++), content and device timeline is roughly the same².

Error handling

WebGPU provides error scope mechanism to handle errors. We can mark certain sections of code with a pushErrorScope and popErrorScope call. If an error happens during the execution of that scope, it is returned in the pop statement. One pop statement returns at most one error and potential other errors in the same scope are ignored.

Buffer

Buffer is a continuous chunk of memory that the GPU can control and use for its operations. Buffer can be in GPU's video RAM (in discrete device) or main RAM (in integrated device). CPU can also read and write to some buffers. However, in WebGPU, the contents of buffers are not visible to the CPU by default, and only some of them can be made available to the CPU by buffer mapping.

Buffer mapping

It is a mechanism to exchange data between GPU and CPU efficiently.

CPU -> GPU: One way to send data from CPU to GPU is to copy it from main RAM to wherever the GPU's memory is with writeBuffer. But this duplicates data. From the user's perspective, buffer mapping allows the CPU to directly write to GPU's memory. What happens under the hood is implementation defined, and may still involve copying data around.
GPU->CPU: Buffer mapping is one way to read back data from the GPU. Currently, there is no nice and easy ~~readBuffer~~.

Why?

Queue::writeBuffer is efficient and may suffice your use case even when you want to transfer a large amount of data. But buffer mapping allows more sophisticated methods, especially when you have to do that very often, with the cost of complexity.

Ownership

Buffer mapping would potentially lead to data race if both GPU and CPU have access to the same buffer. WebGPU states that upon mapping and unmapping, only one of them can take ownership of the buffer.

Command encoding

The content timeline needs to tell Queue timeline what to do. It can record a batch of commands and then send them at once via an object called Queue. Commands have different representation in two timelines, maybe for e.g in CPU they are a queue of actions variants, while in GPU they are simply list of more primitive instructions. A command encoder converts the former to latter.

Pipeline

A rendering pipeline is a series of steps to transform a 3D scene into a 2D image, i.e. defines how GPU will operate on the scene data to produce the final result. It includes:

Programmable stages, i.e shaders
Render states settings, e.g stencil, depth testing
Structural layout of input data

If 2 objects in the scene needs to be drawn using different render states, for example antialiasing, they are likely to need 2 pipelines. Otherwise, they can use same pipeline with different binding groups.

Shader

Shaders are the programmable stages in a pipeline. Shaders can access resources like buffers or images (via texture views). WebGPU provides vertex shader and fragment shader.

Bind group

Shaders often need to use resources like texture or vertex data. The collection of resources that are used by (or bound to) a shader is called a bind group.

There are similarly named types: BindGroupLayout, BindGroupLayoutEntry, BindGroup, BindGroupEntry and PipelineLayout. This is how I differentiate them:

The xxxEntry type can be thought of as the type of an element where the xxx type acts like an array holding them. With that in mind, we only care about BindGroupLayout, PipelineLayout and BindGroup.

BindGroupLayout: Defines the ordering, types, visibilities, and other properties of each element in a resource group that shaders will use. One shader can use multiple groups.
BindGroup: Contain actual, concrete handles to objects that is used by the shader. To construct a bind group we need a layout. The specified objects must match the layout definition.

BindGroupLayout and BindGroup are somewhat similar to a class and its actual instance.

PipelineLayout defines multiple layouts of resource groups that the pipeline will use (i.e. an array of BindGroupLayout).

Texture

Represents images in GPU. Texture can be 1D, 2D, or 3D, and it holds the image color data. Texture lets shader access image data, and can also act like the target to draw on.

Texture view

Texture view allows retrieving content of a region of a texture. One texture view can be configured to show different parts of a texture or even another texture, while a texture can also have many texture views.

Sampler

Helps shader access random part on a texture view. Sampler support many methods to interpolate color value at requested location.

More learning materials

Drafts and official documentations:

C/C++:

Rust:

wgpu (Rust) docs

JavaScript:

Others:

Dawn guide, including build to WASM

Actually, such operations do affect error reporting. ↩
https://eliemichel.github.io/LearnWebGPU/getting-started/the-command-queue.html ↩

Ryu204/webgpu.md