Skip to content

Instantly share code, notes, and snippets.

@nihalpasham
Last active May 24, 2025 10:35
Show Gist options
  • Save nihalpasham/58ccbef0f01d4a0e950e4f73b968bae7 to your computer and use it in GitHub Desktop.
Save nihalpasham/58ccbef0f01d4a0e950e4f73b968bae7 to your computer and use it in GitHub Desktop.
rust-gpu

rust-gpu

What is it?

  • is a custom backend for rustc that compiles native rust code (albeit a sub-set of it) to spir-v
    • MIR -> SPIR-V: to be precise, it takes in rust’s MIR and converts it to SPIR-V

rustc front-end IR(s):

Stuff that happens at each stage in the front-end

  • AST: macro-expansion, name resolution
  • HIR (High-level IR): type-checking, type-inference and trait solving
  • THIR (High-level IR): pattern checking, exhaustiveness checking
  • MIR (Mid-level IR): Simplified, control-flow-oriented representation. Closer to machine code than HIR.
    • borrow checking, optimization (ConstProp, CopyProp, dse), monomorphization.

rust compiler internals:

  • Arenas: Efficient memory pools used for allocating compiler data structures (like HIR, AST nodes) with minimal overhead and no deallocation during compilation. Enables fast allocation and simplified lifetime management.
  • Interning: Deduplicates and reuses identical values (like strings, types, symbols) via lookup tables to save memory and improve comparison speed.
  • Query System: Demand-driven, memoized system where compiler data (like types, traits) is computed lazily as needed, ensuring incremental and modular compilation.
    • traditional compilers have multiple fixed passes
    • rustc is based around function-like queries that compute information
      • E.g. type_of
    • Queries are memoized: computed once and stored in a table
    • Used by incremental compilation
      • Query result table saved on disk, can be reused next time
  • TyCtxt (Type Context): Central context object passed around compiler stages. Gives access to interning, arenas, queries, and type information.

rustc back-end refactoring:

  • The Rust compiler team aimed to make it easier to support new codegen backends by introducing a backend-agnostic crate. At the time, the only backend was rustc_codegen_llvm, which was tightly coupled to LLVM-specific code.
    • Their solution was to separate shared logic from backend-specific details using generics and traits, enabling pluggable backends without code duplication or performance loss.
    • This involved two key changes:
      • Replacing all LLVM-specific types with generics in function signatures and struct definitions.
      • Encapsulating all LLVM FFI calls within traits that define the interface between backend-agnostic and backend-specific code.
  • The core LLVM structs—CodegenCx and Builder—remain backend-specific because their internals are too specialized to generalize. Each backend is expected to define its own versions of these types:
    • CodegenCx: manages code generation for a compilation unit.
    • Builder: emits IR for individual basic blocks.
  • These backend-defined types implement a set of common traits such as CodegenBackend, BuilderMethods, and others.
  • rustc_codegen_ssa is this backend-agnostic crate. It defines a shared interface that all backends—LLVM, Cranelift, and GCC—can implement.
  • rustc_codegen_spirv, for example, implements traits from this crate, including CodegenBackend and ExtraBackendMethods.

A look at the included examples and tests:

// list all executable binaries
cargo run --bin 

// example
cargo run --bin example-runner-wgpu-builder

// dump-mir for example-runner-wgpu-builder
RUSTGPU_CODEGEN_ARGS="--dump-mir DIR" cargo run --bin example-runner-wgpu-builder

// we could also do this to dump-mir cfg with rustc
cargo rustc -- -Z unpretty=mir-cfg 
or
// dump mir before and after every single pass
cargo rustc -- -Z dump-mir=main_vs

// tests
cargo compiletest hello_world
  • If you go through the examples, you’ll see that they make use of a pre-compiled instance of rustc_codegen_spirv (stashed somewhere in the target directory), so you cant really breakpoint-debug the backend by providing it with some input.
  • The core flow of the shader construction process can be understood by examining rust-gpu/examples/runners/wgpu/builder/src/main.rs:
    • Instantiate a new SpirvBuilder with the desired target configuration.
    • Call its build() method, which internally invokes rustc with the appropriate SPIR-V codegen backend and settings.
      • validates the target and prerequisites, like the crate path and SPIR-V environment.
      • locates the rustc_codegen_spirv backend, either from the environment or Cargo’s search path.
      • sets up all build flags, including RUSTFLAGS, LLVM args (not sure why we need them), and encoded panic/validation options.
      • runs a nested Cargo build using the SPIR-V backend with the configured settings.
      • ensures the final SPIR-V output is built with the correct metadata, features, and abort strategy.

rustc_codegen_spirv:

  • finally lets take a look at this backend
  • the first thing to notice is build script makes a patched copy of the rust_codegen_ssa crate to keep using the old, type-specific pointer allocations (called typed allocas)****
  • alloca is an LLVM instruction that allocates memory on the stack.
  • typed alloca means the allocated memory has a specific type, e.g.
    • **old rust-gpu code expects typed allocas:**
It relies on knowing the type of pointer at allocation time.
    %my_var = alloca i32   ; knows it's an i32
    store i32 42, i32* %my_var
    • **new rust/llvm uses opaque pointers:**
The type is not specified, which breaks the old code.
    %my_var = alloca i8    ; just a pointer, type info is lost
    store i32 42, ptr %my_var

Where to start?

  • SpirvCodegenBackend::codegen_crate: The ‘SpirvCodegenBackend’ implements CodegenBackend, which is how custom Rust codegen backends plug into the compiler.
  • which in-turn invokes the rustc_codegen_ssa::base::codegen_crate function. This function is backend-agnostic and orchestrates the full codegen pipeline:
    • early exit for metadata-only builds
    • ==🟡collect & partition monomorphized items==
    • force queries for codegen units
    • metadata module emission (optional)
    • ==🟡start async codegen coordination==
    • allocator shim codegen (optional)
    • sort cgus for throughput/memory tradeoff
    • determine cgu reuse
    • precompile first batch (parallel mode)
    • ==🟣main compilation loop==
    • ==finalize==
  • Each Code generation unit is either:
    • compiled using the backend (SpirvCodegenBackend::compile_codegen_unit)
    • or reused from cache
    • ==🟣all of this happens in the main compilation loop==

Code generation unit:

A Codegen Unit (CGU) is a collection of monomorphized items (functions, statics, etc.) that are grouped together to be compiled as a single unit by the backend.

  • essentially a container for many functions/statics/constants. These are the things that actually get handed off to the backend in batches.

TLDR

  • monomorphized item = One function/statics/const with concrete type substitutions (MIR exists here).
  • codegen unit (CGU) = Group of such items compiled together as a chunk.
  • you get MIR per item, but you compile batches of MIR together as CGUs.

How does the compiler group monomorphized items:

  • The compiler partitions items into Codegen Units, each of which might contain dozens or hundreds of items, depending on optimization level, inlining heuristics, LTO, and other factors.

Three interesting data structures:

  • SpirvCodegenBackend:
  • CodegenCx:
    • The CodegenCx is created once per CGU:
    let cx = CodegenCx::new(tcx, cgu);
    • It’s the context used for all MIR-to-SPIR-V lowering for that one CGU. It holds:
      • The SPIR-V module-in-construction and the spirv module builder - ==🟢BuilderSpirv==
      • The symbol table (e.g., function handles, static variables),
      • Backend-specific config like optimization level, dump flags,
      • Possibly interning caches or string interner handles.
  • BuilderSpirv:
    • Primarily wraps an rspirv::dr::Builder inside a RefCell to allow interior mutability.
    • which enables multiple runtime borrow-checked references to the underlying SPIR-V builder (rspirv::Builder), but only one mutable borrow at a time (enforced by RefCell at runtime).
    • When you want to emit SPIR-V code in the backend, you’ll typically:
      1. Construct a local Builder<'a, 'tcx> with a cursor and a reference to CodegenCx
      2. The local Builder uses emit() to push instructions to rspirv::Builder.
      3. The local Builder struct is really a convenient wrapper for cursor management that knows:
        • where in the module you want to emit
        • which function/block you’re in
        • Note: global statics or variables dont use the localBuilder abstraction.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment