nihalpasham

Nova GPU driver

Nova Driver Architecture Overview

The Nova driver is a two-tier GPU driver architecture written in Rust for NVIDIA GPUs. It consists of two main components that work together:

1. Nova-Core (`drivers/gpu/nova-core/`)

This is the low-level hardware abstraction layer that directly interfaces with the GPU hardware.

Entry Point:

How does automatic kernel fusion work in burn?

Burn’s Tensor Abstraction:

Generic Parameters Tensor<B, D, K> :
- B: Backend - execution backend (composable)
- D: usize - dimensionality (compile-time constant)

Mapping Rust Shader to SPIR-V

#[spirv(fragment)]				
pub fn main_fs(output: &mut Vec4) {
 *output = vec4(1.0, 0.0, 0.0, 1.0);
}

rust-gpu

What is it?

is a custom backend for rustc that compiles native rust code (albeit a sub-set of it) to spir-v
- MIR -> SPIR-V: to be precise, it takes in rust’s MIR and converts it to SPIR-V

rustc front-end IR(s):

Stuff that happens at each stage in the front-end

AST: macro-expansion, name resolution
HIR (High-level IR): type-checking, type-inference and trait solving

Rust: A Modern Language for Safe and Fast General-Purpose Programming

Rust is a relatively young, but increasingly popular, systems programming language that emphasizes safety, performance, and concurrency. Developed by Mozilla, its stable version 1.0 was released in 2015.

Here's why Rust is gaining traction, especially in areas like AI:

Memory Safety without Garbage Collection: This is a core strength of Rust. Unlike languages like C and C++ which offer low-level memory management but are prone to errors (e.g., buffer overflows, null pointer dereferences, data races), Rust uses an "ownership model" and "borrowing rules" to enforce memory safety at compile time. This catches many potential bugs before the code even runs, leading to more reliable and secure software. Crucially, it achieves this without a garbage collector, which can introduce unpredictable pauses in performance.
High Performance: Rust compiles directly to machine code, similar to C and C++. This allows for highly optimized exe

Pliron

#mlir #llvm #compiler

An MLIR-inspired extensible compiler framework written in pure Rust.

⠀Design Evaluation Notes:

IR Format: Pliron’s generic IR (Intermediate Representation) format is based on SSA (Static Single Assignment) form and is conceptually similar to MLIR. Like MLIR, it is a nested IR, meaning it supports hierarchical structures such as operations containing regions, which in turn contain blocks and other operations. However, there may be differences in specific implementation details.

Plan for Building a Backend in Cranelift

Steps to Add a New Backend:

Create a Folder for the Backend
- Place the new backend directory under /cranelift/codegen/src/isa, where each backend resides.
Define Backend and Implement Required Traits
- Ensure the backend implements these essential traits:
  - TargetIsa: Specifies the target architecture’s interface.

LowerBackend: Manages instruction lowering for the architecture.

CubeCL

#gpu #kernel #rust

High Level Overview:

GPU kernels in Rust
Comptime
- Automatic vectorization
- Instruction and shape specialization
Loop unrolling

	KernelDefinition { inputs: [Binding { location: Storage, visibility: Read, item: Item { elem: Float(F32), vectorization: Some(1)
	}, size: None
	}
	], outputs: [Binding { location: Storage, visibility: ReadWrite, item: Item { elem: Float(F32), vectorization: Some(1)
	}, size: None
	}
	], named: [("info", Binding { location: Storage, visibility: Read, item: Item { elem: UInt, vectorization: None
	}, size: None
	})
	], cube_dim: CubeDim { x: 4, y: 1, z: 1

	// 🦀 Generated by Rust Macro Expand 🦀
	// 🦀 Timestamp: 16/09/2024, 12:50:30 🦀
	#![allow(warnings)]
	#![feature(print_internals)]
	#![feature(panic_internals)]
	#![feature(prelude_import)]
	#[prelude_import]
	use std::prelude::rust_2021::*;
	#[macro_use]
	extern crate std;