Skip to content

Instantly share code, notes, and snippets.

@maxious
Created February 4, 2026 12:41
Show Gist options
  • Select an option

  • Save maxious/150f6a29d23b846cd3dca2ee5943eafa to your computer and use it in GitHub Desktop.

Select an option

Save maxious/150f6a29d23b846cd3dca2ee5943eafa to your computer and use it in GitHub Desktop.
SYCL Graph Topology Mismatch Fix

The Error

[SYCL-GRAPH] Exception when updating graph, Cannot update using a graph with a different topology. Mismatch found in the number of nodes.

Root Cause

SYCL command graphs can only be updated (via update()) if the new graph has the exact same topology as the cached one—same number of nodes, same kernel types, same execution order. Parameter values (buffer pointers, scalar arguments) can differ, but structure cannot.

The original implementation cached a single executable graph in sycl_ctx->exec_graph. When test-backend-ops ran tests with varying tensor dimensions (n=1, 2, 3, 4...), each created graphs with different node counts. The code:

Tried to update() the cached graph with new topology → failed with exception Caught exception, re-recorded and finalized a new graph Repeated this cycle for every mismatched topology

After ~9-10 graph recreations, the GPU crashed with UR_RESULT_ERROR_DEVICE_LOST due to driver resource exhaustion.

Fix: Hash-Based Multi-Entry Graph Cache

  1. Topology Hash Function (backend.cpp):

static uint64_t compute_cgraph_hash(const ggml_cgraph * cgraph) { uint64_t hash = 0xcbf29ce484222325ULL; // FNV-1a // Hash: n_nodes, each node's op, type, dimensions, source tensor info ... }

  1. Cache Structure (common.hpp):

std::map<uint64_t, std::unique_ptr<executable_graph>> graph_cache; static constexpr size_t MAX_GRAPH_CACHE_SIZE = 8;

  1. Lookup Logic:

Cache hit: Reuse graph, call update() (guaranteed to succeed since topology matches by hash) Cache miss: Record new graph, finalize, store in cache Eviction: FIFO when cache exceeds 8 entries

Additional Fixes

Disabled graphs for F16×F32 and BF16×F32 GEMM (tiled GEMM had incorrect type casts causing NaN) Only F32×F32 tiled GEMM is graph-compatible; others fall back to oneMKL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment