Skip to content

Instantly share code, notes, and snippets.

llama.cpp Metal mgpu overlay (env shim + optional hooks)

This overlay adds a tiny Objective‑C helper that:

  1. Lets you specify Metal device(s) via GGML_METAL_DEVICES="3,4,5"

    • If GGML_METAL_DEVICE_INDEX is not set, it will be derived from the first index in GGML_METAL_DEVICES
    • Example log:
      [metal-env-shim] derived GGML_METAL_DEVICE_INDEX='3' from GGML_METAL_DEVICES='3,4'
  2. Provides weak optional hooks:

@Basten7
Basten7 / gist:091df055c04edaa9c88eb0cdc7fc429d
Last active September 10, 2025 14:59
Prompt Processing vs Token Generation
Classic LLM-inference trace on the GPU
@Basten7
Basten7 / ggml-metal-optimized-4.m
Created August 11, 2025 08:24
Evol for A new Metal3 Backend for llama.cpp
#import "ggml-metal.h"
#import "ggml-impl.h"
#import "ggml-backend-impl.h"
#import "ggml-metal-impl.h"
#import <Foundation/Foundation.h>
#import <Metal/Metal.h>