Skip to content

Instantly share code, notes, and snippets.

@jdnichollsc
Last active April 8, 2026 02:24
Show Gist options
  • Select an option

  • Save jdnichollsc/ae66994fa1b151da7a1b646e2e6103f4 to your computer and use it in GitHub Desktop.

Select an option

Save jdnichollsc/ae66994fa1b151da7a1b646e2e6103f4 to your computer and use it in GitHub Desktop.
Fine-tuning LLMs with Blackwell, Unsloth and my RTX 5090 (32GB VRAM) + 128GB RAM

Recommended Architecture

  ┌──────────────────────────────────────┐
  │         Windows 11 (Host)            │
  │                                      │
  │  ┌──────────────┐  ┌──────────────┐  │
  │  │  LM Studio   │  │ .wslconfig   │  │
  │  │  (Inference) │  │ memory=96GB  │  │
  │  │  loads .gguf │  │ processors=8 │  │
  │  └──────┬───────┘  └──────────────┘  │
  │         │ reads from                 │
  │  ┌──────┴────────────────────────┐   │
  │  │    Shared Models Directory    │   │
  │  │  C:\Users\you\.cache\models\  │   │
  │  └──────┬────────────────────────┘   │
  │         │ /mnt/c/...                 │
  │  ┌──────┴────────────────────────┐   │
  │  │         WSL2 (Ubuntu)         │   │
  │  │                               │   │
  │  │  Unsloth Studio / Unsloth CLI │   │
  │  │  - Fine-tune Gemma 4          │   │
  │  │  - Export GGUF → shared dir   │   │
  │  │  - CUDA 12.8 + RTX 5090       │   │
  │  └───────────────────────────────┘   │
  └──────────────────────────────────────┘

Installation

Windows

Configuring WSL2

  • C:\Users\<your-username>\.wslconfig:
  ┌─────────────────────┬───────┬────────────────────────────────────────────────────────────────┐
  │       Setting       │ Value │                           Reasoning                            │
  ├─────────────────────┼───────┼────────────────────────────────────────────────────────────────┤
  │ memory              │ 96GB  │ Leaves 32GB for Windows/LM Studio; maximizes training headroom │
  ├─────────────────────┼───────┼────────────────────────────────────────────────────────────────┤
  │ swap                │ 16GB  │ Safety net for memory spikes during training                   │
  ├─────────────────────┼───────┼────────────────────────────────────────────────────────────────┤
  │ processors          │ 12    │ Parallelizes data loading and tokenization                     │
  ├─────────────────────┼───────┼────────────────────────────────────────────────────────────────┤
  │ localhostForwarding │ true  │ Access Unsloth Studio UI from Windows browser                  │
  └─────────────────────┴───────┴────────────────────────────────────────────────────────────────┘
  • wsl --shutdown

WSL

Install Unsloth Studio (includes training)

curl -fsSL https://unsloth.ai/install.sh | sh
  • Launch: unsloth studio -H 0.0.0.0 -p 8888

Install Llama.cpp

brew install llama.cpp

Download LLM for Testing Inference (GGUF)

llama-server -hf unsloth/gemma-4-31B-it-GGUF:UD-Q5_K_XL
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment