Fine-tuning LLMs with Blackwell, Unsloth and my RTX 5090 (32GB VRAM) + 128GB RAM

Recommended Architecture

  ┌──────────────────────────────────────┐
  │         Windows 11 (Host)            │
  │                                      │
  │  ┌──────────────┐  ┌──────────────┐  │
  │  │  LM Studio   │  │ .wslconfig   │  │
  │  │  (Inference) │  │ memory=96GB  │  │
  │  │  loads .gguf │  │ processors=8 │  │
  │  └──────┬───────┘  └──────────────┘  │
  │         │ reads from                 │
  │  ┌──────┴────────────────────────┐   │
  │  │    Shared Models Directory    │   │
  │  │  C:\Users\you\.cache\models\  │   │
  │  └──────┬────────────────────────┘   │
  │         │ /mnt/c/...                 │
  │  ┌──────┴────────────────────────┐   │
  │  │         WSL2 (Ubuntu)         │   │
  │  │                               │   │
  │  │  Unsloth Studio / Unsloth CLI │   │
  │  │  - Fine-tune Gemma 4          │   │
  │  │  - Export GGUF → shared dir   │   │
  │  │  - CUDA 12.8 + RTX 5090       │   │
  │  └───────────────────────────────┘   │
  └──────────────────────────────────────┘

Installation

Windows

Configuring WSL2

C:\Users\<your-username>\.wslconfig:

  ┌─────────────────────┬───────┬────────────────────────────────────────────────────────────────┐
  │       Setting       │ Value │                           Reasoning                            │
  ├─────────────────────┼───────┼────────────────────────────────────────────────────────────────┤
  │ memory              │ 96GB  │ Leaves 32GB for Windows/LM Studio; maximizes training headroom │
  ├─────────────────────┼───────┼────────────────────────────────────────────────────────────────┤
  │ swap                │ 16GB  │ Safety net for memory spikes during training                   │
  ├─────────────────────┼───────┼────────────────────────────────────────────────────────────────┤
  │ processors          │ 12    │ Parallelizes data loading and tokenization                     │
  ├─────────────────────┼───────┼────────────────────────────────────────────────────────────────┤
  │ localhostForwarding │ true  │ Access Unsloth Studio UI from Windows browser                  │
  └─────────────────────┴───────┴────────────────────────────────────────────────────────────────┘

wsl --shutdown

WSL

Install Unsloth Studio (includes training)

curl -fsSL https://unsloth.ai/install.sh | sh

Launch: unsloth studio -H 0.0.0.0 -p 8888

Install Llama.cpp

brew install llama.cpp

Download LLM for Testing Inference (GGUF)

llama-server -hf unsloth/gemma-4-31B-it-GGUF:UD-Q5_K_XL

jdnichollsc/README.md

Select an option

No results found

Select an option

No results found

Recommended Architecture

Installation

Windows

Configuring WSL2

WSL

Install Unsloth Studio (includes training)

Install Llama.cpp

Download LLM for Testing Inference (GGUF)