Running Wan 2.1 (14B) on AMD RDNA 4 (RX 9070 XT) - Stability Guide

Hardware: AMD Radeon RX 9070 XT (16GB VRAM) OS: Ubuntu 22.04 / Linux ROCm: 7.0 / 7.1 Preview Goal: Stable Text-to-Video generation with Wan 2.1 (14B) without crashing or OOM.

The Problem

Running Wan 2.1 on RDNA 4 currently causes frequent HIP error: illegal memory access crashes or immediate OOMs during VAE decoding. This is due to kernel conflicts with PyTorch TunableOp and memory fragmentation.

The Fix (Launch Script)

Save this as run_wan_safe.sh. The specific environment variables are critical.

#!/bin/bash

# 1. DISABLE System Direct Memory Access (SDMA)
# Prevents data corruption during heavy GGUF transfers on RDNA 4.
export HSA_ENABLE_SDMA=0

# 2. DISABLE PyTorch TunableOp
# Crucial. While TunableOp helps Flux, it causes "Illegal Memory Access" 
# crashes with Wan 2.1 kernels on Navi 4x.
export PYTORCH_TUNABLEOP_ENABLED=0

# 3. ENABLE Triton Backend for Flash Attention
# The default Composable Kernel (CK) backend often fails on RDNA 4.
# Requires flash-attn to be built with this var set.
export FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE"

# 4. Aggressive Memory Fragmentation Control
# Forces PyTorch to split blocks earlier (128MB) and GC sooner (60%).
export PYTORCH_HIP_ALLOC_CONF="garbage_collection_threshold:0.6,max_split_size_mb:128,expandable_segments:True"

echo "Launch Config:"
echo "  SDMA: OFF (Stability)"
echo "  TunableOp: OFF (Fix Illegal Access)"
echo "  Triton FA: ON (Performance)"
echo "  HIP Alloc: Optimized for 16GB"

# Launch ComfyUI with Low VRAM mode to force aggressive offloading
python3 main.py --lowvram --use-split-cross-attention

ComfyUI Workflow Settings

Even with the script, you will OOM during the final VAE Decode step unless you use these settings:

Node: Use VAEDecodeTiled (Not standard VAEDecode).
Tile Size: 256 (Default 512 is too large for 16GB VRAM + 14B Model).
Temporal Tiling: 16 (Helps smooth out the decoding).
Overlap: 64.

Notes on Flash Attention

You must build flash-attention from source with the Triton flag enabled:

export FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE"
pip install git+https://github.com/Dao-AILab/flash-attention.git

apollo-mg/RDNA4_WAN2.1_GUIDE.md

Select an option

No results found

Select an option

No results found

Running Wan 2.1 (14B) on AMD RDNA 4 (RX 9070 XT) - Stability Guide

The Problem

The Fix (Launch Script)

ComfyUI Workflow Settings

Notes on Flash Attention

flexusjan commented Nov 30, 2025

Uh oh!

apollo-mg commented Nov 30, 2025 via email •

edited

Loading

Uh oh!

apollo-mg/RDNA4_WAN2.1_GUIDE.md

Running Wan 2.1 (14B) on AMD RDNA 4 (RX 9070 XT) - Stability Guide

The Problem

The Fix (Launch Script)

ComfyUI Workflow Settings

Notes on Flash Attention

flexusjan commented Nov 30, 2025

Uh oh!

apollo-mg commented Nov 30, 2025 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apollo-mg commented Nov 30, 2025 via email •

edited

Loading