Skip to content

Instantly share code, notes, and snippets.

@danielrosehill
Created December 7, 2025 11:41
Show Gist options
  • Select an option

  • Save danielrosehill/1676a5ba5ad8b5032d94a1977b58f468 to your computer and use it in GitHub Desktop.

Select an option

Save danielrosehill/1676a5ba5ad8b5032d94a1977b58f468 to your computer and use it in GitHub Desktop.
Docker Layering For PyTorch ROCm In Action - How to build efficient AI/ML image stacks

Docker Layering For PyTorch ROCm In Action

When building multiple AI/ML Docker images that all need PyTorch with ROCm (AMD GPU) support, naive approaches can waste tens of gigabytes of disk space. This guide shows how Docker's layer sharing works and how to verify your images are efficiently layered.

The Problem: Misleading Image Sizes

Running docker images shows seemingly massive duplication:

REPOSITORY          TAG       SIZE
whisperx-rocm       latest    30.7GB
whisper-rocm        latest    29.7GB
rocm/pytorch        latest    29.3GB

At first glance, this looks like ~90GB of disk usage. But is it really?

The Reality: Shared Layers

These images are actually layered on top of each other:

rocm/pytorch (29.3GB base)
  └── whisper-rocm (+471MB unique)
      └── whisperx-rocm (+998MB unique)

Actual disk usage: ~31GB (not 90GB!)

How to Check Real Disk Usage

Command 1: docker system df -v

This shows the SHARED SIZE vs UNIQUE SIZE for each image:

docker system df -v

Output:

REPOSITORY          TAG       SIZE      SHARED SIZE   UNIQUE SIZE
whisperx-rocm       latest    30.7GB    29.73GB       997.9MB
whisper-rocm        latest    29.7GB    29.73GB       0B
rocm/pytorch        latest    29.3GB    29.26GB       0B

Key insight: whisperx-rocm shows 30.7GB total, but only 997.9MB is unique - the rest is shared with parent layers.

Command 2: docker history

See how an image was built and what each layer adds:

docker history whisper-rocm:latest --no-trunc | head -15

This reveals the base image and what packages were added on top.

Command 3: Total disk summary

docker system df

Shows aggregate disk usage across all images, containers, and volumes.

Building Efficient Layered Images

Pattern: Use a Common Base Image

Instead of each AI tool installing its own PyTorch + ROCm stack, layer them:

Bad (standalone images):

# whisper/Dockerfile - 30GB standalone
FROM ubuntu:22.04
RUN install-rocm-from-scratch
RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
RUN pip install openai-whisper flask gunicorn
# chatterbox/Dockerfile - another 25GB standalone  
FROM ubuntu:22.04
RUN install-rocm-from-scratch
RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
RUN pip install chatterbox-tts fastapi uvicorn

Good (layered on common base):

# whisper/Dockerfile - only adds ~500MB
FROM rocm/pytorch:latest

RUN apt-get update && apt-get install -y ffmpeg git \
    && rm -rf /var/lib/apt/lists/*

RUN pip install --no-cache-dir \
    openai-whisper flask flask-cors gunicorn
# chatterbox/Dockerfile - only adds ~500MB
FROM rocm/pytorch:latest

RUN apt-get update && apt-get install -y ffmpeg \
    && rm -rf /var/lib/apt/lists/*

RUN pip install --no-cache-dir \
    chatterbox-tts fastapi uvicorn

Even Better: Chain Your Images

If Whisper and WhisperX share dependencies, chain them:

# whisperx/Dockerfile - builds on whisper, adds ~1GB
FROM whisper-rocm:latest

RUN pip install whisperx

Real-World Example: AI Audio Stack

Here's an efficient stack for audio AI on AMD GPUs:

Image Total Size Unique Size Purpose
rocm/pytorch:latest 29.3GB 29.3GB Base (PyTorch + ROCm)
whisper-rocm 29.7GB 471MB Speech-to-text API
whisperx-rocm 30.7GB 998MB Enhanced STT with alignment
chatterbox-tts ~30GB ~500MB Text-to-speech with voice cloning

Total apparent size: 120GB
Actual disk usage: ~31GB
Space saved: ~89GB (74% reduction)

Key Takeaways

  1. Always check docker system df -v - image sizes are misleading
  2. Use official base images like rocm/pytorch instead of building from scratch
  3. Chain related images - if B needs everything A has, build B FROM A
  4. Order Dockerfile commands wisely - put rarely-changing layers first
  5. Clean up dangling images with docker image prune

Useful Commands Reference

# Check actual disk usage with layer sharing info
docker system df -v

# See image layer history
docker history IMAGE_NAME --no-trunc

# Clean up dangling (untagged) images
docker image prune -f

# Clean up everything unused (careful!)
docker system prune -a

# See what base image was used
docker inspect IMAGE_NAME | jq '.[0].Config.Image'

Hardware Context

This guide was developed on:

  • GPU: AMD Radeon RX 7700 XT (gfx1101, Navi 32)
  • ROCm: 6.x / 7.x
  • Base Image: rocm/pytorch:latest

The same principles apply to NVIDIA setups with nvidia/cuda or pytorch/pytorch base images.


This gist was generated by Claude Code. Please verify any information before relying on it.

Comments are disabled for this gist.