Skip to content

Instantly share code, notes, and snippets.

@alvarobartt
Last active January 31, 2025 15:50
Show Gist options
  • Save alvarobartt/d0cc7081066bec6ddee78d8b28150e4d to your computer and use it in GitHub Desktop.
Save alvarobartt/d0cc7081066bec6ddee78d8b28150e4d to your computer and use it in GitHub Desktop.
Calculates the required VRAM for different precisions based on the number of parameters of a model (pulled from the Hugging Face Hub Safetensors metadata). This Gist is inspired on https://gist.github.com/philschmid/d188034c759811a7183e7949e1fa0aa4.
from huggingface_hub import get_safetensors_metadata
model_id = "mistralai/Mistral-7B-Instruct-v0.1"
precision = "F8"
dtype_bytes = {"F32": 4, "F16": 2, "BF16": 2, "F8": 1, "INT8": 1, "INT4": 0.5}
metadata = get_safetensors_metadata(model_id)
memory = ((sum(metadata.parameter_count.values()) * dtype_bytes[precision]) / (1024**3)) * 1.18
print(f"{model_id=} requires {memory=}GB")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment