Skip to content

Instantly share code, notes, and snippets.

@xrsrke
Created June 28, 2024 12:41
Show Gist options
  • Save xrsrke/ca0b1fe783aa631323970fadfa5c441f to your computer and use it in GitHub Desktop.
Save xrsrke/ca0b1fe783aa631323970fadfa5c441f to your computer and use it in GitHub Desktop.
install fp8 on nanotron
conda create --prefix ./env python=3.10
module load cuda/12.1
export CUDNN_PATH=/usr/local/cuda-12.1/lib
pip install pyyaml packaging
pip install torch==2.1.2
pip install numpy
pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable
git clone [email protected]:huggingface/nanotron.git
cd nanotron
pip install -e .
pip install pytest pytest-xdist
# NOTE: keep these packages in the following version
# triton==2.1.0 nvidia-nccl-cu12==2.18.1 torch==2.1.2
@xrsrke
Copy link
Author

xrsrke commented Oct 9, 2024

import transformer_engine as te
/fsx/phuc/temp/fp8_for_nanotron/env/lib/python3.10/site-packages/transformer_engine/pytorch/attention.py:108: UserWarning: To use flash-attn v3, please use the following commands to install:
(1) pip install "git+https://github.com/Dao-AILab/flash-attention.git#egg=flashattn-hopper&subdirectory=hopper"
(2) python_path=python -c "import site; print(site.getsitepackages()[0])"
(3) mkdir -p $python_path/flashattn_hopper
(4) wget -P $python_path/flashattn_hopper https://raw.githubusercontent.com/Dao-AILab/flash-attention/main/hopper/flash_attn_interface.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment