Skip to content

Instantly share code, notes, and snippets.

@alvarovm
Forked from jingsk/MACE_install.md
Last active February 27, 2024 17:35
Show Gist options
  • Save alvarovm/045fc36139e692ac179e38a0e2b05da3 to your computer and use it in GitHub Desktop.
Save alvarovm/045fc36139e692ac179e38a0e2b05da3 to your computer and use it in GitHub Desktop.
Install MACE with cuda-enabled Pytorch

Mace installation on ANL's Swing as of 1/30/24

Check resources available on Swing: NVIDIA A100 GPUs (8 GPUs per node) - 1/8 node allocated when requesting 1 gpu

MACE requires Pytorch2, which needs CUDA 11.8 or 11.7. Check required CUDA versions here

Install Torch and Cuda

Create a Conda environment with Python 3.10

conda create -n "CUDA-torch-base" python=3.10.0

Activate once the environment is created

conda activate CUDA-torch-base

Read install Pytorch instructions from their official website. For me this is the command:

conda install pytorch=2.2.0 torchvision=0.17.0 torchaudio=2.2.0 pytorch-cuda=11.8  -c pytorch -c nvidia

Before Torch, Torchvision, Torchaudio are installed: check that they are installing the cuda version and not the cpu version. For example a cpu version of Pytorch source will be pytorch/linux-64::pytorch-2.0.1-py3.10_cpu_0 while the cuda version will say pytorch/linux-64::pytorch-2.2.0-py3.10_cuda11.8_cudnn8.7.0_0. For Pytorch 1.11.0 (for Allegro), I had success with cuda 11.3.

Check whether cuda-enabled torch is installed correctly. Request a GPU (dont forget to ssh in the compute node after resource granted) In Python use

import torch
assert torch.cuda.is_available() = True

Install MACE

If all is well, install MACE following their installation guide

conda create --name NNFF-MACE --clone CUDA-base2
conda activate NNFF-MACE
pip install mace-torch

To learn about MACE, follow this tutorial at https://github.com/ilyes319/mace-tutorials/blob/main/mace-users/MACE_users.ipynb. As a test run download solvent_test.xyz and solvent_train.xyz from the repo, then run this command on a compute node:

mace_run_train \
    --name="model" \
    --train_file="$DATA/solvent_train.xyz" \
    --valid_fraction=0.05 \
    --test_file="$DATA/solvent_test.xyz" \
    --E0s="isolated" \
    --energy_key="energy" \
    --forces_key="forces" \
    --model="MACE" \
    --num_interactions=2 \
    --max_ell=2 \
    --hidden_irreps="16x0e" \
    --num_cutoff_basis=5 \
    --correlation=2 \
    --r_max=3.0 \
    --batch_size=5 \
    --valid_batch_size=5 \
    --eval_interval=1 \
    --max_num_epochs=50 \
    --start_swa=15 \
    --swa_energy_weight=1000 \
    --ema \
    --ema_decay=0.99 \
    --amsgrad \
    --error_table="PerAtomRMSE" \
    --default_dtype="float32" \
    --swa \
    --device=cuda \
    --seed=1234

MACE+LAMMPS

Instructions for LAMMPS with MACE here

KOKKOS in ALCF

Instructions to compile Polaris with Kokkos here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment