Skip to content

Instantly share code, notes, and snippets.

@janosh
Last active November 12, 2024 04:50
Show Gist options
  • Save janosh/a484f3842b600b60cd575440e99455c0 to your computer and use it in GitHub Desktop.
Save janosh/a484f3842b600b60cd575440e99455c0 to your computer and use it in GitHub Desktop.
VASP M1 Mac Compilation Guide

Compiling VASP on M1 Mac

Written by Alex Ganose @utf and Janosh Riebesell @janosh. Published 2022-03-28. Last updated 2024-03-30.

  1. Install Xcode command line tools:

    xcode-select --install
  2. Install dependencies using Homebrew:

    brew install gcc openmpi scalapack fftw qd openblas

    Optionally, add hdf5 for HDF5 support in VASP.

  3. Compile VASP:

    These instructions are for VASP 6.4.1 but should work with minor adjustments for other versions.

    cd /path/to/vasp-6.x.y
    cp arch/makefile.include.gnu_omp makefile.include

    Edit makefile.include in the VASP src directory:

    • Add to CPP_OPTIONS:

      -D_OPENMP \
      -Dqd_emulate
    • Change all instances of gcc to gcc-13 and g++ to g++-13

    • Add after LLIBS = -lstdc++ to emulate quad precision:

      QD ?= /opt/homebrew/
      LLIBS += -L$(QD)/lib -lqdmod -lqd
      INCS += -I$(QD)/include/qd
    • Set SCALAPACK_ROOT ?= /opt/homebrew

    • Set OPENBLAS_ROOT ?= /opt/homebrew/Cellar/openblas/0.3.20 (Double check this is the path on your system)

    • Set FFTW_ROOT ?= /opt/homebrew

    • (optional but recommended by VASP) For HDF5 support, add

      CPP_OPTIONS+= -DVASP_HDF5
      HDF5_ROOT  ?= /opt/homebrew/
      LLIBS      += -L$(HDF5_ROOT)/lib -lhdf5_fortran
      INCS       += -I$(HDF5_ROOT)/include
    • Append getshmem.o to OBJECTS_LIB in makefile.include (VASP wiki)

      - OBJECTS_LIB = linpack_double.o
      + OBJECTS_LIB = linpack_double.o getshmem.o
    • In src/parser/makefile, change (as noted by @zhuligs):

      - ar vq libparser.a $(CPPOBJ_PARS) $(COBJ_PARS) locproj.tab.h
      + ar vq libparser.a $(CPPOBJ_PARS) $(COBJ_PARS)

      Do not replace the tab at the beginning of the line with spaces!

    • In src/lib/getshmem.c, add the line #define SHM_NORESERVE 0 (VASP forum):

      /*output: shmem id
      */
      #define SHM_NORESERVE 0 // this line was added
      
      void getshmem_C(size_t _size, int *_id)
    • In makefile.include, update the parser section (VASP forum):

      # For the parser library
      CXX_PARS = g++-13
      - LLIBS = -lstdc++
      + LIBS += parser
      + LLIBS = -Lparser -lparser -lstdc++
      QD ?= /opt/homebrew
      LLIBS += -L$(QD)/lib -lqdmod -lqd
      INCS += -I$(QD)/include/qd
  4. Finally, run:

    make all

    If a previous compilation failed, remember to run make veryclean to start from a clean slate. Fixes gfortran errors like

    Fatal Error: string.mod not found

If successful, the VASP binaries will be in src/bin. Test with make test.

Last Tested on 2024-03-30

Confirmed working with VASP 6.4.1 on M1 Pro with Sonoma 14.2.1 and [email protected].

Resulting makefile.include with all modifications

See makefile.include.

Benchmarking

Initial performance testing suggests optimal parameters for the M1 Pro chip with 8 performance, 2 efficiency cores and 16" MBP cooling are

export OMP_NUM_THREADS=1 # important
mpiexec -np 8 vasp_std
NCORE = 4 # in INCAR
n_proc n_threads n_core elapsed (sec)
0 1 1 2 93.3
1 1 1 4 92.8
2 1 2 2 82.8
3 1 2 4 82.7
4 2 1 2 42.8
5 2 1 4 42.9
6 2 2 2 52.9
7 2 2 4 52.7
8 4 1 2 32.9
9 4 1 4 32.9
10 4 2 2 52.9
11 4 2 4 53.0
12 8 1 2 32.8
13 8 1 4 22.8
14 8 2 2 62.8
15 8 2 4 62.9

Brings wall time for this Si2 relaxation down to 23 sec.

from time import perf_counter

from atomate2.vasp.jobs.core import RelaxMaker
from atomate2.vasp.powerups import update_user_incar_settings
from jobflow import run_locally
from pymatgen.core import Structure


start = perf_counter()

# FCC silicon structure
si_structure = Structure(
    lattice=[[0, 2.73, 2.73], [2.73, 0, 2.73], [2.73, 2.73, 0]],
    species=["Si", "Si"],
    coords=[[0, 0, 0], [0.25, 0.25, 0.25]],
)

# relax job to optimize structure
relax_job = RelaxMaker().make(si_structure)

relax_job = update_user_incar_settings(relax_job, {"NCORE": 4})

# run job
run_locally(relax_job, create_folders=True, ensure_success=True)

print(f"Si relaxation took {perf_counter() - start:.3f} sec")
# Default precompiler options
CPP_OPTIONS = -DHOST=\"LinuxGNU\" \
-DMPI -DMPI_BLOCK=8000 -Duse_collective \
-DscaLAPACK \
-DCACHE_SIZE=4000 \
-Davoidalloc \
-Dvasp6 \
-Duse_bse_te \
-Dtbdyn \
-Dfock_dblbuf \
-D_OPENMP \
-Dqd_emulate
CPP = gcc-13 -E -C -w $*$(FUFFIX) >$*$(SUFFIX) $(CPP_OPTIONS)
FC = mpif90 -fopenmp
FCL = mpif90 -fopenmp
FREE = -ffree-form -ffree-line-length-none
FFLAGS = -w -ffpe-summary=invalid,zero,overflow -L /opt/homebrew/Cellar/gcc/13.2.0/lib/gcc/13
OFLAG = -O2
OFLAG_IN = $(OFLAG)
DEBUG = -O0
OBJECTS = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o
OBJECTS_O1 += fftw3d.o fftmpi.o fftmpiw.o
OBJECTS_O2 += fft3dlib.o
# For what used to be vasp.5.lib
CPP_LIB = $(CPP)
FC_LIB = $(FC)
CC_LIB = gcc-13
CFLAGS_LIB = -O
FFLAGS_LIB = -O1
FREE_LIB = $(FREE)
OBJECTS_LIB = linpack_double.o getshmem.o
# For the parser library
CXX_PARS = g++-13
LIBS += parser
LLIBS = -Lparser -lparser -lstdc++
QD ?= /opt/homebrew
LLIBS += -L$(QD)/lib -lqdmod -lqd
INCS += -I$(QD)/include/qd
##
## Customize as of this point! Of course you may change the preceding
## part of this file as well if you like, but it should rarely be
## necessary ...
##
# When compiling on the target machine itself, change this to the
# relevant target when cross-compiling for another architecture
FFLAGS += -march=native
# For gcc-10 and higher (comment out for older versions)
FFLAGS += -fallow-argument-mismatch
# BLAS and LAPACK (mandatory)
OPENBLAS_ROOT ?= /opt/homebrew/Cellar/openblas/0.3.26
BLASPACK = -L$(OPENBLAS_ROOT)/lib -lopenblas
# scaLAPACK (mandatory)
SCALAPACK_ROOT ?= /opt/homebrew
SCALAPACK = -L$(SCALAPACK_ROOT)/lib -lscalapack
LLIBS += $(SCALAPACK) $(BLASPACK)
# FFTW (mandatory)
FFTW_ROOT ?= /opt/homebrew
LLIBS += -L$(FFTW_ROOT)/lib -lfftw3 -lfftw3_omp
INCS += -I$(FFTW_ROOT)/include
# HDF5-support (optional but strongly recommended)
#CPP_OPTIONS+= -DVASP_HDF5
#HDF5_ROOT ?= /path/to/your/hdf5/installation
#LLIBS += -L$(HDF5_ROOT)/lib -lhdf5_fortran
#INCS += -I$(HDF5_ROOT)/include
# For the VASP-2-Wannier90 interface (optional)
#CPP_OPTIONS += -DVASP2WANNIER90
#WANNIER90_ROOT ?= /path/to/your/wannier90/installation
#LLIBS += -L$(WANNIER90_ROOT)/lib -lwannier
# For the fftlib library (experimental)
#CPP_OPTIONS+= -Dsysv
#FCL += fftlib.o
#CXX_FFTLIB = g++-13 -fopenmp -std=c++11 -DFFTLIB_THREADSAFE
#INCS_FFTLIB = -I./include -I$(FFTW_ROOT)/include
#LIBS += fftlib
#LLIBS += -ldl
"""This script grid-searches OMP_NUM_THREADS, NCORE and number of MPI processes for
minimal VASP runtime on a simple Si2 relaxation.
It writes the results to CSV and copies
markdown table to clipboard. Requires Python 3.10. To keep a log, invoke with
python vasp-perf-grid-search.py 2>&1 | tee Si-relax.log
To install OpenMPI's mpiexec on macOS, use Homebrew:
brew install open-mpi
"""
import os
import warnings
from itertools import product
from time import perf_counter, sleep
import pandas as pd
from atomate2.vasp.jobs.core import RelaxMaker
from atomate2.vasp.powerups import update_user_incar_settings
from jobflow import run_locally
from pandas.io.clipboard import clipboard_set
from pymatgen.core import Structure
warnings.filterwarnings("ignore") # hide pymatgen warnings clogging up the logs
VASP_BIN = "/Users/janosh/dev/vasp/compiled/vasp_std_6.3.0_m1"
results: list[tuple[int, int, int, float]] = []
# construct an FCC silicon structure
si_structure = Structure(
lattice=[[0, 2.73, 2.73], [2.73, 0, 2.73], [2.73, 2.73, 0]],
species=["Si", "Si"],
coords=[[0, 0, 0], [0.25, 0.25, 0.25]],
)
# grid-search OMP_NUM_THREADS, NCORE and number of MPI processes
try:
prod = list(product([1, 2, 4, 8], [1, 2], [2, 4]))
for idx, (n_proc, n_threads, n_core) in enumerate(prod, 1):
os.environ["OMP_NUM_THREADS"] = str(n_threads)
print(f"Run {idx} / {len(prod)}")
# make a relax job to optimize the structure
relax_job = RelaxMaker(
run_vasp_kwargs={"vasp_cmd": f"mpiexec -np {n_proc} {VASP_BIN}"},
).make(si_structure)
relax_job = update_user_incar_settings(relax_job, {"NCORE": n_core})
start = perf_counter()
# run the job
run_locally(relax_job, create_folders=True, ensure_success=True)
elapsed = perf_counter() - start
print(
f"run with {n_proc=}, {n_threads=}, {n_core=} took {elapsed:.1f} sec",
)
results += [(n_proc, n_threads, n_core, elapsed)]
print("Waiting 10 secs to cooldown...\n\n", flush=True)
sleep(10) # so every run is a bit more like the first
except KeyboardInterrupt: # exit gracefully on ctrl+c and write partial results
print("Job was interrupted")
df_perf = pd.DataFrame(results, columns=["n_proc", "n_threads", "n_core", "elapsed"])
df_perf.round(2).to_csv("vasp-perf-results.csv")
clipboard_set(df_perf.to_markdown())
@mumu6651
Copy link

Hi everyone,
I use MacBook M1 with Sonoma 14.5. I install VASP 6.1.0.
As following the guideline above, I have got these messages.

lex.yy.c:1291:32: warning: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
1291 | register char *source =
| ^~~~~~
g++-14 -D YY_parse_DEBUG=1 -c locproj.tab.c -o locproj.tab.o
g++-14 -D YY_parse_DEBUG=1 -c yywrap.c -o yywrap.o
rm -f libparser.a
ar vq libparser.a sites.o functions.o radial.o basis.o lex.yy.o locproj.tab.o yywrap.o
ar: creating archive libparser.a
q - sites.o
q - functions.o
q - radial.o
q - basis.o
q - lex.yy.o
q - locproj.tab.o
q - yywrap.o
rsync -ru ../../src/CUDA .
cp makefile.include CUDA
/Library/Developer/CommandLineTools/usr/bin/make -C CUDA -j1
make -f Makefile_CUDASDK verbose=1
mkdir -p lib
mkdir -p obj/x86_64/release
mkdir -p lib
cc -arch x86_64 -DKERNEL_DP -DKERNEL_ZP -DDEBUG -DUSE_STREAM -DMPICH_IGNORE_CXX_SEEK -D__PARA -I/include -I/include -I -I. -I/include -I/samples/common/inc -DUNIX -O3 -g -o obj/x86_64/release/fortran.c.o -c fortran.c
fortran.c:68:10: fatal error: 'cuda_runtime.h' file not found
#include <cuda_runtime.h>
^~~~~~~~~~~~~~~~
1 error generated.
make[3]: *** [obj/x86_64/release/fortran.c.o] Error 1
make[2]: *** [release] Error 2
make[1]: *** [CUDA] Error 2
make: *** [gpu] Error 2

could anyone please suggest me how to solve these errors?
right now, I have no idea.

Thank you in advance,
Ham

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment