This document will walk you through compiling your own scientific python distribution from source,
without sudo
, on a linux machine. The core numpy
and scipy
libraries will be linked against
Intel MKL for maximum performance.
This procedure has been tested with Rocks Cluster Linux 6.0 (Mamba) and CentOS 6.3.
Most scientific python software has not ported to python3 yet, so we're going to use the latest and final version of python2.
To get started, download and compile python
wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz`
tar -xzvf Python-2.7.3.tgz
cd Python-2.7.3
./configure --prefix=$HOME/local/python
make
make install
export PATH=$HOME/local/python/bin:$PATH
cd ..
The core scientific python packages are numpy
and scipy
. To get the most
performance, we want to build them with intel compilers and link them against
the Intel Math Kernel Library (MKL), which contains the fastest linear algebra
routines
The first step is to download the intel compilers. They contain MKL as well -- you don't need to download a separate package for MKL.
- Intel® Fortran Composer XE 2013 for Linux
- Intel® C++ Composer XE 2013 for Linux
The compilers are free for academic use. You can sign the academic license and get the download links at this website.
When you sign up, they send you an email containing the links, and you
can download files called l_fcompxe_2013.2.146.tgz
for the fortran compiler,
and l_ccompxe_2013.2.146.tgz
for the C/C++ compiler. Untar these files.
The installation is done via an interactive shell script. When it asks,
change the install location to /home/<your_user_name>/opt
.
cd l_ccompxe_2013.2.146
./install.sh
Install the fortran compiler as well, also using /home/<your_user_name>/opt
as the
install path
cd ../l_fcompxe_2013.2.146
./install.sh
To make icc
, ifort
, and the MKL
libraries available to other
programs, you'll need to add some entries to your ~/.bashrc
file:
# add the intel compilers to the PATH
export PATH=$HOME/opt/intel/bin:$PATH
# add MKL and the compiler libs to the path
export LD_LIBRARY_PATH=$HOME/opt/intel/mkl/lib/intel64/:$$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/opt/intel/lib/intel64/:$LD_LIBRARY_PATH
Intel has some good directions for installing numpy and scipy with MKL, available here
The latest version of numpy, 1.7.0, can be downloaded from the python package index at http://pypi.python.org/pypi/numpy/1.7.0.
Inside the numpy-1.7.0 directory, make a file called site.cfg
, containing these
lines
[mkl]
library_dirs = /home/<your_username>/opt/intel/mkl/composer_xe_2013/lib/intel64
include_dirs = /home/<your_username>/opt/intel/mkl/include
mkl_libs = mkl_rt
lapack_libs =
As the article describes, you want to add some compiler flags to how icc
is invoked.
Open up numpy/distutils/intelcompiler.py
file using your editor, and edit the line that
reads cc_exe='icc'
(probably line 7) to instead
read cc_exe='icc -O3 -g -fPIC -fp-model strict -fomit-frame-pointer -openmp -xhost'
Now, the execute the python install command, specifying the intelem
compiler.
python setup.py config --compiler=intelem build_clib --compiler=intelem build_ext --compiler=intelem install
The lastest version of scipy, current 0.11, can be downloaded from the python package index at http://pypi.python.org/pypi/scipy/0.11.0.
The install command is:
python setup.py config --compiler=intelem --fcompiler=intelem build_clib --compiler=intelem --fcompiler=intelem build_ext --compiler=intelem --fcompiler=intelem install
Now that we've got numpy
and scipy
installed, we can add in some more packages.
One of the key packages, pytables
, requires a C library, hdf5
that might not
be installed by default.
hdf5
is a C library for efficient storage of hierarchical datasets. Unfortunately,
it's not installed by default on all clusters. We'll install it to $HOME/opt/hdf5
.
wget http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.8.10-patch1.tar.gz
tar -xzvf hdf5-1.8.10-patch1.tar.gz
cd hdf5-1.8.10-patch1
./configure --prefix=$HOME/opt/hdf5
make
make install
You should add the following lines to your ~/.bashrc
file:
# install hdf5 libraries and executables
export LD_LIBRARY_PATH=$HOME/opt/hdf5/lib:$LD_LIBRARY_PATH
export PATH=$HOME/opt/hdf5/bin:$PATH
We're going to need to get one python package called setuptools
manually.
wget http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz
tar -xzvf setuptools-0.6c11.tar.gz
cd setuptools-0.6c11.tar.gz
python setup.py install
cd ..
Using setuptools
we can automatically install some other python packages. Here are commands to paste into the shell
that will install a wide range of core scientific python packages.
# First, get pip, a better python package manager
easy_install pip
# Virtualenv helps to manage dependencies via isolated virtual environments
pip install virtualenv
pip install virtualenvwrapper
# For command line interaction, we want the GNU readline wrappers
pip install readline
# Nose is a widely used unit testing library
pip install nose
# Cython is a python/c hybrid language, and numexpr is an inline
# mathematical expression compiler. Both are required by `tables`
pip install cython
pip install numexpr
# tables` gives us hierarchical data sets. note: we have to tell
# it where to find our hdf5 libraries
export HDF5_DIR=$HOME/opt/hdf5
pip install tables
# Matplotlib is the standard 2d plotting library
pip install matplotlib
# IPython is great for interactive data analysis
# pip install ipython
# pandas builds on numpy with a powerful numpy-backed DataFrame object for
# smart data analysis. sklearn and statsmodels give us key statistical
# algorithms
pip install pandas
pip install scikit.learn
pip install statsmodels
Here are a few commands that can be executed from the shell to verify your installation
python -c "import numpy; numpy.test()"
python -c "import scipy; scipy.test()"
python -c "import tables; tables.test()"
nosetests pandas
For reference, I've put down some gromacs stuff here.
wget http://www.open-mpi.org/software/ompi/v1.6/downloads/openmpi-1.6.3.tar.gz
tar -xzvf openmpi-1.6.3.tar.gz
cd openmpi-1.6.3
# vt is not compatible with recent versions of CUDA
./configure --disable-vt --prefix=$HOME/opt/openmpi163
make
make install
cd ..
cd openpi-1.6.3
# vt is not compatible with recent versions of CUDA
./configure --disable-vt --prefix=$HOME/opt/intel/openmpi/1.6.3/ CC=icc CXX=icpc F77=ifort FC=ifort
make -j4
make install
Add the following to your .bashrc
# install OpenMPI
export LD_LIBRARY_PATH=$HOME/opt/openmpi163/lib:$LD_LIBRARY_PATH
export PATH=$HOME/opt/openmpi163/bin:$PATH
wget http://www.fftw.org/fftw-3.3.3.tar.gz
tar -xzvf fftw-3.3.3.tar.gz
cd fftw-3.3.3
./configure --prefix=$HOME/opt/fftw --enable-float --enable-shared --enable-sse2
make
make install
cd ..
wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-4.5.5.tar.gz
tar -xzvf gromacs-4.5.5.tar.gz
cd gromacs-4.5.5
export CPPFLAGS="-I$HOME/opt/fftw/include"
export LDFLAGS="-L$HOME/opt/fftw/lib"
./configure --prefix=$HOME/opt/gromacs455 --enable-mpi --enable-shared --enable-threads --with-fft=fftw3
make
make install
cd ..
wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-4.6.tar.gz
tar -xzvf gromacs-4.6.tar.gz
mkdir build
cd build
export PKG_CONFIG_PATH=$HOME/opt/fftw/lib/pkgconfig/:$PKG_CONFIG_PATH
cmake -DGMX_MPI=ON -DCMAKE_INSTALL_PREFIX:PATH=$HOME/opt/gromacs46 ../gromacs-4.6
make -j4
make install
cd ..
Amber isn't open source, so you're going to have to obtain the source code yourself.
It comes in two files, Amber12.tar.bz2
, and AmberTools12.tar.bz2
tar -xjvf Amber12.tar.bz2
tar -xjvf AmberTools12.tar.bz2
cd amber12
export AMBERHOME=`pwd`
export MKL_HOME=/home/rmcgibbo/opt/intel/mkl/
export CUDA_HOME=/usr/local/cuda/
export LD_LIBRARY_PATH=$HOME/opt/intel/openmpi/1.6.3/lib/:$LD_LIBRARY_PATH
export PATH=$HOME/opt/intel/openmpi/1.6.3/bin/:$PATH
echo 'Y' | ./configure -cuda -noX11 intel
make install