jupyter | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
For Disc-ussions workshop at Monash University, July 15, 2019.
In this tutorial I will cover:
- How to install the HDF5 libraries with Homebrew (on macOS) or APT (in a Debian-based Linux distribution).
- How to compile Phantom with HDF5.
- How to convert a Phantom dump to HDF5, possibly for use with Plonk.
- How to read the data in the HDF5 file.
See the Phantom wiki for more documentation of HDF5 with Phantom.
What is HDF5? And why HDF5?
From Wikipedia:
Hierarchical Data Format (HDF) is a set of file formats (HDF4, HDF5) designed to store and organize large amounts of data. Originally developed at the National Center for Supercomputing Applications, it is supported by The HDF Group, a non-profit corporation whose mission is to ensure continued development of HDF5 technologies and the continued accessibility of data stored in HDF.
HDF5 has the following nice features.
- It is widely available.
- It has bindings in C and Fortran.
- It has command line tools for reading data.
- It has Python packages to read data into NumPy arrays.
- It has compression built-in.
On macOS you can install HDF5 with Homebrew.
brew install hdf5
The shared object library and include files are at /usr/local/opt/hdf5
.
On Ubuntu, for example, you can install HDF5 with APT.
sudo apt install libhdf5-serial-dev
The location of the library is then /usr/lib/x86_64-linux-gnu/hdf5/serial
.
Note: these packages use gfortran to compile the HDF5 source. Due to the way things work with Fortran .mod files, you need to use gfortran to compile Phantom too.
The alternative is to compile HDF5 yourself with ifort. This is easy to do but I won't go over it in this tutorial.
Note: In what follows I assume Phantom is at ~/repos/phantom
.
Writing HDF5 dump files is a compile time option and requires access to the Fortran HDF5 library.
To compile Phantom with HDF5 support the HDF5ROOT
Makefile variable must be set. For example if HDF5 was installed with Homebrew on macOS HDF5ROOT=/usr/local/opt/hdf5
.
For example, to compile phantom
and phantomsetup
for a dusty disc, do:
~/repos/phantom/scripts/writemake.sh dustydisc > Makefile
make SYSTEM=gfortran HDF5=yes HDF5ROOT=/usr/local/opt/hdf5 phantom setup
Now when you run phantom
and phantomsetup
they should read and write HDF5 dump files.
Note: if you have issues at runtime, you may need to add to LD_LIBRARY_PATH
(in Linux) or DYLD_LIBRARY_PATH
(in macOS) to point to the HDF5 library location. For example, on macOS with HDF5 installed with Homebrew
export DYLD_LIBRARY_PATH=/usr/local/opt/hdf5/lib:$DYLD_LIBRARY_PATH
I assume the following:
- You have the HDF5 library installed on your machine, or some remote machine. In my case, installed via Homebrew, it is located at
/usr/local/opt/hdf5
. - You have a non-HDF5 Phantom dump, and you know how Phantom was compiled to
produce that dump. I.e. what was the Phantom
SETUP
Makefile variable. In my case, this isdump_00000
, andSETUP=dustydisc
.
You can compile the phantom2hdf5
utility with
~/repos/phantom/scripts/writemake.sh dustydisc > Makefile
make SYSTEM=gfortran HDF5=yes PHANTOM2HDF5=yes HDF5ROOT=/usr/local/opt/hdf5 phantom2hdf5
Now pass a dump file (or a list of dump files) to the converter
./phantom2hdf5 dump_*
and you should get an HDF5 dump file for each file passed to the converter.
The HDF5 writer in Phantom takes advantage of gzip compression. The resulting HDF5 dump file should be smaller than the original Phantom dump.
HDF5 has command line tools to investigate the contents of the file. h5ls
and h5dump
are very useful for quickly seeing the data contain in the file. For example
h5ls -r dump_00000.h5
recursively lists all datasets within the file. And
h5dump -d "/header/time" dump_00000.h5
prints the dump file time.
The Python package h5py comes with Anaconda. Alternatively you can install it with pip or Conda.
conda install h5py
To read a dump file
>>> import h5py
>>> f = h5py.File('dump_00000.h5')
Then you can access datasets like
>>> f['particles/xyz'][:]
Plonk is a Python package for analysis and visualisation of SPH data. It is open source and available at https://github.com/dmentipl/plonk.
It can read Phantom dump files in HDF5 format, not in the traditional format.
It uses compiled Fortran from Splash to do interpolation from the data to a pixel grid.
Use Conda to install Plonk
conda install plonk --channel dmentipl