These last few months, I have tried a lot of difference formulation to calculate Standardized Precipitation Index (SPI) based on rainfall data in netCDF format, check below files as a background:
- SPI using IMERG netCDF (notebook)
- SPI using CHIRPS netCDF (how-to guideline)
The reason why I use rainfall in netCDF format in above files because the software to calculate SPI: climate-indices python package will only accept single netCDF as input, and the SPI script will read the netCDF input file based on time dimension.
Converting raster files into netCDF is easy using GDAL or other GIS software, but to make the time dimension enabled netCDF output and CF-Compliant is another story. Using NCO, I can easily rename any variable with ncrename and add attributes to the content with ncatted to fit the desired data structure. Yet, I think this is too much work if every time I update the data, I should update the attribute too.
Then I found this solution.
Now, I am able to convert bunch of GeoTIFF in a folder into a single netCDF file with time dimension enabled and CF-Compliant and can be accepted as input using climate-indices
software.
Lets try!
This step-by-step guide was tested using Macbook Pro, 2.9 GHz 6-Core Intel Core i9, 32 GB 2400 MHz DDR4, running on macOS Big Sur 11.1
For this test, I am working at /Users/bennyistanto/Temp/CHIRPS/SPI/Java directory. I have some folder inside this directory:
- Download
- Fitting
- Input_nc
- Input_TIF
- Output_nc
- Output_TEMP
- Output_TIF
- Script
- Subset
Feel free to use your own preferences for this setting.
The installation and configuration described below are performed using a bash/zsh shell on macOS. Windows users will need to install and configure a bash shell in order to follow the instruction below. Try to use Windows Subsystem for Linux for this purpose.
-
If you are new to using Bash refer to the following lessons with Software Carpentry: http://swcarpentry.github.io/shell-novice/
-
If you don't have Homebrew, you can install it by pasting below code in your macOS terminal.
bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
-
Install wget (for downloading data). Use Hombrew to install it by pasting below code in your macOS terminal.
brew install wget
-
Download and install Panoply Data Viewer from NASA GISS on your machine: macOS, Windows, Linux.
-
Download and install Anaconda Python on your machine: macOS, Windows, Linux.
The code for calculating SPI is written in Python 3. It is recommended to use either the Miniconda3 (minimal Anaconda) or Anaconda3 distribution. The below instructions will be Anaconda specific (although relevant to any Python virtual environment), and assume the use of a bash shell.
A new Anaconda environment can be created using the conda environment management system that comes packaged with Anaconda. In the following examples, I’ll use an environment named climate_indices
(any environment name can be used instead of climate_indices
) which will be created and populated with all required dependencies through the use of the provided setup.py file.
-
First, create the Python environment:
conda create -n climate_indices python=3.7
-
The environment created can now be ‘activated’:
conda activate climate_indices
-
Install climate-indices package. Once the environment has been activated then subsequent Python commands will run in this environment where the package dependencies for this project are present. Now the package can be added to the environment along with all required modules (dependencies) via pip:
pip install climate-indices
-
Install jupyter and other package using conda
conda install -c conda-forge jupyter numpy datetime gdal netCDF4
-
Install Climate Data Operator (CDO) from Max-Planck-Institut für Meteorologie using conda
conda install -c conda-forge cdo
-
Install netCDF Operator (NCO) using conda
conda install -c conda-forge nco
SPI requires monthly rainfall data, and there are many source providing global high-resolution gridded monthly rainfall data:
For this step-by-step guideline, I will use CHIRPS monthly data in GeoTIFF format and the case study is Java island - Indonesia.
Why CHIRPS? It is produced at 0.05 x 0.05 degree spatial resolution, make it CHIRPS is the highest gridded rainfall data, and long-term historical data from 1981 – now.
-
Navigate to
Download
folder in the working directory. Download usingwget
all CHIRPS monthly data in GeoTIFF format from Jan 1981 to Dec 2020 (this is lot of data ~7GB zipped files, and become 27GB after extraction, please make sure you have bandwidth and unlimited data package):wget -r https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_monthly/tifs/
-
Gunzip all the downloaded files
gunzip *.gz
-
Download the Java boundary shapefile http://on.istan.to/365PSyH. And save it in
Subset
directory then unzip it. -
Still in your
Download
directory, Clip your area of interest using Java boundary and save it toInput_TIF
directory. I will use gdalwarp command from GDAL to clip all GeoTIFF files in a folder.for i in `find *.tif`; do gdalwarp --config GDALWARP_IGNORE_BAD_CUTLINE YES -srcnodata NoData -dstnodata -9999 -cutline ../Subset/java_bnd_chirps_subset.shp -crop_to_cutline $i ../Input_TIF/java_cli_$i; done
if you are using Windows, you can follow below script.
for %i IN (*.tif) do gdalwarp --config GDALWARP_IGNORE_BAD_CUTLINE YES -srcnodata NoData -dstnodata -9999 -cutline ../Subset/java_bnd_chirps_subset.shp -crop_to_cutline %i ../Input_TIF/java_%i
If you have limited data connection or lazy to download ~7GB and process ~27GB data, you can get pre-processed clipped data for Java covering Jan 1981 to Dec 2020, with file size ~6.8MB. Link: https://on.istan.to/3iLu68v
-
Download python script/notebook that I use to convert GeoTIFF in a folder to single netCDF, save it to
Script
folder.You MUST adjust the folder location (replace /path/to/directory/ with yours) in line 13 and 114.
If you are using other data source (I assume all the data in WGS84), you need to adjust few code in:
- Line 31: folder location.
- Line 40: start of the date
- Line 44: output name
- Line 53: date attribute
- Line 85-88: bounding box
- Line 110: output filename structure
- Line 114: folder location
- Line 120-122: date character location in a filename
-
After everything is set, then you can execute the translation process (choose one or you can try both)
-
Using Python in Terminal, navigate to your
Script
directory, typepython tiff2nc.py
Wait for a few moments, you will get the output
java_cli_chirps_1months_1981_2020.nc
. You will find this file insideInput_TIF
folder. Move it toInput_nc
folder. -
Using Jupyter, make sure you still inside conda
climate_indices
environment, type in Terminaljupyter notebook
Navigate to your notebook directory (where you put *.ipynb file), run Cell by Cell until completed. Wait for a few moments, you will get the output
java_cli_chirps_1months_1981_2020.nc
. You will find this file insideInput_TIF
folder. Move it toInput_nc
folder.
-
-
You also can get this data: java_cli_chirps_1months_1981_2020.nc via this link http://on.istan.to/3iRDl7e
The climate-indices software enables the user to calculate SPI using any gridded netCDF dataset, However, there are certain specifications for input files that vary based on input type.
Precipitation unit must be written as millimeters
, milimeter
, mm
, inches
, inch
or in
.
Data dimension and order must be written as lat
, lon
, time
or time
, lat
, lon
.
-
I have try to make script in Step 3 are following above specification. To make sure everything is correct, in your Terminal - navigate to your directory where you save java_cli_chirps_1months_1981_2020.nc file:
Input_nc
. Then typencdump -h java_cli_chirps_1months_1981_2020.nc
From above picture, I can say:
- Time dimension is enabled,
480
is total months from Jan 1981 to Dec 2020 - Data dimension and order are following the specification
time
,lat
,lon
- The unit is in
mm
So, everything is correct and I am ready to calculate SPI. Make sure in your Terminal still inside climate_indices environment.
Other requirements and options related to the indices calculation, please follow https://climate-indices.readthedocs.io/en/latest/#indices-processing
- Time dimension is enabled,
-
In order to pre-compute fitting parameters for later use as inputs to subsequent SPI calculations I can save both gamma and Pearson distribution fitting parameters to NetCDF, and later use this file as input for SPI calculations over the same calibration period.
In your Terminal, run the following code.
spi --periodicity monthly --scales 1 2 3 6 9 12 24 36 48 60 72 --calibration_start_year 1981 --calibration_end_year 2020 --netcdf_precip /Users/bennyistanto/Temp/CHIRPS/Java/Input_nc/java_cli_chirps_1months_1981_2021.nc --var_name_precip precip --output_file_base /Users/bennyistanto/Temp/CHIRPS/Java/Output_nc/java_CHIRPS --multiprocessing all --save_params /Users/bennyistanto/Temp/CHIRPS/Java/Fitting/java_CHIRPS_fitting.nc --overwrite
The above command will compute SPI (standardized precipitation index, both gamma and Pearson Type III distributions) from an input precipitation dataset (in this case, CHIRPS precipitation dataset). The input dataset is monthly rainfall accumulation data and the calibration period used will be Jan-1981 through Dec-2020. The index will be computed at 1,2,3,6,9,12,24,36,48,60 and 72-month timescales. The output files will be
<out_dir>/java_CHIRPS_spi_gamma_xx.nc
, and<out_dir>/java_CHIRPS_spi_pearson_xx.nc
.The output files will be:
- 1-month:
/Output_nc/java_CHIRPS_spi_gamma_01.nc
- 2-month:
/Output_nc/java_CHIRPS_spi_gamma_02.nc
- 3-month:
/Output_nc/java_CHIRPS_spi_gamma_03.nc
- 6-month:
/Output_nc/java_CHIRPS_spi_gamma_06.nc
- 9-month:
/Output_nc/java_CHIRPS_spi_gamma_09.nc
- 12-month:
/Output_nc/java_CHIRPS_spi_gamma_12.nc
- 24-month:
/Output_nc/java_CHIRPS_spi_gamma_24.nc
- 36-month:
/Output_nc/java_CHIRPS_spi_gamma_36.nc
- 48-month:
/Output_nc/java_CHIRPS_spi_gamma_48.nc
- 60-month:
/Output_nc/java_CHIRPS_spi_gamma_60.nc
- 72-month:
/Output_nc/java_CHIRPS_spi_gamma_72.nc
- 1-month:
/Output_nc/java_CHIRPS_spi_pearson_01.nc
- 2-month:
/Output_nc/java_CHIRPS_spi_pearson_02.nc
- 3-month:
/Output_nc/java_CHIRPS_spi_pearson_03.nc
- 6-month:
/Output_nc/java_CHIRPS_spi_pearson_06.nc
- 9-month:
/Output_nc/java_CHIRPS_spi_pearson_09.nc
- 12-month:
/Output_nc/java_CHIRPS_spi_pearson_12.nc
- 24-month:
/Output_nc/java_CHIRPS_spi_pearson_24.nc
- 36-month:
/Output_nc/java_CHIRPS_spi_pearson_36.nc
- 48-month:
/Output_nc/java_CHIRPS_spi_pearson_48.nc
- 60-month:
/Output_nc/java_CHIRPS_spi_pearson_60.nc
- 72-month:
/Output_nc/java_CHIRPS_spi_pearson_72.nc
Parallelization will occur utilizing all CPUs.
For small area of interest, the calculation will fast and don’t take much time. Below is example if you processed bigger area:
- Monthly IMERG data, global coverage 180W - 180E, 60N - 60S, 0.1 deg spatial resolution. It takes almost 9-hours to process SPI 1-72 months.
Output gamma and pearson file https://on.istan.to/2MhVnTP
Fitting file http://on.istan.to/3c6ZLjq
- 1-month:
What if the new data is coming (Jan 2021)? Should I re-run again for the whole periods, 1981 to date? That's not practical as it requires large storage and time processing if you do for bigger coverage (country or regional analysis).
Updating SPI up to SPI72, I should have data at least 6 years back (2014) from the latest (Jan 2021). In order to avoid computation for the whole periods (1981-2021), I will process data data only for year 2014 to 2021.
After that, I continue the process following Step 3.
Step 4 demonstrates how distribution fitting parameters can be saved as NetCDF. This fittings NetCDF can then be used as pre-computed variables in subsequent SPI computations. Initial command computes both distribution fitting values and SPI for various month scales.
The distribution fitting variables are written to the file specified by the
--save_params
option.The second command also computes SPI but instead of computing the distribution fitting values it loads the pre-computed fitting values from the NetCDF file specified by the
--load_params
option.See below code:
spi --periodicity monthly --scales 1 2 3 6 9 12 24 36 48 60 72 --calibration_start_year 1981 --calibration_end_year 2020 --netcdf_precip /Users/bennyistanto/Temp/CHIRPS/Java/Input_nc/java_cli_chirps_1months_2014_2021.nc --var_name_precip precip --output_file_base /Users/bennyistanto/Temp/CHIRPS/Java/Output_nc/java_CHIRPS --multiprocessing all --load_params /Users/bennyistanto/Temp/CHIRPS/Java/Fitting/java_CHIRPS_fitting.nc
Let see the result.
-
From the
Output_nc
directory, right-click filejava_CHIRPS_spi_gamma_3_month.nc
(you can download this file from this link: https://on.istan.to/2MhVnTP and Open With Panoply. -
From the Datasets tab select spi_gamma_3_month and click Create Plot
-
In the Create Plot window select option Georeferenced Longitude-Latitude.
-
When the Plot window opens:
- Array tab: Change the time into 469 to view data on Jan 2020
- Scale tab: Change value on Min -3, Max 3, Major 6, Color Table CB_RdBu_09.cpt
- Map tab: Change value on Center on Lon 110.0 Lat -7.5, then Zoom in the map through menu-editor Plot > Zoom Plot In few times until Indonesia appear proportionally. Set grid spacing 2.0 and Labels on every grid lines.
- Overlays tab: Change Overlay 1 to MWDB_Coasts_Countries_1.cnob
Conversion of the result into GeoTIFF format, CDO required the variable should be in time
, lat
, lon
, while the output from SPI: java_CHIRPS_spi_xxxxx_x_month.nc
in lat
, lon
, time
, you can check this via ncdump -h file.nc
-
Navigate your Terminal to folder
Output_nc
-
Let's re-order the variables into
time
,lat
,lon
usingncpdq
command from NCO and save the result to folderOutput_TEMP
ncpdq -a time,lat,lon java_CHIRPS_spi_gamma_3_month.nc ../Output_TEMP/java_CHIRPS_spi_gamma_3_month_rev.nc
-
Navigate your Terminal to folder Output_TEMP
-
Check result and metadata to make sure everything is correct.
ncdump -h java_CHIRPS_spi_gamma_3_month_rev.nc
-
Then convert all SPI value into GeoTIFF with
time
dimension information as thefilename
using CDO and GDAL. -
Navigate your Terminal to folder
Output_TEMP
and save the result to folderOutput_TIF
for t in `cdo showdate java_CHIRPS_spi_gamma_3_month_rev.nc`; do cdo seldate,$t java_CHIRPS_spi_gamma_3_month_rev.nc dummy.nc gdal_translate -of GTiff -a_ullr 105.05 -5.05 116.25 -8.8 -a_srs EPSG:4326 -co COMPRESS=LZW -co PREDICTOR=1 dummy.nc ../Output_TIF/$t.tif done
FINISH
Congrats, now you are able to calculate SPI based on monthly rainfall in GeoTIFF format.