Skip to content

Instantly share code, notes, and snippets.

@bennyistanto
Created January 26, 2024 05:48
Show Gist options
  • Save bennyistanto/38997cf5cbaf70b05a3d8f894d0533b0 to your computer and use it in GitHub Desktop.
Save bennyistanto/38997cf5cbaf70b05a3d8f894d0533b0 to your computer and use it in GitHub Desktop.
Zonal statistics (regular and categorical) using Earth Engine Python API
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "dc92de18-88f6-485b-88c7-e620489b1481",
"metadata": {},
"source": [
"# Zonal Statistics on GEE"
]
},
{
"cell_type": "markdown",
"id": "3f838da7-75d0-41a2-85d2-6ad205761a4b",
"metadata": {},
"source": [
"A zonal statistics operation is one that calculates statistics on cell values of a raster (a value raster) within the zones defined by another datasets. We will utilise raster data from Google Earth Engine, and do the zonal statistics process in a notebook utilizing Earth Engine Python API.\n",
"\n",
"Contact:</br>\r\n",
"Benny Istanto. Climate Geographer.</br>\r\n",
"GOST/DECSC/DECDG, The World Bank</br>\r\n",
"[email protected]\r\n",
"\r\n",
"If you found a mistake, send me an email.\r\n",
"\r\n",
"License: </br>\r\n",
"This script is in the public domain, free from copyrights or restrictions.\r\n",
"\r\n",
"Changelog:</br>\r\n",
"2024-01-25 first draft \r\n",
"rst draft \r\n",
"d panel\r\n",
" * "
]
},
{
"cell_type": "markdown",
"id": "c4a9d29a-2b3e-4943-9585-f7c3314e5500",
"metadata": {},
"source": [
"## 1. Setup\n",
"\n",
"The Earth Engine Python API needs to be authenticated with Google Account. If you does not have an access to the Google Earth Engine (GEE), you need to sign up first here: https://code.earthengine.google.com/register"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "161dca30-56eb-434a-8328-4504b6b2e0e4",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<p>To authorize access needed by Earth Engine, open the following\n",
" URL in a web browser and follow the instructions:</p>\n",
" <p><a href=https://code.earthengine.google.com/client-auth?scopes=https%3A//www.googleapis.com/auth/earthengine%20https%3A//www.googleapis.com/auth/devstorage.full_control&request_id=qYLZnDjp8W2sMWtTTGhG9Gyq1YWk9HdVKwj7UgOaRv4&tc=dJAl5MlV4uw9ckSqHeKY6L4eT8caVrk6pndscD7IcwQ&cc=TOEyD3jkycCaPDsKmWfZxUKOUgcC6OdWjZm8pHRIAU4>https://code.earthengine.google.com/client-auth?scopes=https%3A//www.googleapis.com/auth/earthengine%20https%3A//www.googleapis.com/auth/devstorage.full_control&request_id=qYLZnDjp8W2sMWtTTGhG9Gyq1YWk9HdVKwj7UgOaRv4&tc=dJAl5MlV4uw9ckSqHeKY6L4eT8caVrk6pndscD7IcwQ&cc=TOEyD3jkycCaPDsKmWfZxUKOUgcC6OdWjZm8pHRIAU4</a></p>\n",
" <p>The authorization workflow will generate a code, which you should paste in the box below.</p>\n",
" "
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"Command '/usr/bin/x-www-browser' requires the firefox snap to be installed.\n",
"Please install it with:\n",
"\n",
"snap install firefox\n",
"\n"
]
},
{
"name": "stdin",
"output_type": "stream",
"text": [
"Enter verification code: 4/1AfJohXljoOu5mDzvL-iARAtRaTTztGjJVamKtUrKRf4GfB20oMQD4bvknU8\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Successfully saved authorization token.\n"
]
}
],
"source": [
"# Authenticate the GEE\n",
"import ee\n",
"ee.Authenticate()\n"
]
},
{
"cell_type": "markdown",
"id": "f82fba1a-03d1-4225-bbf3-874c90732bc3",
"metadata": {},
"source": [
"Check if this working by running below:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "cee65e89-1bc4-4a73-894b-2cc26ca1f6e9",
"metadata": {},
"outputs": [],
"source": [
"ee.Initialize()\n"
]
},
{
"cell_type": "markdown",
"id": "788961d2-096f-440f-93bc-e8d344fab022",
"metadata": {},
"source": [
"## 2. Boundary\r\n",
"\r\n",
"Let's define the boundary that will use as the zones.\r\n",
"\r\n",
"You can:\r\n",
"\r\n",
"* Select a full country by changing the countryname variable, and select the zones field.\r\n",
"* Use the Geometry Polygon as the study area.\r\n",
"* Load a shapefile stored in the user Assets collection.ection"
]
},
{
"cell_type": "markdown",
"id": "6a474c82-7d18-4fac-b689-d4ed24b79aa2",
"metadata": {},
"source": [
"### 2.1. Define AoI by Country Name and Administrative Level"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "04d28304-0234-413b-bae1-eac935e8f276",
"metadata": {},
"outputs": [],
"source": [
"# Define the FAO GAUL 2015 dataset\n",
"boundary = ee.FeatureCollection(\"FAO/GAUL/2015/level2\")\n",
"\n",
"# User inputs\n",
"# To check list of country name used by GAUL data\n",
"# https://data.apps.fao.org/catalog/dataset/gaul-codes\n",
"country_name = 'Indonesia' # Example country name\n",
"admin_level = 'ADM1_CODE' # Options: 'ADM1_CODE' or 'ADM2_CODE'\n",
"\n",
"# Filter by country name\n",
"aoi_country = boundary.filter(ee.Filter.eq('ADM0_NAME', country_name))\n",
"\n",
"# Filter by administrative level within the country\n",
"if admin_level == 'ADM1_CODE':\n",
" aoi = aoi_country.distinct('ADM1_CODE')\n",
"elif admin_level == 'ADM2_CODE':\n",
" aoi = aoi_country.distinct('ADM2_CODE')\n"
]
},
{
"cell_type": "markdown",
"id": "d6109cce-8923-41a8-a35e-8e1f624ad52c",
"metadata": {},
"source": [
"### 2.2. Use a User-Defined Geometry\r\n",
"\n",
"For this method, you need to define the geometry in Python. This is typically done using ee.Geometry object. Assuming you have a defined geometr.\n",
":"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "caf170ec-281a-4ef1-a89a-38d6d801d36e",
"metadata": {},
"outputs": [],
"source": [
"# # Example geometry (replace with your geometry)\n",
"# geometry = ee.Geometry.Polygon([\n",
"# [[-103.80, 44.18], [-103.80, 43.78],\n",
"# [-104.06, 43.78], [-104.06, 44.18]]\n",
"# ])\n",
"# aoi = ee.FeatureCollection(geometry)\n"
]
},
{
"cell_type": "markdown",
"id": "b6d4a4e2-6e63-43da-b4dc-404730d468dc",
"metadata": {},
"source": [
"### 2.3. Load a Shapefile from Assets Collection\n",
"\r\n",
"This method assumes that you have uploaded a shapefile to your GEE Asset\n",
":"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7beac031-a310-44c9-bf7f-bd2ba6262214",
"metadata": {},
"outputs": [],
"source": [
"# # User input for asset path and column name\n",
"# asset_path = 'users/your_username/your_asset_name'\n",
"# column_name = 'your_column_name' # e.g., 'ADM1_CODE'\n",
"\n",
"# # Load the shapefile from Assets and select the specified column\n",
"# aoi = ee.FeatureCollection(asset_path).distinct(column_name)\n"
]
},
{
"cell_type": "markdown",
"id": "4365064d-6ce4-430a-8bfa-dd19056064fc",
"metadata": {},
"source": [
"## 3. Regular Zonal Statistics\n",
"\n",
"The output from regular zonal statistics are similar to what ArcPy Zonal Statistics do, or `rasterstats` python library with `categorical=False`.\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "d1da1cbb-c7cd-4c1f-a867-f8cfd277f834",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Processing image 1 of 13...\n",
"Processing image 2 of 13...\n",
"Processing image 3 of 13...\n",
"Processing image 4 of 13...\n",
"Processing image 5 of 13...\n",
"Processing image 6 of 13...\n",
"Processing image 7 of 13...\n",
"Processing image 8 of 13...\n",
"Processing image 9 of 13...\n",
"Processing image 10 of 13...\n",
"Processing image 11 of 13...\n",
"Processing image 12 of 13...\n",
"Processing image 13 of 13...\n",
"All tasks submitted.\n"
]
}
],
"source": [
"# Define the image collection and dates\n",
"raster = ee.ImageCollection(\"MODIS/061/MOD13Q1\").select('EVI')\n",
"start_date = '2020-01-01' # Example start date\n",
"end_date = '2021-01-01' # Example end date\n",
"frequency = 'monthly' # Options: 'default', 'monthly', 'annual'\n",
"stat_operation = 'mean' # Options: 'mean', 'median', 'min', 'max', 'stdDev'\n",
"\n",
"# Function to calculate statistics\n",
"# Use include_sum=True or sum_only=True in the function call if SUM is needed\n",
"# for population related data that required SUM\n",
"def calc_stats(feature, img, include_sum=False, sum_only=False):\n",
" if sum_only:\n",
" # Use only SUM reducer\n",
" reducers = ee.Reducer.sum()\n",
" else:\n",
" # Use combination of reducers\n",
" reducers = ee.Reducer.mean().combine(\n",
" reducer2=ee.Reducer.minMax(),\n",
" sharedInputs=True\n",
" ).combine(\n",
" reducer2=ee.Reducer.median(),\n",
" sharedInputs=True\n",
" ).combine(\n",
" reducer2=ee.Reducer.stdDev(),\n",
" sharedInputs=True\n",
" )\n",
" if include_sum:\n",
" reducers = reducers.combine(reducer2=ee.Reducer.sum(), sharedInputs=True)\n",
"\n",
" stats = img.reduceRegion(**{\n",
" 'reducer': reducers,\n",
" 'geometry': feature.geometry(),\n",
" 'scale': 250,\n",
" 'maxPixels': 1e9\n",
" })\n",
" return feature.set(stats)\n",
"\n",
"# Filter the collection\n",
"filtered_raster = raster.filterDate(start_date, end_date)\n",
"\n",
"# Convert start and end dates to ee.Date objects\n",
"start_date_ee = ee.Date(start_date)\n",
"end_date_ee = ee.Date(end_date)\n",
"\n",
"# Adjusted Function to Perform Temporal Aggregation\n",
"def aggregate_temporally(collection, freq):\n",
" if freq == 'default':\n",
" return collection\n",
" elif freq == 'monthly':\n",
" start_year = start_date_ee.get('year').getInfo()\n",
" end_year = end_date_ee.get('year').getInfo()\n",
" start_month = start_date_ee.get('month').getInfo()\n",
" end_month = end_date_ee.get('month').getInfo()\n",
"\n",
" def monthlyComposite(year, month):\n",
" y_str = ee.Number(year).format()\n",
" # Add leading zero for single-digit months\n",
" m_str = ee.Number(month).format() if month > 9 else ee.String('0').cat(ee.Number(month).format())\n",
" return collection.filter(ee.Filter.calendarRange(year, year, 'year')) \\\n",
" .filter(ee.Filter.calendarRange(month, month, 'month')) \\\n",
" .mean() \\\n",
" .set('system:index', y_str.cat('_').cat(m_str))\n",
"\n",
" composites = []\n",
" for year in range(start_year, end_year + 1):\n",
" for month in range(1, 13):\n",
" if year == start_year and month < start_month:\n",
" continue\n",
" if year == end_year and month > end_month:\n",
" continue\n",
" composite = monthlyComposite(year, month)\n",
" composites.append(composite)\n",
"\n",
" return ee.ImageCollection(composites)\n",
"\n",
" elif freq == 'annual':\n",
" years = ee.List.sequence(start_date_ee.get('year'), end_date_ee.get('year'))\n",
" return ee.ImageCollection.fromImages(years.map(annualComposite))\n",
"\n",
"# Apply the temporal aggregation\n",
"temporal_collection = ee.ImageCollection(aggregate_temporally(filtered_raster, frequency))\n",
"\n",
"# Initialize a counter for progress tracking\n",
"total_images = temporal_collection.size().getInfo()\n",
"current_image = 1\n",
"\n",
"# Specify the folder name where the CSV files will be saved\n",
"folder_name = 'zonal_gee'\n",
"\n",
"# Map the function over the FeatureCollection for each image\n",
"for img_id in temporal_collection.aggregate_array('system:index').getInfo():\n",
" print(f\"Processing image {current_image} of {total_images}...\")\n",
" image = temporal_collection.filter(ee.Filter.eq('system:index', img_id)).first()\n",
" \n",
" if isinstance(image, ee.Image):\n",
" stats_fc = aoi.map(lambda f: calc_stats(f, image, include_sum=False, sum_only=False))\n",
"\n",
" # Format the file name using a naming convention\n",
" # Example: 'EVI_Stats_2020_01' for an image from January 2020\n",
" file_description = f'EVI_Stats_{img_id.replace(\"_\", \"-\")}'\n",
"\n",
" # Export the FeatureCollection to a CSV file in the specified folder\n",
" task = ee.batch.Export.table.toDrive(**{\n",
" 'collection': stats_fc,\n",
" 'description': file_description,\n",
" 'folder': folder_name,\n",
" 'fileFormat': 'CSV'\n",
" })\n",
" task.start()\n",
" \n",
" else:\n",
" print(f\"Skipped: Image with ID {img_id} is not valid.\")\n",
"\n",
" current_image += 1\n",
"\n",
"print(\"All tasks submitted.\")\n"
]
},
{
"cell_type": "markdown",
"id": "052570f3-d71d-4a3a-8b1e-2c6887fc015d",
"metadata": {},
"source": [
"## 4. Categorical Zonal Statistics\n",
"\n",
"The output from categorical zonal statistics are similar to what ArcPy Zonal Histogram do, or `rasterstats` python library with `categorical=True`.\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8e2e4f8e-8304-49e3-bcaa-6125ccef4b48",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"All tasks submitted.\n"
]
}
],
"source": [
"# Define the land cover image\n",
"raster = ee.ImageCollection(\"ESA/WorldCover/v200\").first()\n",
"\n",
"# Function to calculate land cover statistics based on the selected reducer type\n",
"# In below example, we use \"count\" because the input is landcover, and we would like to\n",
"# calculate count of pixel of each land cover class by admin boundary\n",
"def calc_land_cover_stats(feature, reducer_type='count'):\n",
" if reducer_type == 'all':\n",
" reducer = ee.Reducer.minMax().combine(\n",
" reducer2=ee.Reducer.mean(), sharedInputs=True).combine(\n",
" reducer2=ee.Reducer.median(), sharedInputs=True).combine(\n",
" reducer2=ee.Reducer.stdDev(), sharedInputs=True)\n",
" elif reducer_type == 'all_with_sum':\n",
" reducer = ee.Reducer.minMax().combine(\n",
" reducer2=ee.Reducer.mean(), sharedInputs=True).combine(\n",
" reducer2=ee.Reducer.median(), sharedInputs=True).combine(\n",
" reducer2=ee.Reducer.stdDev(), sharedInputs=True).combine(\n",
" reducer2=ee.Reducer.sum(), sharedInputs=True)\n",
" elif reducer_type == 'sum':\n",
" reducer = ee.Reducer.sum()\n",
" elif reducer_type == 'count':\n",
" reducer = ee.Reducer.frequencyHistogram()\n",
" else:\n",
" raise ValueError(\"Invalid reducer type. Choose 'all', 'all_with_sum', 'sum', or 'count'.\")\n",
"\n",
" stats = raster.reduceRegion(reducer=reducer, geometry=feature.geometry(), scale=300, maxPixels=1e9)\n",
" if reducer_type == 'count':\n",
" stats = ee.Dictionary(stats.get('Map')) # Convert histogram to a dictionary\n",
" return feature.set(stats)\n",
"\n",
"# Specify the reducer type ('all', 'all_with_sum', 'sum', or 'count')\n",
"reducer_type = 'count' # Example: change this to 'all', 'all_with_sum', 'sum', or 'count' as needed\n",
"\n",
"# Map the function over the FeatureCollection\n",
"land_cover_stats_fc = aoi.map(lambda f: calc_land_cover_stats(f, reducer_type))\n",
"\n",
"# Specify the folder name where the CSV files will be saved\n",
"folder_name = 'zonal_gee'\n",
"\n",
"# Format the file description based on the reducer type\n",
"file_description = f'exportLandCoverStats_{reducer_type}'\n",
"\n",
"# Export the FeatureCollection to a CSV file\n",
"task = ee.batch.Export.table.toDrive(**{\n",
" 'collection': land_cover_stats_fc,\n",
" 'description': file_description,\n",
" 'folder': folder_name,\n",
" 'fileFormat': 'CSV'\n",
"})\n",
"task.start()\n",
"\n",
"print(\"All tasks submitted.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2b817b20-2ac9-4cab-b811-e3c40c738c3b",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment