Skip to content

Instantly share code, notes, and snippets.

@cbcunc
Last active December 12, 2024 01:55
Show Gist options
  • Save cbcunc/02823b9302320394ab1e1c57aabab1b2 to your computer and use it in GitHub Desktop.
Save cbcunc/02823b9302320394ab1e1c57aabab1b2 to your computer and use it in GitHub Desktop.
GCSFuse THREDDS Test
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "82a458cd-7a0b-43c1-a850-88e6706a0495",
"metadata": {},
"source": [
"## Let's compare THREDDS with local disk vs gcsfuse mount"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1643995c-c53e-4bce-bd89-0452a83a601c",
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"import xarray as x"
]
},
{
"cell_type": "markdown",
"id": "e2b05e36-65d7-4609-97dd-365fd374e1a7",
"metadata": {},
"source": [
"### GCSFuse with small subset of dataset from the Hydroshare tutorial"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "4989c022-d616-4525-b3b5-903e07d02959",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Elapsed time with gcsfuse mount: 10.5129 milliseconds\n"
]
}
],
"source": [
"start_time = time.process_time_ns()\n",
"ds = x.open_dataset(\"http://thredds.hydroshare.org/thredds/dodsC/gcsfuse/resources/f3f947be65ca4b258e88b600141b85f3/data/contents/SWE_time.nc?time[0:1:2183],y[0:1:58],x[0:1:38],transverse_mercator,SWE[0:1:0][0:1:58][0:1:38]\")\n",
"ds.to_netcdf(\"swe-t0-test.nc\")\n",
"end_time = time.process_time_ns()\n",
"elapsed_time = end_time - start_time\n",
"print(f\"Elapsed time with gcsfuse mount: {elapsed_time / 10e6} milliseconds\")"
]
},
{
"cell_type": "markdown",
"id": "064f5ca9-9f2f-4b6a-a1ae-c719110c3f59",
"metadata": {},
"source": [
"### Local disk with small subset of dataset from the Hydroshare tutorial"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ff0aa091-e76b-4878-bf0b-aab790eeb69d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Elapsed time with local disk: 7.3861 milliseconds\n"
]
}
],
"source": [
"start_time = time.process_time_ns()\n",
"ds = x.open_dataset(\"http://thredds.hydroshare.org/thredds/dodsC/hydroshare/resources/f3f947be65ca4b258e88b600141b85f3/data/contents/SWE_time.nc?time[0:1:2183],y[0:1:58],x[0:1:38],transverse_mercator,SWE[0:1:0][0:1:58][0:1:38]\")\n",
"ds.to_netcdf(\"swe-t0.nc\")\n",
"end_time = time.process_time_ns()\n",
"elapsed_time = end_time - start_time\n",
"print(f\"Elapsed time with local disk: {elapsed_time / 10e6} milliseconds\")"
]
},
{
"cell_type": "markdown",
"id": "16605256-5f77-4c9f-b47a-2663265e999c",
"metadata": {},
"source": [
"### GCSFuse with 3M dataset used for THREDDS heartbeat"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "88567007-9fdf-46e3-885b-fae4ba7bca26",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Elapsed time with gcsfuse mount: 11.5806 milliseconds\n"
]
}
],
"source": [
"start_time = time.process_time_ns()\n",
"ds = x.open_dataset(\"http://thredds.hydroshare.org/thredds/dodsC/gcsfuse/resources/f203458dcdde4cc8982a4a02aec9de8f/data/contents/tos_O1_2001-2002.nc\")\n",
"ds.to_netcdf(\"tos_O1_2001-2002-test.nc\")\n",
"end_time = time.process_time_ns()\n",
"elapsed_time = end_time - start_time\n",
"print(f\"Elapsed time with gcsfuse mount: {elapsed_time / 10e6} milliseconds\")"
]
},
{
"cell_type": "markdown",
"id": "0c6fa34f-ec04-4c45-a199-bcf72acbb57d",
"metadata": {},
"source": [
"### Local disk with 3M dataset used for THREDDS heartbeat"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "79a236b5-924f-41d5-af33-c3b5910cfce4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Elapsed time with local disk: 10.3747 milliseconds\n"
]
}
],
"source": [
"start_time = time.process_time_ns()\n",
"ds = x.open_dataset(\"http://thredds.hydroshare.org/thredds/dodsC/hydroshare/resources/f203458dcdde4cc8982a4a02aec9de8f/data/contents/tos_O1_2001-2002.nc\")\n",
"ds.to_netcdf(\"tos_O1_2001-2002.nc\")\n",
"end_time = time.process_time_ns()\n",
"elapsed_time = end_time - start_time\n",
"print(f\"Elapsed time with local disk: {elapsed_time / 10e6} milliseconds\")"
]
},
{
"cell_type": "markdown",
"id": "58d3bf15-55a6-4f5f-97e9-f8d52568a3f8",
"metadata": {},
"source": [
"## GCSFuse with 3M subset of 18G dataset used for Global Distribution of Shallow Groundwater"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "136038db-fb48-4992-9e1d-b018c5ef9462",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Elapsed time with gcsfuse mount: 6.1657 milliseconds\n"
]
}
],
"source": [
"start_time = time.process_time_ns()\n",
"ds = x.open_dataset(\"http://thredds.hydroshare.org/thredds/dodsC/gcsfuse/resources/9462b23c5e1e46bdae6ef8abcdbed365/data/contents/ShallowGW2021.nc?ShallowGW[180:1:190][1250:1:1260][900:1:910],latitude[900:1:910],time[180:1:190],longitude[1250:1:1260]\")\n",
"del ds.attrs['_NCProperties']\n",
"ds.to_netcdf(\"ShallowGW2021-test.nc\")\n",
"end_time = time.process_time_ns()\n",
"elapsed_time = end_time - start_time\n",
"print(f\"Elapsed time with gcsfuse mount: {elapsed_time / 10e6} milliseconds\")"
]
},
{
"cell_type": "markdown",
"id": "4c53501f-ac64-4503-ac49-e452c0a815ab",
"metadata": {},
"source": [
"### Local disk with 3M subset of 18G dataset used for Global Distribution of Shallow Groundwater"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "49ab61fd-a194-4e66-9932-3d9a74723469",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Elapsed time with local disk: 6.7853 milliseconds\n"
]
}
],
"source": [
"start_time = time.process_time_ns()\n",
"ds = x.open_dataset(\"http://thredds.hydroshare.org/thredds/dodsC/gcsfuse/resources/9462b23c5e1e46bdae6ef8abcdbed365/data/contents/ShallowGW2021.nc?ShallowGW[180:1:190][1250:1:1260][900:1:910],latitude[900:1:910],time[180:1:190],longitude[1250:1:1260]\")\n",
"del ds.attrs['_NCProperties']\n",
"ds.to_netcdf(\"ShallowGW2021.nc\")\n",
"end_time = time.process_time_ns()\n",
"elapsed_time = end_time - start_time\n",
"print(f\"Elapsed time with local disk: {elapsed_time / 10e6} milliseconds\")"
]
},
{
"cell_type": "markdown",
"id": "72483c92-703b-40b8-90ad-f5e579e6bc84",
"metadata": {},
"source": [
"#### Note\n",
"\n",
"See https://docs.unidata.ucar.edu/tds/current/userguide/tds_config_ref.html#opendap-service\n",
"\n",
"Because it’s easy for a user to inadvertently request very large amounts of data, the TDS limits the size of the data response. <span style=\"color:red\">In our experience legitimate requests ask for subset sizes that are well below the defaults.</span>\n",
"\n",
" ascLimit: maximum size of an ascii data request , in Megabytes. Default 50 Mbytes.\n",
" binLimit: maximum size of a binary data request , in Megabytes. Default is 500 Mbytes.\n",
"\n",
"This may be changed in the`<Opendap>` stanza of `threddsConfig.xml`.\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment