Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Created September 12, 2025 00:57
Show Gist options
  • Save mdsumner/53e2e7a98820b12bad01c0eae552a666 to your computer and use it in GitHub Desktop.
Save mdsumner/53e2e7a98820b12bad01c0eae552a666 to your computer and use it in GitHub Desktop.

In R just do

arrow::read_parquet("https://github.com/mdsumner/dryrun/raw/refs/heads/main/data-raw/noaa_oi_025_degree_daily_sst_avhrr.parquet")$url

In python, I understand GDAL

from osgeo import ogr
ogr.UseExceptions()
path = "https://github.com/mdsumner/dryrun/raw/refs/heads/main/data-raw/noaa_oi_025_degree_daily_sst_avhrr.parquet"

ds = ogr.Open(f"/vsicurl/{path}")
l = ds.GetLayer()
stream = l.GetArrowStreamAsPyArrow()
url = []
for batch in stream: 
    a = batch.field("url")
    url = url + [u for u in a.to_pylist() if u is not None]
l,stream,a = None, None, None
ds.close()
url[-2:]
# ['https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250826_preliminary.nc', 'https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250827_preliminary.nc']

How to do this the best python way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment