Skip to content

Instantly share code, notes, and snippets.

@mdsumner
Created August 24, 2025 23:08
Show Gist options
  • Save mdsumner/186254ad9a761c556cf04d03c9715ab8 to your computer and use it in GitHub Desktop.
Save mdsumner/186254ad9a761c556cf04d03c9715ab8 to your computer and use it in GitHub Desktop.

See how the target doesn't have the "/%Y%m/" component:

Sun Aug 24 22:59:11 2025
Synchronizing dataset: NOAA OI 1/4 Degree Daily SST AVHRR
Source URL https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/
--------------------------------------------------------------------------------------------

 this dataset path is: /perm_storage/home/data/r_tmp/Rtmp7EG26o/bowerbird_files/www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr
 visiting https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/ ... 
Downloading: 4 kB     
 0 download links, 2 links to visit done.
 visiting https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/202507 ... 
Downloading: 1 kB                                                                                                                                                                                       |   0%
 done.
 visiting https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/202508 ... 
Downloading: 900 B                                                                                                                                                                                      |   0%
 done.
 dry_run is TRUE, bb_rget is not downloading the following files:
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250701.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250702.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250703.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250704.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250705.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250706.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250707.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250708.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250709.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250710.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250711.nc
 https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/oisst-avhrr-v02r01.20250712.nc

reprex:

library(bowerbird)
library(blueant)

datadir <- file.path(tempdir(), "bowerbird_files")
if (!file.exists(datadir)) dir.create(datadir)

srcset <- NULL

src <- blueant::sources("NOAA OI 1/4 Degree Daily SST AVHRR")
year00 <- sprintf("(%s|%s)", format(Sys.Date(), "%Y%m"), format(Sys.Date()-30, "%Y%m"))

src <- src |>
  bb_modify_source(method = list(accept_follow = year00, accept_download = ".*nc$",no_host = FALSE)) 

cf <- bb_config(local_file_root = datadir)
cf <- bb_add(cf, src)
status <- bb_sync(cf, confirm_downloads_larger_than = NULL, dry_run = T, verbose = TRUE)
@raymondben
Copy link

%Y%m gets dropped because the index page (https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/) links to the yearmon directory with no trailing slash (see e.g.:

visiting https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/202507 ... 

(no slash). But when you visit that you get a 301 permanent redirect to the same directory with a slash appended. Bowerbird constructs the target path from the original URL I guess, not the redirect. Will look further ...

@raymondben
Copy link

Also not sure why that isn't coming up as a failure in the log, all of those file download attempts give a 404 not found error but those are only being raised as warnings and there is no indication in the log output that things went to custard

@raymondben
Copy link

Both should be fixed now, have updated server, we will see on next run.

@mdsumner
Copy link
Author

awesome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment