NOAA ERDDAP server gives access to apparently 802 datasets
install_github("ropensci/rnoaa")
library("rnoaa")
Search for data using erddap_search()
. Grab a datasetid that you want more information on.
(out <- erddap_search(query='fish size'))
## 7 results, showing first 20
## title dataset_id
## 1 CalCOFI Fish Sizes erdCalCOFIfshsiz
## 2 CalCOFI Larvae Sizes erdCalCOFIlrvsiz
## 3 CalCOFI Tows erdCalCOFItows
## 4 GLOBEC NEP MOCNESS Plankton (MOC1) Data erdGlobecMoc1
## 5 GLOBEC NEP Vertical Plankton Tow (VPT) Data erdGlobecVpt
## 6 CalCOFI Larvae Counts Positive Tows erdCalCOFIlrvcntpos
## 7 OBIS - ARGOS Satellite Tracking of Animals aadcArgos
id <- out$info$dataset_id[1]
Using a datasetid, search for information on a datasetid using erddap_info()
. A list of length 2 is returned. The variables
slot has all column names, and whether they're a float, string, etc., The alldata
slot has the comprehensive list of information on each column - it's given as a list as descriptions can be quite long, and would make for a hard to use data frame
erddap_info(datasetid=id)$variables
## row_type variable_name data_type
## 31 variable cruise String
## 35 variable ship String
## 38 variable ship_code String
## 42 variable order_occupied int
## 46 variable tow_type String
## 49 variable net_type String
## 53 variable tow_number int
## 58 variable net_location String
## 62 variable standard_haul_factor float
## 66 variable volume_sampled float
## 71 variable percent_sorted float
## 76 variable sample_quality float
## 80 variable latitude float
## 88 variable longitude float
## 96 variable line float
## 101 variable station float
## 106 variable time double
## 116 variable scientific_name String
## 119 variable common_name String
## 122 variable itis_tsn int
## 126 variable calcofi_species_code int
## 130 variable fish_size float
## 134 variable fish_count float
## 137 variable fish_1000m3 float
Get data from the dataset, with column names gleaned from the last step
head(erddap_data(datasetid = id, fields = c('latitude','longitude','scientific_name')))
## latitude longitude scientific_name
## 2 35.038334 -120.88333 Microstomus pacificus
## 3 34.97167 -121.02333 Cyclothone signata
## 4 34.97167 -121.02333 Cyclothone signata
## 5 34.97167 -121.02333 Cyclothone signata
## 6 34.97167 -121.02333 Cyclothone signata
## 7 34.97167 -121.02333 Cyclothone signata
Some datasets have ITIS taxonomic identifiers - we can use taxize
to get more information on each species using these identifiers.
Get data on the California Cooperative Oceanic Fisheries Investigations fish sizes dataset (erdCalCOFIfshsiz)
out <- erddap_data(datasetid = 'erdCalCOFIfshsiz', fields = c('latitude','longitude','scientific_name','itis_tsn'))
tsns <- unique(out$itis_tsn[1:100])
Load taxize and get classifications for each taxon, then combine to a single data frame
install.packages("taxize")
library("taxize")
classif <- classification(tsns, db = "itis")
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=172887
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=162168
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=623625
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=162172
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=162301
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=162685
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=162664
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=162221
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=164792
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=162167
## http://www.itis.gov/ITISWebService/services/ITISService/getFullHierarchyFromTSN?tsn=162092
alldata <- rbind(classif)
nrow(alldata)
## [1] 166
head(alldata)
## source taxonid name rank
## 1 itis 172887 Animalia Kingdom
## 2 itis 172887 Bilateria Subkingdom
## 3 itis 172887 Deuterostomia Infrakingdom
## 4 itis 172887 Chordata Phylum
## 5 itis 172887 Vertebrata Subphylum
## 6 itis 172887 Gnathostomata Infraphylum
So ERDDAP is a great service. They don't have a unified API against which to specify constraints on what data gets returned - each dataset has different possible parameters. So you do need to use erddap_ino()
to find out what columns there are, then you can pass in parameter to erddap_search()
based on those column names. For example, from above we see that there is a column caled fish_count
in the erdCalCOFIfshsiz dataset. We can use that column to get back only data that meet a criterion.
Records with a count of more than 6
erddap_data(datasetid = id, fields = c('latitude','longitude','scientific_name','fish_count'), 'fish_count>=6')
## latitude longitude scientific_name fish_count
## 2 35.595 -121.736664 Cyclothone signata 6
## 3 35.336666 -122.555 Stenobrachius leucopsarus 8
## 4 35.336666 -122.861664 Cyclothone signata 6
## 5 35.336666 -122.861664 Cyclothone signata 6
## 6 35.336666 -122.861664 Stenobrachius leucopsarus 6
## 7 35.356667 -123.12666 Cyclothone signata 6
## 8 35.356667 -123.12666 Stenobrachius leucopsarus 7
## 9 35.333332 -123.473335 Cyclothone signata 6
## 10 35.333332 -123.473335 Cyclothone signata 6
## 11 35.333332 -123.473335 Cyclothone signata 6
## 12 35.333332 -123.473335 Cyclothone signata 7
## 13 35.333332 -123.473335 Cyclothone signata 7
## 14 35.333332 -123.473335 Cyclothone signata 6
## 15 35.336666 -123.77 Cyclothone signata 8
## 16 35.336666 -123.77 Cyclothone signata 7
## 17 35.331665 -124.375 Cyclothone signata 7
## 18 35.338333 -125.28667 Cyclothone signata 6
## 19 35.338333 -125.28667 Cyclothone signata 6
## 20 36.836666 -124.878334 Cyclothone signata 6
## 21 35.981667 -124.88 Cyclothone signata 6
## 22 36.333332 -124.878334 Cyclothone signata 6
## 23 36.333332 -124.878334 Cyclothone signata 6
## 24 36.583332 -124.875 Cyclothone signata 6
## 25 35.335 -123.26667 Stenobrachius leucopsarus 6
## 26 37.001667 -124.901665 Cyclothone signata 6
## 27 37.5 -126.70167 Cyclothone signata 6
## 28 38.0 -128.5 Cyclothone signata 6
## 29 38.0 -128.5 Cyclothone signata 11
## 30 38.5 -125.5 Cyclothone signata 7
Records with a count of more than or equal to 6
erddap_data(datasetid = id, fields = c('latitude','longitude','scientific_name','fish_count'), 'fish_count>=6')
## latitude longitude scientific_name fish_count
## 2 35.595 -121.736664 Cyclothone signata 6
## 3 35.336666 -122.555 Stenobrachius leucopsarus 8
## 4 35.336666 -122.861664 Cyclothone signata 6
## 5 35.336666 -122.861664 Cyclothone signata 6
## 6 35.336666 -122.861664 Stenobrachius leucopsarus 6
## 7 35.356667 -123.12666 Cyclothone signata 6
## 8 35.356667 -123.12666 Stenobrachius leucopsarus 7
## 9 35.333332 -123.473335 Cyclothone signata 6
## 10 35.333332 -123.473335 Cyclothone signata 6
## 11 35.333332 -123.473335 Cyclothone signata 6
## 12 35.333332 -123.473335 Cyclothone signata 7
## 13 35.333332 -123.473335 Cyclothone signata 7
## 14 35.333332 -123.473335 Cyclothone signata 6
## 15 35.336666 -123.77 Cyclothone signata 8
## 16 35.336666 -123.77 Cyclothone signata 7
## 17 35.331665 -124.375 Cyclothone signata 7
## 18 35.338333 -125.28667 Cyclothone signata 6
## 19 35.338333 -125.28667 Cyclothone signata 6
## 20 36.836666 -124.878334 Cyclothone signata 6
## 21 35.981667 -124.88 Cyclothone signata 6
## 22 36.333332 -124.878334 Cyclothone signata 6
## 23 36.333332 -124.878334 Cyclothone signata 6
## 24 36.583332 -124.875 Cyclothone signata 6
## 25 35.335 -123.26667 Stenobrachius leucopsarus 6
## 26 37.001667 -124.901665 Cyclothone signata 6
## 27 37.5 -126.70167 Cyclothone signata 6
## 28 38.0 -128.5 Cyclothone signata 6
## 29 38.0 -128.5 Cyclothone signata 11
## 30 38.5 -125.5 Cyclothone signata 7
You can use latitude
and longitude
parameters to constrain search geographically
out <- erddap_data(datasetid = 'erdCalCOFIfshsiz', fields = c('latitude','longitude','scientific_name'), 'latitude>=34.8', 'latitude<=35', 'longitude>=-125', 'longitude<=-124.4')
head(out)
## latitude longitude scientific_name
## 2 34.881668 -124.48 Cyclothone atraria
## 3 34.881668 -124.48 Lipolagus ochotensis
## 4 34.881668 -124.48 Bathylagoides wesethi
## 5 34.881668 -124.48 Cyclothone acclinidens
## 6 34.881668 -124.48 Cyclothone acclinidens
## 7 34.881668 -124.48 Cyclothone signata
hey @dill -
fish_count>=' = 6
is weird. I'll try that change.latitude<40
&latitude>30
, etc. for longitude. That is, I don't think there's any global parameter for a bounding box or passing in a WKT string