Load rgbif
library("rgbif")
occ_search(taxonKey=6351, fields = "all", hasCoordinate=TRUE, hasGeospatialIssue = FALSE)
#> Records found [6522]
#> Records returned [500]
#> No. unique hierarchies [2]
#> No. media records [1]
#> Args [taxonKey=6351, hasCoordinate=TRUE, hasGeospatialIssue=FALSE, limit=500, offset=0, fields=all]
#> First 10 rows of data
#>
#> name key decimalLatitude decimalLongitude issues
#> 1 Raphoneis amphiceros 113516533 53.74579 1.67182 cdround,cudc,gass84,txmatfuz
#> 2 Diatomaceae 113425616 53.94419 1.23671 cdround,cudc,gass84,txmathi
#> 3 Raphoneis amphiceros 113524813 53.94419 1.23671 cdround,cudc,gass84,txmatfuz
#> 4 Diatomaceae 113422759 53.64906 1.10778 cdround,cudc,gass84,txmathi
#> 5 Raphoneis amphiceros 113516272 52.72840 -5.72135 cdround,cudc,gass84,txmatfuz
#> 6 Diatomaceae 113422760 53.80887 2.19924 cdround,cudc,gass84,txmathi
#> 7 Diatomaceae 113425650 54.07139 1.52203 cdround,cudc,gass84,txmathi
#> 8 Raphoneis amphiceros 113516576 53.64906 1.10778 cdround,cudc,gass84,txmatfuz
#> 9 Raphoneis amphiceros 113517123 53.33904 4.61214 cdround,cudc,gass84,txmatfuz
#> 10 Diatomaceae 113424196 44.22176 -60.25779 cdround,cudc,gass84,txmathi
#> .. ... ... ... ... ...
#> Variables not shown: datasetKey (chr), publishingOrgKey (chr), publishingCountry (chr), protocol
#> (chr), lastCrawled (chr), extensions (chr), basisOfRecord (chr), taxonKey (int), kingdomKey
#> (int), phylumKey (int), classKey (int), familyKey (int), genusKey (int), speciesKey (int),
#> scientificName (chr), kingdom (chr), phylum (chr), family (chr), genus (chr), species (chr),
#> genericName (chr), specificEpithet (chr), taxonRank (chr), depth (dbl), depthAccuracy (dbl),
#> year (int), month (int), day (int), eventDate (chr), lastInterpreted (chr), identifiers (chr),
#> facts (chr), relations (chr), geodeticDatum (chr), class (chr), countryCode (chr), country
#> (chr), catalogNumber (chr), institutionCode (chr), collectionCode (chr), gbifID (chr),
#> lastParsed (chr), elevation (dbl), elevationAccuracy (dbl), stateProvince (chr), recordedBy
#> (chr), county (chr), locality (chr), identifiedBy (chr)
For some of the download API functions, note that you have to pass in your username, email and password for the GBIF website
Start a download
(dload <- occ_download('taxonKey = 6351',
'hasCoordinate = TRUE',
'hasGeospatialIssue = FALSE'))
#> <<gbif download>>
#> Username: xxx
#> E-mail: [email protected]
#> Download key: 0003358-150721130643939
Then you have to wait for the download file to be made ready by GBIF servers. In the meantime, check on all your downloads like
occ_download_list()
#> $meta
#> offset limit endofrecords count
#> 1 0 3 FALSE 38
#>
#> $results
#> key doi request.predicate.type
#> 1 0003358-150721130643939 doi:10.15468/dl.ll3wue and
#> 2 0003357-150721130643939 doi:10.15468/dl.bjzstf equals
#> 3 0007658-150615163101818 doi:10.15468/dl.jh2wda equals
#> request.predicate.predicates
#> 1 equals, equals, equals, TAXON_KEY, HAS_COORDINATE, HAS_GEOSPATIAL_ISSUE, 6351, TRUE, FALSE
#> 2 NULL
#> 3 NULL
#> request.predicate.key request.predicate.value request.creator request.format
#> 1 <NA> <NA> sckott DWCA
#> 2 TAXON_KEY 6351 sckott DWCA
#> 3 TAXON_KEY 2433433 sckott DWCA
#> request.notificationAddresses request.sendNotification created
#> 1 [email protected] FALSE 2015-08-04T15:58:11.902+0000
#> 2 [email protected] FALSE 2015-08-04T15:56:23.864+0000
#> 3 [email protected] FALSE 2015-07-09T14:48:40.705+0000
#> modified status
#> 1 2015-08-04T15:59:00.874+0000 SUCCEEDED
#> 2 2015-08-04T15:57:19.062+0000 SUCCEEDED
#> 3 2015-07-09T14:50:28.628+0000 SUCCEEDED
#> downloadLink size totalRecords
#> 1 http://api.gbif.org/v1/occurrence/download/request/0003358-150721130643939.zip 0.41 6522
#> 2 http://api.gbif.org/v1/occurrence/download/request/0003357-150721130643939.zip 0.50 6943
#> 3 http://api.gbif.org/v1/occurrence/download/request/0007658-150615163101818.zip 0.00 16613
#> numberDatasets
#> 1 7
#> 2 11
#> 3 174
And you can check on a specific download like
occ_download_meta(dload)
#> <<gbif download metadata>>
#> Status: SUCCEEDED
#> Download key: 0003358-150721130643939
#> Created: 2015-08-04T15:58:11.902+0000
#> Modified: 2015-08-04T15:59:00.874+0000
#> Download link: http://api.gbif.org/v1/occurrence/download/request/0003358-150721130643939.zip
#> Total records: 6522
#> Request:
#> type: and
#> predicates:
#> - type: equals, key: TAXON_KEY, value: 6351
#> - type: equals, key: HAS_COORDINATE, value: TRUE
#> - type: equals, key: HAS_GEOSPATIAL_ISSUE, value: FALSE
Once the download status is SUCCEEDED
, then you can download the file, and import it (into R)
occ_download_get(dload) %>% occ_download_import(path)
#> gbifID abstract accessRights accrualMethod accrualPeriodicity accrualPolicy alternative
#> 1 197225455 NA NA NA NA NA NA
#> 2 197226215 NA NA NA NA NA NA
#> 3 113421846 NA NA NA NA NA NA
#> 4 113421847 NA NA NA NA NA NA
#> 5 113421851 NA NA NA NA NA NA
#> 6 113421860 NA NA NA NA NA NA
#> 7 113421982 NA NA NA NA NA NA
#> 8 113421988 NA NA NA NA NA NA
#> 9 113421994 NA NA NA NA NA NA
#> 10 113422101 NA NA NA NA NA NA
#> .. ... ... ... ... ... ... ...
#> Variables not shown: audience (lgl), available (lgl), bibliographicCitation (lgl), conformsTo (lgl),
#> contributor (lgl), coverage (lgl), created (lgl), creator (lgl), date (lgl), dateAccepted
#> (lgl), dateCopyrighted (lgl), dateSubmitted (lgl), description (lgl), educationLevel (lgl),
#> extent (lgl), format (lgl), hasFormat (lgl), hasPart (lgl), hasVersion (lgl), identifier (lgl),
#>
#> .... cutoff for brevity
You may get some read file warnings on the occ_download_import()
call, but that shouldn't be a problem.
You can also just read in the file however you like. The output of occ_download_get()
has the path
to the file downloaded.