Skip to content

Instantly share code, notes, and snippets.

@amdevine
Last active March 8, 2021 14:30
Show Gist options
  • Select an option

  • Save amdevine/b21ca15fcfaac5c1e75fc33fdcde4056 to your computer and use it in GitHub Desktop.

Select an option

Save amdevine/b21ca15fcfaac5c1e75fc33fdcde4056 to your computer and use it in GitHub Desktop.
Access data from the GGBN API

Access data from the GGBN API

The Global Genome Biodiversity Network (GGBN) API is a service that allows users to retrieve data from the GGBN Data Portal quickly and easily. Data can be accessed via a script or a web browser. Data are output in the JavaScript Object Notation (JSON) format, a common format for sharing and accessing data from websites.

Types of data available

The GGBN API is accessed by using a web browser (or a script) to navigate to a specific URL (the endpoint) on the GGBN website:

http://data.ggbn.org/ggbn_portal/api/search

This base URL alone provides no data; to access different kinds of data, you specify a particular search method at the end of the URL, depending on what kind of data you'd like to request.

Here are the four different kinds of data you can request:

Method getRepositories
Description Returns the list of GGBN repositories contributing sample data to the Data Portal
URL http://data.ggbn.org/ggbn_portal/api/search?getRepositories
Filters none
Method getCounts
Description Returns GGBN Data Portal statistics
URL http://data.ggbn.org/ggbn_portal/api/search?getCounts
Filters repository, name, isocountry, sampleType
Method getClassification
Description Returns taxonomic names and corresponding number of samples in the Data Portal
URL http://data.ggbn.org/ggbn_portal/api/search?getClassification
Filters unitID, sampleType, name, country, isocountry, repository
Method getSampletype
Description Returns counts of each sampletype in the Data Portal
URL http://data.ggbn.org/ggbn_portal/api/search?getSampletype
Filters unitID, sampleType, name, country, isocountry, repository
Method getCollYear
Description Returns counts of records by collection year in the Data Portal
URL http://data.ggbn.org/ggbn_portal/api/search?getSampletype
Filters sampleType, name, isocountry

Filters

Several of the search methods in the API allow you to add additional parameters at the end of the URL, which will filter the results provided by the search method. The general format for a filter is &filtername=value.

You can specify multiple filters; just add &filtername=value for each filter you would like to specify.

The additional filters available for each search method are listed in the Endpoint and Search Method section above.

For the isocountry filter, use upper-case two-letter country codes. See ISO 3166-1 alpha-2.

See the Examples section below for different examples of queries.

Genus names

If you are supplying a genus name to the name filter, you have to enter it in the following format: &name=Canis *. The name needs to be followed by a space and an asterisk. Otherwise, it will most likely return no results.

Spaces

If your filter value contains a space and/or a comma (e.g. NMNH, Washington), it is okay to just type the space and/or the comma into the URL. Alternatively, you can substitute the space with the characters %20 (e.g. &repository=NMNH,%20Washington).

Limit results by taxon

If you wish to filter by name, you can further filter results to just certain taxonomic lineages. You can do this by submitting a value to name= in the format of name AND filtername.

For example, the genus Calamus is a valid genus name in both plants and animals. To just search for Calamus, we would search name=Calamus *. (See section on querying genus names above.)

Animal-only query: name=Calamus * AND Animalia

Animal-only results: http://www.ggbn.org/ggbn_portal/api/search?getClassification&name=Calamus%20*%20AND%20Animalia

Plant-only query: name=Calamus * AND Plantae

Plant-only results: http://www.ggbn.org/ggbn_portal/api/search?getClassification&name=Calamus%20*%20AND%20Plantae

As usual, if your query contains spaces, then the URL will automatically add %20 URL encoding characters to the resulting URL.

Results

Results are returned in the JSON data format. The basic structure of JSON data is:

{
    'field1': 'value1',
    'field2': 'value2',
    'field3': [
        'value3.1',
        'value3.2',
        'value3.3'
    ],
    'field4': 'value4',
    'field5': {
        'field5a': 'value5a',
        'field5b': 'value5b',
        'field5c': 'value5c'
    }
}

All results are surrounded by curly braces {}. The curly braces contain 'name': 'value' pairs, separated by commas. If a result has multiple values, all those values are written between square brackets [] and separated by commas. If a result contains a set of sub-results, those sub-results are stored in their own curly braces.

See the Examples section at the end for examples of JSON outputs.

Reading JSON in Chrome

In order to more easily read JSON output from a website, it is recommended that you install a browser extension that automatically parses and organizes JSON data for reading. For the Chrome browser, I would recommend JSON Formatter (https://chrome.google.com/webstore/detail/json-formatter/bcjindcccaagfpapjjmafapmmgkkhgoa).

Pasting JSON data into a spreadsheet

If you would like to copy and paste JSON data into a text editor or spreadsheet program, you can run it through a 'pretty print' tool such as JSON Pretty Print (http://jsonprettyprint.com/). Copy and paste the entire JSON output from your web browser into the pretty print tool, and it will return JSON output that includes spacing, hard-coded line returns, nesting, etc.

Even with a browser extension like JSON Formatter installed, I would still recommend utilizing a pretty printer to copy and paste JSON data.

Examples

Querying the API

getCounts

Data Portal statistics (getCounts) for the NHM, London repository:

http://data.ggbn.org/ggbn_portal/api/search?getCounts&repository=NHM, London

Optionally, you can replace spaces in URLs with %20. e.g. http://data.ggbn.org/ggbn_portal/api/search?getCounts&repository=NHM,%20London

Data Portal statistics for NMNH, Washington bird (Aves) samples:

http://data.ggbn.org/ggbn_portal/api/search?getCounts&repository=NMNH, Washington&name=Aves

getClassification

All Arthropod names and sample counts:

http://data.ggbn.org/ggbn_portal/api/search?getClassification&name=Arthropoda

Taxonomic classification and sample counts of the canine genus Canis:

http://data.ggbn.org/ggbn_portal/api/search?getClassification&name=Canis *

When searching genus names, you need to add a space and a * after the name to get the correct count.

All mammals from the United States in the NMNH, Washington collection:

http://data.ggbn.org/ggbn_portal/api/search?getClassification&name=Mammalia&country=United States&repository=NMNH, Washington

getSampletype

Count of DNA samples from Costa Rica:

http://data.ggbn.org/ggbn_portal/api/search?getSampletype&sampleType=DNA&country=Costa Rica

Sample types and counts of arthropods in the NMNH, Washington collection:

http://data.ggbn.org/ggbn_portal/api/search?getSampletype&name=Arthropoda&repository=NMNH, Washington

JSON Results

getCounts

{
    "method": "getCounts",
    "nbSamples": 5016869,
    "nbFamilies": 4460,
    "nbGenera": 25536,
    "nbSpecies": 54575,
    "nbSampleCollections": 24,
    "nbVoucherCollections": 38,
    "nbCountries": 235,
    "nbContinents": 7,
    "nbSeas": 54,
    "nbOceans": 6,
    "nbIsMarine": {
        "1": 58
    },
    "nbCollectionYear": {
        "0": 524344,
        "1005": 1,
        "1220": 1,
        "1800": 4914,
        "1818": 1079,
        "1819": 3,
        "1827": 1,
        "1830": 1,
        "1837": 2,
        "1842": 2,
        "1843": 1,
        "1845": 2,
        "1846": 4,
        "1847": 2,
        "1850": 53,
        "1851": 3,
        "1854": 1,
        "1855": 3,
        "1856": 2,
        "1858": 2,
        "1860": 8,
        "1861": 6,
        "1862": 2,
        "1865": 3,
        "1866": 3,
        "1867": 2,
        "1874": 4,
        "1875": 2,
        "1877": 1,
        "1878": 2,
        "1879": 11,
        "1880": 913,
        "1881": 1,
        "1883": 2,
        "1884": 1,
        "1885": 4,
        "1886": 3,
        "1887": 5,
        "1888": 7,
        "1889": 4,
        "1890": 3,
        "1891": 4,
        "1892": 9,
        "1893": 3,
        "1894": 8,
        "1895": 5,
        "1896": 7,
        "1897": 22,
        "1898": 14,
        "1899": 28,
        "1900": 108,
        "1901": 15,
        "1902": 22,
        "1903": 24,
        "1904": 10,
        "1905": 67,
        "1906": 9,
        "1907": 22,
        "1908": 63,
        "1909": 48,
        "1910": 40,
        "1911": 47,
        "1912": 14,
        "1913": 30,
        "1914": 19,
        "1915": 189,
        "1916": 15,
        "1917": 21,
        "1918": 14,
        "1919": 13,
        "1920": 25,
        "1921": 27,
        "1922": 19,
        "1923": 24,
        "1924": 14,
        "1925": 34,
        "1926": 28,
        "1927": 20,
        "1928": 42,
        "1929": 31,
        "1930": 31,
        "1931": 37,
        "1932": 26,
        "1933": 24,
        "1934": 43,
        "1935": 30,
        "1936": 57,
        "1937": 65,
        "1938": 68,
        "1939": 47,
        "1940": 45,
        "1941": 30,
        "1942": 27,
        "1943": 19,
        "1944": 23,
        "1945": 19,
        "1946": 38,
        "1947": 40,
        "1948": 32,
        "1949": 53,
        "1950": 90,
        "1951": 34,
        "1952": 54,
        "1953": 61,
        "1954": 55,
        "1955": 83,
        "1956": 72,
        "1957": 103,
        "1958": 80,
        "1959": 170,
        "1960": 109,
        "1961": 162,
        "1962": 176,
        "1963": 282,
        "1964": 223,
        "1965": 184,
        "1966": 233,
        "1967": 277,
        "1968": 362,
        "1969": 1201,
        "1970": 391,
        "1971": 824,
        "1972": 506,
        "1973": 1285,
        "1974": 3793,
        "1975": 3787,
        "1976": 5981,
        "1977": 5041,
        "1978": 3774,
        "1979": 8018,
        "1980": 7065,
        "1981": 9072,
        "1982": 8432,
        "1983": 5708,
        "1984": 8476,
        "1985": 8816,
        "1986": 11742,
        "1987": 12870,
        "1988": 16468,
        "1989": 17025,
        "1990": 18819,
        "1991": 20927,
        "1992": 20415,
        "1993": 22941,
        "1994": 45303,
        "1995": 27523,
        "1996": 35797,
        "1997": 35818,
        "1998": 39729,
        "1999": 37840,
        "2000": 42319,
        "2001": 42482,
        "2002": 46679,
        "2003": 55573,
        "2004": 50989,
        "2005": 58379,
        "2006": 60864,
        "2007": 67807,
        "2008": 110374,
        "2009": 106054,
        "2010": 227289,
        "2011": 118247,
        "2012": 660344,
        "2013": 941301,
        "2014": 977110,
        "2015": 292250,
        "2016": 94100,
        "2017": 53983,
        "2018": 22051,
        "2019": 7131,
        "2020": 30
    },
    "samples": {
        "DNA": 1628763,
        "culture": 25343,
        "eVoucher": 2,
        "environmental sample": 227,
        "observation": 10,
        "specimen": 2096658,
        "tissue": 1262488,
        "unknown": 3378
    }
}

getClassification (name=Stomatopoda)

{
    "method": [
        "getClassification"
    ],
    "filters": "fullScientificName_nc=Stomatopoda",
    "familia": {
        "Coronididae": 2,
        "Gonodactylidae": 56,
        "Lysiosquillidae": 7,
        "N/A": 68,
        "Nannosquillidae": 9,
        "Odontodactylidae": 2,
        "Protosquillidae": 4,
        "Pseudosquillidae": 17,
        "Squillidae": 18,
        "Tetrasquillidae": 1
    },
    "genus": {
        "Chorisquilla": 3,
        "Echinosquilla": 1,
        "Gonodactylaceus": 13,
        "Gonodactylellus": 9,
        "Gonodactylus": 14,
        "Lysiosquilla": 3,
        "Lysiosquillina": 3,
        "Meiosquilla": 1,
        "N/A": 82,
        "Neogonodactylus": 10,
        "Odontodactylus": 2,
        "Pseudosquilla": 10,
        "Pseudosquillana": 4,
        "Pullosquilla": 8,
        "Raoulserenea": 3,
        "Squilla": 17,
        "Tetrasquilla": 1
    },
    "species": {
        "Chorisquilla spinosissima": 3,
        "Gonodactylaceus falcatus": 3,
        "Gonodactylaceus randalli": 1,
        "Gonodactylellus affinis": 1,
        "Gonodactylus bredini": 2,
        "Gonodactylus childi": 1,
        "Gonodactylus smithii": 2
    },
    "classis": {
        "Malacostraca": 184
    },
    "ordo": {
        "Stomatopoda": 184
    },
    "phylum": {
        "Arthropoda": 184
    },
    "regnum": {
        "Animalia": 184
    }
}

getSampletype (name=Actinopterygii)

{
    "method": [
        "getSampletype"
    ],
    "filters": "fullScientificName_nc=Actinopterygii",
    "sampletype": {
        "DNA": 13209,
        "specimen": 10586,
        "tissue": 15595,
        "unknown": 213
    }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment