Skip to content

Instantly share code, notes, and snippets.

@blahah
Last active April 8, 2017 16:23
Show Gist options
  • Select an option

  • Save blahah/9c790d4a79e6bea5ee0a0f1fac2632b7 to your computer and use it in GitHub Desktop.

Select an option

Save blahah/9c790d4a79e6bea5ee0a0f1fac2632b7 to your computer and use it in GitHub Desktop.
get CC-BY 3 or 4 papers from CrossRef that have XML fulltext available (example URLs / bash pipelines)
# get count of fulltext XML papers by license
http://api.crossref.org/v1/works?filter=has-full-text:true,full-text.type:text/xml&facet=t
# for a given license, get count of publishers, e.g.
http://api.crossref.org/v1/works?filter=has-full-text:true,full-text.type:text/xml,license.url:http://creativecommons.org/licenses/by/3.0/&facet=t
# for a given license and publisher, get the first 10 papers URLs and download them, e.g.
URL = "http://api.crossref.org/v1/works?filter=has-full-text:true,full-text.type:text/xml,license.url:http://creativecommons.org/licenses/by/3.0/,publisher-name:Elsevier BV&rows=10"
curl $URL | jq ".message.items[].link[].URL" | grep 'text\/xml' | wget
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment