Skip to content

Instantly share code, notes, and snippets.

@max-mapper
Created August 9, 2017 01:30
Show Gist options
  • Save max-mapper/322864dfe13dfa5e377fade1c96891af to your computer and use it in GitHub Desktop.
Save max-mapper/322864dfe13dfa5e377fade1c96891af to your computer and use it in GitHub Desktop.
get all DOIs on DataCite
npm install gunzip-maybe xml-json jsonfilter nugget -g
curl "https://search.datacite.org/sitemaps/sitemap.xml.gz" | gunzip-maybe | xml-json sitemapindex | jsonfilter sitemap.*.loc | xargs nugget -d datacite
ls datacite | xargs -I {} sh -c "cat datacite/{} | gunzip-maybe | xml-json urlset | jsonfilter url.*.loc | grep works" >> urls.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment