Skip to content

Instantly share code, notes, and snippets.

@slint
Last active November 8, 2018 12:48
Show Gist options
  • Save slint/1941591bea26dc5fa2192a41e85543a5 to your computer and use it in GitHub Desktop.
Save slint/1941591bea26dc5fa2192a41e85543a5 to your computer and use it in GitHub Desktop.
# Fetch events from DataCite
curl -G https://api.datacite.org/events \
-d source-id=crossref \
-d relation-type-id=references \
-d prefix=10.5281 > datacite-0.json
# Get next page from ".links.next" link
curl "$(jq -r .links.next datacite-0.json)" > datacite-1.json
# Fetch events from Crossref
curl -G https://api.eventdata.crossref.org/v1/events \
-d source=crossref \
-d relation-type=references \
-d obj-id.prefix=10.5281 > crossref-0.json
# Fetch next page using cursor
curl -G https://api.eventdata.crossref.org/v1/events \
-d cursor=$(jq -r '.message["next-cursor"]' crossref-0.json) \
-d source=crossref \
-d relation-type=references \
-d obj-id.prefix=10.5281 > crossref-1.json
# Compare totals
jq .meta.total datacite-0.json
jq '.message["total-results"]' crossref-0.json
# Normalize results to a [id, subj_id, obj_id, occurred_at] tuple
jq -r '.[] | .data[] | [.id, (.relationships.subj.data.id | ascii_downcase), (.relationships.obj.data.id | ascii_downcase), .attributes["occurred-at"][:10] ] | @csv' -s datacite-*.json >> datacite-unique.csv
sort -u -o datacite-unique.csv datacite-unique.csv
jq -r '.[] | .message.events[] | [.id, (.subj_id | ascii_downcase), (.obj_id | ascii_downcase), .occurred_at[:10]] | @csv' -s crossref-*.json > crossref-unique.csv
sort -u -o crossref-unique.csv crossref-unique.csv
# Show links that are only available in Crossref
comm -2 -3 crossref-unique.csv datacite-unique.csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment