Last active
December 27, 2023 13:29
-
-
Save slint/eb4bcb8bc572a37b9650b8c55e759fc9 to your computer and use it in GitHub Desktop.
Fetch Zenodo stats for communities and queries
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# To use this script you need to have "curl" and "jq" installed. | |
COMMUNITY_ID="community_id" | |
OUTPUT_CSV="${COMMUNITY_ID}_community_stats.csv" | |
# Create CSV file header | |
echo "URL,DOI,Title,PublicationDate,Views,Downloads" > "${OUTPUT_CSV}" | |
# Download all records (including multiple versions) from the community (max 10k records) | |
curl -s -G "https://zenodo.org/api/records/" -d "size=10000" -d "all_versions=true" -d "communities=${COMMUNITY_ID}" \ | |
`# Process with jq to extract the required fields` \ | |
| jq -r '.hits.hits[] | [.links.self, .metadata.doi, .metadata.title, .metadata.publication_date, .stats.views, .stats.downloads] | @csv' \ | |
>> "${OUTPUT_CSV}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# To use this script you need to have "curl" and "jq" installed. | |
QUERY="grants.code:12345" | |
OUTPUT_CSV="[${QUERY}]_query_stats.csv" | |
# Create CSV file header | |
echo "URL,DOI,Title,PublicationDate,Views,Downloads" > "${OUTPUT_CSV}" | |
# Download all records (including multiple versions) for the query (max 10k records) | |
curl -s -G "https://zenodo.org/api/records/" -d "size=10000" -d "all_versions=true" -d "q=${QUERY}" \ | |
`# Process with jq to extract the required fields` \ | |
| jq -r '.hits.hits[] | [.links.self, .metadata.doi, .metadata.title, .metadata.publication_date, .stats.views, .stats.downloads] | @csv' \ | |
>> "${OUTPUT_CSV}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# To use this script you need to have "curl" and "jq" installed. | |
USER_ID="12345" | |
OUTPUT_CSV="${USER_ID}_user_stats.csv" | |
# Create CSV file header | |
echo "URL,DOI,Title,PublicationDate,Views,Downloads" > "${OUTPUT_CSV}" | |
# Download all records (including multiple versions) for the user (max 10k records) | |
curl -s -G "https://zenodo.org/api/records/" -d "size=10000" -d "all_versions=true" -d "q=owners:${USER_ID}" \ | |
`# Process with jq to extract the required fields` \ | |
| jq -r '.hits.hits[] | [.links.self, .metadata.doi, .metadata.title, .metadata.publication_date, .stats.views, .stats.downloads] | @csv' \ | |
>> "${OUTPUT_CSV}" |
I manage to parse some communities that way (most of the smaller ones, anyway), but not all. E.g. like this :
curl -s -G "https://zenodo.org/api/records?communities=${COMMUNITY_ID}" -d "size=10000" -d "all_versions=true" | jq -r '.hits.hits[] | [.metadata.doi, .metadata.title, .stats.unique_views, .stats.unique_downloads, .metadata.communities[][]] |@csv' >> "${OUTPUT_CSV}"
works with community 'lory_unilu_tf' but not 'lory_unilu'. The first one is a smaller community, the second a bigger one. So far the biggest community that I managed was about 500 records, but fails at 600 records. I've tried leaving off the 'metadata.title' in the hope that some wonky character is generating the error, but that has not helped.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, removing the front slash (/) and reducing the size does not help in my case - the original error is gone, but the wrong documents are fetched. The only workaround so far is to pack all options in the url:
curl -G "https://zenodo.org/api/records?communities=operaseu&all_versions=true&size=10000" \