Last active
February 25, 2025 20:42
-
-
Save pgwillia/e96d9112e129e3e551d8e410a2b70628 to your computer and use it in GitHub Desktop.
ERA Open Access report
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| year | count | community | collection | collection url | |
|---|---|---|---|---|---|
| 2025 | 22 | The department of Cat | The annals of 'Cat International' | http://era.lvh.me:3000/communities/16ebf0dd-19f0-4ee7-a8f5-dfd533652a81/collections/786ee7fe-c345-4a92-8a3e-833724737f62 | |
| 2025 | 2 | The department of Cat | Theses about cats | http://era.lvh.me:3000/communities/16ebf0dd-19f0-4ee7-a8f5-dfd533652a81/collections/618912fe-0d3f-470b-bb99-1ee7bacd69a3 | |
| 2025 | 22 | The department of Unicorn | The annals of 'Unicorn International' | http://era.lvh.me:3000/communities/d49ff70a-24fd-42a5-97d0-ac1c7747f56e/collections/99df8858-f7bc-4c25-9dc2-cc7181ced3fc | |
| 2025 | 22 | Special reports about dogs | The annals of 'Dog International' | http://era.lvh.me:3000/communities/0917355b-39a7-4484-bdc7-43e9a6a0ee39/collections/2b8f986f-5123-4148-82fb-56794da56cc9 | |
| 2025 | 2 | Special reports about dogs | Theses about dogs | http://era.lvh.me:3000/communities/0917355b-39a7-4484-bdc7-43e9a6a0ee39/collections/61618a4e-16a7-4814-bd23-80afbc38f396 | |
| 2025 | 2 | The department of Unicorn | Theses about unicorns | http://era.lvh.me:3000/communities/d49ff70a-24fd-42a5-97d0-ac1c7747f56e/collections/cc41166d-9f59-4944-aa7c-5bc4cd57bed1 | |
| 2025 | 22 | Special reports about hamburgers | The annals of 'Hamburger International' | http://era.lvh.me:3000/communities/c7e8e2c1-5448-4277-ae63-2d0158bef0a4/collections/a45f7a22-b6b3-4cc0-bbab-7de707b2ca01 | |
| 2025 | 2 | Special reports about hamburgers | Theses about hamburgers | http://era.lvh.me:3000/communities/c7e8e2c1-5448-4277-ae63-2d0158bef0a4/collections/242b3e37-3ebd-4a4e-89bb-780b1f1d8697 | |
| 2025 | 22 | The department of Librarian | The annals of 'Librarian International' | http://era.lvh.me:3000/communities/a9b73242-610a-4939-99c5-8aaed1e3b2f6/collections/3a1db360-5368-4985-8d04-90051f7ccd5a | |
| 2025 | 2 | The department of Librarian | Theses about librarians | http://era.lvh.me:3000/communities/a9b73242-610a-4939-99c5-8aaed1e3b2f6/collections/7f2ac0dd-64d4-48ae-97dd-6e0e02d81048 | |
| 2000 | 3 | Special reports about dogs | The annals of 'Dog International' | http://era.lvh.me:3000/communities/0917355b-39a7-4484-bdc7-43e9a6a0ee39/collections/2b8f986f-5123-4148-82fb-56794da56cc9 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| CSV.open('open_access.csv', 'wb') do |csv| | |
| csv << ['year', 'count', 'community', 'collection', 'collection url'] | |
| open_access_items = Item.select(:member_of_paths, :record_created_at) | |
| .where(visibility: JupiterCore::VISIBILITY_PUBLIC) | |
| .group_by {|item| item.record_created_at.year } | |
| open_access_items.each do |year, items| | |
| items.map(&:member_of_paths).flatten.tally.each do |member_of_path, count| | |
| community_id, collection_id = member_of_path.split('/') | |
| csv << [year, count, Community.find(community_id).title, Collection.find(collection_id).title, | |
| Rails.application.routes.url_helpers.community_collection_url(community_id, collection_id)] | |
| end | |
| end | |
| end |
Author
Looks good! Had a quick look at this as my day is ending, a deeper dive could be done.
I think a final tally of the total item count by year (throughout all collections) may be beneficial although that could easily be done after the fact in sheets/excel if needed.
Some optimisation could be done, however it may not be worthwhile for a one-time script. Something that comes to mind would be utilising ActiveRecord's in_batches/find_in_batches for line 3/6 however that would require testing (and setting the batch size as the default is 1000, which wouldn't test well in the test environment.)
Given that this is a one-time script and the time constraint I don't see any issues with this as is.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
record_created_atis the date created in ERA - see https://docs.google.com/document/d/1-d0eM0kn79h1DC6X4aO_0IRzq7qKoDcymqdcW7RASV8/edit?tab=t.0#heading=h.q9yzq2h9giy5