Created
April 28, 2021 20:05
-
-
Save waynegraham/cdcdd2300da36f6b25b89aa81227cc48 to your computer and use it in GitHub Desktop.
Check Resource Counts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# frozen_string_literal: true | |
source "https://rubygems.org" | |
git_source(:github) {|repo_name| "https://github.com/#{repo_name}" } | |
gem "mechanize" | |
gem 'progress_bar' | |
gem 'terminal-table' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'colorize' | |
require 'mechanize' | |
require 'progress_bar' | |
require 'terminal-table' | |
@base_url = 'https://dlmenetwork.org/library/browse' | |
@agent = Mechanize.new | |
namespace :test do | |
desc 'Test landing page item counts' | |
task :landing_page do | |
rows = [] | |
bar = ProgressBar.new(categories.size) | |
@page = @agent.get(@base_url) | |
categories = @page.search("//div[contains(@class, 'category')]") | |
categories.each do |category| | |
# extract items | |
count = category.search('small').text.gsub(/item?(s?)/, '').strip.to_i | |
label = category.search('span[@class="title"]').text | |
link = category.search('a').first | |
# go to page | |
view = @agent.click(link) | |
page_count = view.search('small').text.gsub(/item?(s?)/, '').strip.to_i | |
difference = page_count - count | |
bar.puts "Checking #{label}".green | |
bar.increment! | |
rows << [label, count, page_count, difference] | |
end | |
table = Terminal::Table.new headings: ['Category', 'Index Count', 'Page Count', 'Difference'], rows: rows | |
puts table | |
end | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+--------------------------------------------------------------+-------------+------------+------------+ | |
| Category | Index Count | Page Count | Difference | | |
+--------------------------------------------------------------+-------------+------------+------------+ | |
| Manuscripts from the Free Library of Philadelphia | 87 | 87 | 0 | | |
| Abdul-Hamid II Books and Serials, Library of Congress | 321 | 3414 | 3093 | | |
| Manuscripts from the University of Pennsylvania Libraries | 692 | 692 | 0 | | |
| Persian Language Rare Materials, Library of Congress | 5232 | 5232 | 0 | | |
| Manuscripts from Haveford College | 32 | 32 | 0 | | |
| Manuscripts from Bryn Mawr College | 22 | 22 | 0 | | |
| Rare Books & Manuscripts, Columbia University Library | 220 | 220 | 0 | | |
| Manuscripts from the Library Company of Philadelphia | 4 | 4 | 0 | | |
| Sakip Sabanci Museum's Emirgân Archive | 305 | 3515 | 3210 | | |
| Abdul Hamid II Photograph Collection, Library of Congress | 1817 | 1818 | 1 | | |
| Manuscripts from the American Philosophical Society | 4 | 4 | 0 | | |
| Libraries of the Greek & Armenian Patriarchates in Jerusalem | 1002 | 1002 | 0 | | |
| Manuscripts from the Philadelphia Museum of Art | 1 | 1 | 0 | | |
| Muhammad Ali Eltaher Collection, Library of Congress | 105 | 5232 | 5127 | | |
| Manuscripts from St. Catherine's Monastery, Mt. Sinai | 1687 | 1687 | 0 | | |
| Medical Manuscripts | 189 | 31386 | 31197 | | |
| Qur'an Manuscripts | 186 | 141285 | 141099 | | |
| Persian Manuscripts | 677 | 677 | 0 | | |
| Manuscripts in Naskh Script | 1472 | 69247 | 67775 | | |
| Mathematical Manuscripts | 183 | 31386 | 31203 | | |
| Manuscripts in Muhaqqaq Script | 18 | 69247 | 69229 | | |
| Cairo Genizah Manuscripts | 22743 | 31386 | 8643 | | |
| Richard B. Parker Nile Watercraft Photographs | 75 | 5705 | 5630 | | |
| Arabic Manuscripts | 7211 | 7211 | 0 | | |
| Manuscripts in Riqa Script | 10 | 69247 | 69237 | | |
| Émile Béchard's Oriental Studies Photographs | 90 | 5705 | 5615 | | |
| Astronomy Manuscripts | 130 | 31386 | 31256 | | |
| Ottoman Turkish Manuscripts | 182 | 182 | 0 | | |
| Manuscripts in Thuluth Script | 84 | 69247 | 69163 | | |
| Manuscripts in Nastaliq Script | 515 | 69247 | 68732 | | |
| Turkish Painting: Ottoman Reformation to the Republic | 492 | 3515 | 3023 | | |
| Abidin Dino Archive | 2118 | 141285 | 139167 | | |
| Hassan Fathy Architectural Archives | 56 | 3931 | 3875 | | |
| Ramses Wissa Wassef Architectural Drawings | 48 | 5705 | 5657 | | |
| K.A.C. Creswell Photographs of Islamic Architecture | 1006 | 5705 | 4699 | | |
+--------------------------------------------------------------+-------------+------------+------------+ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment