Last active
December 23, 2018 04:01
-
-
Save harrisonmalone/1746a1a509624e4955c171316e50e6b4 to your computer and use it in GitHub Desktop.
meeting i had with hugo in like may of 2018 (just after bootcamp) to discuss some ideas i had for a few different apps
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
to get all artists => https://pitchfork.com/artists/by/alpha/(a..z)/ | |
to get specific elements on each page => | |
[1] pry(main)> driver.find_element(:class, "score") | |
=> #<Selenium::WebDriver::Element:0x77d43895b7210db2 id="0.5838103215094534-1"> | |
[2] pry(main)> driver.find_element(:class, "score").text | |
=> "6.8" | |
[3] pry(main)> driver.find_element(:class, "artist-links artist-list single-album-tombstone__artist-links").text | |
\ | |
Selenium::WebDriver::Error::InvalidSelectorError: invalid selector: Compound class names not permitted | |
(Session info: chrome=66.0.3359.139) | |
(Driver info: chromedriver=2.38.552518 (183d19265345f54ce39cbb94cf81ba5f15905011),platform=Mac OS X 10.13.4 x86_64) | |
from /Users/harrisonmalone/.rbenv/versions/2.4.3/lib/ruby/gems/2.4.0/gems/selenium-webdriver-3.11.0/lib/selenium/webdriver/remote/response.rb:69:in `assert_ok' | |
[4] pry(main)> driver.find_element(:class, "artist-list").text | |
=> "Middle Kids" | |
[5] pry(main)> driver.find_element(:class, "single-album-tombstone__meta-year").text | |
=> "• 2018" | |
[6] pry(main)> year = driver.find_element(:class, "single-album-tombstone__meta-year").text | |
=> "• 2018" | |
[7] pry(main)> year | |
=> "• 2018" | |
[8] pry(main)> year.gsub!("• ","") | |
=> "2018" | |
[9] pry(main)> year | |
=> "2018" | |
to get the albums of an artist => | |
[10] pry(main)> array[0] | |
=> #<Selenium::WebDriver::Element:0x..f9ba00174faee20d6 id="0.7842375297075637-2"> | |
[11] pry(main)> band_urls = [] | |
=> [] | |
[12] pry(main)> array[0].find_atribute("href") | |
NoMethodError: undefined method `find_atribute' for #<Selenium::WebDriver::Element:0x00007faaf797dae8> | |
from (pry):12:in `<main>' | |
[13] pry(main)> array[0].atribute("href") | |
NoMethodError: undefined method `atribute' for #<Selenium::WebDriver::Element:0x00007faaf797dae8> | |
Did you mean? attribute | |
from (pry):13:in `<main>' | |
[14] pry(main)> array[0].attribute("href") | |
=> "https://pitchfork.com/reviews/albums/iceage-beyondless/" | |
[15] pry(main)> band_urls << array[0].attribute("href") | |
=> ["https://pitchfork.com/reviews/albums/iceage-beyondless/"] | |
[16] pry(main)> band_urls | |
=> ["https://pitchfork.com/reviews/albums/iceage-beyondless/"] | |
[17] pry(main)> urls = array.each do |url| | |
[17] pry(main)* array[url].attribute("href") | |
[17] pry(main)* end | |
TypeError: no implicit conversion of Selenium::WebDriver::Element into Integer | |
from (pry):18:in `[]' | |
[18] pry(main)> array | |
=> [#<Selenium::WebDriver::Element:0x..f9ba00174faee20d6 id="0.7842375297075637-2">, | |
#<Selenium::WebDriver::Element:0x1e5356c3dcf6656 id="0.7842375297075637-3">, | |
#<Selenium::WebDriver::Element:0x2b89a3de71cc8c64 id="0.7842375297075637-4">, | |
#<Selenium::WebDriver::Element:0x..fb0caa6f3cc3b78a id="0.7842375297075637-5">] | |
[19] pry(main)> array.each do |url| | |
[19] pry(main)* url.attribute("href") | |
[19] pry(main)* end | |
=> [#<Selenium::WebDriver::Element:0x..f9ba00174faee20d6 id="0.7842375297075637-2">, | |
#<Selenium::WebDriver::Element:0x1e5356c3dcf6656 id="0.7842375297075637-3">, | |
#<Selenium::WebDriver::Element:0x2b89a3de71cc8c64 id="0.7842375297075637-4">, | |
#<Selenium::WebDriver::Element:0x..fb0caa6f3cc3b78a id="0.7842375297075637-5">] | |
[20] pry(main)> array.each do |url| | |
[20] pry(main)* url.attribute("href") | |
[20] pry(main)* end | |
=> [#<Selenium::WebDriver::Element:0x..f9ba00174faee20d6 id="0.7842375297075637-2">, | |
#<Selenium::WebDriver::Element:0x1e5356c3dcf6656 id="0.7842375297075637-3">, | |
#<Selenium::WebDriver::Element:0x2b89a3de71cc8c64 id="0.7842375297075637-4">, | |
#<Selenium::WebDriver::Element:0x..fb0caa6f3cc3b78a id="0.7842375297075637-5">] | |
[21] pry(main)> array.each do |url| | |
[21] pry(main)* urls = url.attribute("href") | |
[21] pry(main)* band_array << urls | |
[21] pry(main)* end | |
NameError: undefined local variable or method `band_array' for main:Object | |
Did you mean? band_urls | |
from (pry):29:in `block in <main>' | |
[22] pry(main)> array.each do |url| | |
[22] pry(main)* urls = url.attribute("href") | |
[22] pry(main)* band_urls << urls | |
[22] pry(main)* end | |
=> [#<Selenium::WebDriver::Element:0x..f9ba00174faee20d6 id="0.7842375297075637-2">, | |
#<Selenium::WebDriver::Element:0x1e5356c3dcf6656 id="0.7842375297075637-3">, | |
#<Selenium::WebDriver::Element:0x2b89a3de71cc8c64 id="0.7842375297075637-4">, | |
#<Selenium::WebDriver::Element:0x..fb0caa6f3cc3b78a id="0.7842375297075637-5">] | |
[23] pry(main)> band_urls | |
=> ["https://pitchfork.com/reviews/albums/iceage-beyondless/", | |
"https://pitchfork.com/reviews/albums/iceage-beyondless/", | |
"https://pitchfork.com/reviews/albums/19806-iceage-plowing-into-the-field-of-love/", | |
"https://pitchfork.com/reviews/albums/17623-iceage-youre-nothing/", | |
"https://pitchfork.com/reviews/albums/15576-new-brigade/"] | |
[24] pry(main)> array.each do |url| | |
[24] pry(main)* band_urls << url.attribute("href") | |
[24] pry(main)* end | |
=> [#<Selenium::WebDriver::Element:0x..f9ba00174faee20d6 id="0.7842375297075637-2">, | |
#<Selenium::WebDriver::Element:0x1e5356c3dcf6656 id="0.7842375297075637-3">, | |
#<Selenium::WebDriver::Element:0x2b89a3de71cc8c64 id="0.7842375297075637-4">, | |
#<Selenium::WebDriver::Element:0x..fb0caa6f3cc3b78a id="0.7842375297075637-5">] | |
[25] pry(main)> band_urls | |
=> ["https://pitchfork.com/reviews/albums/iceage-beyondless/", | |
"https://pitchfork.com/reviews/albums/iceage-beyondless/", | |
"https://pitchfork.com/reviews/albums/19806-iceage-plowing-into-the-field-of-love/", | |
"https://pitchfork.com/reviews/albums/17623-iceage-youre-nothing/", | |
"https://pitchfork.com/reviews/albums/15576-new-brigade/", | |
"https://pitchfork.com/reviews/albums/iceage-beyondless/", | |
"https://pitchfork.com/reviews/albums/19806-iceage-plowing-into-the-field-of-love/", | |
"https://pitchfork.com/reviews/albums/17623-iceage-youre-nothing/", | |
"https://pitchfork.com/reviews/albums/15576-new-brigade/"] | |
[26] pry(main)> new_band_url = array.map do |url| | |
[26] pry(main)* url.attribute("href") | |
[26] pry(main)* end | |
=> ["https://pitchfork.com/reviews/albums/iceage-beyondless/", | |
"https://pitchfork.com/reviews/albums/19806-iceage-plowing-into-the-field-of-love/", | |
"https://pitchfork.com/reviews/albums/17623-iceage-youre-nothing/", | |
"https://pitchfork.com/reviews/albums/15576-new-brigade/"] | |
[27] pry(main)> new_band_url | |
=> ["https://pitchfork.com/reviews/albums/iceage-beyondless/", | |
"https://pitchfork.com/reviews/albums/19806-iceage-plowing-into-the-field-of-love/", | |
"https://pitchfork.com/reviews/albums/17623-iceage-youre-nothing/", | |
"https://pitchfork.com/reviews/albums/15576-new-brigade/"] | |
[28] pry(main)> | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
firstly talked about how to do the pace calculator | |
remember simple math | |
find pace per kilometre | |
time = 53 mins 16 seconds | |
distance = 12.03km | |
convert into seconds | |
53 * 60 = 3180 | |
3180 + 16 = 3196 seconds | |
3196 / 12.03 = 265.6691 | |
265.6691 / 60 = 4.4278 | |
4 mins (.4278 * 60) = 4 mins 26 seconds | |
pace = 4 mins 26 seconds (266 seconds) | |
time = 53 mins 16 seconds (3196 seconds) | |
now find distance | |
3196 / 266 = 12.02km | |
? strava uses some other equation | |
now find time | |
pace = 4 mins 26 seconds (266 seconds) | |
distance = 12.03km | |
266 * 12.03 = 3200 seconds | |
3200 / 60 = 53.33333 | |
53 mins (.3333 * 60) = 53 mins 20 seconds | |
everything is slightly off but the formulas are correct | |
do this in terminal with gets.chomp | |
we created the models together for the pitchfork score scraper | |
artist model with a name | |
album model with a name, year, album url, foreign key | |
used selenium-webdriver as a scraper | |
using an a..z range to get all artists | |
make infinite scroll before grabbing all the artists | |
notes for specific scrapes are in find_elements_using_selenium.txt | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require "pry-byebug" | |
require "selenium-webdriver" | |
driver = Selenium::WebDriver.for :chrome | |
driver.navigate.to "https://pitchfork.com/artists/29540-iceage/" | |
binding.pry | |
sleep(5) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment