Skip to content

Instantly share code, notes, and snippets.

@rickychilcott
Created October 18, 2016 02:26
Show Gist options
  • Save rickychilcott/ab89607ed15e32796c8714c9d7b67e1c to your computer and use it in GitHub Desktop.
Save rickychilcott/ab89607ed15e32796c8714c9d7b67e1c to your computer and use it in GitHub Desktop.
Quick and dirty script to fetch data from Heroku's Parnter page
require 'rubygems'
require 'mechanize'
require 'csv'
require 'pry'
A = Mechanize.new { |agent|
agent.user_agent_alias = 'Mac Safari'
}
def fetchDetailsFor(url)
image_url = ''
name = ''
slogan = ''
description = ''
address = ''
website = ''
github = ''
twitter = ''
skills = []
A.get(url) do |page|
name = page.search(".title").text
slogan = page.search(".tagline").text
address = page.search("adress").text
description = page.search("p").text
website = page.search(".website").attribute('href').value unless page.search(".website").empty?
github = page.search(".github").attribute('href').value unless page.search(".github").empty?
twitter = page.search(".twitter").attribute('href').value unless page.search(".twitter").empty?
skills = page.search(".pill").map {|s| s.text}.join(", ")
image_url = page.search("img").attribute('src').value
end
[name, slogan, address, description, website, github, twitter, skills, image_url]
end
output = CSV.open("~/Downloads/partners.heroku.com.clean.csv", "wb")
output << ["Name", "Slogan", "Address", "Description", "Website", "Github", "Twitter", "Skills"]
CSV.foreach("~/Downloads/partners.heroku.com_17th_Oct_2016.csv") do |row|
row[0] ||= ""
if row[0].include? "partners.heroku.com"
url = row[0]
output << fetchDetailsFor(url)
end
end
output.close
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment