Skip to content

Instantly share code, notes, and snippets.

@TrevMcKendrick
Created October 11, 2013 15:55
Show Gist options
  • Save TrevMcKendrick/6937239 to your computer and use it in GitHub Desktop.
Save TrevMcKendrick/6937239 to your computer and use it in GitHub Desktop.
class StudentScraper
attr_accessor :main_index_url
def initialize(main_index_url)
@main_index_url = main_index_url
end
def call
require 'nokogiri'
require 'open-uri'
index_page = Nokogiri::HTML(open("#{self.main_index_url}"))
students_array = index_page.css('li.home-blog-post div.blog-thumb a').collect do |link|
link.attr('href')
end
students = []
begin
students_array.each do |student|
student_website = "#{self.main_index_url}/#{student}"
student_page = Nokogiri::HTML(open("#{student_website}"))
name = student_page.css('h4.ib_main_header').text
social_media = student_page.css('div.social-icons a').collect do |link|
link.attr('href')
end
quote = student_page.css('div.textwidget h3').text
text = student_page.css('div.services p').collect do |link|
link.content.strip if link.element_children.empty?
end
text = text.compact
# Insert data stored in variables into student_hash
student = {}
student[:name] = name
student[:twitter] = social_media[0]
student[:linkedin] = social_media[1]
student[:github] = social_media[2]
student[:facebook] = social_media[3]
student[:website] = student_website
student[:quote] = quote
student[:bio] = text[0]
student[:work] = text[1]
students << student
end
rescue
end
students
end
end
#puts StudentScraper.new("http://students.flatironschool.com").call
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment