Skip to content

Instantly share code, notes, and snippets.

@irmiller22
Created October 4, 2013 18:09
Show Gist options
  • Save irmiller22/6830131 to your computer and use it in GitHub Desktop.
Save irmiller22/6830131 to your computer and use it in GitHub Desktop.
Scraping Example
# Scraping Most Voted Hackernews
require 'nokogiri'
require 'open-uri'
# Get all the Posts on Hackernews
doc = Nokogiri::HTML(open('http://ycombinator.com/'))
# Figure out their vote count
stories = hacker_news.css("span.comhead")
# stories.first.class
# stories.class
# stories.first.parent.css("a").to_s
# stories.first.parent.css("a").to_text
stories.each do |source_doc|
title = source_doc.parent.css("a").to_text
href = source.doc.parent.css("a").attr("href").to_s
#raise href.inspect
stories << { :title, :href => href }
# get story array to mirror subtext array
# Sort that array by vote count
vote_counts = hacker_news.css("td.subtext.span").collect {|e| e.text }
vote_counts.each_with_index do |vote, idx|
stories[i][:vote_count]
stories.each do |story_hash|
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment