Created
October 1, 2013 18:18
-
-
Save tomstuart/6782733 to your computer and use it in GitHub Desktop.
Downloading and scraping an HTML table
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'uri' | |
require 'net/http' | |
require 'nokogiri' | |
uri = URI.parse('http://en.wikipedia.org/wiki/Doctor_Who') | |
html = Net::HTTP.get(uri) | |
document = Nokogiri::HTML.parse(html) | |
tables = document.css('table.wikitable') | |
table = tables.first | |
table.css('tr').each do |row| | |
cells = row.css('td') | |
if cells.length == 3 | |
doctor, actor, tenure = cells.map(&:text) | |
puts "#{doctor} was played by #{actor} during #{tenure}" | |
end | |
end | |
# First Doctor was played by William Hartnell during 1963–1966[note 5] | |
# Second Doctor was played by Patrick Troughton during 1966–1969[note 5] | |
# Third Doctor was played by Jon Pertwee during 1970–1974[note 5] | |
# Fourth Doctor was played by Tom Baker during 1974–1981[note 5] | |
# Fifth Doctor was played by Peter Davison during 1982–1984[note 5] | |
# Sixth Doctor was played by Colin Baker during 1984–1986 | |
# Seventh Doctor was played by Sylvester McCoy during 1987–1989, 1996[64][65][66] | |
# Eighth Doctor was played by Paul McGann during 1996 | |
# Ninth Doctor was played by Christopher Eccleston during 2005 | |
# Tenth Doctor was played by David Tennant during 2005–2010[12][note 5] | |
# Eleventh Doctor was played by Matt Smith during 2010–present |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment