Skip to content

Instantly share code, notes, and snippets.

@gnarl
Created July 3, 2010 18:05
Show Gist options
  • Save gnarl/462733 to your computer and use it in GitHub Desktop.
Save gnarl/462733 to your computer and use it in GitHub Desktop.
require 'rubygems'
require 'mechanize'
require 'ap'
require 'logger'
ISHARES_URL = "http://us.ishares.com/product_info/fund/overview/"
RESOURCE = ".htm"
@tickers = ['IVV', 'IGE', 'IWV', 'IWB', 'IWR', 'IYR', 'IYT', 'IYM', 'IYE', 'SLV']
@small_cap_tickers = ['JKJ', 'JKK', 'JKL', 'IWO', 'IWM', 'IWN', 'IWC', 'IJT', 'IJR', 'IJS']
@emerging_tickers = ['EEM', 'FXI', 'EWZ', 'EIDO', 'EWY', 'EWT', 'TUR', 'INDY']
def get_page( url, agent )
agent.get(url)
end
# Given the Mechanize::Page return the Nokogiri::NodeSet containing the P/E
def get_nodes( page )
# parser returns nokogiri document
# use to_html to print a node in html
# get td nodes that have Price to Earnings Ratio included
nodes = page.parser.xpath("//table[@class='module eq-fundamentals-risk']//tr//td")
end
# Given the Mechanize::Page get the Nokigiri::NodeSet.
# Iterate over the NodeSet until the P/E value is located.
# Return the P/E value.
def pe( page )
nodes = get_nodes( page )
pe = nil
nodes.each do |n|
#ap n.child.text
if n.child.text.match(/\APrice to Earnings Ratio\s*\Z/)
pe = n.next_sibling.child.text
break
end
end
pe
end
# Given Mechanize::Page locate and return the Fund's name.
def fund_name(page)
nodes = page.parser.xpath("//span[@class='fund-name']")
ret_text = ""
nodes.each do |n|
text = n.child.text.strip
unless text.empty?
ret_text = text
end
end
ret_text
end
# Method that ties all the other methods together
def print_pe(ticker, agent)
page = get_page( ISHARES_URL + ticker + RESOURCE, agent )
nodes = get_nodes(page)
pe = pe(page)
#output P/E
puts "#{ticker.ljust(5)} #{pe.rjust(7) unless pe.nil?} #{fund_name(page)}"
end
#Main
agent = Mechanize.new { |a| a.log = Logger.new("mech.log") }
agent.user_agent_alias = 'Mac Safari'
# Output list of P/Es with the Fund's name
puts "GENERAL ETFs"
puts "----------------------------------"
@tickers.sort.each {|t| print_pe(t, agent) }
puts " "
puts "SMALL CAP ETFs"
puts "----------------------------------"
@small_cap_tickers.each {|t| print_pe(t, agent) }
puts " "
puts "EMERGING ETFs"
puts "----------------------------------"
@emerging_tickers.each {|t| print_pe(t, agent) }
#TODO time the page retrieval versus parsing page
#TODO Test Cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment