Skip to content

Instantly share code, notes, and snippets.

@camkidman
Created June 3, 2015 03:35
Show Gist options
  • Save camkidman/6a3b139fd06d75baa571 to your computer and use it in GitHub Desktop.
Save camkidman/6a3b139fd06d75baa571 to your computer and use it in GitHub Desktop.
require 'nokogiri'
require 'open-uri'
class Scraper
def self.scrape_page url
Nokogiri::HTML(open(url))
end
def self.to_json
end
end
attr_titles = []
attr_values = []
full_attrs = {}
seltas = Scraper.scrape_page("http://kiranico.com/en/mh4u/monster/seltas")
name = seltas.css('h1[data-swiftype-name="title"]').text
description = seltas.css('h1[data-swiftype-name="body"]').text
seltas.css('.col-sm-4 h5').each { |hash_key| hash_key.downcase = Hash.new }
seltas.css('.col-sm-2 h5').each { |another_hash_key| another_hash_key.downcase = Hash.new }
#find the h5 with the hash key of the data you want
# target the table right after that
# for each TR, the first TD is your key
# The second is your value
# something like this:
#
# x = find(h5:"#{enraged}.titleize")
# table.after(x).trs.each do |table_row|
# enraged[table_row.first_td] = table_row.second_td
#
seltas.css('.col-sm-2 table td').each { |s| attr_values << s.text }
attr_titles.each do |key|
attr_values.each do |value|
full_attrs[key] = value
end
end
puts full_attrs
#=> {"HP"=>">757.6", "Limping"=>">757.6", "Capture"=>">757.6", "Habitat"=>">757.6", "Enraged"=>">757.6", "Crown Sizes"=>">757.6"}
puts attr_titles
=begin
HP
Limping
Capture
Habitat
Enraged
Crown Sizes
=end
puts attr_values
=begin
Duration
75
Attack
x1.1
Defense
x1.0
Speed
x1.1
Miniature
<554.3
Large
>708.3
King
>757.6
=end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment