Skip to content

Instantly share code, notes, and snippets.

@kjlape
Last active August 29, 2015 14:02
Show Gist options
  • Select an option

  • Save kjlape/2b99e303140105149c14 to your computer and use it in GitHub Desktop.

Select an option

Save kjlape/2b99e303140105149c14 to your computer and use it in GitHub Desktop.
Scrape wordlists from BigIQKids.com.
require 'rubygems'
require 'nokogiri'
require 'open-uri'
urls = ['First', 'Second', 'Third', 'Fourth', 'Fifth', 'Sixth', 'Seventh', 'Eighth']
.map {|ordinal| "http://www.bigiqkids.com/SpellingVocabulary/Lessons/wordlistSpelling#{ordinal}Grade.shtml"}
8.times do |number|
page = Nokogiri::HTML(open(urls[number]))
File.open("grade#{number + 1}.wordlist", 'w') do |file|
page.css('table tr td a')
.select {|x| x.text.split(' ').count == 1}
.map {|x| x.text.downcase}
.each do |word|
file.puts word
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment