Skip to content

Instantly share code, notes, and snippets.

@postmodern
Created October 12, 2010 20:15
Show Gist options
  • Save postmodern/622825 to your computer and use it in GitHub Desktop.
Save postmodern/622825 to your computer and use it in GitHub Desktop.
#!/usr/bin/env ruby
require 'open-uri'
require 'nokogiri'
require 'raingrams'
MODEL_FILE = 'youtube_corpora.ngrams'
model = if File.file?(MODEL_FILE)
Raingrams::BigramModel.open(MODEL_FILE)
else
Raingrams::BigramModel.new
end
if ARGV[0]
doc = Nokogiri::HTML(open(ARGV[0]))
puts "Training from: #{doc.at('#watch-headline-title').inner_text.strip}"
doc.search('div.comment-text').each do |node|
model.train_with_text(node.inner_text)
end
model.save(MODEL_FILE)
else
puts model.random_sentence
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment