Last active
December 10, 2015 17:38
-
-
Save dannysmith/4468916 to your computer and use it in GitHub Desktop.
Get the number of google results for each word in a list of words.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #google-results.rb | |
| require 'nokogiri' | |
| require 'open-uri' | |
| require 'uri' | |
| rstr, q_list = [], [] | |
| #The list of words | |
| File.open('list.txt').each_line{|s| q_list << URI.escape(s.gsub("\n",''))} | |
| q_list.each do |q| | |
| doc = Nokogiri::HTML(open("http://www.google.com/search?q=#{q}")) | |
| result = doc.css('#resultStats').first.content.gsub(',', '').split(" ")[1] | |
| rstr << result | |
| puts result | |
| end | |
| #Write to a File | |
| File.open('result.txt', 'w') do |f| | |
| f.puts rstr | |
| end |
Author
Thanks!
I ran into exactly the same issue (and did the same thing) just after I posted this.
I've updated the gist accordingly.
I'm still not the best developer, but I believe your way is better. It's storing the escaped strings in the array, instead of doing the escaping on each query like I have it. Not sure what kind of processing gains there are, but your way feels a lot cleaner.
Really cool work man.
Author
I'm no expert in ruby performance, but I dont imagine it'd make any difference where the string was escaped. Storing the escaped strings in the array just feels a little cleaner to me.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is awesome! Nice work.
I was getting some errors at first when I put words with spaces in the list.txt document, so I required the uri library and then encoded the query like so:
doc = Nokogiri::HTML(open("http://www.google.com/search?q=#{URI.encode(q)}"))