Skip to content

Instantly share code, notes, and snippets.

@rbreve
Last active December 29, 2015 11:39
Show Gist options
  • Save rbreve/7664715 to your computer and use it in GitHub Desktop.
Save rbreve/7664715 to your computer and use it in GitHub Desktop.
crawlea actas del tse
require 'open-uri'
require 'nokogiri'
for i in 1..16000
# u = "http://siede.tse.hn/app_dev.php/divulgacionmonitoreo/reporte-acta/#{i}"
u="http://s3.amazonaws.com/actas2013/icr/40/1/%05d104.jpg" % i
print u
openu=""
begin
openu = open(u)
rescue
openu = ""
print "invalid"
end
if openu != ""
print u
print "\n"
doc = Nokogiri::HTML(openu)
# if image_link = doc.at_css(".image-acta")
# uri = image_link['src']
# File.open(File.basename(uri),'wb'){ |f| f.write(open(uri).read) }
# end
File.open(File.basename(u),'wb'){ |f| f.write(open(openu).read) }
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment