Skip to content

Instantly share code, notes, and snippets.

@robsalasco
Last active August 29, 2015 14:02
Show Gist options
  • Save robsalasco/35bcdffe546738ee2ff9 to your computer and use it in GitHub Desktop.
Save robsalasco/35bcdffe546738ee2ff9 to your computer and use it in GitHub Desktop.
La Tercera
require 'open-uri'
require 'net/http'
require 'fileutils.rb'
fecha = Time.now.strftime("%Y/%m/%d")
def remote_file_exists?(url)
url = URI.parse(url)
Net::HTTP.start(url.host, url.port) do |http|
return http.head(url.request_uri).code == "200"
end
end
puts "Creando directorio temporal de descarga"
FileUtils.mkdir("/tmp/latercera/")
count = 1
loop do
file = "%03d.pdf" % count.to_s
url = "http://papeldigital.info/lt/#{fecha}/01/paginas/" + file
if remote_file_exists?(url) == true
open("/tmp/latercera/" + File.basename(url), "wb") do |file|
open(url) do |uri|
puts "Descargando #{url}"
file.write(uri.read)
end
end
else
break
end
count += 1
end
out = "LATERCERA_" + Time.now.strftime("%Y-%m-%d") + ".pdf"
puts "Combinando PDF"
system("gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=#{out} /tmp/latercera/*.pdf")
puts "Eliminando archivos"
FileUtils.rm(Dir.glob('/tmp/latercera/0*.pdf'))
puts "Quitando directorio temporal"
FileUtils.rm_rf("/tmp/latercera/")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment