Skip to content

Instantly share code, notes, and snippets.

@rtanglao
Created September 23, 2010 02:35
Show Gist options
  • Select an option

  • Save rtanglao/592987 to your computer and use it in GitHub Desktop.

Select an option

Save rtanglao/592987 to your computer and use it in GitHub Desktop.
download flickr "Z" size photos which for 5MP originals is oddly 640x481
#!/usr/bin/env ruby
require 'json'
require 'pp'
require 'curb'
# requires serialized flickr json file to be $stdin or specified on the command line and then
# downloads the flickr "Z" files which are 640x481 for 5MP originals (481 is odd methinks!) to the current directory
def fetch_1_at_a_time(urls)
easy = Curl::Easy.new
easy.follow_location = true
urls.each do|url|
easy.url = url
filename = url.split(/\?/).first.split(/\//).last
$stderr.print "'#{url}' :"
File.open(filename, 'wb') do|f|
easy.on_progress {|dl_total, dl_now, ul_total, ul_now| $stderr.print "="; true }
easy.on_body {|data| f << data; data.size }
easy.perform
$stderr.puts "=> '#{filename}'"
end
end
end
ARGF.each_line do |line|
serializedJSON = line
flickr_data_page = JSON.parse(serializedJSON)
total = flickr_data_page["photos"]["total"].to_i
total_pages = flickr_data_page["photos"]["pages"].to_i
page = flickr_data_page["photos"]["page"].to_i
$stderr.printf "Total photos to download:%d page:%d of:%d\n", total, page, total_pages
total_to_download_for_this_page = 0
if page == total_pages
total_to_download_for_this_page = total % 250 # 250 per page
else
total_to_download_for_this_page = 250
end
urls = []
url_index = 0
0.upto(total_to_download_for_this_page - 1) do |i|
if !flickr_data_page["photos"]["photo"][i]["url_z"].nil?
urls[url_index] = flickr_data_page["photos"]["photo"][i]["url_z"]
url_index += 1
end
end
fetch_1_at_a_time(urls)
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment