Skip to content

Instantly share code, notes, and snippets.

@aisuii
Created June 29, 2011 06:38
Show Gist options
  • Save aisuii/1053275 to your computer and use it in GitHub Desktop.
Save aisuii/1053275 to your computer and use it in GitHub Desktop.
geinou2dl
概要:
http://blog.livedoor.jp/geinow2 のエントリ URL から、画像とってくる。誰得。
必要なの:
- ruby 1.9 くらい
- bundler
- bundle install して入るもの
使い方:
dl.rb
usage: ruby dl.rb 'http://blog.livedoor.jp/geinow2/archives/3339639.html'
http://blog.livedoor.jp/geinow2 のエントリ URL を引数に、画像をおとしてくる。data ディレクトリに保存する。
list.rb
usage: ruby list.rb
http://blog.livedoor.jp/geinow2 の最近のエントリを標準出力。
server.rb
usage: ruby server.rb
port 9292 で簡易的にファイル見る。
# coding: utf-8
require 'open-uri'
require 'fileutils'
require 'json'
require 'nokogiri'
# http://blog.livedoor.jp/geinow2 のエントリ URL から、画像とってくる
if ARGV.empty?
puts "usage: ruby #{__FILE__} 'http://blog.livedoor.jp/geinow2/archives/3339639.html'"
exit
end
data_path = "data"
entry_url = ARGV[0]
save_dir = File.basename(entry_url, ".*")
file_list = []
missing_file_list = []
wait_time = 0.5
path = File.join(data_path, save_dir)
FileUtils.mkdir_p path
Dir.chdir path
puts "mkdir -p #{path} && cd #{path}"
puts "dl images from #{entry_url}"
noko = Nokogiri::HTML(open(entry_url))
noko.css('.blogbody a[href$=jpg]').each do |elem|
remote_file_name = elem.attr('href')
local_file_name = File.basename(remote_file_name)
begin
puts "dl #{remote_file_name} and wait #{wait_time} sec."
open(local_file_name, 'wb'){ |f| f.write(open(remote_file_name).read) }
file_list << {filename: local_file_name, original: remote_file_name}
rescue
missing_file_list << {filename: local_file_name, original: remote_file_name}
puts "miss #{remote_file_name}"
end
sleep wait_time
end
open('info.json', 'wb'){|json| json.write({ url: entry_url, title: noko.css('title').text }.to_json) }
open('list.json', 'wb'){|json| json.write(file_list.to_json) }
open('errors.json', 'wb'){|json| json.write(missing_file_list.to_json) }
gem 'nokogiri'
gem 'rack'
gem 'thin'
# coding: utf-8
require 'open-uri'
require 'nokogiri'
atom = Nokogiri::XML(open('http://blog.livedoor.jp/geinow2/atom.xml'))
entries = atom.css('entry').map do |entry|
{link: entry.css('link').attr('href').value, title: entry.css('title').text}
end
entries.each do |entry|
puts "%s: %s" % entry.values_at(:link, :title)
end
require 'rack'
HANDLER = Rack::Handler::Thin
PORT = 9292
DATA_DIR = "data"
HANDLER.run Rack::Directory.new(DATA_DIR), :Port => PORT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment