Skip to content

Instantly share code, notes, and snippets.

@toritori0318
Created July 9, 2011 15:19
Show Gist options
  • Save toritori0318/1073660 to your computer and use it in GitHub Desktop.
Save toritori0318/1073660 to your computer and use it in GitHub Desktop.
はてなフォトライフから画像ぶっこぬき
use strict;
use Scrappy;
my $s = Scrappy->new;
# 1秒間隔でクロール
$s->pause(1);
# クロール実行
$s->crawl('http://f.hatena.ne.jp/hotfoto',
# 1. root
'/hotfoto' => {
'//ul[ @class="fotolist" ]/li/a' => sub {
my ($self, $item) = @_;
# 詳細リンクをキューに追加
$self->queue->add($item->{href});
}
},
# 2. 写真詳細
'/:user/:id' => {
'//img[ @class="foto" ]' => sub {
my ($self, $item ) = @_;
my @file = split(/\//,$item->{src});
# 画像をローカルにダウンロード
$s->get($item->{src})->store('/tmp/' . $file[$#file]);
}
},
);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment