Skip to content

Instantly share code, notes, and snippets.

@mattn
Created April 6, 2009 04:48
Show Gist options
  • Save mattn/90638 to your computer and use it in GitHub Desktop.
Save mattn/90638 to your computer and use it in GitHub Desktop.
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
use WWW::Mechanize::AutoPager;
use Web::Scraper;
use YAML::Syck;
my $mech = WWW::Mechanize->new(timeout => 10);
$mech->add_header('Accept-Encoding', 'identity');
$mech->autopager->load_siteinfo();
my $greeting = scraper { process 'div.storycontent p', text => 'TEXT'; };
my $u = 'http://alpha.mixi.co.jp/blog/?author=3';
while($u) {
my $res = $greeting->scrape( $mech->get($u)->decoded_content );
print Dump $res;
last if ( $@ or !defined($mech->next_link) );
$u = $mech->next_link;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment