Skip to content

Instantly share code, notes, and snippets.

View robhammond's full-sized avatar

Rob Hammond robhammond

View GitHub Profile
@robhammond
robhammond / mojo-crawler
Last active January 16, 2018 09:17
Event source mojolicious crawler
package WebCrawl::Crawl;
use Mojo::Base 'Mojolicious::Controller';
sub crawl {
my $self = shift;
# Increase inactivity timeout for connection a bit
Mojo::IOLoop->stream($self->tx->connection)->timeout(15);
# Change content type
@robhammond
robhammond / url-analysis.pl
Created January 24, 2013 17:30
Analysis of keywords in different parts of URLs. You'll need to extract the URLs from Google's SERPs yourself and paste into the @urls array.
#!/usr/bin/env perl
use strict;
use Modern::Perl;
use Domain::PublicSuffix;
use URI;
my $keyword = 'broadband';
my @urls = qw(http://www.moneysupermarket.com/broadband/
http://www.o2.co.uk/broadband
@robhammond
robhammond / gist:4155823
Created November 27, 2012 17:46
Basic Mojo UserAgent parallel crawler
#!/usr/bin/env perl
use Modern::Perl;
use Mojo::UserAgent;
use Mojo::IOLoop;
use Mojo::URL;
# FIFO queue
my $seed_url = 'http://www.google.co.uk/';
my @urls = ($seed_url);
@robhammond
robhammond / perl-dep-scanner.pl
Created July 24, 2012 08:52
Perl dependency scanner
#!/usr/bin/env perl
###
# This snippet will recursively scan directories for use/require statements
# in perl scripts or modules, making it easy to build a list of dependencies
# from code you've inherited or neglected to document.
###
use Perl::PrereqScanner;
use File::Spec::Functions qw( catfile );
use File::Find qw(finddepth);
use Data::Dumper;