Skip to content

Instantly share code, notes, and snippets.

@SeptiyanAndika
Last active August 29, 2015 14:07
Show Gist options
  • Save SeptiyanAndika/570e323a7792f589ebe3 to your computer and use it in GitHub Desktop.
Save SeptiyanAndika/570e323a7792f589ebe3 to your computer and use it in GitHub Desktop.
crawler nodejs using request, cheerio, and async
// from a https://github.com/chriso/node.io
var request = require('request')
, cheerio = require('cheerio')
, async = require('async')
, format = require('util').format;
var reddits = [ 'programming', 'javascript', 'node' ]
, concurrency = 2;
async.eachLimit(reddits, concurrency, function (reddit, next) {
var url = format('http://reddit.com/r/%s', reddit);
request(url, function (err, response, body) {
if (err) throw err;
var $ = cheerio.load(body);
$('a.title').each(function () {
console.log('%s (%s)', $(this).text(), $(this).attr('href'));
});
next();
});
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment