Skip to content

Instantly share code, notes, and snippets.

@ds0nt
Last active September 13, 2015 05:51
Show Gist options
  • Save ds0nt/f0e254d09c5671a1b1ba to your computer and use it in GitHub Desktop.
Save ds0nt/f0e254d09c5671a1b1ba to your computer and use it in GitHub Desktop.
nodejs docker sandbox
#!/bin/bash
# exposes host dir /code to container as /toast-me
# but now that I think about it.. I could pipe it through stdin
docker run --rm \
-v /code:/toast-me \
iojs:latest \
node -e 'require("fs").readFile("/toast-me/npm-crawlers.data", "utf8", function (a,b) { console.log(b) } )'
NAME DESCRIPTION AUTHOR DATE VERSION KEYWORDS
access_lint-js PhantomJS accessibility auditor =dgalarza =ckundo… 2014-12-15 0.0.8 accessibility crawler
aci-crawler 爬虫 =aci 2015-03-03 0.1.10 crawler
alexandria A storage interface to store crawled content in… =maheshyellai… 2014-08-01 0.1.7 cache crawler elasticsearch
alfresco_crawler 2015-04-13
algolia-webcrawler Simple node worker that crawls sitemaps in order to keep an… =deuxhuithuit 2014-11-23 0.1.0 algolia web-crawler search
alinator Gera uma tabela de projetores alocados por professor do dia… =fegemo 2015-05-22 0.2.2 extrator crawler
allbower Bower dependency graph crawler =anvaka 2015-06-29 1.0.2 bower dependency graph crawler indexer
angular-translation-crawler crawl the project to get out which translation key is in… =drmabuse 2014-11-07 0.0.1-alpha angular-translation atom-shell node
arachnid node.js spider/crawler =dbalcomb 2013-07-05 0.0.0
arachnod Web Crawler for Node.js (Redis) =risyasin 2015-02-09 0.1.4 Arachnode Crawler Spider Bot Scraper Scraping
arania Node.js screen scraping and web crawling module =dreyacosta 2014-09-14 0.1.2 crawler spider
askro This tool allow you to parse, collect and traverse through… =aulizko 2014-11-26 0.1.6 parser webcrawler opendata radiation level askro SARMS АСКРО
auto-web-crawler auto web crawler =junsangpil 2015-03-13 0.0.7 autoWebCrawler node crawler webCrawler awc
autocomplete_wallcrawler create autosuggestions using the english dictionary =wallcrawler 2015-05-26 0.0.1 autocomplete english dictionary suggestions
awc auto web crawler =junsangpil 2015-03-04 0.0.7 autoWebCrawler node crawler webCrawler awc
bas Behaviour Assertion Sheets: CSS-like declarative syntax for… =Christopher… 2014-06-13 0.0.32 test css behaviour integration testing declarative client-side crawl crawler assertion sheets bas
bauer-cli Command-line interface for bauer-crawler. =yneves 2015-08-08 0.1.5 multi-thread multi-core process fork multi-process cluster crawler scrape scraper
bauer-crawler Multi-thread crawler engine. =yneves 2015-08-08 0.1.10 multi-thread multi-core process fork multi-process cluster crawler scraper
bauer-crawler-csv-to-json Plugin for bauer-crawler to convert CSV into JSON. =yneves 2015-07-25 0.1.2 multi-thread multi-core process fork multi-process cluster crawler scrape scraper
bauer-crawler-extract Plugin for bauer-crawler to extract values from JSON data. =yneves 2015-07-25 0.1.3 multi-thread multi-core process fork multi-process cluster crawler fetch
bauer-crawler-fetch Plugin for bauer-crawler to make http requests. =yneves 2015-08-09 0.1.6 multi-thread multi-core process fork multi-process cluster crawler fetch
bauer-crawler-pdf-to-text Plugin for bauer-crawler to convert PDF into text. =yneves 2015-07-25 0.1.3 multi-thread multi-core process fork multi-process cluster crawler scrape scraper
bauer-crawler-scrape Plugin for bauer-crawler to scrape content. =yneves 2015-08-09 0.1.4 multi-thread multi-core process fork multi-process cluster crawler scrape scraper
bcycle B-cycle station status crawler =chbrown 2015-07-21 0.4.1 bcycle
bigseo BigSEO is a ExpresJS module built for apps who need a SEO… =grillorafael 2015-04-17 0.6.4 seo engine express cache crawler bigseo angularjs
birdeater A command-line tool for backing up a user's public Tweets… =bcoe 2012-08-06 0.0.3 crawler twitter public-timeline
bogeyman Bogeyman is application build upon awesome PhantomJS and it… =reneklacan 2014-07-02 0.0.4 scraping crawler phantomjs api
bolero Web crawler for Node and browsers =mountainmoon 2015-01-08 0.1.5 crawler browser robot spider
bot-detector It uses the user-agents.org xml file for detecting bots. =slooker 2015-07-09 1.0.10 spiders crawlers bot detection robots.txt
brackets-store crawler it is downloading all extensions available on… =sparta-code 2014-12-20 0.2.3 bracket offline extension
cameo-crawler Crawl the web breadth-first from a seed url, statefully =chbrown 2014-05-22 0.3.0 web spider crawl recurse postgres
casper-crawler 2014-09-22
casper-sdk A development kit for casperjs =sandwind 2015-01-31 1.0.0 Casperjs crawler scrapy
check-site-for Check sites for a given content =boo1ean 2014-07-17 0.1.0 check site crawler scrapper
climate-data-crawler Data Crawler for CDO (Climate Data Online) web services =jonbern 2015-07-27 2.1.3 climate data crawler
cobweb Web auditing and analysis framework =dbalcomb 2014-04-17 0.0.7 web analysis framework middleware spider crawler
cobweb-queue Adds queuing functionality to Cobweb =dbalcomb 2014-04-17 0.0.2 web analysis framework middleware spider crawler queue
compendiovicorum-crawler A crawler that reads all the 'comune' info from italian… =davidepastore 2015-04-11 0.1.2
component-crawler registry that crawls github users =jongleberry… 2014-08-29 0.1.2
congregator-sitescraper Site scraper for Congregator =eiriklv 2014-06-28 1.0.1 scraper parser site crawler
connect-is-bot adds `isBot` to the request, when requested from a bot,… =andineck 2015-03-13 1.0.0 bot search engine spider crawler
console-crawler A simple web crawler that keeps to the domain =robcolburn 2014-03-10 0.2.0 spider crawler scraper phantom phantomjs
crawl Website crawler and differencer =mmoulton 2013-01-24 0.3.1 crawl differencer diff web website
crawl-parser Connect middleware that allow the search engines, facebook,… =maxs15 2014-06-02 0.0.6 open graph facebook parser middleware express connect seo share node
crawl-shot Site crawler and screenshot generator =connormckelvey 2015-05-01 1.0.1 website screenshots site crawler
crawl2tweet Crawl web pages to tweet new articles =hyunjo_on 2015-02-25 0.1.4 bot crawl crawler tweet
crawler Crawler is a web spider written with Nodejs. It gives you… =sylvinus =paulvalla 2015-03-17 0.4.3 dom javascript crawling spider scraper scraping jquery crawler
crawler-cat =frederikli 2015-08-14 1.0.1
crawler-diagnostic-tool * Install the package globally. =tusharmathur 2014-07-06 0.0.8 crawler diagnostics scraper
crawler-hq Crawler is a web spider written with Nodejs. It gives you… =anatoliy 2014-01-14 0.2.7 dom javascript crawling spider scraper scraping jquery
crawler-indexer The best module ever. =nhhagen 2014-02-27 0.1.2
crawler-js Opensource Framework Crawler in Node.js =rodrigorizando 2015-08-15 2.0.0 crawler scrapy hacker crawlers robots robot dom extraction nsa bigdata
crawler-master =harippe 2015-04-09 1.0.0
crawler-mod based on node-crawler =bonwei 2014-02-01 0.0.1 dom javascript crawling spider scraper scraping jquery
crawler-ninja A web crawler made for the SEO based on plugins. Please… =christophebe 2015-06-11 0.1.8 web crawler crawler seo crawler
crawler-ninja-expired Expired domains finder for crawler.ninja =christophebe 2015-05-14 0.1.1 web crawler expired domains
crawler-slave-engine =harippe 2015-04-09 1.0.0
crawler-tmp 2015-04-15
crawler-user-agents Package crawler-user-agents contains a list of of HTTP… =shumkov 2015-06-23 0.0.1
crawler-worker the worker side of nova-crawler =yi 2013-10-18 0.1.0
crawler.io crawler util =you21979 2014-10-08 0.0.1 crawler scraping
crawler2 Crawler is a web spider written with Nodejs. It gives you… =damngoto 2015-01-03 0.0.2 dom javascript crawling spider scraper scraping jquery crawler
crawlerjs A simple crawler =bhou 2014-03-19 0.0.3
crawlerx A powerful crawler support strategy to different url, the… =dlutwuwei 2015-01-13 0.0.8 crawl spider web url
crawlho Simple web crawler =gammasoft 2014-11-06 0.0.2 web crawler
crawlit A node.js crawler support custom crawl rules for special… =inaction 2014-04-14 0.1.5 crawler crawl spidder
crawljs A basic nodejs crawler. =juzerali 2013-05-19 0.0.1 crawler crawl spider
crawlstream Crawl websites in a streaming fashion =edmellum 2013-08-14 0.3.5 crawl crawler stream streaming
css-crawler Crawl web via css selector =huang47 2013-02-10 0.4.0
dcrawler DCrawler is a distribited web spider written in Nodejs and… =blikenoother 2015-01-21 0.0.8 distribited crawling spider scraper scraping jquery crawler
docpad-plugin-catalogs A crawler that scans a physical folder structure to… =knyki12 2014-08-27 2.0.0 docpad docpad-plugin catalogs products
dom-collector A simple DOM crawler based on JSON scheme. =eces 2015-08-14 1.0.6 crawler dom parser lexer cheerio request api spider phantomjs
easy-crawler curling is really esay =chilijung 2014-02-13 0.0.2
easycrawler Simple and Easy Crawler library for Node.js =dante62 2015-06-04 0.1.2 crawler spider nodejs scraper
easyspider mini spider. =sxyizhiren 2015-06-10 1.0.0 spider crawler
ebook-crawler Download ebook data from amazon, douban, duokan =pipi32167 2014-01-10 0.0.5
eightylegs Simplified api wrapper for 80legs api =edwincen 2014-08-28 0.0.1 web crawler 80legs
email-extractor extract emails address from website by following links =moein7tl 2014-12-17 0.2.9 scraper crawler extract mail email spider
express-bot Crawler(robots) decision middleware for Express =fkei 2015-07-23 0.2.1 robots crawler useragent express
express-crawler-snapshots express.js middleware for generating web page html… =domasx2 2015-07-18 0.2.0 express.js express middleware crawler googlebot phantomjs snapshots seo
express-turnout Pre-rendering Single-Page-Application for crawlers. =59naga 2015-07-02 0.0.4-alpha express middleware prerendering angular seo
extrae A web scraping framework written in coffeescript =carrasti 2014-06-29 0.1.6 coffeescript web scraper spider crawler
fight-matrix Crawls and parses FightMatrix.com =valish 2015-03-13 0.0.1 mma api fightmatrix crawler
flexible Easily build flexible, scalable, and distributed, web… =eckardt 2013-10-23 0.1.20 flexible web html paser dom document crawler spider queue distributed postgres postgresql sql database eventemitter evented router querystring middleware scalable node nodejs node.js
flickr-tag-crawler Flickr Tag Crawler =abarth500 2015-06-25 0.0.6
forage-fetch forage-fetch changed its name to norch-fetch go here… =fergie 2015-02-19 0.0.3 crawler
forvy 抓取电商网站的相关数据 =forsigner 2015-02-27 1.0.3 crawler 淘宝 天猫 京东
foundation-icon-fonts-3-glyphsearch-craw Crawls the Foundation Icon Fonts 3 website and formats it… =matiassingers 2014-09-24 0.2.1 icons glyphs fonts foundation zurb
ler
funnelweb Detect search engine crawlers by their User-Agent strings. =presentcompany 2012-12-13 0.0.1 search engine user agent crawler spider
gamer-crawler crawler of www.gamer.com.tw =xuhaojun 2015-06-23 0.0.5
gc-crawler Simple helper to bind crawl list api to list items =ddikman 2014-09-06 0.9.1
get-image-urls Scrape image urls from HTML website including CSS… =vorg 2014-11-06 1.0.4 scraping web-crawler images
github-events-crawler =darashi 2012-03-26 0.0.2
google-crawler Google crawler middleware (for SPA) =saalaa 2014-09-19 0.1.0 express middleware google escaped_fragment seo
google-play-search Crawls Google Play store apps website, returning results as… =mikkolehtinen 2014-02-18 0.0.2 googleplay search crawler play store
googlebot Express middleware that returns the resulting html after… =dvidsilva 2014-02-25 0.1.41 phantomjs SEO crawler google
gpapi use google play protobuf api in node =dweinstein 2015-08-14 1.1.15 google play store play api crawler
grabby Simple node crawler =i4got10 2014-10-26 0.1.9 crawler http grabber
great-reaper Scrap and collect data from urls, html, json and stuff.. =boo1ean 2015-02-05 0.5.3 scrapper scrap crawler collect data reaper parser crawl
gretel Follows and collects breadcrumbs accross the web =mauricebutler 2013-09-27 0.0.6 gretel grettel simple crawler spider web crawl breadcrumbs hansel
grunt-cookielist A grunt crawler to list all cookies on urls using phantomjs =nrmnrsh 2015-01-22 0.1.0 cookie cookies list phantom phantomjs crawl crawler
grunt-crawl PhantomJS-based web crawler with support for sitemap,… =mradcliffe 2015-07-17 0.2.1 gruntplugin phantom sitemap.xml crawler
grunt-ddvfont ddv for crawler =darkylin 2015-01-19 1.0.9
grunt-license-crawler Analyzes license information for multiple node.js modules… =mwittig 2015-04-14 0.0.1 license npm checker crawler gruntplugin
grunt-link-checker Finds broken links and resources on websites =chriswren 2015-04-21 0.1.0 grunt plugin gruntplugin link-checker broken links crawler
grunt-ulimit Bumps ulimit on the system so that more files can be opened. =chriswren 2014-05-28 0.0.0 grunt plugin gruntplugin ulimit broken links crawler
grunt-url-image-crawler Crawl your CSS/SCSS or HTML files for img URL's and store… =fvanharreveld 2013-12-10 1.3.1 gruntplugin css scss html img crawl images urls local file find search find
handelsregisterbekanntmachungen a module to crawl entries in the German commercial register… =maxwellium 2015-03-24 1.2.0 crawler handelsregisterbekanntmachungen handelsregister
hcrawler a hierachical web crawler with concurrency control and… =llllkkkk 2013-07-15 0.0.5 web crawler hierachical multiple level concurrency jQuery
hidemyemail A jQuery plugin that helps you to hide your email on your… =frenchfreelance 2015-08-30 0.1.0 jquery-plugin ecosystem:jquery jquery antispam email-protection hide-my-email crawlers script
hive-parser A JSON crawler that parses a JSON tree and responds to… =bingomanatee 2013-08-23 0.0.2 JSON node.js hive
hto-webcrawler 2014-09-10
http-agent A simple agent for performing a sequence of http requests… =indexzero =isaacs 2013-08-18 0.1.2 http-agent iterator http webcrawler
huntsman Super configurable async web spider =missinglink 2015-02-24 0.2.12 spider crawler crawl huntsman robot aync
hyperspider A declarative HATEOAS API crawler for node.js =jed 2012-08-09 0.1.1
ig.crawler =infogeeker 2014-05-09 0.0.0
image-crawler Scrape image urls from HTML website including CSS… =vorg 2014-11-06 1.0.1 scraping web-crawler images
images-spider simple cli-tool to get images from the website =luicfer 2015-04-19 0.0.5 crawler spider images-spider images-crawler images
img-crawler A module to download images from a given URL =radvieira 2013-02-21 0.0.2 crawl crawler image images downloader graphics spider
inka.js Reactive, configurable web crawler =wikp 2015-06-07 1.0.0 crawler reactive rx spider blog archive
is-bot Determines if a user-agent is a bot/spider/crawler. =gjohnson 2013-10-30 0.0.1 is-bot bot-regexp
is-crawler detect crawler =poying 2014-05-14 0.0.1 detect spider crawler googlebot bot
is-spider Using differents rules, try to know if a user agent string… =rodati =leandono 2015-07-04 2.0.1 bot user-agent crawler spider
isbot detects bots/crawlers/spiders via the user agent. =feroc1ty 2015-07-24 0.0.1 bot crawlers spiders googlebot useragent
itpub-crawler itpub crawler =pingjiang 2014-09-21 0.0.3 itpub-crawler node itpub crawler
jdistiller A page scraping DSL for extracting structured information… =bcoe 2015-01-09 2.0.0 crawler jQuery
jedi-crawler Lightsabing Node/PhantomJS crawler. Crawl almost… =spacenick 2013-08-22 0.0.3 phantom scraping scrape crawler crawl parse parser web
js-crawler Web crawler for Node.js =ant-ivanov 2015-07-07 0.3.2 web-crawler crawler scraping website-crawler crawling web-bot
json-web-crawler Turn the web crawler code into a single json =knovour 2015-02-03 0.0.6 json web-crawler crawl jquery
jwebquery An jQuery style web crawler(actually extend jquery). =nstal 2014-04-20 0.0.7 crawler web download
kickstarter-crawler Crawl kickstarter project data (30 + 8n data points - where… =ghostsnstuff 2015-03-16 0.1.3 kickstarter crawler crowdfunding data spider bot
knightcrawler DDM web article crawler =jwarkentin 2015-08-12 0.0.1 crawler
koa-detect-crawler Handle http request from crawler (Something like Googlebot) =poying 2014-05-14 0.0.0 detect crawler bot koa googlebot
krawler Fast and lightweight web crawler with built-in cheerio, xml… =ondrs 2014-06-23 0.3.3 dom javascript crawler crawling spider scraper scraping cheerio html xml json promise event
kreepy Simple web crawler which converts to Markdown =clinth 2014-12-17 0.0.2 spider crawler indexer
krowlr A fast asynchrone crawler =fe_lix_ 2012-01-05 1.0.0
le-manga CLI for download manga and serve it locally. =attomos 2014-05-14 0.0.6 manga downloader webcrawler
limit-request-promise http request for web scraping =you21979 2014-11-15 0.0.5 http request web scraping crawler
listal bot to download pictures from listal.com =bitoiu 2013-11-27 0.1.10 listal download pictures crawler
loki Crawl all the things =matomesc 2013-11-05 0.0.0 web crawler
magnetic Magnetic is a tool that makes it easy to fetch web pages… =neodon1014 2015-07-17 0.1.2 magnetic prerender googlebot crawl crawler search index
magnetic-example Magnetic example - Magnetic is an Express middleware that… =neodon1014 2015-07-16 0.1.0 magnetic prerender googlebot crawl crawler search index
magnetic-express Magnetic Express is an Express middleware for Magnetic - a… =neodon1014 2015-07-16 0.1.0 magnetic express middleware prerender googlebot crawl crawler search index
mangaweb Use this command line tool to download manga =mangaweb… 2015-08-31 3.2.13
maproxy A caching proxy with result parsing. =dominykas… 2015-04-14 0.0.1 proxy cache crawl crawler
markdown-crawler Returns map of markdown files in markdown =ralphsaunders 2015-02-13 0.0.1 markdown npm module
mean-seo SEO Solution for MEAN.JS applications which forwards… =roieki =amoshaviv 2014-10-11 0.0.8 phantomjs headless webkit mean meanjs seo
medusa-crawler a configureable distributed crawler =hiyijian 2014-07-10 0.0.1 distributed crawler
microcrawler Micro implementation of crawler =korczis 2015-05-30 0.0.5
mrspider simple polite crawling of the web. =vermiculite 2015-08-14 1.3.0 polite spider crawler crawling spidering scraping scraper screenscraper
neocrawler Nodejs Distribute Crawler =successage 2015-05-11 2.0.0 nodejs crawler phantomjs
netcrawler Net Crawler is a web spider written with Nodejs =zhengzhiyu 2014-07-02 0.8.6 dom javascript crawling spider scraper scraping jquery
nginxtop See what IPs are hitting your website the most in real… =dmuth 2014-05-06 0.1.2
nightcrawler Tor control interface and anonymizer =d-oliveros 2014-08-24 0.2.1 tor anonymizer anonymous tor-control
nj-npm-crawler-wrapper =xumoooo 2014-11-23 0.0.1
nlp-gs parse nginx log files, filtering crawler bot, pass to… =80xer 2015-08-05 1.0.5 nginxparser nginx logfile google spreadsheet
node-ajax-seo It deals with the most popular crawlers, redirecting them… =ericzon 2015-01-27 1.0.3 node ajax seo
node-bot Fast and Real-time extraction of web pages information… =ayms 2012-08-28 0.1.0 bot gadget widget crawler dom html js style css w3c javascript ajax
node-ckan-crawler NodeJS based crawler for CKAN sites =sogko 2014-07-30 0.0.3 crawler ckan
node-crawler Node.JS Multithreaded Web Crawler with rules to parse site =13w 2013-10-01 0.0.2 crawler jquery spider parser web site
node-crawler-server casperjs server for scraping, with it you can simple write… =shawn_ljw 2014-09-23 0.0.9 casperjs phantomjs crawler mongo job queue
node-googleplay-api use google play protobuf api in node =dweinstein 2015-08-14 1.1.14 google play store play api crawler
node-nightcrawler 2014-08-16
node-readability-cheerio node-readability的cheerio版。支持GBK、GB2312等编码的网页抓… =indooorsman 2015-05-22 1.0.0 readability GBK GB2312 web crawler cheerio
node-scrapy Simple, lightweight and expressive web scraping with Node.js =stefanmaric 2014-12-26 0.2.1 web html scraping scrape scraper scrapy crawler
node-simple-crawler Simple web-crawler for nodejs =antixrist 2015-06-07 0.1.6 spider crawler grabber parser
node-solrcrawler A solr crawler =hguillermo 2014-09-25 0.0.3
node-spider Generic web crawler powered by NodeJS =flesler 2015-02-10 0.8.1 spider crawler node nodejs web scrap crawl
node-web-scraper node-web-crawler ================ =kbepari 2014-01-17 0.0.1
node-webcrawler Crawler is a web spider written with Nodejs. It gives you… =mike442144 2015-08-10 0.5.1 dom javascript crawling spider scraper scraping jquery crawler
nodecrawler Package to crawl websites =ramseydsilva 2014-04-25 0.0.5 link request
nolimit-crawl-storing Nolimit's own crawler storing engine module =ans4175 2015-04-10 1.0.3 nolimit
nolimitid-crawl-storing Nolimit's own crawler storing engine module =ans4175 2015-06-11 1.0.12 nolimit
nolimitid-crawler-master =harippe 2015-04-09 1.0.0
nolimitid-crawler-slave-engine =harippe 2015-07-01 2.7.0
norch-fetch Fetch pure HTML from a webserver and save it to disk =fergie 2014-09-07 0.0.2 crawler
npm-license-crawler Analyzes license information for multiple node.js modules… =mwittig 2015-08-10 0.1.0 license npm checker crawler
nspider A Node.js Web Spider =xiongjia 2014-11-02 0.0.1 spider crawler
nz-npm-crawler-wrapper =two_capitals 2015-03-05 0.0.2-beta
object-crawl object crawler =stryju 2015-04-13 1.0.1 object crawl key chain
octicons-glyphsearch-crawler Crawls the GitHub Octicons website and formats it for… =matiassingers 2014-09-24 0.2.1 icons glyphs octicons fonts
orbweaver A simple webcrawler that prints out the URLs of the pages… =boutell 2014-12-07 0.1.2 webcrawler webspider spider crawler load testing load tester siege traffic
osctranslatecrawler oscTranslateCrawler =youxiachai 2013-06-23 0.0.9 ocschina translation
osmosis Web scraper for NodeJS =rc0x03 2015-06-18 0.0.8 web scraper crawler html xml parser
pagemunch A node.js wrapper for the PageMunch web crawler API =tommoor 2013-02-12 0.1.0 web crawler spider metadata parsing link url microformats schema.org json
painless-crawler A painless Node.js web crawler that simply works =skewedlines 2015-08-05 0.0.2-alpha crawler scraper jquery node
pauk web crawler =igord 2014-05-27 0.0.2 crawler scraping web spider
peter-parker A web spider app, it converts a page into a json object =johnhu 2014-09-18 0.1.1 html jsdom spider crawler
pgn-crawler Crawls pgns (chess files) for a given player. =ibihim 2015-08-21 1.0.0 chess pgn zombiejs es6
phanos Simple human like stress test tool. This tools doesn't… =dtaynov 2014-02-03 0.1.4 stresstest stress-test stress stresslogs stress-logs load-test load test performance site crawler spider
phantalyzer A PhantomJS script for running Wappalyzer over many sites… =mcheavyd 2013-12-17 0.1.80 compliance tags analytics crawler report PhantomJS Wappalyzer
phantom-crawl Web crawler for ajax applications =vmeurisse 2013-06-26 0.0.3 ajax crawler crawl SEO phantom phantomjs
phantom-scraper promise based interface to inject a script into a phantom… =diffalot 2015-02-08 0.0.3 phantomjs crawler scraper
pixnet-posts-crawler PIXNET posts crawler for node.js =pleasurazy 2015-07-09 0.2.1 pixnet crawler article post node
plentiful-files Abstract layer for managing big sets of files =f1ames 2015-01-17 0.0.5 files fs read write unlink manage crawler
plucky-crawler The error crawler that powers http://plucky.io/ =pauljohncleary 2014-05-03 0.0.1 plucky error dom javascript crawling spider scraper scraping jquery
promise-parser Web scraper =rc3 2014-10-18 0.0.4 web scraper crawler html xml parser
pub-crawler Visualize homebrew formulae in your machine. =shuhei 2014-09-23 1.2.1
qonsumer configurable API crawler for scraping feeds to static files =cryptoquick 2015-02-10 0.5.1 api crawler scraper json consumer command static grunt
rawblog-crawler DEEP BETA. Crawls pages, a bit like Jekyll =devgru 2012-03-23 0.0.2 blog crawler
repunt Simple, configurable and extensible webcrawler =jlarsson 2014-06-03 0.1.4 crawl crawler spider util utility
reqscraper Lightweight wrapper for Request and X-Ray JS. =kengz 2015-08-18 0.0.3 HTTP request scraper crawler web x-ray phantom javascript js
revealation Reveal.js slide crawler to generate slide PDFs =codemiller 2014-09-09 0.0.5 Reveal.js PDF generation presentation slides
revenant A headless browser powered by PhantomJS functions in Node.js =skewedlines 2015-08-31 0.1.2 phantom phantomjs headless browser scraper crawler
rippled-network-crawler command line interface to crawl a rippled network =souren 2015-08-11 0.0.1 ripple crawler rippled
roach A very adaptable web crawler framework. Impossible to kill. =ekryski 2014-05-28 0.1.2
roboto A web crawler for Nodejs. =jculvey 2014-08-24 0.8.2 crawler crawling spider spidering scraping scraper robot bot cheerio
routers-news A crawler for various popular tech news sources. Read… =bcoe 2014-11-04 1.0.4 crawler
ruthless Deprecated. Use 'cameo-crawler' instead. =chbrown 2015-06-09 1.0.0
salmonjs Web Crawler in Node.js to spider dynamically whole websites. =fabiocicerchia 2014-05-26 0.5.0 cli web crawler salmonjs
sandcrawler Scrape front-end, automatize back-end. =yomguithereal 2015-04-05 0.0.1 crawler scraper
sandcrawler-dashboard A handy terminal dashboard plugin for sandcrawler. =yomguithereal 2015-03-24 0.1.1 sandcrawler-plugin dashboard
sandcrawler-logger A logger plugin for sandcrawler. =yomguithereal 2015-03-24 0.1.1 logger sandcrawler-plugin
scawler A scraping crawler =weidong 2013-12-11 0.0.1 scraper crawler
scrapebp Boilerplate code for a Node.js scraper with CLI =leesei 2015-06-25 0.5.0 scraper crawler boilerplate
scraper-js From the Bay to LA, scraper will collect all of the images… =jasonaibrahim 2015-03-02 1.0.2 scrape thumbnails images facebook twitter thumbnail image scraper web crawler image web oakland
scraperrr Web crawler configured by JSON configurations defining what… =dahie 2014-03-07 0.7.0 wiki pirate spickerrr scrape spider crawl
scrapinode content driven and route based scraper =lbdremy 2013-06-22 0.2.0 scraper dom manipulation crawler jquery cheerio jsdom
seenreq A library to test if a url is crawled, usually used in a… =mike442144 2015-06-30 0.0.5 nodejs url seen test reomve duplicate url request normalize
seo-checker A library for checking basic SEO signals of a website =billpatrianakos 2015-01-16 0.3.2 SEO HTML parser crawler
seoserver <h3>Welcome!</h3> <p>Seo Server is a command line tool that… =thomasdavis 2012-10-03 1.1.6
simple-crawler 2015-06-02
simple-web-crawler Simple Web Crawler =victorkl 2015-07-03 0.0.1 web crawler spider
simplecrawler Very straigntforward web crawler. Uses EventEmitter.… =Christopher… 2015-04-21 0.5.2 simple crawler spider cache queue simplecrawler eventemitter
simplecrawler-queue-mongo MongoDB queue for Node Simple Crawler =lazurski 2014-10-07 0.1.8 SimpleCrawler Mongo Mongoose Queue
simplecrawler-referrer-filter Very straigntforward web crawler. Uses EventEmitter.… =davidsinclair 2015-04-01 0.3.1-1.1 simple crawler spider cache queue simplecrawler eventemitter
simplecrawling Crawler made simple =rafa.cesar 2014-12-28 0.0.3 crawler crawling crawlers spider web-request request requests
site-parser Parse websites with templates =eiriklv 2015-08-13 1.0.2 scraper parser site crawler
sitemap-generator Creates an XML-Sitemap by crawling a given site. =graubnla 2015-08-14 2.1.1 sitemap xml generator crawler seo google
sitemapper Parser for XML Sitemaps to be used with Robots.txt and web… =hawaiianchimp 2014-08-21 0.0.1 parse sitemap xml robots.txt sitemaps crawlers webcrawler
sitescraper Site scraper for Congregator =eiriklv 2014-06-28 1.0.1 scraper parser site crawler
slinky web crawler just for links =andrejewski 2014-08-03 0.0.1 web crawler link hyperlink sitemap
smart-crawler This module is propose to scrapy website pages and extract… =hh54188 2014-07-03 0.0.0 scrapy crawler robot robots hacker
smeagol A easy to use NodeJS HTTP web-crawler. =gserrano 2015-01-27 0.0.4 crawler scrapper
snapshooter Simple crawler for Single Page Applications =arboleya =hems 2014-03-24 0.3.9 crawler html javascript ajax render
snapshoter Recusively loads javascript pages and render then to plain… =arboleya 2013-02-08 0.1.1 crawler html javascript ajax render
sound-crawler A nodejs directory crawler that indexes audio files =codecurve 2013-10-15 0.0.0
soundcrawler SoundCrawler =================== =antonhansel 2015-01-05 1.0.2 soundcloud download
spa-crawler Crawl 100% JS single page apps with phantomjs and node. =lukekarrys 2014-09-12 1.1.0 phantomjs spa crawler
special-agent Thin wrapper around a compilation of common user agent… =yomguithereal 2015-02-26 0.1.0 user-agent ua scraper crawler
spidee Tiny web crawler =michal.szajter 2015-03-24 0.1.6 web crawler spider
spider-detector A tiny node module to detect spiders/crawlers quickly and… =michael.heuberger 2015-08-08 1.0.14 crawler detector spider bot middleware single page app expressjs
spider-engine Web crawling and scraping engine. =d-oliveros 2014-09-20 0.1.3 crawler spider scraping scrape engine event-emitter
spider-event A simple evented web scraping framework using node.js =luoyetx 2014-04-21 0.0.2 spider crawler nodejs event-driven web
spidercheck basic web crawler =bencevans 2013-05-05 0.0.0
spiderman-crawler Spiderman makes it trivial work to write a crawler. Just… =ltebean 2014-07-10 0.2.3 spider crawler
spidex A web crawler for node.js. =xadillax 2015-08-08 2.0.5 clawler
spidey Web Crawler in Node.js to spider dynamically whole websites. =fabiocicerchia 2013-12-29 0.2.2 cli web crawler spidey
splitzee-crawler Crawler is a web spider written with Nodejs. It gives you… =andreioprisan 2013-10-31 0.2.6 dom javascript crawling spider scraper scraping jquery
spoder A nodejs crawler module =menixator 2014-12-07 0.2.3 javascript crawler scraper dom cheerio
spotify-crawler Crawls all international Spotify pages and returns prices… =matiassingers 2014-09-24 0.2.0 spotify crawler scrape pricing
spotlight An object crawler/property search library that works on… =jdalton =d10 2014-12-01 1.1.0 crawl find search utility
spyderino A Web Crawler =aisaacs =vampirical 2015-07-24 0.1.8
starkana-manga-crawler Manga downloader using starkana.com. =jfmengels 2015-02-18 0.1.4 manga download starkana
status-check This package take a list of website link as csv and create… =sguha-work 2015-05-13 0.0.15 HTTP HTTPS response link-checker status-check website-status HTTP-status Web-crawler http checker status connect express
steer Use steer to control your chrome (the browser) =andreasmadsen 2015-03-14 0.6.0 google chrome chrome remote crawler webkit inspector extension browser control
suck Simple crawler tree of patterns. =kaiquewdev 2013-09-15 0.0.7 cli web-crawler pattern
tarantula nodejs crawler/spider which provides a simple interface for… =gpolitis… 2014-04-18 0.2.1 spider crawler scraper phantom phantomjs
teemo Node.js crawler frame. =youyudehexie 2014-11-06 0.0.1 crawler spider scrapy
tibia-crawler A tibia crawler module for Node. =renatorib =gpedro 2015-04-07 0.1.1 tibia-crawler tibia otserv crawler tibiajs tibiacrawler
tibia-node-crawler Tibia Crawler in nodejs =renatorib 2015-04-09 0.1.2
tpbcrawler The Pirate Bay crawler for Node.js =elmccd 2014-05-27 0.1.0 tpb the pirate bay torrent crawler
trawler Express middleware to troll bots. A combination of… =prinzhorn 2014-01-30 0.0.1 crawling crawler bots
unsplash-crawl Crawl all images from unsplash =chirag04 2015-02-05 0.0.2 unsplash crawler
unsplash-crawler Crawl all images from unsplash =duyetdev 2015-08-27 0.0.1 unsplash crawler duyetdev crawl
url-info-scraper Library to retrieve meta data (title, favicon address etc)… =pauljohncleary 2015-08-07 0.1.2 url-info-scraper url info scraper url metadata scraper title crawler favicon mime type checker info
usenet-crawler-js-api Implementation of the usenet-crawler.com api in node =larsvonqualen 2015-02-12 1.1.0
vigenere npm package for basic encoding and decoding =wallcrawler 2015-09-06 1.0.2 vigenere
w3c-validator Crawls a given site and checks for W3C validity. =graubnla 2015-08-21 2.2.0 w3c validator crawler check
waff simple crawler with cheerio =zheng1 2014-09-13 1.0.2
wangyi-game-news-crawler A crawler for http://a.163.com, which 綜合, 攻略, 趣聞,… =pleasurazy 2015-08-12 2.3.1
web-automation A crawler framework base on jsdom. =zyp001a 2014-06-25 0.0.1 node crawler scrapper
web-crawler Scalable, extensible, web crawler framework. =eckardt 2013-10-23 0.0.0 framework web html rss crawler crawling spider spidering scraper scraping router dom selector jquery cheerio distributed cloud ironio iron.io
web-htmlparser moved to web-automation. =zyp001a 2014-06-20 0.0.4 node crawler
webcheck A module to analyse websites for SEO, validation and… =atd 2014-05-28 0.3.1 SEO web crawler crawl analyzer reporter quality management change management data mining
webcrawler Crawls given domains to provide a site map of static assets =lawrencejones 2014-12-09 2.0.2 web crawler sitemap node coffee
webrobber A light weight nodejs library to helps you grab the… =tanker327 2015-08-13 0.2.0 webrobber web crawler content
website-crawler Under development, will become available soon with install… =yassine-khachlek 2015-08-10 1.0.0
wikifetch Uses cheerio to return a structured JSON representation of… =bcoe 2015-01-09 0.0.2 crawler twitter public-timeline
wikivet-crawler Crawls WikiVet for quizzes. =sugarstack 2013-04-10 0.0.4
wordpress-posts-crawler Wordpress posts crawler for node.js =pleasurazy 2015-07-17 0.3.1 wordpress crawler article post node
x-ray-crawler x-ray's crawler =mattmueller 2015-07-18 2.0.2 x-ray crawler request scrape scraper
x-ray-harvest Web Scraper Service =diogoazevedos 2015-09-10 2.0.0 x-ray cheerio web api scraper crawler
yan-crawler Yet another Node crawler. =cgavrila 2015-08-24 0.0.1 crawler
zh-vote-crawler The goal is to write a crawler to fetch and extract the… =tpreusse 2015-04-12 0.1.5
zoo-crawler The crawler module fetches the best general information for… =bastianallgeier 2014-02-10 0.0.1 crawler url open graph content type detection oembed
zx_crawler nodejs之使用 superagent 与 cheerio 完成简单爬虫 =zx 2014-10-11 1.0.0 nodejs 爬虫
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment