The following gist is an extract of the article Building a simple crawler. It allows crawling from a URL and for a given number of bounce.
from crawler import Crawler
crawler = Crawler()
crawler.crawl('http://techcrunch.com/')
#! /bin/sh | |
### BEGIN INIT INFO | |
# Provides: elasticsearch | |
# Required-Start: $all | |
# Required-Stop: $all | |
# Default-Start: 2 3 4 5 | |
# Default-Stop: 0 1 6 | |
# Short-Description: Starts elasticsearch | |
# Description: Starts elasticsearch using start-stop-daemon | |
### END INIT INFO |
The following gist is an extract of the article Building a simple crawler. It allows crawling from a URL and for a given number of bounce.
from crawler import Crawler
crawler = Crawler()
crawler.crawl('http://techcrunch.com/')