This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ''' | |
| Spider for IMDb | |
| - Retrieve most popular movies & TV series with rating of 8.0 and above | |
| - Crawl next pages recursively | |
| ''' | |
| from scrapy.contrib.spiders import CrawlSpider, Rule | |
| from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor | |
| from scrapy.selector import Selector |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import MySQLdb.cursors | |
| from twisted.enterprise import adbapi | |
| from scrapy.xlib.pydispatch import dispatcher | |
| from scrapy import signals | |
| from scrapy.utils.project import get_project_settings | |
| from scrapy import log | |
| SETTINGS = get_project_settings() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| Celery base task aimed at longish-running jobs that return a result. | |
| ``AwesomeResultTask`` adds thundering herd avoidance, result caching, progress | |
| reporting, error fallback and JSON encoding of results. | |
| """ | |
| from __future__ import division | |
| import logging | |
| import simplejson |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python | |
| """ | |
| Regex for URIs | |
| These regex are directly derived from the collected ABNF in RFC3986 | |
| (except for DIGIT, ALPHA and HEXDIG, defined by RFC2234). | |
| They should be processed with re.VERBOSE. | |
| """ |
NewerOlder