This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
This is a simple example of WebSocket + Tornado + Redis Pub/Sub usage. | |
Do not forget to replace YOURSERVER by the correct value. | |
Keep in mind that you need the *very latest* version of your web browser. | |
You also need to add Jacob Kristhammar's websocket implementation to Tornado: | |
Grab it here: | |
http://gist.github.com/526746 | |
Or clone my fork of Tornado with websocket included: | |
http://github.com/pelletier/tornado | |
Oh and the Pub/Sub protocol is only available in Redis 2.0.0: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from scrapy import log | |
from scrapy.item import Item | |
from scrapy.http import Request | |
from scrapy.contrib.spiders import XMLFeedSpider | |
def NextURL(): | |
""" | |
Generate a list of URLs to crawl. You can query a database or come up with some other means | |
Note that if you generate URLs to crawl from a scraped URL then you're better of using a |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import shutil | |
import subprocess | |
import sys | |
import tarfile | |
import urllib2 | |
LIBXML2_PREFIX = "libxml2" | |
LIBXSLT_PREFIX = "libxslt" | |
LIBXML2_FTPURL = "ftp://xmlsoft.org/libxml2/" |