Skip to content

Instantly share code, notes, and snippets.

@keroxil
Last active December 10, 2015 07:28
Show Gist options
  • Select an option

  • Save keroxil/4400976 to your computer and use it in GitHub Desktop.

Select an option

Save keroxil/4400976 to your computer and use it in GitHub Desktop.
from scrapy.spider import BaseSpider
class DmozSpider(BaseSpider):
name = "dmoz"
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
# Do something useful here with the response,
# e.g. extract structured data from the page
pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment