Skip to content

Instantly share code, notes, and snippets.

@xurenlu
Created September 1, 2014 14:32
Show Gist options
  • Select an option

  • Save xurenlu/5b80e39986d598ab8984 to your computer and use it in GitHub Desktop.

Select an option

Save xurenlu/5b80e39986d598ab8984 to your computer and use it in GitHub Desktop.
scrapy example
class MySpider(BaseSpider):
name = 'myspider'
start_urls = (
'http://example.com/page1',
'http://example.com/page2',
)
def parse(self, response):
# collect `item_urls`
for item_url in item_urls:
yield Request(url=item_url, callback=self.parse_item)
def parse_item(self, response):
item = MyItem()
# populate `item` fields
yield Request(url=item_details_url, meta={'item': item},
callback=self.parse_details)
def parse_details(self, response):
item = response.meta['item']
# populate more `item` fields
return item
#该代码片段来自于: http://www.sharejs.com/codes/python/6398
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment