Skip to content

Instantly share code, notes, and snippets.

@jmcarp
Created November 8, 2015 05:12
Show Gist options
  • Select an option

  • Save jmcarp/d9fbc73e5d9719c04613 to your computer and use it in GitHub Desktop.

Select an option

Save jmcarp/d9fbc73e5d9719c04613 to your computer and use it in GitHub Desktop.
scraping for humans?
"""
Scrapy includes an `ItemLoader` class and associated helpers to abstract
data extraction from `Reponse` objects. But this API is verbose and easily
result in more boilerplate, not less. The following is a quick sketch of
a possible interface for using marshmallow, with a few custom fields, to
pull data from Scrapy responses.
"""
class PersonSchema(Schema):
name = fields.XPath('//title/text()', fields.Str)
hobbies = fields.CSS('.hobby', fields.List(fields.Str))
@fields.Method()
def details(self, response):
labels = response.css('.label::text').extract()
values = response.css('.value::text').extract()
return dict(zip(labels, values))
PersonSchema().dump(response)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment