Skip to content

Instantly share code, notes, and snippets.

@mdeous
Created April 8, 2011 13:40
Show Gist options
  • Save mdeous/909853 to your computer and use it in GitHub Desktop.
Save mdeous/909853 to your computer and use it in GitHub Desktop.
sitemap spider
from scrapy.contrib.spiders import XMLFeedSpider
class SitemapSpider(XMLFeedSpider):
name = "sitemap"
namespaces = [
# ('', 'http://www.sitemaps.org/schemas/sitemap/0.9'),
('video', 'http://www.sitemaps.org/schemas/sitemap-video/1.1'),
]
start_urls = ["http://mattoufoutu.rafale.org/sample_sitemap.xml"]
itertag = 'url'
def parse_node(self, response, node):
print "PARSING %s" % str(node)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment