Skip to content

Instantly share code, notes, and snippets.

@comewalk
Created June 28, 2012 16:23
Show Gist options
  • Save comewalk/3012277 to your computer and use it in GitHub Desktop.
Save comewalk/3012277 to your computer and use it in GitHub Desktop.
Parse sitemaps.xml for urls
#!/usr/bin/env python
from xml.etree import ElementTree
def main():
namespace = 'http://www.google.com/schemas/sitemap/0.84'
xml = ElementTree.parse('./sitemap.xml')
root = xml.getroot()
for item in root.findall('.//s:loc', namespaces=dict(s=namespace)):
print item.text
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment