Skip to content

Instantly share code, notes, and snippets.

@saml
Last active December 13, 2015 18:38
Show Gist options
  • Save saml/4956919 to your computer and use it in GitHub Desktop.
Save saml/4956919 to your computer and use it in GitHub Desktop.
download sitemap and urls
#!/bin/bash
while read -r x; do wget -nc "$x" && gzip -d -f "$(basename "$x")"; done < <(grep '<loc>' sitemap.xml |sed 's!\s*<.\?loc>!!g')
grep --no-filename '<loc>' sitemap[1-8].xml |sed 's!\s*<.\?loc>!!g'|sort|uniq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment