Skip to content

Instantly share code, notes, and snippets.

@ifnull
Created November 17, 2013 21:11
Show Gist options
  • Save ifnull/7518315 to your computer and use it in GitHub Desktop.
Save ifnull/7518315 to your computer and use it in GitHub Desktop.
Get URLs from sitemap.xml
curl -ks http://ec2-50-17-173-58.compute-1.amazonaws.com/sitemap.xml | xpath '/urlset/url/loc/text()' 2>/dev/null | sed -E 's~http(s)?:~/\'$'\nhttp\1:~g' | grep -vE '^\s*$ ' > urls.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment