Skip to content

Instantly share code, notes, and snippets.

@chales
Last active January 19, 2022 14:01
Show Gist options
  • Save chales/6967493 to your computer and use it in GitHub Desktop.
Save chales/6967493 to your computer and use it in GitHub Desktop.
A couple of simple options to parse sitemap.xml to warm the cache or for other actions such as generating memory_profiler checks.
# This can be added to your cron job to run right after Drupal's cron or combine them into a single command so
# that it automatically executes when the cron run completes.
wget -q http://www.example.com/sitemap.xml -O - | egrep -o "http://www\.example\.com[^<]+" | wget -q -i - -O /dev/null --wait 1
#!/bin/bash
DOMAIN='example.com'
# One liner with wget. This can be used on the cli, just replace $DOMAIN with the domain directly.
wget -q http://$DOMAIN/sitemap.xml --no-cache -O - | egrep -o "http://$DOMAIN[^<]+" | wget --spider -i - --wait 1
#!/bin/bash
DOMAIN='www.example.com'
# wget and cURL
wget -q http://$DOMAIN/sitemap.xml --no-cache -O - | egrep -o "http://$DOMAIN[^<]+" | while read line;
do
time curl -A 'Cache Warmer' -s -L $line > /dev/null 2>&1
echo $line
done
# If you have multiple languages add another, e.g. http://$DOMAIN/es/sitemap.xml
@kaziqta
Copy link

kaziqta commented Nov 1, 2016

very good, thank you ! :)

Can the first one be directly used in crontab?

@JPustkuchen
Copy link

JPustkuchen commented Jan 15, 2018

Thank you very much. Simple and cool!
Does this also work if the sitemap.xml contains sub-sitemaps? (split-up) That would be typical on larger websites.

@JPustkuchen
Copy link

JPustkuchen commented Jan 15, 2018

Inspired by your gist I created: https://gist.github.com/JPustkuchen/f185bee60c5a36211cdf6f1c8f6deebe
Of course the best would be to add an if that checks if the second level only contains sub-sitemap-pages and then jumps into the loop. But I always needed sub-sitemaps so far.

@paweldesign
Copy link

very good, thank you ! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment