Last active
January 19, 2022 14:01
-
-
Save chales/6967493 to your computer and use it in GitHub Desktop.
A couple of simple options to parse sitemap.xml to warm the cache or for other actions such as generating memory_profiler checks.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This can be added to your cron job to run right after Drupal's cron or combine them into a single command so | |
# that it automatically executes when the cron run completes. | |
wget -q http://www.example.com/sitemap.xml -O - | egrep -o "http://www\.example\.com[^<]+" | wget -q -i - -O /dev/null --wait 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
DOMAIN='example.com' | |
# One liner with wget. This can be used on the cli, just replace $DOMAIN with the domain directly. | |
wget -q http://$DOMAIN/sitemap.xml --no-cache -O - | egrep -o "http://$DOMAIN[^<]+" | wget --spider -i - --wait 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
DOMAIN='www.example.com' | |
# wget and cURL | |
wget -q http://$DOMAIN/sitemap.xml --no-cache -O - | egrep -o "http://$DOMAIN[^<]+" | while read line; | |
do | |
time curl -A 'Cache Warmer' -s -L $line > /dev/null 2>&1 | |
echo $line | |
done | |
# If you have multiple languages add another, e.g. http://$DOMAIN/es/sitemap.xml |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Inspired by your gist I created: https://gist.github.com/JPustkuchen/f185bee60c5a36211cdf6f1c8f6deebe
Of course the best would be to add an if that checks if the second level only contains sub-sitemap-pages and then jumps into the loop. But I always needed sub-sitemaps so far.