Skip to content

Instantly share code, notes, and snippets.

@spleenteo
Last active December 4, 2024 15:26
Show Gist options
  • Save spleenteo/6d1cd894f31ef309c684d25080ab730e to your computer and use it in GitHub Desktop.
Save spleenteo/6d1cd894f31ef309c684d25080ab730e to your computer and use it in GitHub Desktop.
# Script to download a website for offline browsing using HTTrack
# This script uses HTTrack to mirror a website, keeping all assets like CSS, JS, and images
# Usage: ./statify.sh <domain>
# Check if the domain is provided as a parameter
if [ -z "$1" ]; then
echo "Usage: $0 <domain>"
exit 1
fi
DOMAIN=$1
DOMAIN_FOLDER=$(echo $DOMAIN | awk -F/ '{print $3}')
# Use HTTrack to download the website including assets from datocms-assets.com, limited to the main domain and datocms-assets.com
httrack "$DOMAIN" -O "./$DOMAIN_FOLDER" "+*.${DOMAIN_FOLDER}/*" "+*.datocms-assets.com/*" -v -%c -%k --robots=0 -N100
# Find and replace datocms-assets.com URLs to point to local downloaded assets within the main domain folder
find ./$DOMAIN_FOLDER -name '*.html' -exec sed -i 's#https://www.datocms-assets.com/#./#g' {} \;
# Inform the user that the download is complete
echo "Website download completed."
# How to use this script:
# - Save the script as 'statify.sh'.
# - Make it executable using 'chmod +x statify.sh'.
# - Run the script by typing './statify.sh <domain>' in your terminal.
# Permissions:
# - To make the script executable, run the command: 'chmod +x statify.sh'.
# - This command adds execute permissions, allowing you to run the script directly.
# License: MIT
# Feel free to use, modify, and distribute this script as per the terms of the MIT license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment