Skip to content

Instantly share code, notes, and snippets.

@styks1987
Last active July 27, 2018 21:25
Show Gist options
  • Save styks1987/d56936d1a693d103005cc536ca172272 to your computer and use it in GitHub Desktop.
Save styks1987/d56936d1a693d103005cc536ca172272 to your computer and use it in GitHub Desktop.

Download the site

  • --adjust-extension - Changes the file to be .css or .html
  • -m - Mirror
  • --no-host-directories - Normally this would put the downloaded files into a subfolder of the hostname. This prevents that
  • -k - make links in downloaded HTML or CSS point to local files
wget -mk --adjust-extension --no-host-directories www.visionworksconsulting.com

.htaccess

DirectoryIndex index.html
Options -Indexes

THE BELOW DOES NOT CURRENTLY WORK PROPERLY.

find . -name "*.html" -type f -print0 | xargs -0 perl -i -pe "s/\/index.html/\//g"
find . -name "*.html" -type f -print0 | xargs -0 perl -i -pe "s/\"index\.html\"/\"\.\"/g"
find . -name "*.html" -type f -print0 | xargs -0 perl -i -pe "s/index\.html\#/\.\#/g"
find . -name "*.html" -type f -print0 | xargs -0 perl -i -pe "s/\.\.\/index\//\.\.\//g"

Looking for 404s

Run this locally

wget --no-check-certificate --spider -o ~/wget.log -e robots=off -D frontier.upupdev.net -r -p https://frontier.upupdev.net

Watch this in the access log

tail -f /var/log/apache2/other_vhosts_access.log | awk '{if($10 == "404") print $8,$10,$12;}'
srdb.cli.php -h 127.0.0.1 -n frontierwp -u droopal -p Mdr00p4l -s www.frontierlabel.com -r frontier.upupdev.net
srdb.cli.php -h 127.0.0.1 -n frontierwp -u droopal -p Mdr00p4l -s /wordpress/uploads -r /app/uploads
srdb.cli.php -h 127.0.0.1 -n frontierwp -u droopal -p Mdr00p4l -s /wp-content/uploads -r /app/uploads
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment