Skip to content

Instantly share code, notes, and snippets.

@tonejito
Created June 2, 2017 23:02
Show Gist options
  • Save tonejito/bab3074590c2c6c37f6a620c5b53cc69 to your computer and use it in GitHub Desktop.
Save tonejito/bab3074590c2c6c37f6a620c5b53cc69 to your computer and use it in GitHub Desktop.
Filter out valid htdocs files to expose (old and backup) files with odd names
#!/bin/bash
# filter-htdocs-files
# Filter out valid htdocs files to expose (old and backup) files with odd names
#
# Output files:
# ~/htdocs.files: List of all files found in this script
# ~/htdocs.urls: List of all URLs where the found files can be accessed
#
# This file can also be passed to wget to check automagically the urls like so:
#
# % wget --no-verbose --spider --input-file=htdocs.urls
# 2000-01-03 00:00:00 URL: https://www.example.com/.gitignore 200 OK
#
# Andres Hernandez - tonejito
# This script is released under the BSD license
HTDOCS=/var/www/html
DOMAIN=www.example.com
SED1="\/var\/www\/html"
SED2="http:\/\/$DOMAIN"
# Remove the lines you don't need and checkout the output files
find $HTDOCS ! -type d | \
egrep -vi '\.?(README|LICENSE|CHANGELOG|INSTALL|UPGRAD(E|ING))\.?' | \
egrep -vi '(Thumbs.db|\.DS_Store)$' | \
egrep -v '\.(gitignore|htaccess|db)$' | \
egrep -v '\.(md|TXT|list|ini|var|template|sample|config|functions)$' | \
egrep -v '\.(html|tpl|php|inc|css)$' | \
egrep -v '\.(ico|png|jpe?g||gif|swf)$' | \
egrep -v '\.(ttf|woff2?|eot|otf|svg|pxm)$' | \
egrep -v '\.(ya?ml|xml|xsl)$' | \
egrep -v '\.(js(on)?|(c|t)sv|pdf|(doc|xls|ppt)x?|ods|slk)$' | \
egrep -v '\.(sh|pl|py|rb)$' | \
egrep -v '\.(pem|crt|fdf|p12)$' | \
egrep -v '\.(g?z(ip)?|tgz|(tar|sql)(\.(gz|bz2))?)$' | \
egrep -v '\.(sass|scss|less)$' | \
sort > ~/htdocs.files
sed -e "s/$SED1/$SED2/g" < ~/htdocs.files > ~/htdocs.urls
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment