Skip to content

Instantly share code, notes, and snippets.

@rkennesson
Last active February 2, 2017 18:58
Show Gist options
  • Select an option

  • Save rkennesson/ee247a5c6574b4c2785f to your computer and use it in GitHub Desktop.

Select an option

Save rkennesson/ee247a5c6574b4c2785f to your computer and use it in GitHub Desktop.
wget --mirror --convert-links --adjust-extension --page-requisites https://template16.carrd.co/
wget --exclude-directories="PS/3-Updates" --mirror --convert-links --adjust-extension --page-requisites www.test.com
#!/usr/bin/env bash
# for files in format http://site.com/webfile-1.ext
# for files in format http://site.com/webfile-2.ext
#start at one and count by two
for (( COUNTER=1; COUNTER<=13; COUNTER+=2 )); do
wget http://site.com/filename-$COUNTER.ext
sleep 5
done
wget -erobots=off --no-parent --wait=3 --limit-rate=20K -r -p -U "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)" -A htm,html,css,js,json,gif,jpeg,jpg,bmp http://website.com
wget -erobots=off --no-parent --wait=3 --limit-rate=50K -r -p -U "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)" -k --directory-prefix "C:\rip" --page-requisites -A htm,aspx,php,jsp,asp,zip,png,html,css,js,json,gif,jpeg,jpg,bmp http://website.com
wget -erobots=off --no-parent --wait=3 --level=1 --limit-rate=50K -r -p -U "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)" -k --directory-prefix "C:\rip" --page-requisites -A htm,aspx,php,jsp,asp,zip,png,html,css,js,json,gif,jpeg,jpg,bmp,pdf http://website.com
# uses content disposition to rename the file based on the header information
wget --content-disposition http://site.com/file
#search for links in website using development console
var el = document.getElementsByTagName('a');
arr = []
for(var i = 0; i < el.length; i++){
if(el[i].innerText === "PDF" & el[i].host === "website.com"){
arr.push(el[i].href)
}
}
#!/bin/bash
# $1 = level
# $2 = url
wget -erobots=off --no-parent --wait=3 --level="$1" --limit-rate=50K -r -p -U "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)" --page-requisites -A htm,aspx,php,jsp,asp,zip,png,html,css,js,json,gif,jpeg,jpg,bmp,pdf "$2"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment