Filter | Description | Example |
---|---|---|
allintext | Searches for occurrences of all the keywords given. | allintext:"keyword" |
intext | Searches for the occurrences of keywords all at once or one at a time. | intext:"keyword" |
inurl | Searches for a URL matching one of the keywords. | inurl:"keyword" |
allinurl | Searches for a URL matching all the keywords in the query. | allinurl:"keyword" |
intitle | Searches for occurrences of keywords in title all or one. | intitle:"keyword" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
" _ _ " | |
" _ /|| . . ||\ _ " | |
" ( } \||D ' ' ' C||/ { % " | |
" | /\__,=_[_] ' . . ' [_]_=,__/\ |" | |
" |_\_ |----| |----| _/_|" | |
" | |/ | | | | \| |" | |
" | /_ | | | | _\ |" | |
It is all fun and games until someone gets hacked! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import re | |
import sys | |
from multiprocessing.dummy import Pool | |
def robots(host): | |
r = requests.get( | |
'https://web.archive.org/cdx/search/cdx\ | |
?url=%s/robots.txt&output=json&fl=timestamp,original&filter=statuscode:200&collapse=digest' % host) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import sys | |
import json | |
def waybackurls(host, with_subs): | |
if with_subs: | |
url = 'http://web.archive.org/cdx/search/cdx?url=*.%s/*&output=json&fl=original&collapse=urlkey' % host | |
else: | |
url = 'http://web.archive.org/cdx/search/cdx?url=%s/*&output=json&fl=original&collapse=urlkey' % host |