Skip to content

Instantly share code, notes, and snippets.

@githubfoam
Last active March 12, 2025 14:05
Show Gist options
  • Save githubfoam/7ff5c4e853c563570e57812516b0f31b to your computer and use it in GitHub Desktop.
Save githubfoam/7ff5c4e853c563570e57812516b0f31b to your computer and use it in GitHub Desktop.
URLscan Dorking Techniques
#=====================================================================
Dorking is the practice of crafting advanced search queries to uncover publicly available but sensitive information. URLscan.io allows users to filter through indexed scans using Lucene-based query syntax
Examples:
page.domain:example.com → Searches all scans related to example.com
page.ip:192.168.1.1 → Finds all URLs hosted on this IP
2. Common URLscan Dorking Queries
a) Finding Open Admin Panels
page.title:"Admin" OR page.title:"Dashboard" OR page.title:"Login"
Finds websites with admin dashboards exposed.
page.url:"/admin"
Lists URLs ending in /admin, commonly used for login panels.
b) Exposed API Keys & Secrets
page.body:"apikey" OR page.body:"api_key"
Looks for leaked API keys in scanned web pages.
page.url:"key=" OR page.url:"token="
Identifies URLs containing authentication keys or tokens.
c) Identifying Open Directories
page.url:"index of /"
Detects misconfigured open directories listing files.
page.url:"/.git"
Finds exposed Git repositories that could contain source code and credentials.
d) Searching for Specific Technologies
page.headers:"server: apache"
Lists all scanned websites using the Apache web server.
page.body:"WordPress"
Finds WordPress-powered sites that may be vulnerable to outdated plugins.
e) Hunting for Potentially Malicious URLs
task.tags:phishing
Displays flagged phishing websites.
page.body:"2FA" AND task.tags:malware
Searches for phishing sites targeting 2FA (two-factor authentication).
Combine Queries with OSINT
Use Google Dorking (site:urlscan.io) to refine results in Google searches.
Correlate data with other tools like Shodan, Censys, or VirusTotal.
Use Filters to Reduce Noise
Use after: and before: filters to limit results by date.
Use exclusions (-task.tags:phishing) to remove unwanted results.
#=====================================================================
Understanding URLscan.io Search Syntax:
URLscan.io uses a Lucene-based search syntax, which provides flexibility and precision. Key operators include:
field:value: Searches for a specific value within a field.
AND, OR, NOT: Boolean operators for combining search terms.
*: Wildcard character for partial matches.
"": Encloses phrases for exact matches.
-: Excludes results containing the specified value.
Common URLscan.io Dorking Techniques:
Targeting Specific Domains or Subdomains:
domain:example.com: Finds all scans related to example.com.
domain:*.example.com: Finds scans for all subdomains of example.com.
domain:subdomain.example.com: Finds scans for the specific subdomain.
Filtering by IP Address:
ip:192.168.1.100: Finds scans associated with the specified IP address.
ip:192.168.1.*: Finds scans within a specific IP range.
Searching for Specific Technologies or Headers:
headers.server:nginx: Finds scans where the Server header contains "nginx."
headers."X-Powered-By":PHP: Finds scans with a specific X-Powered-By header.
technology:wordpress: Finds scans that have wordpress detected.
technology:joomla: Find scans that have joomla detected.
Filtering by Status Code:
status:200: Finds scans with a successful HTTP status code.
status:404: Finds scans with a "Not Found" status code.
Searching for Specific Content or Strings:
page.title:"Admin Panel": Finds scans where the page title contains "Admin Panel."
page.body:"password": Finds scans where the page body contains the string "password." Be very careful with sensitive searches.
page.body:"<script>alert": searches for inline javascript alerts.
Filtering by Country or ASN:
country:US: Finds scans from the United States.
asn:13335: Finds scans from Cloudflare's ASN.
Searching for Specific Resources or Files:
resource:"config.php": Finds scans that include the resource "config.php."
resource:"/admin/": Finds scans where the url path contains /admin/.
resource:".git": finds scans that contain .git directories.
Filtering by Time:
timestamp:[2023-01-01 TO 2023-01-31]: Finds scans within a specific date range.
Combining Operators:
domain:example.com AND status:404: Finds scans from example.com with a 404 status code.
ip:192.168.1.* AND NOT headers.server:nginx: Finds scans within the IP range that do not use Nginx.
Best Practices:
Be Specific: Use precise search terms and operators to narrow down your results.
Use Quotes: Enclose phrases in quotes to search for exact matches.
Experiment: Try different combinations of search terms and operators to discover new information.
Respect Privacy: Only use URLscan.io for ethical and legal purposes. Avoid accessing or disclosing sensitive information without authorization.
Use responsibly: Do not use URLscan.io to conduct unauthorized scans or to perform malicious activity.
Combine with other tools: URLscan.io is most powerful when combined with other OSINT tools.
Understand limitations: URLscan.io's data is based on snapshots, so it may not always reflect the current state of a website.
Use the API: For automated tasks, URLscan.io's API provides more control and flexibility.
Review the robots.txt: Before performing any automated scanning, verify that it is allowed by the robots.txt file.
Be aware of false positives: Some results may be false positives. Verify the findings with other resources.
Example Dorking Scenarios:
Finding Exposed Configuration Files:
domain:example.com resource:"config.php"
Identifying Vulnerable WordPress Plugins:
domain:example.com technology:wordpress page.body:"/wp-content/plugins/vulnerable-plugin/"
Discovering Exposed Git Repositories:
domain:example.com resource:".git/config"
Searching for Admin Panels:
page.title:"Admin Panel"
Finding servers that expose directory listings:
page.body:"Index of /"
#=====================================================================
A user has submitted a short link from their Vodafone Rewards site (submitted URL), which has then redirected to a unique URL (effective URL) containing a free movie ticket.
page.domain:"nz" AND page.url:"voucher"
#=====================================================================
password reset links
page.domain:“nz” AND page.url:“reset”
#=====================================================================
For example, a phishing kit might be using the same CSS file across its deployment. You can get SHA256 hash of the CSS file and search urlscan.io for that hash in order to see submissions with that phishing kit
#=====================================================================
For example, if WordPress password reset pages were using the same core CSS file, we can use the CSS file hash to find submissions that have exposed their password reset links.
#=====================================================================
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment