Firstly, we should ask, in the context of web application security, what is content? Content can be many things, a file, video, picture, backup, a website feature. When we talk about content discovery, we're not talking about the obvious things we can see on a website; it's the things that aren't immediately presented to us and that weren't always intended for public access.
This content could be, for example, pages or portals intended for staff usage, older versions of the website, backup files, configuration files, administration panels, etc.
There are three main ways of discovering content on a website which we'll cover. Manually, Automated and OSINT (Open-Source Intelligence).
Web aplikasiyalarda framework və kitabxanaların sıx olaraq istifadə edilməsi ilə bərabər bu resurslarda təhlükəsizlik boşluqlarının olması da gözlənilən haldır.
Təhlükəsizlik boşluqlarının araşdırılması üçün birinci növbədə istifadə edilən resursun nə olduğunun təyin edilməsi hər bir halda fayda verəcəkdir.
Əgər istifadə edilən resursun loqosu aplikasiyada dəyişdirilməyibsə və ya hərhansı başqa bir nöqtədə mövcuddursa bu bizə onun hansı resurs olduğunu təyin etməkdə köməkçi ola bilər. Bunun üçün mövcud loqonun MD5 dəyəri alınır və OWASP Favicon Database kimi verilənlər bazalarında axtarılır.
curl (Client for URLs): Command-line bir tool'dur və URL'ləri istifadə edərək onlarla data alış-verişi etməyə imkan verir.
md5sum: Bir faylın MD5 dəyərini generate edən command-line bir tool'dur
curl https://getbootstrap.com/docs/5.3/assets/img/favicons/apple-touch-icon.png | md5sum
Chrome Extension olaraq mövcuddur. Bir web aplikasiyada istifadə olunan texnologiyaların versiya nömrələri ilə bərabər olacaq şəkildə siyahılaya bilir.
Wayback Machine'nin hətta 90-cı illərə qədər olan veb saytların məzmununu əhatə edə bilmə potensialı olan bir verilənlər bazası var.
Əlaqədar veb saytın repository`si public olaraq yayımlanırsa, bu böyük bir imkandır.
Sonları s3.amazonaws.com olaraq bitir.
Bu depolarda saxlanılan veb sayt kontentinin access permissions'ı düzgün konfiqurasiya olunmayıbsa kontentə xaricdən müdaxilələr mümkün hala gəlir.
One common automation method is by using the company name followed by common terms such as
- name-assets.s3.amazonaws.com (fds-assets.s3.amazonaws.com)
- name-www.s3.amazonaws.com (fds-www.s3.amazonaws.com)
- name-public.s3.amazonaws.com (fds-public.s3.amazonaws.com)
- name-private.s3.amazonaws.com (fds-private.s3.amazonaws.com)
Hazırda mövcud olan və məzmun kəşfi üçün nəzərdə tutulmuş tool'lar vasitəsi ilə content discovery etməyi ifadə edir.
What is Automated Discovery?
Automated discovery is the process of using tools to discover content rather than doing it manually. This process is automated as it usually contains hundreds, thousands or even millions of requests to a web server. These requests check whether a file or directory exists on a website, giving us access to resources we didn't previously know existed. This process is made possible by using a resource called wordlists.
What are wordlists? Wordlists are just text files that contain a long list of commonly used words; they can cover many different use cases. For example, a password wordlist would include the most frequently used passwords, whereas we're looking for content in our case, so we'd require a list containing the most commonly used directory and file names. An excellent resource for wordlists that is preinstalled on the THM AttackBox is https://github.com/danielmiessler/SecLists which Daniel Miessler curates.
➜ Downloads ffuf -w /usr/share/wordlists/seclists/Discovery/Web-Content/common.txt -u http://10.10.171.120/FUZZ
/'___\ /'___\ /'___\
/\ \__/ /\ \__/ __ __ /\ \__/
\ \ ,__\\ \ ,__\/\ \/\ \ \ \ ,__\
\ \ \_/ \ \ \_/\ \ \_\ \ \ \ \_/
\ \_\ \ \_\ \ \____/ \ \_\
\/_/ \/_/ \/___/ \/_/
v2.1.0-dev
________________________________________________
:: Method : GET
:: URL : http://10.10.171.120/FUZZ
:: Wordlist : FUZZ: /usr/share/wordlists/seclists/Discovery/Web-Content/common.txt
:: Follow redirects : false
:: Calibration : false
:: Timeout : 10
:: Threads : 40
:: Matcher : Response status: 200-299,301,302,307,401,403,405,500
________________________________________________
assets [Status: 301, Size: 178, Words: 6, Lines: 8, Duration: 115ms]
contact [Status: 200, Size: 3108, Words: 747, Lines: 65, Duration: 121ms]
customers [Status: 302, Size: 0, Words: 1, Lines: 1, Duration: 132ms]
development.log [Status: 200, Size: 27, Words: 5, Lines: 1, Duration: 115ms]
monthly [Status: 200, Size: 28, Words: 4, Lines: 1, Duration: 116ms]
news [Status: 200, Size: 2538, Words: 518, Lines: 51, Duration: 116ms]
private [Status: 301, Size: 178, Words: 6, Lines: 8, Duration: 113ms]
robots.txt [Status: 200, Size: 46, Words: 4, Lines: 3, Duration: 113ms]
sitemap.xml [Status: 200, Size: 1391, Words: 260, Lines: 43, Duration: 113ms]
:: Progress: [4723/4723] :: Job [1/1] :: 345 req/sec :: Duration: [0:00:15] :: Errors: 0 ::
➜ Downloads
➜ Downloads dirb http://10.10.171.120/ /usr/share/wordlists/seclists/Discovery/Web-Content/common.txt
-----------------
DIRB v2.22
By The Dark Raver
-----------------
START_TIME: Sun Dec 24 00:25:20 2023
URL_BASE: http://10.10.171.120/
WORDLIST_FILES: /usr/share/wordlists/seclists/Discovery/Web-Content/common.txt
-----------------
GENERATED WORDS: 4722
---- Scanning URL: http://10.10.171.120/ ----
==> DIRECTORY: http://10.10.171.120/assets/
+ http://10.10.171.120/contact (CODE:200|SIZE:3108)
+ http://10.10.171.120/customers (CODE:302|SIZE:0)
+ http://10.10.171.120/development.log (CODE:200|SIZE:27)
^A
^A
^A
Testing: http://10.10.171.120/jrun
^C
➜ Downloads
➜ Downloads gobuster dir --url http://10.10.171.120/ -w /usr/share/wordlists/seclists/Discovery/Web-Content/common.txt
===============================================================
Gobuster v3.6
by OJ Reeves (@TheColonial) & Christian Mehlmauer (@firefart)
===============================================================
[+] Url: http://10.10.171.120/
[+] Method: GET
[+] Threads: 10
[+] Wordlist: /usr/share/wordlists/seclists/Discovery/Web-Content/common.txt
[+] Negative Status codes: 404
[+] User Agent: gobuster/3.6
[+] Timeout: 10s
===============================================================
Starting gobuster in directory enumeration mode
===============================================================
/assets (Status: 301) [Size: 178] [--> http://10.10.171.120/assets/]
/contact (Status: 200) [Size: 3108]
/customers (Status: 302) [Size: 0] [--> /customers/login]
/development.log (Status: 200) [Size: 27]
/monthly (Status: 200) [Size: 28]
/news (Status: 200) [Size: 2538]
/private (Status: 301) [Size: 178] [--> http://10.10.171.120/private/]
/robots.txt (Status: 200) [Size: 46]
/sitemap.xml (Status: 200) [Size: 1391]
Progress: 4723 / 4724 (99.98%)
===============================================================
Finished
===============================================================
➜ Downloads