Last active
April 30, 2024 15:42
-
-
Save ruanbekker/d6110ef0f93e640bf8466193db9c23c1 to your computer and use it in GitHub Desktop.
Bash Script to Parse Nginx Access Logs: https://sysadmins.co.za/bash-script-to-parse-and-analyze-nginx-access-logs/
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# variables | |
LOGFILE="/var/log/nginx/access.log" | |
LOGFILE_GZ="/var/log/nginx/access.log.*" | |
RESPONSE_CODE="200" | |
# functions | |
filters(){ | |
grep $RESPONSE_CODE \ | |
| grep -v "\/rss\/" \ | |
| grep -v robots.txt \ | |
| grep -v "\.css" \ | |
| grep -v "\.jss*" \ | |
| grep -v "\.png" \ | |
| grep -v "\.ico" | |
} | |
filters_404(){ | |
grep "404" | |
} | |
request_ips(){ | |
awk '{print $1}' | |
} | |
request_method(){ | |
awk '{print $6}' \ | |
| cut -d'"' -f2 | |
} | |
request_pages(){ | |
awk '{print $7}' | |
} | |
wordcount(){ | |
sort \ | |
| uniq -c | |
} | |
sort_desc(){ | |
sort -rn | |
} | |
return_kv(){ | |
awk '{print $1, $2}' | |
} | |
request_pages(){ | |
awk '{print $7}' | |
} | |
return_top_ten(){ | |
head -10 | |
} | |
## actions | |
get_request_ips(){ | |
echo "" | |
echo "Top 10 Request IP's:" | |
echo "====================" | |
cat $LOGFILE \ | |
| filters \ | |
| request_ips \ | |
| wordcount \ | |
| sort_desc \ | |
| return_kv \ | |
| return_top_ten | |
echo "" | |
} | |
get_request_methods(){ | |
echo "Top Request Methods:" | |
echo "====================" | |
cat $LOGFILE \ | |
| filters \ | |
| request_method \ | |
| wordcount \ | |
| return_kv | |
echo "" | |
} | |
get_request_pages_404(){ | |
echo "Top 10: 404 Page Responses:" | |
echo "===========================" | |
zgrep '-' $LOGFILE $LOGFILE_GZ\ | |
| filters_404 \ | |
| request_pages \ | |
| wordcount \ | |
| sort_desc \ | |
| return_kv \ | |
| return_top_ten | |
echo "" | |
} | |
get_request_pages(){ | |
echo "Top 10 Request Pages:" | |
echo "=====================" | |
cat $LOGFILE \ | |
| filters \ | |
| request_pages \ | |
| wordcount \ | |
| sort_desc \ | |
| return_kv \ | |
| return_top_ten | |
echo "" | |
} | |
get_request_pages_all(){ | |
echo "Top 10 Request Pages from All Logs:" | |
echo "===================================" | |
zgrep '-' --no-filename $LOGFILE $LOGFILE_GZ \ | |
| filters \ | |
| request_pages \ | |
| wordcount \ | |
| sort_desc \ | |
| return_kv \ | |
| return_top_ten | |
echo "" | |
} | |
# executing | |
get_request_ips | |
get_request_methods | |
get_request_pages | |
get_request_pages_all | |
get_request_pages_404 |
Hi there, nice script, thanks for gist-ing it!
I'm trying to get this to work with a ServerPilot.io server, scanning all logs in all users app directories.
I've changed the $LOGFILE and $LOGFILE_GZ line to this:
LOGFILE="/srv/users/*/log/*/*nginx.access.log*"
LOGFILE_GZ="/srv/users/*/log/*/*nginx.access.log*.gz"
... but I'm not sure that it's doing what I need it to do... (sorry, hopeless!) Do I need to loop over the $LOGFILES to get this to work?
Thanks again!
Hello, I think grep $RESPONSE_CODE
should be changed to awk '$9==200'
, because the IP address may contain the number 200.
fix litet error in scan 404 and 200 resposnes view in my fork
Hi,
the function request_pages()
is specified twice in this script. Please check lines 32-34 and 49-51.
Regards,
Tronde
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Made my life easy...Thanks
one thing though...need some help with
zgrep '-' --no-filename