Skip to content

Instantly share code, notes, and snippets.

@huynhbaoan
Last active October 25, 2024 09:14
Show Gist options
  • Save huynhbaoan/419c82762643f66930b65d899a9e0e00 to your computer and use it in GitHub Desktop.
Save huynhbaoan/419c82762643f66930b65d899a9e0e00 to your computer and use it in GitHub Desktop.
Log extract
awk '{
ip = $2
url = gensub(/.*"(https?:\/\/[^"]+)".*/, "\\1", "g", $0)
status = $7
user_agent = gensub(/.*"[^"]*"\s+"([^"]+)".*/, "\\1", "g", $0)
# Print fields separated by pipes for easy re-parsing
print ip "|" url "|" status "|" user_agent
}' esg.access.log-20241016
awk '{
# Extract IP (third field)
ip = $3;
# Extract URL (first quoted string)
match($0, /"https?:\/\/[^"]+"/);
url = substr($0, RSTART+1, RLENGTH-2);
# Extract Status Code (number following URL)
match($0, /"https?:\/\/[^"]+" [0-9]+/);
status = substr($0, RSTART + RLENGTH - 3, 3);
# Extract User Agent (last quoted string)
match($0, /"[^"]+"$/);
user_agent = substr($0, RSTART+1, RLENGTH-2);
# Print fields separated by pipes for easy re-parsing
print ip "|" url "|" status "|" user_agent;
}' esg.access.log-20241016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment