Created
April 27, 2013 06:59
-
-
Save vmadman/5472166 to your computer and use it in GitHub Desktop.
An apache log format that allow access logs (but not error logs) to be output in JSON format. I found this here: http://untergeek.com/2012/10/11/getting-apache-to-output-json-for-logstash/ -- but modified it for my purposes a good bit.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Access Logs | |
LogFormat "{ \ | |
\"@vips\":[\"%v\"], \ | |
\"@source\":\"%v%U%q\", \ | |
\"@source_host\": \"%v\", \ | |
\"@source_path\": \"%f\", \ | |
\"@tags\":[\"Apache\",\"Access\"], \ | |
\"@message\": \"%h %l %u %t \\\"%r\\\" %>s %b\", \ | |
\"@fields\": { \ | |
\"timestamp\": \"%{%Y-%m-%dT%H:%M:%S%z}t\", \ | |
\"clientip\": \"%a\", \ | |
\"duration\": %D, \ | |
\"status\": %>s, \ | |
\"request\": \"%U%q\", \ | |
\"urlpath\": \"%U\", \ | |
\"urlquery\": \"%q\", \ | |
\"method\": \"%m\", \ | |
\"referer\": \"%{Referer}i\", \ | |
\"user-agent\": \"%{User-agent}i\", \ | |
\"bytes\": %B \ | |
} \ | |
}" ls_apache_json | |
# The catch-all | |
CustomLog "||/usr/local/bin/udpclient.pl 127.0.0.1 5001" ls_apache_json |
Likewise that's like one
sed
invocation per web request. Surely this will perform badly on a highly loaded server.
@pacohope Luckily Apache 2.4 (and < 2.4 when using the ||
form) starts the filtering process once at Apache startup time, and pushes data through that one always-running process, so no, it doesn't add much overhead assuming the filter program is efficient, and it doesn't respawn anew for each request:
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Seems to me that parts of the
access_log
contain user input from untrusted sources on the Internet. Is it really safe to pipe that throughsed
on the web server? Thesed
command will be running as the web server user and this might create opportunities for command injection. Likewise that's like onesed
invocation per web request. Surely this will perform badly on a highly loaded server. Apache goes to great lengths to be scalable, but if we invoke a whole unix process on each and every log line, I think that would significantly hurt scalability.