-
-
Save vmadman/5472166 to your computer and use it in GitHub Desktop.
# Access Logs | |
LogFormat "{ \ | |
\"@vips\":[\"%v\"], \ | |
\"@source\":\"%v%U%q\", \ | |
\"@source_host\": \"%v\", \ | |
\"@source_path\": \"%f\", \ | |
\"@tags\":[\"Apache\",\"Access\"], \ | |
\"@message\": \"%h %l %u %t \\\"%r\\\" %>s %b\", \ | |
\"@fields\": { \ | |
\"timestamp\": \"%{%Y-%m-%dT%H:%M:%S%z}t\", \ | |
\"clientip\": \"%a\", \ | |
\"duration\": %D, \ | |
\"status\": %>s, \ | |
\"request\": \"%U%q\", \ | |
\"urlpath\": \"%U\", \ | |
\"urlquery\": \"%q\", \ | |
\"method\": \"%m\", \ | |
\"referer\": \"%{Referer}i\", \ | |
\"user-agent\": \"%{User-agent}i\", \ | |
\"bytes\": %B \ | |
} \ | |
}" ls_apache_json | |
# The catch-all | |
CustomLog "||/usr/local/bin/udpclient.pl 127.0.0.1 5001" ls_apache_json |
Actually, even simpler is to use the new feature of Filebeat that handles JSON decode errors automatically:
- type: log
enabled: true
paths:
- /var/log/mylog.json
json:
keys_under_root: true
add_error_key: true
If parsing fails, you'll get an entry with fields (among others):
{
"error.message": "Error decoding JSON: invalid character 'o' in literal false (expecting 'a')",
"error.type": "json",
"message": "fooooo"
}
Then you can set up CustomLog to write directly to a file, bypassing sed.
Small correction, the sed customlog command needs to be spawned with a shell (
|$
vs|
) for the>>
redirect to have meaning.
Seems to me that parts of the access_log
contain user input from untrusted sources on the Internet. Is it really safe to pipe that through sed
on the web server? The sed
command will be running as the web server user and this might create opportunities for command injection. Likewise that's like one sed
invocation per web request. Surely this will perform badly on a highly loaded server. Apache goes to great lengths to be scalable, but if we invoke a whole unix process on each and every log line, I think that would significantly hurt scalability.
Likewise that's like one
sed
invocation per web request. Surely this will perform badly on a highly loaded server.
@pacohope Luckily Apache 2.4 (and < 2.4 when using the ||
form) starts the filtering process once at Apache startup time, and pushes data through that one always-running process, so no, it doesn't add much overhead assuming the filter program is efficient, and it doesn't respawn anew for each request:
Small correction, the sed customlog command needs to be spawned with a shell (
|$
vs|
) for the>>
redirect to have meaning.Corrected line (with
\v
fix as well):CustomLog "|$/bin/sed -e s/\\v/\\ u000b/g -e s/\\x/\\u00/ >> /var/log/httpd/access_log" myformat
Doc reference: https://httpd.apache.org/docs/2.4/logs.html#piped