Skip to content

Instantly share code, notes, and snippets.

@tomplayford
Last active May 18, 2023 21:16
Show Gist options
  • Save tomplayford/509a1444e7a0e81dd2ab to your computer and use it in GitHub Desktop.
Save tomplayford/509a1444e7a0e81dd2ab to your computer and use it in GitHub Desktop.
haproxy log file regex for fluentd. This actually seems to work, unlike the one in the fluent docs. Reads the full HTTP log file format. It'll also catch all other haproxy logs, but only as one big string.
^(?<syslog_time>[^ ]* +[^ ]* +[^ ]*) (?<syslog_host>[\w-\.]+) (?<ps>\w+)\[(?<pid>\d+)\]: ((?<c_ip>[\w\.]+):(?<c_port>\d+) \[(?<time>.+)\] (?<f_end>[\w\~-]+) (?<b_end>[\w-]+)\/(?<b_server>[\w\.-]+) (?<tq>\d+)\/(?<tw>\d+)\/(?<tc>\d+)\/(?<tr>\d+)\/(?<tt>\d+) (?<status_code>\d+) (?<bytes>\d+) (?<req_cookie>\S?) (?<res_cookie>\S?) (?<t_state>[\w-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srv_conn>\d+)\/(?<retries>\d+) (?<srv_queue>\d+)\/(?<backend_queue>\d+) \{?(?<req_headers>[^}]*)\}? ?\{?(?<res_headers>[^}]*)\}? ?"(?<request>[^"]*)"|(?<message>.+))
@salmanmp
Copy link

salmanmp commented Jan 26, 2020

Hi. thank you. I used this and made it a little better.

^(?<syslog_time>[^ ]* +[^ ]* +[^ ]*) (?<syslog_host>[\w-\.]+) (?<ps>\w+)\[(?<pid>\d+)\]: ((?<c_ip>[\w\.]+):(?<c_port>\d+) \[(?<time>.+)\] (?<f_end>[\w\~-]+) (?<b_end>[\w-]+)\/(?<b_server>[<>\w\.-]+) (?<tq>\d+)\/(?<tw>[\d-]+)\/(?<tc>[\d-]+)\/(?<tr>[\d-]+)\/(?<tt>[\d-]+) (?<status_code>\d+) (?<bytes>\d+) (?<req_cookie>\S?) (?<res_cookie>\S?) (?<t_state>[\w-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srv_conn>\d+)\/(?<retries>\d+) (?<srv_queue>\d+)\/(?<backend_queue>\d+) \{?(?<req_headers>[^}]*)\}?( ?\{(?<res_headers2>[^}]*)\})? ?"(?<request>[^"]*)"|(?<message>.+))$

@la-bibe
Copy link

la-bibe commented Sep 21, 2020

Hey, thanks for this regex. I used it for fluent bit and I improved it a bit to handle different logging message level. I also escaped the minus - signs to prevent the error unmatched range specifier in char-class:

^(?<syslog_time>[^ ]* +[^ ]* +[^ ]*) (?<syslog_host>[\w\-\.]+) (?<ps>\w+)\[(?<pid>\d+)\]: ((?<c_ip>[\w\.]+):(?<c_port>\d+) \[(?<time>.+)\] (?<f_end>[\w\~\-]+) (?<b_end>[\w\-]+)\/(?<b_server>[\w\.\-]+) (?<tq>[\d\-]+)\/(?<tw>[\d\-]+)\/(?<tc>[\d\-]+)\/(?<tr>[\d\-]+)\/(?<tt>[\d\-]+) (?<status_code>\d+) (?<bytes>\d+) (?<req_cookie>\S?) (?<res_cookie>\S?) (?<t_state>[\w\-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srv_conn>\d+)\/(?<retries>\d+) (?<srv_queue>\d+)\/(?<backend_queue>\d+) \{?(?<req_headers>[^}"]*)\}? ?\{?(?<res_headers>[^"}]*)\}? ?"(?<request>[^"]*).*|((\[(?<message_level>[^\]]+)\] )?(?<message>.+)))

@salmanmp
Copy link

salmanmp commented Dec 9, 2020

Very good. Thank you.
Excuse me for late response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment