Created
August 23, 2018 14:35
-
-
Save sethhall/727ac36a630a642ca941661db68b87f4 to your computer and use it in GitHub Desktop.
Get some extra file names from http
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
redef record HTTP::Info += { | |
potential_fname: string &optional; | |
}; | |
event http_request(c: connection, method: string, original_URI: string, | |
unescaped_URI: string, version: string) &priority=5 | |
{ | |
# Get rid of uri arguments | |
local path = split_string(c$http$uri, /\?/)[0]; | |
local out = split_string(path, /\//); | |
# Take the last component in the uri path | |
c$http$potential_fname = out[|out|-1]; | |
} | |
event http_header(c: connection, is_orig: bool, name: string, value: string) &priority=3 | |
{ | |
if ( is_orig ) | |
return; | |
if ( name == "ETAG" && /\"/ in value ) | |
{ | |
if ( c$http?$potential_fname && c$http$potential_fname != "" ) | |
c$http$current_entity$filename = c$http$potential_fname; | |
} | |
} | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Oh and a small clarification, so that we don't digress over a canard. I realize Zeek logs aren't sequences of bytes where anything could end up in them, because the tab separated data and json both escape non-printable stuff. But internally in Zeek I worry if in every datatype they're all just arbitrary sequences of bytes which means they can technically haves nulls or anything else in them. I would blanche if hash results could haves nulls or anything such in them. The point I am raising in this discussion is that programmers carry some semantic baggage as they read variable and type names. I blanche if a "filename" can contain a * or / or \. It needs to be termed a filepath if it is the '/' delimited hierarchy. It needs to be a fullpath if it is the filepath and filename concatenated. It needs to be a pattern if it can contain * or ?.