Skip to content

Instantly share code, notes, and snippets.

@dannycoates
Last active August 29, 2015 14:04
Show Gist options
  • Save dannycoates/0c7d04f9aa8d3908007a to your computer and use it in GitHub Desktop.
Save dannycoates/0c7d04f9aa8d3908007a to your computer and use it in GitHub Desktop.
FxA Log Format

FxA Log Format

version 0.0.1-draft-1

Goals

A standard base log format that enables log messages to be easily routed and filtered while allowing applications maximum flexibility in data they are able to record.

  • record format
    • few required fields (just enough for coarse routing/filtering)
    • "easy" serialization and deserialization
    • easily shipped through Heka
    • support arbitrary application data
  • file (stream) format
    • newline delimited
    • "grepable" on disk

Non-Goals

  • serialized size
  • processing speed over format flexibility
  • comprehensive schemas for all applications

Open Questions

  • How closely to Heka message format should we try to be?
    • Heka "Fields" are less flexible than JSON.
    • Heka natively uses protobufs, which are not flexible.
  • How big of a concern is routing performance?
    • Is a strict field order important?
    • Should we expect a router to parse the entire message?
  • Is it important for the message to use a single encoding?
    • For example, if the envelope was protobuf and payload was JSON?
    • I lean toward yes, a single encoding would be ideal.
  • How important is nested data in a log message?
    • I think its very important.
    • Without nesting, ad hoc "flattening" is inevitable.

Record Format

Log messages are objects (in JSON terminology) containing a small list of required fields. Additional fields belong to the payload field which may be of any JSON compatible type.

Top Level Fields

The purpose of the top level fields is to uniquely identify each log message by application, location, time, and importance.

  • version
    • string
    • log format version (semver)
  • severity
  • type
    • string
    • designates the "type" of the payload field
    • optionally a JSON Schema URL
  • logger
    • string
    • application name
  • hostname
    • string
  • pid
    • number
    • Process ID of the logging process
  • timestamp
    • number
    • microseconds since UNIX epoch (UTC)
  • payload
    • object
  • _*
    • optional
    • string
    • additional metadata fields

These fields are strictly ordered from least to most specific. A process may choose to leave any value unparsed that it doesn't require. For example, a router may only need to parse up to the name in order to send the message to the correct log aggregator. It can leave the rest of the message unparsed and pass the entire message as its payload.

The type value may be any string signaling additional context of the payload value for applications that emit structured log messages. It may be simply "object", or it may be "FMLError:v1.0.2", or a JSON Schema URL.

_* fields are for metadata about the payload that don't fit into the other fields or in the payload itself. Some examples may be _appVersion, _threadId, _awsRegion

Application Data Types

Application specific payload types may be hierarchical (even encouraged) but their schemas are beyond the scope of this document.

Example

{
  "version":"0.0.1-draft-1",
  "severity":6,
  "type":"http://schemas.accounts.firefox.com/v1/request_summary.json",
  "logger":"fxa-auth-server",
  "hostname":"apollo",
  "timestamp": 1405915343869000,
  "payload": {
    "code":400,
    "errno":103,
    "rid":"1405915831047-56717-80",
    "path":"/v1/password/change/start",
    "lang":"en-US,en",
    "remoteAddressChain":["127.0.0.1"],
    "t":1,
    "email":"[email protected]"
  },
  "_appVersion": "0.18.0"
}

Serialization

TO BE DETERMINED

  • JSON
  • TNetStrings
  • MsgPack

Lines may be compressed. (???)

Human Readability

Serialized messages do not need to be directly human readable. A pretty printer application may be used to display messages in a human optimized format.

Grepability

Raw log files should be easily searchable from a *nix shell, somehow

Potential Problems

  • arbitrarily nested objects
    • parser skipping unknown fields (+1 for tnetstrings)

Prior Art

References

@dannycoates
Copy link
Author

I've renamed some of the fields to be closer to Heka's native message naming

@seanmonstar
Copy link

Yes, it identifies the code that is running. @chilts had a good suggestion of adding the app version also.

Can't that be inferred by where the log is coming from?

Yes, I think all of these top level fields should be set by the logger, not by the caller.

I meant that the actual log pipe (fxa-auth-server.log, or whatever) would be able to infer the app name.

However, now the semantics of what 7 means is harder for a machine to determine (and query later)

Yea, I struggled a lot in intel trying to figure out how to serialize strings like that. Currently, intel's Record.toJSON does json.args = args, and figures that a machine can put them back together if desired.

Or do we not do either and force the router to parse data if it wants to make intelligent routing decisions?

I'm not sure. That's what I ended up expecting in intel. I see that hapi uses tags for server and request logging. I catch "uncaught" errors from hapi by checking tags.error && tags.implementation. Though, coming from Python, and hence what I built into intel, was the idea of routing on logger.name and level. It felt like that information was sufficient for getting the right log messages wherever.

Since we're talking about logging here, and I happen to work an logging library, I'd be interested in adjusting intel to solve all these problems easily.

@seanmonstar
Copy link

For the app name + version, how about app?

@dannycoates
Copy link
Author

From #heka irc channel:

11:10 (dcoates) so, I'd really like to be able to encode a tree structure in a heka message, and I don't see a straightforward way to do that looking at message.proto. Is there a recommended way to do it?
11:12 (trink) dcoates: the only wayto do it is flatten the structure and put the hierarchy in the key name
11:13 (trink) dcoates: util.table_to_fields
11:14 (RaFromBRC) https://github.com/mozilla-services/lua_sandbox/blob/dev/modules/util.lua#L13
11:15 (dcoates) trink: yeah, thats what i was afraid of. It means I can't just use heka messages as a "native" format and have to kind of "double encode" my message in a sense.
11:16 (dcoates) I think it would be *really nice* to make Field recursive
11:20 (trink) dcoates: that still wouldn't do what you want (since there is no concept of a map)
11:24 (dcoates) well, you got my intent ;) I'm not trying to offer a solution
11:31 (dcoates) Is supporting encoding trees in messages without flattening something y'all would be willing to consider?
11:32 (trink) dcoates: the what is clear, the intent [why] is unclear (i.e. the need for a highly structured/nested log to be used in real time analysis)
11:36 (dcoates) trees are natural structures for a lot of things. I may want to trigger on something without having to push it up to the top scope. It'd also be nice to not have to transform my data for each system it goes through.
11:41 (chance) just wondering, what kinda data in this case might be in the tree format
11:42 (chance) maybe git commits? :D
11:42 (chance) you could serialize to json, and unserialize back
11:43 (chance) not in the message fields itself, but nonetheless its there.
11:44 (dcoates) chance: indeed, but now I've got 2 encodings in one message when I could have 1, and as a string payload its harder to use in heka filters.
11:45 (chance) i dont disagree. but supporting a tree like structure in pure heka messages would likely make it harder to use *everything* in heka, not even just filters.
11:46 (chance) but its hard to say.
11:46 (chance) at least protbuf supports it im reading, so its a possibility.
11:47 (chance) probably not on the top of things we'll be working on until we get other stuff done.
11:48 (RaFromBRC) dcoates: the double encoding issue is valid, and i know json sucks in a lot of ways, but in lua filters it'd actually be easier to deal w/ tree structured data in json format than it would embedded in the message directly
11:48 (dcoates) chance: im interested in what would be harder for heka internally so that I can understand what I should and shouldn't expect heka to dofor me
11:49 (RaFromBRC) `cjson.decode(your_json)` will give you a lua table (like a javascript object) all ready for action
11:49 (chance) it's hard to say, because i haven't investigated. we use protobuf, protobuf supports this kind of data. however that isn't to say doing such a thing would be as simple as changing protobuf
11:50 (RaFromBRC) extracting data from a bunch of different message fields would be multiple `read_message(FieldName)` calls, and constructing the data structure by hand
11:54 (dcoates) RaFromBRC: awesome, that really helps!
11:55 (chance) yeah, its a good point that its probably easier for you as a user to just get the data from the payload for that specific rason
11:55 (chance) reason, even
11:55 (RaFromBRC) :)
11:55 RaFromBRC likes being helpful
11:56 (dcoates) chance: yes, that is what I was trying to understand, which would be better, all native protobuf, or something else in the payload. payload wins!
11:59 (chance) definitely. sometimes i find myself unsure as well, but im still new to heka :)

@trink
Copy link

trink commented Jul 22, 2014

Non-Goals - comprehensive schemas for all applications

This non-goal is already causing problems in Ops. If the output isn't documented it should be identifiable as such and not sent to Heka or any data warehouse. Everything else should be fully documented and versioned to prevent breaking downstream consumers. There have been several ad-hoc FxA logging changes over the past few months. From removing memory statistics that we actually alert on, an error message that is now sometimes an object instead of a string, and sporadically missing required fields.

@dannycoates
Copy link
Author

@trink I'm not saying having documented types is not a goal. I am saying it shouldn't be a prerequisite for getting any logs flowing. The purpose of this document is to get a format in place that will not fail to parse in heka for any message going through. There is a type field that should be strict enough that we can write heka filters to process them if we want, but I think it is important to allow untyped dynamic payload data to at least not get dropped on the floor, and to somewhere a developer can at least look at it to decide how fucked up it is. Strict schemas for everything makes it unnecessarily hard to discover what is important, especially for unexpected errors. Our current setup needs to improve, this is working towards that.

@rfk
Copy link

rfk commented Jul 22, 2014

Suggestion: could the _* fields be cleaner if put into another sub-object, e.g. meta to go alongside payload? It's not clear what advantage we get from having them in the top-level fields.

@dannycoates
Copy link
Author

could the _* fields be cleaner if put into another sub-object, e.g. meta to go alongside payload?

Yep, I took that convention from GELF, but a meta object might be nicer. If it were named fields it would map directly to the heka message but be less clear in its purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment