Filebeat modules are all either open source, or provided via the Elastic License. You can look at them all, to understand how the parsing, the conversion and the mapping to ECS are done.
- All Filebeat modules are listed here: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-modules.html
- The code for Apache 2 open source modules is here: https://github.com/elastic/beats/tree/master/filebeat
- The code for Elastic Licensed modules is here: https://github.com/elastic/beats/tree/master/x-pack/filebeat/module
From either module directory, the structure is the same:
- You'll have a directory named after the module
- Under it you'll have one or more directory for "file sets" (different logs like apache error & access log).
- For a given fileset / log directory, you will either have Beats processors in
config/*.yml
or an Elasticsearch ingest pipeline atingest/*.json
oringest/*.yml
, some modules have both Beats processors and ES pipelines. - Concrete example Suricata (
x-pack/filebeat/module/suricata
, under the "eve" file set):- Beats processors: x-pack/filebeat/module/suricata/eve/config/eve.yml
- ES Ingest pipeline: x-pack/filebeat/module/suricata/eve/ingest/pipeline.yml
Most modules have tests which include raw logs and the converted log, which you can also look at.
- The
test
directory will contain pairs of log files. One with the original logs, and another named the same with-expected.json
at the end, which shows the resulting event documents, after conversion. - Continuing the Suricata example:
- An original Suricata EVE log: x-pack/filebeat/module/suricata/eve/test/eve-small.log
- The same log converted: x-pack/filebeat/module/suricata/eve/test/eve-small.log-expected.json
These test files do not show the actual format of the document as it will be in Elasticsearch. This file is instead optimized for "diffing" before/after, when making changes to the module. In other words, it's made easier to read for humans.
The real format of the converted JSON documents is that there are no dotted keys, it's all nested JSON objects.
So where you'd see this in the "-expected.json"
{
"@timestamp": "2018-07-05T19:01:09.820Z",
"destination.address": "192.168.253.112",
...
}
Means the document would look like this in Elasticsearch:
{
"@timestamp": "2018-07-05T19:01:09.820Z",
"destination": {
"address": "192.168.253.112",
...
}
...
}
If you're looking for the field definitions of a given module, you'll generally find them inside each fileset's directory as well. If some field definitions are common across the module, and not specific to a fileset, you may also find them at the module level. Simply navigate to _meta/fields.yml
in each of these locations.
Concretely:
- Module-level definitions for Zeek: x-pack/filebeat/module/zeek/_meta/fields.yml
- Definitions for DNS events: x-pack/filebeat/module/zeek/dns/_meta/fields.yml
- Definitions for HTTP events: x-pack/filebeat/module/zeek/http/_meta/fields.yml