As part of helping a customer develop their proof of concept monitoring system with Sensu Enterprise, I worked up a mutator which uses stash data to determine if an event occurred within a pre-defined maintenance window.
The idea here is that event data needs to be annotated to indicate whether an event occurred during a scheduled maintenance window for SLA reporting purposes. With this added downtime context, events logged to an external source (e.g. greylog, elasticsearch) via Sensu Enterprise event bridge should provide enough information to determine whether or not a client's check result matches a scheduled downtime window.
Please note that I have done very little in the way of testing so this plugin is not likely to be very robust. Since this mutator probably needs to be applied to every event, it should probably be implemented as an extension before being put into a production system.
The mutator assumes the following:
-
Relative to Sensu event processor, Sensu API is running on 127.0.0.1:4567 . This will be true of any Sensu Enterprise server.
-
Sensu Clients are configured with a custom attribute,
services
, whose value is an array containing zero or more strings defining service names which will be compared to named stashes under thedowntime
path. -
Stashes will be created via the Sensu API under the
downtime
path, with a name matching a service defined on clients withstart
andend
attributes whose values are unix epoch timestamps.
Example client definition, Note "arbitrary_service_id" as a value in the services
array.:
{
"client":{
"name":"datboi",
"address":"192.168.2.227",
"subscriptions":[
"client:datboi"
],
"environment":"staging",
"tags":[],
"services":[
"arbitrary_service_id"
]
}
Example curl command to create a "scheduled downtime" stash under the downtime
path, matching the arbitrary_service_id
service defined on the client above:
curl -X POST -H 'Content-Type: application/json' -d '{"path":"downtime/arbitrary_service_id","content":{"start":1493158003,"end":1493168003,"creator":"Your Name Here","description":"this is a test"}}' http://127.0.01:4567/stashes
With a client configured and a stash created, the mutator can be defined in configuration and applied to a handler. Here's the combined handler and mutator configuration I used in my testing:
{
"handlers": {
"downtime_test": {
"type": "pipe",
"command": "tee /tmp/downtime_test",
"mutator": "scheduled_downtime"
}
},
"mutators": {
"scheduled_downtime": {
"command": "/usr/local/bin/scheduled-downtime.rb"
}
}
}
After restarting Sensu services to apply configuration, I tested the mutator using nc
(netcat) to send a check result to the local client socket:
echo '{"name":"test","status":2,"output":"test output","handler":"downtime_test"}' | nc 127.0.0.1 3030
And I see the data written to disk by tee
, with a copy of the downtime
stash incorporated in the event data under the downtime
array, as I expect:
$ cat /tmp/downtime_test | jq .
{
"client": {
"name": "datboi",
"address": "192.168.2.227",
"subscriptions": [
"client:datboi"
],
"environment": "staging",
"tags": [],
"services": [
"arbitrary_service_id"
],
"version": "0.29.0",
"timestamp": 1493160058
},
"check": {
"name": "test",
"status": 2,
"output": "test output",
"handler": "downtime_test",
"executed": 1493160075,
"issued": 1493160075,
"type": "standard",
"history": [
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2",
"2"
],
"total_state_change": 0
},
"occurrences": 21,
"occurrences_watermark": 21,
"action": "create",
"timestamp": 1493160075,
"id": "fc081db1-961a-4f64-8412-d5a56a152ed4",
"last_state_change": 1491797301,
"last_ok": 1491797301,
"silenced": false,
"silenced_by": [],
"downtime": [
{
"start": 1493159338,
"end": 1493169338,
"creator": "Your Name Here",
"description": "this is a test"
}
]
}