Skip to content

Instantly share code, notes, and snippets.

@grobie
Created November 9, 2017 14:05
Show Gist options
  • Save grobie/29e083e30fc6edca9938fd665d12a638 to your computer and use it in GitHub Desktop.
Save grobie/29e083e30fc6edca9938fd665d12a638 to your computer and use it in GitHub Desktop.
Prometheus alerts (in 1.x format) to monitor a Prometheus 2.0 server
ALERT PrometheusTSDBReloadsFailing
IF increase(prometheus_tsdb_reloads_failures_total[4h]) > 0
FOR 1d
LABELS {
system = "prometheus",
severity = "warning",
}
ANNOTATIONS {
summary = "Prometheus has issues reloading data blocks from disk",
description = "{{$labels.job}} at {{$labels.instance}} had {{$value | humanize}} reload failures over the last four hours.",
runbook = "http://eng-doc/runbooks/prometheus/#prometheustsdbreloadsfailing",
}
ALERT PrometheusTSDBCompactionsFailing
IF increase(prometheus_tsdb_compactions_failed_total[4h]) > 0
FOR 1d
LABELS {
system = "prometheus",
severity = "warning",
}
ANNOTATIONS {
summary = "Prometheus has issues compacting sample blocks",
description = "{{$labels.job}} at {{$labels.instance}} had {{$value | humanize}} compaction failures over the last four hours.",
runbook = "http://eng-doc/runbooks/prometheus/#prometheustsdbcompactionsfailing",
}
ALERT PrometheusTSDBWALCorruptions
IF tsdb_wal_corruptions_total > 0
FOR 4h
LABELS {
system = "prometheus",
severity = "warning",
}
ANNOTATIONS {
summary = "Prometheus write-ahead log is corrupted",
description = "{{$labels.job}} at {{$labels.instance}} has a corrupted write-ahead log (WAL).",
runbook = "http://eng-doc/runbooks/prometheus/#prometheustsdbwalcorruptions",
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment