Skip to content

Instantly share code, notes, and snippets.

@mikejoh
Last active January 3, 2025 10:03
Show Gist options
  • Save mikejoh/c172b2400909d33c37199c9114df61ef to your computer and use it in GitHub Desktop.
Save mikejoh/c172b2400909d33c37199c9114df61ef to your computer and use it in GitHub Desktop.
Prometheus troubleshooting

Prometheus troubleshooting

List the number of top 10 metrics and how many data points are saved

This will give you a hint on which metrics that has the highest cardinality.

topk(10, count by (__name__, job)({__name__=~".+"}))

Calculate required disk space by Prometheus

Formula:

needed_disk_space = retention_time_seconds * ingested_samples_per_second * bytes_per_sample

Metrics example:

  • retention_time_seconds: 864000 (10 days)
prometheus_tsdb_retention_limit_seconds
  • ingested_samples_per_second: 21359.936262203624
rate(prometheus_tsdb_head_samples_appended_total[2h])
  • bytes_per_sample: 1.640350514654668
rate(prometheus_tsdb_compaction_chunk_size_bytes_sum[2h]) / rate(prometheus_tsdb_compaction_chunk_samples_sum[2h])

Which means that we'll need 30272644028.76188 (~28GB) bytes of storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment