Skip to content

Instantly share code, notes, and snippets.

@ansrivas
Created March 22, 2019 20:47
Show Gist options
  • Save ansrivas/c74325dcb2876f0a77cef5ee41be0847 to your computer and use it in GitHub Desktop.
Save ansrivas/c74325dcb2876f0a77cef5ee41be0847 to your computer and use it in GitHub Desktop.
prometheus-config.md

That's a good question. It mostly comes down to how many individual metrics and how many samples per second you plan to ingest. The number of actual targets isn't as big an issue as the scrapes are cheap, a simple http GET, but the sample ingestion takes some work.

RAM is a big factor

  • It limits how much data you can crunch with queries
  • It limits how much data can be buffered before writing to the disk storage

Network throughput is not a huge issue. A single server with millons of timeseries and 100k samples/second only needs a few megabits/second.

CPU is important, a large server can easily use many cores.

For example, a prometheus server configured to monitor just node_exporter metrics:

  • ~1700 nodes
  • ~1400 metrics/node
  • ~2.3M in-memory series
  • ~78k samples/second

This server uses about 45GB of ram, and typically uses about 5 CPUs

It also needs about 5GB/day of storage space (SSD in this case) with varbit encoding.

We could probably get away with a lot less ram, but it allows for very large historical queries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment