APP resource monitoring
- Data should be kept as long as Long time apps running
- Data should be kept months after apps die
- Support querying data per container
- Support querying data per app
- Support querying data per cluster
- Tens of thousands of collectors
- Load balancing multiple query engines
comparision between prometheus opentsdb influxdb Graphite
name | design | performance | scale | availability | usage |
---|---|---|---|---|---|
OpenTSDB | Hbase | A single TSD can handle thousands of writes per second, millions of writes per second | tens of thousands | HBase HA + Haproxy | http://www.searchtb.com/2012/07/opentsdb-monitoring-system.html |
Influxdb | rocksdb/leveldb/SQL | - | pre-configed shards | Haproxy + double write | - |
Prometheus | Private storage | - | require explicit sharding once the capacity of a single node is exceeded | - | - |
Graphite | - | - | - | - | |
Zabbix | RDBMS mysql/pgsql | - | - | - | - |
http://db-engines.com/en/ranking/time+series+dbms
Tag方式存储数据 start=1356998400&m=sum:sys.cpu.user{host=webserver01} sys.cpu.user host=webserver01,cpu=0
8 tags at most
3 bytes uid
cadvisor
https://influxdata.com/blog/benchmarking-leveldb-vs-rocksdb-vs-hyperleveldb-vs-lmdb-performance-for-influxdb/