This sprint there are two issues in product backlog about dashboards and metrics.
The current dashboard about deis-router has response time, status code, requests per second, CPU and Memory. We are getting CPU and memory from kubernetes Prometheus end point. The reason why router has additional metrics other than CPU and memory
[2016-08-11T19:50:32+00:00] - deis/deis-monitor-grafana - 10.240.0.23 - - - 200 - "GET /api/datasources/proxy/1/query?db=kubernetes&q=SELECT%20last(%22gauge%22)%20FROM%20%22container_memory_usage_bytes%22%20WHERE%20%22kubernetes_container_name%22%20%3D%20%27deis-logger-redis%27%20AND%20time%20%3E%20now()%20-%205m%20GROUP%20BY%20time(2s)%20fill(null)&epoch=ms HTTP/1.1" - 772 - "http://grafana.104.154.18.233.nip.io/dashboard/db/redis" - "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:47.0) Gecko/20100101 Firefox/47.0" - "~^grafana\x5C.(?<domain>.+)$" - 10.135.243.27:80 - grafana.104.154.18.233.nip.io - 0.088 - 0.088
If you observe the above log it has time stamp and status code and collecting response time. which fluentd collects and sends to NSQ and NSQ formats the log to influx JSON and sends to telegraf.
where as other components for example controller log looks like this INFO Pulling Docker image localhost:5555/jaffa:v2
For other components to show similar metrics like router they have to be modified to show metrics in logs or somehow send them to NSQ which can store in telegraf.
Now coming to default metric for each deis components to set in manifests. We have a concrete problem here we don't know the cluster size the customer is using and we have no idea or information about when some metric will spike for other deis components as they are not exposing any metrics other than CPU and memory as of now.
For now what we can achieve for metrics dashboard and set limits to components ?
- for metrics dashboard we are already collecting CPU and memory of every component in deis-health dashboard.
- deis-health dashboard has a single graph which shows everything. we can split the graph into individual panel components.
- for setting limits to deis components give the user an interface to set custom limits in each chart.