cAdvisor doesn't monitor container restarts, but it does pass through / expose the Docker label container_restart_count
to you.
Unfortunately, being a label, you cannot really monitor it. And it looks like this isn't something cAdvisor plans to support soon, as the issue has been closed.
I am not proud of this and hope cAdvisor will support this more easily in the future, but, it does work well. Here is a Prometheus rule you can use to define this metric.
Limitation: If the restart count exceeds 99,999, the behavior is undefined.
groups:
- name: cadvisor-restart-count.rules
rules:
# See https://gist.github.com/slimsag/85e06781eb0d4d35beee12916aefac5f
- record: container_restart_count
expr: |-
(
(count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*1$"}) * 1)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*2$"}) * 2)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*3$"}) * 3)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*4$"}) * 4)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*5$"}) * 5)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*6$"}) * 6)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*7$"}) * 7)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*8$"}) * 8)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*9$"}) * 9)
)
+
(
(count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^.$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^1.$"}) * 10)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^2.$"}) * 20)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^3.$"}) * 30)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^4.$"}) * 40)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^5.$"}) * 50)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^6.$"}) * 60)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^7.$"}) * 70)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^8.$"}) * 80)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^9.$"}) * 90)
)
+
(
(count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^.$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^..$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^1..$"}) * 100)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^2..$"}) * 200)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^3..$"}) * 300)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^4..$"}) * 400)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^5..$"}) * 500)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^6..$"}) * 600)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^7..$"}) * 700)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^8..$"}) * 800)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^9..$"}) * 900)
)
+
(
(count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^.$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^..$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^...$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^1...$"}) * 1000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^2...$"}) * 2000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^3...$"}) * 3000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^4...$"}) * 4000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^5...$"}) * 5000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^6...$"}) * 6000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^7...$"}) * 7000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^8...$"}) * 8000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^9...$"}) * 9000)
)
+
(
(count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^.$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^..$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^...$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^....$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^1....$"}) * 10000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^2....$"}) * 20000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^3....$"}) * 30000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^4....$"}) * 40000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^5....$"}) * 50000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^6....$"}) * 60000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^7....$"}) * 70000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^8....$"}) * 80000)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^9....$"}) * 90000)
)
The first section:
(
(count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*1$"}) * 1)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*2$"}) * 2)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*3$"}) * 3)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*4$"}) * 4)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*5$"}) * 5)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*6$"}) * 6)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*7$"}) * 7)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*8$"}) * 8)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~".*9$"}) * 9)
)
Extracts the last digit of the restartcount
label. The first case ^$
is for when there is no restartcount
label.
Then, we add the 2nd to last digit:
+
(
(count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^.$"}) * 0)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^1.$"}) * 10)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^2.$"}) * 20)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^3.$"}) * 30)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^4.$"}) * 40)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^5.$"}) * 50)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^6.$"}) * 60)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^7.$"}) * 70)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^8.$"}) * 80)
or (count by (name)(container_spec_cpu_shares{container_label_restartcount=~"^9.$"}) * 90)
)
Similarly, the first ^.$
case handles the restartcount
being a single-digit number.
This is repeated to handle restartcount
digits in the range of 0-99,999
This is working for me => https://stackoverflow.com/a/63782891