This document provides an overview of all metrics generated by the llm-d
components.
The llm-d
system uses Prometheus as the primary metrics collection framework, with metrics covering inference performance, resource utilization,
error rates, and energy consumption across multiple components.