Inspired by https://github.com/kubernetes/kube-state-metrics, this is a proposal for a prometheus metrics exporter. The exporter will export metrics solely for Gateway API resources. The exporter will watch the kubernetes API server for Gateway API resources and convert the contents of the resources into metrics with labels and values.
By providing a standard set of metrics around Gateway API resources it will allow further standardization and sharing of things like:
- Alert queries and recording rules
- A General Gateway API Dashboard
Additionally, the metrics available from this exporter could provide the glue for doing more complex and useful queries when combined with metrics from underlying gateway providers.
- A metrics exporter component that exposes metrics for a core set of Gateway API resource on a HTTP endpoint in prometheus exposition format
- The exposed metrics follow the same conventions as Kube State Metrics
- The exporter will follow best practices and guidelines
- Implement metrics for all Gateway API resources. Additional metrics can be added later
- Any example dashboards or alerts that make use of the new metrics
This proposal focuses on an initial set of metrics for the core Gateway API resources:
- Gateway
- GatewayClass
- HTTPRoute
The reason for choosing these 3 resources is:
- to limit the scope while a pattern is established for naming, labels and resource relationships without getting bogged down in content
- because these resources are the main resources used in the Kuadrant project day to day, and value can be added there quickly
Information about a Gateway, Gauge
gatewayapi_gateway_info{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>"} 1
Unix creation timestamp in seconds, Gauge
gatewayapi_gateway_created{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>"} 1690879977
Unix deletion timestamp in seconds, Gauge
gatewayapi_gateway_deletion_timestamp{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>"} 1690879977
Per Listener information, Gauge
gatewayapi_gateway_listener{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",listener="<LISTENER_NAME>",port="<PORT>",protocol="<PROTOCOL>"} 1
Potential additional labels for optional fields include hostname
& tls-mode
AllowedRoutes and CertificateRefs would need some thought if they are to be represented as metrics, as they are lists.
Perhaps separate metrics e.g. gatewayapi_gateway_listener_allowed_route{}
and gatewayapi_gateway_listener_certificate_ref{}
While technically possible, there's potential for excessive numbers of metrics being generated.
Status Conditions of Gateway, Gauge, 1 or 0 (1 means this condition currently applies to this gateway)
gatewayapi_gateway_status_condition{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",condition="<Accepted|Scheduled|Ready>",status="<true|false|unknown>"} 1
Number of attached routes for an individual listener, Gauge
gatewayapi_gateway_status_listener_attached_routes{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",listener="<LISTENER_NAME>"} 5
Status conditions of individual listeners, Gauge 1 or 0 (1 means this condition currently applies to this listener)
gatewayapi_gateway_status_listener_condition{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",listener="<LISTENER_NAME>",condition="<Accepted|Conflicted|Detached|Programmed|Ready|ResolvedRefs>",status="<true|false|unknown>"} 1
Address types and values, Gauge
gatewayapi_gateway_status_address{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",type="<IPAddress|Hostname>",value="<ADDRESS>"} 1
gatewayapi_gateway_status_address{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",type="<DOMAIN_PREFIXED_STRING_IDENTIFIER>",value="<ADDRESS>"} 1
Kubernetes annotations converted to Prometheus labels, Gauge
gatewayapi_gateway_annotations{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",annotation_GATEWAY_ANNOTATION="<GATEWAY_ANNOTATION>"} 1
Kubernetes labels converted to Prometheus labels, Gauge
gatewayapi_gateway_labels{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",label_GATEWAY_LABEL="<GATEWAY_LABEL>"} 1
Find all Gateways not in a Ready state
gatewayapi_gateway_status_condition{condition="Ready",status!="true"} > 0
Count the number of listeners across all gateways
count(gatewayapi_gateway_listener)
Find any gateways with 0 attached routes
gatewayapi_gateway_status_listener_attached_routes == 0
Find any listeners not in a Ready state
gatewayapi_gateway_status_listener_condition{condition="Ready",status!="true"} > 0
Information about a GatewayClass, Gauge
gatewayapi_gatewayclass_info{gatewayclass="<GATEWAYCLASS_NAME>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>"} 1
ParametersReference would need some thought if they are to be represented as metrics, as it is a list. Perhaps a separate metric e.g. gatewayapi_gatewayclass_parameter_ref{}
The GatewayClass description
is an optional field that has limited value in a metric. It doesn't make sense to include this field as a label.
Unix creation timestamp in seconds, Gauge
gatewayapi_gatewayclass_created{gatewayclass="<GATEWAYCLASS_NAME>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>"} 1690879977
Unix deletion timestamp in seconds, Gauge
gatewayapi_gatewayclass_deletion_timestamp{gatewayclass="<GATEWAYCLASS_NAME>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>"} 1690879977
Status Conditions of GatewayClass, Gauge, 1 or 0 (1 means this condition currently applies to this GatewayClass)
gatewayapi_gatewayclass_status_condition{gatewayclass="<GATEWAYCLASS_NAME>",condition="Accepted",status="<true|false|unknown>"} 1
Kubernetes annotations converted to Prometheus labels, Gauge
gatewayapi_gatewayclass_annotations{gatewayclass="<GATEWAYCLASS_NAME>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>",annotation_GATEWAYCLASS_ANNOTATION="<GATEWAYCLASS_ANNOTATION>"} 1
Kubernetes labels converted to Prometheus labels, Gauge
gatewayapi_gatewayclass_labels{gatewayclass="<GATEWAYCLASS_NAME>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>",label_GATEWAYCLASS_LABEL="<GATEWAYCLASS_LABEL>"} 1
Find any GatewayClasses that are not in an accepted state
gatewayapi_gatewayclass_status_condition{condition="Accepted",status!="true"} > 0
Get the GatewayClass info (e.g. controller name) of any Gateways that are not in a Ready state
(gatewayapi_gateway_status_condition{condition="Ready",status!="true"} > 0)
* on(gatewayclass) group_right gatewayapi_gatewayclass_info
Information about a HTTPRoute, Gauge
gatewayapi_httproute_info{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>"} 1
Unix creation timestamp in seconds, Gauge
gatewayapi_httproute_created{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>"} 1690879977
Unix deletion timestamp in seconds, Gauge
gatewayapi_httproute_deletion_timestamp{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>"} 1690879977
Parent References that the route wants to be attached to, Gauge
gatewayapi_httproute_parent_ref{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>",parent_ref_group="<GROUP>",parent_ref_kind="<KIND>",parent_ref_namespace="<PARENT_REF_NAMESPACE>",parent_ref_name="<PARENT_REF_NAME>",parent_ref_section_name="<PARENT_REF_SECTION_NAME>",parent_ref_port="<PARENT_REF_PORT>"} 1
- To avoid confusion, the parent ref namespace will use the label
parent_ref_namespace
instead ofnamespace
(which is the namespace of the HTTPRoute). Then, for the sake of keeping a pattern, all other parent ref fields will be prefixed withparent_ref_
e.g.parent_ref_group
Hostnames to match against the HTTP Host header, Gauge
gatewayapi_httproute_hostname{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>",hostname="<HOSTNAME>"} 1
Rules would need some thought if they are to be represented as metrics as there are nested lists for matches
, filters
and backendRefs
.
While technically possible, there's potential for excessive numbers of metrics being generated.
HTTPRouteStatus parent status conditions, Gauge
gatewayapi_httproute_status_parent_condition{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>",condition="<Accepted|ResolvedRefs>",status="<true|fasle|unknown>",parent_ref_group="<GROUP>",parent_ref_kind="<KIND>",parent_ref_namespace="<PARENT_REF_NAMESPACE>",parent_ref_name="<PARENT_REF_NAME>",parent_ref_section_name="<PARENT_REF_SECTION_NAME>",parent_ref_port="<PARENT_REF_PORT>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>"} 1
Kubernetes annotations converted to Prometheus labels, Gauge
gatewayapi_httproute_annotations{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>",annotation_HTTPROUTE_ANNOTATION="<HTTPROUTE_ANNOTATION>"} 1
Kubernetes labels converted to Prometheus labels, Gauge
gatewayapi_httproute_labels{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>",label_HTTPROUTE_LABEL="<HTTPROUTE_LABEL>"} 1
Find any HTTPRoutes that haven't been Accepted by a parent.
gatewayapi_httproute_status_parent_condition{condition="Accepted",status!="true"} > 0
Get any non-true Gateway listener status conditions for HTTPRoutes in the default namespace that haven't been Accepted by a parent.
(gatewayapi_gateway_status_listener_condition{status!=true} > 0)
* on(gateway)
label_replace(
gatewayapi_httproute_status_parent_condition{namespace="default",condition="Accepted",status!="true"},
"gateway","$1","parent_ref_name", "(.+)"
)