Brent Salisbury nerdalert

Instructions for enabling and disabling metrics in a running Kuadrant OLM deployment

Load the patched Kuadrant operator:

kubectl -n kuadrant-system patch csv kuadrant-operator.v0.0.0 --type='json' \

llm-d-infra/charts/llm-d-infra/templates/gateway-infrastructure/gatewayparameters.yaml

{{ if and .Values.gateway.enabled (eq .Values.gateway.gatewayClassName "kgateway") .Values.gateway.gatewayParameters.enabled }}
apiVersion: gateway.kgateway.dev/v1alpha1
kind: GatewayParameters
metadata:
  name: {{ include "gateway.fullname" . }}
  labels: {{ include "common.labels.standard" . | nindent 4 }}
    app.kubernetes.io/gateway: {{ include "gateway.fullname" . }}

$ kubectl get envoyfilter --all-namespaces -o yaml
apiVersion: v1
items:
- apiVersion: networking.istio.io/v1alpha3
  kind: EnvoyFilter
  metadata:
    creationTimestamp: "2025-07-10T05:32:40Z"
    generation: 1
    labels:

vLLM Inference Simulator

Repo at llm-d/llm-d-inference-sim

1 . Start the sim container

podman || docker run --rm --net host ghcr.io/llm-d/llm-d-inference-sim \
  --port 8000 \
 --model "Qwen/Qwen2.5-1.5B-Instruct" \

Request rates 10,30,inf (num prompts max 900)

Only difference in the commands are metadata (deployment name for graphing):

./run-bench.sh   --model meta-llama/Llama-3.2-3B-Instruct \
  --base_url http://llm-d-inference-gateway.llm-d.svc.cluster.local:80 \
  --dataset-name random \
  --input-len 1000 \
 --output-len 500 \

$ ENV_METADATA_GPU="4xNVIDIA_L40S" \
./e2e-bench-control.sh --4xgpu-minikube --model meta-llama/Llama-3.2-3B-Instruct

🌟 LLM Deployment and Benchmark Orchestrator 🌟
-------------------------------------------------
--- Configuration Summary ---
Minikube Start Args (Hardcoded): --driver docker --container-runtime docker --gpus all --memory no-limit --cpus no-limit
LLMD Installer Script (Hardcoded): ./llmd-installer.sh
Test Request Script (Hardcoded): ./test-request.sh (Args: --minikube, Retry: 30s)

	2025-06-24T04:04:35.7285268Z Current runner version: '2.325.0'
	2025-06-24T04:04:35.7315212Z ##[group]Operating System
	2025-06-24T04:04:35.7316005Z Ubuntu
	2025-06-24T04:04:35.7316512Z 24.04.2
	2025-06-24T04:04:35.7316951Z LTS
	2025-06-24T04:04:35.7317533Z ##[endgroup]
	2025-06-24T04:04:35.7318061Z ##[group]Runner Image
	2025-06-24T04:04:35.7318693Z Image: ubuntu-24.04
	2025-06-24T04:04:35.7319275Z Version: 20250615.1.0
	2025-06-24T04:04:35.7320471Z Included Software: https://github.com/actions/runner-images/blob/ubuntu24/20250615.1/images/ubuntu/Ubuntu2404-Readme.md

	2025-06-24T03:52:30.5394764Z Current runner version: '2.325.0'
	2025-06-24T03:52:30.5428738Z ##[group]Operating System
	2025-06-24T03:52:30.5429883Z Ubuntu
	2025-06-24T03:52:30.5430846Z 24.04.2
	2025-06-24T03:52:30.5431596Z LTS
	2025-06-24T03:52:30.5432469Z ##[endgroup]
	2025-06-24T03:52:30.5433460Z ##[group]Runner Image
	2025-06-24T03:52:30.5434453Z Image: ubuntu-24.04
	2025-06-24T03:52:30.5435403Z Version: 20250615.1.0
	2025-06-24T03:52:30.5437208Z Included Software: https://github.com/actions/runner-images/blob/ubuntu24/20250615.1/images/ubuntu/Ubuntu2404-Readme.md

	2025-06-24T02:53:43.4754220Z Current runner version: '2.325.0'
	2025-06-24T02:53:43.4782172Z ##[group]Operating System
	2025-06-24T02:53:43.4783684Z Ubuntu
	2025-06-24T02:53:43.4784437Z 24.04.2
	2025-06-24T02:53:43.4785176Z LTS
	2025-06-24T02:53:43.4785895Z ##[endgroup]
	2025-06-24T02:53:43.4786846Z ##[group]Runner Image
	2025-06-24T02:53:43.4787746Z Image: ubuntu-24.04
	2025-06-24T02:53:43.4788600Z Version: 20250615.1.0
	2025-06-24T02:53:43.4790596Z Included Software: https://github.com/actions/runner-images/blob/ubuntu24/20250615.1/images/ubuntu/Ubuntu2404-Readme.md


	ubuntu@ip-172-31-16-33:~/secret-llm-d-deployer/project$ kubectl logs -n kgateway-system kgateway-7c58ddd989-nw5wc -c kgateway --previous --tail=200
	{"level":"info","ts":"2025-05-17T18:01:08.979Z","caller":"probes/probes.go:57","msg":"probe server starting at :8765 listening for /healthz"}
	{"level":"info","ts":"2025-05-17T18:01:08.979Z","caller":"setup/setup.go:69","msg":"got settings from env: {DnsLookupFamily:V4_PREFERRED EnableIstioIntegration:false EnableIstioAutoMtls:false IstioNamespace:istio-system XdsServiceName:kgateway XdsServicePort:9977 UseRustFormations:false EnableInferExt:true InferExtAutoProvision:false DefaultImageRegistry:cr.kgateway.dev/kgateway-dev DefaultImageTag:v2.0.0 DefaultImagePullPolicy:IfNotPresent}"}
	{"level":"info","ts":"2025-05-17T18:01:08.980Z","logger":"k8s","caller":"setup/setup.go:110","msg":"starting kgateway"}
	{"level":"info","ts":"2025-05-17T18:01:08.984Z","logger":"k8s","caller":"setup/setup.go:117","msg":"creating krt collections"}
	{"level":"info","ts":"2025-05-17T18:01