Skip to content

Instantly share code, notes, and snippets.

@liveaverage
Last active May 4, 2022 13:57
Show Gist options
  • Save liveaverage/f41553d47fa295588bbdb7018e7493b4 to your computer and use it in GitHub Desktop.
Save liveaverage/f41553d47fa295588bbdb7018e7493b4 to your computer and use it in GitHub Desktop.
OpenShift GPU Monitoring Configuration
apiVersion: integreatly.org/v1alpha1
kind: GrafanaDataSource
metadata:
name: nv-ds
namespace: nvidia-gpu-operator
spec:
datasources:
- basicAuthUser: nvadmin
access: proxy
editable: true
secureJsonData:
basicAuthPassword: nvopenshift
name: Prometheus
url: 'https://prometheus-k8s.openshift-monitoring.svc.cluster.local:9091'
jsonData:
timeInterval: 5s
tlsSkipVerify: true
basicAuth: true
isDefault: true
version: 1
type: prometheus
name: nv-ds.yaml
#!/bin/bash
# Alternative configuration using bearer token and thanos prometheus endpoint
oc adm policy add-cluster-role-to-user cluster-monitoring-view -z grafana-serviceaccount -n nvidia-gpu-operator
TOKEN=`oc serviceaccounts get-token grafana-serviceaccount -n nvidia-gpu-operator`
SVCIP=`oc get svc thanos-querier -n openshift-monitoring -o jsonpath='{.spec.clusterIP}'`
# Confirm token is set and retry until it is
i=0;
while [ "$TOKEN" == "" ] || [ ${#TOKEN} -lt 2 ]; do
i=$((i+1));
echo "Waiting for token intialization (Attempt $i)";
sleep 2;
TOKEN=`oc serviceaccounts get-token grafana-serviceaccount -n nvidia-gpu-operator`
done
sleep 10;
cat << EOF | oc apply -f -
apiVersion: integreatly.org/v1alpha1
kind: GrafanaDataSource
metadata:
name: nv-ds
namespace: nvidia-gpu-operator
spec:
datasources:
- access: proxy
editable: true
isDefault: true
jsonData:
httpHeaderName1: 'Authorization'
timeInterval: 5s
tlsSkipVerify: true
name: Prometheus
secureJsonData:
httpHeaderValue1: 'Bearer ${TOKEN}'
type: prometheus
url: 'https://${SVCIP}:9091'
name: nv-ds.yaml
EOF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment