KServe vLLM
Custom KServe dahsboard to use with vLLM backend. All the metrics used start by "vllm:".
In order to scrape KServe metrics, you have to:
- Change your helm deploy to enable the scrapping:
helm upgrade kserve oci://ghcr.io/kserve/charts/kserve \
--reuse-values \
--set metricsaggregator.enablePrometheusScraping=true
- Add an annotation to your inferenceserver:
annotations:
serving.kserve.io/enable-prometheus-scraping: "true"
- Create a servicemonitor:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: <model>
labels:
release: prometheus # must match your Prometheus release name
spec:
selector:
matchLabels:
serving.kserve.io/inferenceservice: <model>
namespaceSelector:
matchNames:
- <namespace>
endpoints:
- port: <model-predictor>
path: /metrics
interval: 15s
Enjoy !
Data source config
Collector config:
Upload an updated version of an exported dashboard.json file from Grafana
Revision | Description | Created | |
---|---|---|---|
Download |