KServe vLLM

Custom KServe dahsboard to use with vLLM backend. All the metrics used start by "vllm:".

In order to scrape KServe metrics, you have to:

  1. Change your helm deploy to enable the scrapping:
helm upgrade kserve oci://ghcr.io/kserve/charts/kserve \
--reuse-values \
--set metricsaggregator.enablePrometheusScraping=true
  1. Add an annotation to your inferenceserver:
metadata:
  1. Create a servicemonitor:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
   name: <model>
   labels:
      release: prometheus          # must match your Prometheus release name
spec:
  selector:
    matchLabels:
      serving.kserve.io/inferenceservice: <model>
  namespaceSelector:
    matchNames:
      - <namespace>
  endpoints:
    - port: <model-predictor>
      path: /metrics
      interval: 15s

Enjoy !

Revisions
RevisionDescriptionCreated

Get this dashboard

Import the dashboard template

or

Download JSON

Datasource
Dependencies