GPU monitoring
Monitors Kubernetes cluster using Prometheus. Shows overall cluster CPU / Memory / Filesystem usage as well as individual pod, containers, systemd services statistics. Uses cAdvisor metrics only.
cadvisor collects the usage information of GPU If you want to collect the GPU temperature or power information, please call the nvidia nvml libraray with node-exporter additionally
Dashboard revisions
Upload an updated version of an exported dashboard.json file from Grafana
Revision | Decscription | Created | |
---|---|---|---|
Download |
Sign up for Grafana Cloud
Get this dashboard
Data source:
Dependencies: