GPU monitoring


Monitors Kubernetes cluster using Prometheus. Shows overall cluster CPU / Memory / Filesystem usage as well as individual pod, containers, systemd services statistics. Uses cAdvisor metrics only.
Last updated: 3 years ago

    cadvisor collects the usage information of GPU If you want to collect the GPU temperature or power information, please call the nvidia nvml libraray with node-exporter additionally

