GPU Health - EKS Cluster
Cluster Wide View of Common GPU Errors
HyperPod EKS Dashboard for Cluster wide GPU Metrics reported from Nvidia DCGM Exporter
Data source config
Collector config:
Upload an updated version of an exported dashboard.json file from Grafana
Revision | Description | Created | |
---|---|---|---|
Download |