K8S Cluster Health
Dashboard to get an overview of K8S Production Cluster
With this dashboard we will be able to have a full overview of your K8S cluster services like API Server, ETCD, Ingress, Cluster Autoscaler, Prometheus and some other stuff.
This dashboard uses metrics from a lot of cluster services, so make sure you have configured your Prometheus to scrap metrics from this all this apps:
- Cluster Autoscaler Metrics
- API Server Metrics
- ETCD Metrics
- Node Exporter Metrics
- Ingress Metrics
- Kube State Metrics
Version Information:
- Kubernetes: 1.15.0+
- Prometheus: 2.17.0+ (tested and working fine with Prometheus 1.x)
- Node Exporter: 0.18.0+
- Cluster Autoscaler: 1.17.0+
- Kube State Metrics: 1.8.0+
Our Prometheus Scrap Rule for ETCD Relabel:
- job_name: 'etcd-manager'
kubernetes_sd_configs:
- role: pod
tls_config:
ca_file: /etc/prometheus/etcd-certs/etcd-clients-ca.crt
cert_file: /etc/prometheus/etcd-certs/prometheus-etcd.crt
key_file: /etc/prometheus/etcd-certs/prometheus-etcd.key
insecure_skip_verify: true
scheme: https
metrics_path: '/metrics'
relabel_configs:
- action: keep
regex: ^(etcd-manager-main-.*)$
source_labels:
- __meta_kubernetes_pod_name
- source_labels: [__address__]
action: replace
regex: (.+)
replacement: $1:4001
target_label: __address__
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: node_name
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: pod_name
- source_labels: [__meta_kubernetes_pod_container_name]
action: replace
target_label: container_name
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
replacement: $1
Data source config
Collector config:
Upload an updated version of an exported dashboard.json file from Grafana
Revision | Description | Created | |
---|---|---|---|
Download |
Kubernetes
Monitor your Kubernetes deployment with prebuilt visualizations that allow you to drill down from a high-level cluster overview to pod-specific details in minutes.
Learn more