K8S Cluster Health

Dashboard

Dashboard to get an overview of K8S Production Cluster
Last updated: a year ago

Start with Grafana Cloud and the new FREE tier. Includes 10K series Prometheus or Graphite Metrics and 50gb Loki Logs

Downloads: 1233

Reviews: 0

  • cluster-1.JPG
    cluster-1.JPG
  • cluster-2.JPG
    cluster-2.JPG
  • cluster-3.JPG
    cluster-3.JPG

With this dashboard we will be able to have a full overview of your K8S cluster services like API Server, ETCD, Ingress, Cluster Autoscaler, Prometheus and some other stuff.

This dashboard uses metrics from a lot of cluster services, so make sure you have configured your Prometheus to scrap metrics from this all this apps:

  • Cluster Autoscaler Metrics
  • API Server Metrics
  • ETCD Metrics
  • Node Exporter Metrics
  • Ingress Metrics
  • Kube State Metrics

Version Information:

  • Kubernetes: 1.15.0+
  • Prometheus: 2.17.0+ (tested and working fine with Prometheus 1.x)
  • Node Exporter: 0.18.0+
  • Cluster Autoscaler: 1.17.0+
  • Kube State Metrics: 1.8.0+

Our Prometheus Scrap Rule for ETCD Relabel:

- job_name: 'etcd-manager'
      kubernetes_sd_configs:
      - role: pod
      tls_config:
        ca_file: /etc/prometheus/etcd-certs/etcd-clients-ca.crt
        cert_file: /etc/prometheus/etcd-certs/prometheus-etcd.crt
        key_file: /etc/prometheus/etcd-certs/prometheus-etcd.key
        insecure_skip_verify: true
      scheme: https
      metrics_path: '/metrics'
      relabel_configs:
        - action: keep
          regex: ^(etcd-manager-main-.*)$
          source_labels:
          - __meta_kubernetes_pod_name
        - source_labels: [__address__]
          action: replace
          regex: (.+)
          replacement: $1:4001
          target_label: __address__
        - source_labels: [__meta_kubernetes_pod_node_name]
          action: replace
          target_label: node_name
        - source_labels: [__meta_kubernetes_pod_name]
          action: replace
          target_label: pod_name
        - source_labels: [__meta_kubernetes_pod_container_name]
          action: replace
          target_label: container_name
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
          replacement: $1