Openshift Overview v2.0

New Dashboard for K8/OKD/OCP > 3.7

Openshift Overview v2.0 screenshot 1
Openshift Overview v2.0 screenshot 2
Openshift Overview v2.0 screenshot 3
Openshift Overview v2.0 screenshot 4
Openshift Overview v2.0 screenshot 5
Openshift Overview v2.0 screenshot 6

This dashboards does alerting, elasticsearch indices sizes, etcd stuff, capacity utilization, general node stuff, CPU throttling. Oh, and you need the Pie Chart plugin. I like pie.

You will need the elaticsearch exporter, and some other stuff I can't remember right now

Here is my ConfigMap for Prometheus Adjust host names appropriately. Create your own ConfigMap for the etcd certficates

apiVersion: v1
data:
  prometheus.yml: |+
    global:
      scrape_interval: 30s
      evaluation_interval: 30s
scrape_configs:
  - job_name: 'kubernetes-controllers'
    scheme: https
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
        - default
    relabel_configs:
    - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: kubernetes;https
    - source_labels: [__address__]
      action: replace
      target_label: __address__
      regex: (.+)(?::\d+)
      replacement: $1:8444

  - job_name: 'kubernetes-nodes'
    scheme: https
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    kubernetes_sd_configs:
    - role: node
    metric_relabel_configs:
    - source_labels: [__name__]
      action: drop
      regex: 'openshift_sdn_pod_(setup|teardown)_latency(.*)'
    relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)

  - job_name: 'kubernetes-cadvisor'
    scheme: https
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    metrics_path: /metrics/cadvisor
    kubernetes_sd_configs:
    - role: node
    metric_relabel_configs:
    - source_labels: [__name__]
      action: drop
      regex: 'container_(cpu_user_seconds_total|cpu_cfs_periods_total|memory_usage_bytes|memory_swap|memory_cache|last_seen|fs_(read_seconds_total|write_seconds_total|sector_(.*)|io_(.*)|reads_merged_total|writes_merged_total)|tasks_state|memory_failcnt|memory_failures_total|spec_memory_swap_limit_bytes|fs_(.*)_bytes_total|spec_(.*))'
    relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)
  - job_name: 'kubernetes-service-endpoints'
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true
    kubernetes_sd_configs:
    - role: endpoints
    relabel_configs:
    - source_labels: [__meta_kubernetes_namespace]
      action: keep
      regex: 'default|logging|metrics|kube-.+|openshift|openshift-.+'
    - source_labels: [__meta_kubernetes_service_name]
      action: drop
      regex: 'prometheus-node-exporter'
    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
      action: keep
      regex: true
    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
      action: replace
      target_label: __scheme__
      regex: (https?)
    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
      action: replace
      target_label: __metrics_path__
      regex: (.+)
    - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
      action: replace
      target_label: __address__
      regex: (.+)(?::\d+);(\d+)
      replacement: $1:$2
    - action: labelmap
      regex: __meta_kubernetes_service_label_(.+)
    - source_labels: [__meta_kubernetes_namespace]
      action: replace
      target_label: kubernetes_namespace
    - source_labels: [__meta_kubernetes_service_name]
      action: replace
      target_label: kubernetes_name

  - job_name: node
    static_configs:
      - targets:
        - "dcposen01.lsd.co.za:9100"
        - "dcposen02.lsd.co.za:9100"
        - "dcposen03.lsd.co.za:9100"
        - "dcposen04.lsd.co.za:9100"
        - "dcposen05.lsd.co.za:9100"
        - "dcposen06.lsd.co.za:9100"
        - "dcposen07.lsd.co.za:9100"
        - "dcposen08.lsd.co.za:9100"
        - "dcposei01.lsd.co.za:9100"
        - "dcposei02.lsd.co.za:9100"
        - "dcposei03.lsd.co.za:9100"
        - "dcposem02.lsd.co.za:9100"
        - "dcposem03.lsd.co.za:9100"
        - "dcposem01.lsd.co.za:9100"

  - job_name: kube-state-metrics
    static_configs:
      - targets:
        - "kube-state-metrics.openshift-monitoring.svc.cluster.local:8080"

  - job_name: openshift-etcd
    scheme: https
    tls_config:
      ca_file: /etc/prometheus/openshift-etcd-certs/ca.crt
      cert_file: /etc/prometheus/openshift-etcd-certs/peer.crt
      key_file: /etc/prometheus/openshift-etcd-certs/peer.key
      insecure_skip_verify: true
    static_configs:
      - targets:
        - "172.30.48.31:2379"
        - "172.30.50.15:2379"
        - "172.30.50.16:2379"

  - job_name: blackbox-exporter
    metrics_path: /probe
    params:
      module: [http_2xx_no_ssl_check]  # Look for a HTTP 200 response.
    static_configs:
      - targets:
        - https://webserver-invest-prod.oseza.lsd.co.za
        - https://hawkular-metrics.oseza.lsd.co.za
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox-exporter.openshift-monitoring.svc.cluster.local:9115


  - job_name: 'tcp_cert_check'
    scrape_interval: 1h
    metrics_path: /probe
    params:
      module: [tcp_cert]
    static_configs:
      - targets:
        - '172.30.48.31:2379'
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox-exporter.openshift-monitoring.svc.cluster.local:9115


  - job_name: blackbox-exporter-ssh
    metrics_path: /probe
    params:
        module: [ssh_banner]
    static_configs:
    - targets:
      - dcposen02.lsd.co.za:22
      - dcposen03.lsd.co.za:22
      - dcposen04.lsd.co.za:22
      - dcposen05.lsd.co.za:22
      - dcposen06.lsd.co.za:22
      - dcposen07.lsd.co.za:22
      - dcposen08.lsd.co.za:22
      - dcposei01.lsd.co.za:22
      - dcposei02.lsd.co.za:22
      - dcposei03.lsd.co.za:22
      - dcposem02.lsd.co.za:22
      - dcposem03.lsd.co.za:22
      - dcposem01.lsd.co.za:22
    relabel_configs:
      - source_labels: [__address__]     # Leave this spooky blackbox stuff alone
        target_label: __param_target     # Stuff breaks otherwise
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox-exporter.openshift-monitoring.svc.cluster.local:9115
  - job_name: elasticsearch-exporter
    static_configs:
      - targets:
        - "elasticsearch-exporter.openshift-monitoring.svc.cluster.local:9108"

kind: ConfigMap

Revisions
RevisionDescriptionCreated

Get this dashboard

Import the dashboard template

or

Download JSON

Datasource
Dependencies