OnDemand Clusters
Open OnDemand Clusters dashboard
This dashboard pulls data provided by several Prometheus exporters:
All metrics expect to have a host
label. This is an example of a relabel configuration in Prometheus to assign all instances the host label:
relabel_configs:
- source_labels: '[__address__]'
regex: '([^.]+)..*'
replacement: '$1'
target_label: host
Record rules used for CPU and network panels:
groups:
- name: node
rules:
- record: node:cpus:count
expr: count by(host,cluster,role) (node_cpu_info)
- record: node:cpu_load_user:avg5m
expr: avg by (host,cluster,role)(irate(node_cpu_seconds_total{mode="user"}[5m]))
- record: node:cpu_load_system:avg5m
expr: avg by (host,cluster,role)(irate(node_cpu_seconds_total{mode="system"}[5m]))
- record: node:cpu_load_iowait:avg5m
expr: avg by (host,cluster,role)(irate(node_cpu_seconds_total{mode="iowait"}[5m]))
- record: node:cpu_load_total:avg5m
expr: 1 - avg by (host,cluster,role)(irate(node_cpu_seconds_total{mode="idle"}[5m]))
- record: node:network_received_rate_bytes
expr: irate(node_network_receive_bytes_total[5m])
- record: node:network_transmit_rate_bytes
expr: irate(node_network_transmit_bytes_total[5m])
Record rules for cgroup related panels:
groups:
- name: cgroup
rules:
- record: cgroup:cpu_user_seconds:irate5m
expr: (irate(cgroup_cpu_user_seconds[5m]) / cgroup_cpus) * on(cgroup, host) group_left(jobid,uid,username) cgroup_info
- record: cgroup:cpu_system_seconds:irate5m
expr: (irate(cgroup_cpu_system_seconds[5m]) / cgroup_cpus) * on(cgroup, host) group_left(jobid,uid,username) cgroup_info
- record: cgroup:cpu_total_seconds:irate5m
expr: (irate(cgroup_cpu_total_seconds[5m]) / cgroup_cpus) * on(cgroup, host) group_left(jobid,uid,username) cgroup_info
- record: cgroup:memory_used_bytes
expr: cgroup_memory_used_bytes * on(cgroup, host) group_left(jobid,uid,username) cgroup_info
- record: cgroup:memory_total_bytes
expr: cgroup_memory_total_bytes * on(cgroup, host) group_left(jobid,uid,username) cgroup_info
- record: cgroup:memory_rss_bytes
expr: cgroup_memory_rss_bytes * on(cgroup, host) group_left(jobid,uid,username) cgroup_info
- record: cgroup:memory_cache_bytes
expr: cgroup_memory_cache_bytes * on(cgroup, host) group_left(jobid,uid,username) cgroup_info
- record: cgroup:swap_used_bytes
expr: (cgroup_memsw_used_bytes - cgroup_memory_used_bytes) * on(cgroup, host) group_left(jobid,uid,username) cgroup_info
Data source config
Collector config:
Upload an updated version of an exported dashboard.json file from Grafana
Revision | Description | Created | |
---|---|---|---|
Download |