Configure Node exporter to generate Prometheus metrics
Grafana Cloud

Configure Node exporter to generate Prometheus metrics

To configure nodes to generate Prometheus metrics, install the the Node Exporter.

Metrics and KPIs

MetricKPI

CPU

node_cpu_seconds_total

1 - avg by(instance, job)(rate(node_cpu_seconds_total{mode=“idle”}[5m]))

Memory

node_memory_MemTotal_bytes

node_memory_Buffers_bytes

node_memory_Cached_bytes

node_memory_MemFree_bytes

node_memory_Slab_bytes

node_vmstat_pgmajfault

Memory Utilization

1 - (buffer + cached + free + slab)/total

Page Fault Rate

rate(node_vmstat_pgmajfault[1m])

Network Bytes

node_network_receive_bytes_total

node_network_transmit_bytes_total

Network Byte Rate

rate(node_network_receive_bytes_total[5m])

rate(node_network_transmit_bytes_total[5m])

Disk

node_filesystem_avail_bytes

node_filesystem_size_bytes

Read/Write byte rate

node_disk_read_bytes_total

node_disk_written_bytes_total

Read Time and Count

node_disk_read_time_seconds_total

node_disk_reads_completed_total

Write Time and Count

node_disk_write_time_seconds_total

node_disk_writes_completed_total

Disk Utilization

1 - available bytes / size bytes

Disk IO Rate

rate(node_disk_read_bytes_total[5m])

rate(node_disk_written_bytes_total[5m])

Disk Average Latency

rate(…time_seconds_total[5m])

/

rate(…_completed_total[5m])

Alerts

KPIAlert

Memory Utilization

High Memory Page Faults

Saturation with resource_type=memory:utilization

Saturation with resource_type=memory:page_fault

CPU UtilizationSaturation
Network Bytes RateResourceRateAnomaly
Disk UtilizationSaturation
Disk Read/Write RateResourceRateAnomaly
Disk Read/Write Latency AverageSaturation when latency average breaches 100ms

Dashboard

The following dashboard shows Node KPIs.

Node dashboard