Node exporter

Node exporter

Overview Installation Recording rules Dashboards Alerting rules Grafana Cloud Integration

On this page:

You should load the following recording rules before loading the dashboards in this guide. The dashboard queries use recording rules to reduce load on the Prometheus or Grafana Cloud Metrics servers, depending on where you’re evaluating the rules.

This quickstart includes the following recording rules:

  • instance:node_num_cpu:sum
  • instance:node_cpu_utilisation:rate1m
  • instance:node_load1_per_cpu:ratio
  • instance:node_memory_utilisation:ratio
  • instance:node_vmstat_pgmajfault:rate1m
  • instance_device:node_disk_io_time_seconds:rate1m
  • instance_device:node_disk_io_time_weighted_seconds:rate1m
  • instance:node_network_receive_bytes_excluding_lo:rate1m
  • instance:node_network_transmit_bytes_excluding_lo:rate1m
  • instance:node_network_receive_drop_excluding_lo:rate1m
  • instance:node_network_transmit_drop_excluding_lo:rate1m
Download the following recording rules YAML file
yaml
"groups":
- "name": "node-exporter.rules"
  "rules":
  - "expr": |
      count without (cpu) (
        count without (mode) (
          node_cpu_seconds_total{job="node"}
        )
      )
    "record": "instance:node_num_cpu:sum"
  - "expr": |
      1 - avg without (cpu, mode) (
        rate(node_cpu_seconds_total{job="node", mode="idle"}[1m])
      )
    "record": "instance:node_cpu_utilisation:rate1m"
  - "expr": |
      (
        node_load1{job="node"}
      /
        instance:node_num_cpu:sum{job="node"}
      )
    "record": "instance:node_load1_per_cpu:ratio"
  - "expr": |
      1 - (
        node_memory_MemAvailable_bytes{job="node"}
      /
        node_memory_MemTotal_bytes{job="node"}
      )
    "record": "instance:node_memory_utilisation:ratio"
  - "expr": |
      rate(node_vmstat_pgmajfault{job="node"}[1m])
    "record": "instance:node_vmstat_pgmajfault:rate1m"
  - "expr": |
      rate(node_disk_io_time_seconds_total{job="node", device!=""}[1m])
    "record": "instance_device:node_disk_io_time_seconds:rate1m"
  - "expr": |
      rate(node_disk_io_time_weighted_seconds_total{job="node", device!=""}[1m])
    "record": "instance_device:node_disk_io_time_weighted_seconds:rate1m"
  - "expr": |
      sum without (device) (
        rate(node_network_receive_bytes_total{job="node", device!="lo"}[1m])
      )
    "record": "instance:node_network_receive_bytes_excluding_lo:rate1m"
  - "expr": |
      sum without (device) (
        rate(node_network_transmit_bytes_total{job="node", device!="lo"}[1m])
      )
    "record": "instance:node_network_transmit_bytes_excluding_lo:rate1m"
  - "expr": |
      sum without (device) (
        rate(node_network_receive_drop_total{job="node", device!="lo"}[1m])
      )
    "record": "instance:node_network_receive_drop_excluding_lo:rate1m"
  - "expr": |
      sum without (device) (
        rate(node_network_transmit_drop_total{job="node", device!="lo"}[1m])
      )
    "record": "instance:node_network_transmit_drop_excluding_lo:rate1m"

This recording rule YAML file was generated using the Node Exporter mixin. It uses the job=node label selector to query metrics by default. If you need to use a different selector, modify the selector in config.libsonnet and regenerate the dashboard following the instructions in the mixin repository.