Menu
Grafana Cloud

Linux Server integration for Grafana Cloud

The Linux integration for Grafana Cloud enables you to collect metrics related to the operating system running on a node, including aspects like CPU usage, load average, memory usage, and disk and networking I/O using the well known node_exporter. It also allows you to use the agent to scrape logs with promtail. Supported files are syslog, auth.log, kern.log and journal logs.

This integration includes sixteen useful alerts and two pre-built dashboards to help monitor and visualize Linux metrics and logs.

Install Linux Server integration for Grafana Cloud

  1. In your Grafana instance, click Integrations and Connections (lightning bolt icon)
  2. Navigate to the Linux Server tile and review the prerequisites. Then click Install integration.
  3. Once the integration is installed, follow the steps on the Configuration Details page to setup Grafana Agent and start sending Linux Server metrics to your Grafana Cloud instance.

Post-install configuration for the Linux Server integration

If you want to show logs and metrics signals correlated in your dashboards, ensure the following:

  • job and instance label values must match for node_exporter integration and logs scrape config in your agent configuration file.
  • job label must be set to integrations/node_exporter (already configured in the snippets).
  • instance label must be set to a value that uniquely identifies your Linux node. Replace the default hostname value according to your environment - it should be set manually. Note that if you use localhost for multiple nodes, the dashboards will not be able to filter correctly by instance.

Note: Ensure each deployed Grafana Agent has a configuration that matches the node it is deployed to.

Refer to the following preferred agent configuration example, with logs collected from systemd:

integrations:
  node_exporter:
    enabled: true
    relabel_configs:
      - replacement: hostname
        target_label: instance

logs:
  configs:
    - name: integrations
      scrape_configs:
        - job_name: integrations/node_exporter_journal_scrape
          journal:
            max_age: 24h
            labels:
              instance: hostname
              job: integrations/node_exporter
          relabel_configs:
            - source_labels: ['__journal__systemd_unit']
              target_label: 'unit'
            - source_labels: ['__journal__boot_id']
              target_label: 'boot_id'
            - source_labels: ['__journal__transport']
              target_label: 'transport'
            - source_labels: ['__journal_priority_keyword']
              target_label: 'level'

If systemd is not an option, you can scrape log files instead:

integrations:
  node_exporter:
    enabled: true
    relabel_configs:
      - replacement: hostname
        target_label: instance

logs:
  configs:
    - name: integrations
      scrape_configs:
        - job_name: integrations/node_exporter_direct_scrape
          static_configs:
            - targets:
                - localhost
              labels:
                instance: hostname
                __path__: /var/log/{syslog,messages,*.log}
                job: integrations/node_exporter

Dashboards

The Linux Server integration installs the following dashboards in your Grafana Cloud instance to help monitor your Linux metrics.

  • Node Exporter / Nodes
  • Node Exporter / USE Method / Node

Linux Node dashboard

image

Linux USE dashboard

image

Alerts

This integration includes the following useful alerts:

Group: node-exporter

AlertDescription
NodeFilesystemAlmostOutOfSpaceWarning: Filesystem has less than 5% space left.
NodeFilesystemAlmostOutOfSpaceCritical: Filesystem has less than 3% space left.
NodeFilesystemFilesFillingUpWarning: Filesystem is predicted to run out of inodes within the next 24 hours.
NodeFilesystemFilesFillingUpCritical: Filesystem is predicted to run out of inodes within the next 4 hours.
NodeFilesystemAlmostOutOfFilesWarning: Filesystem has less than 5% inodes left.
NodeFilesystemAlmostOutOfFilesCritical: Filesystem has less than 3% inodes left.
NodeNetworkReceiveErrsWarning: Network interface is reporting many receive errors.
NodeNetworkTransmitErrsWarning: Network interface is reporting many transmit errors.
NodeHighNumberConntrackEntriesUsedWarning: Number of conntrack are getting close to the limit.
NodeTextFileCollectorScrapeErrorWarning: Node Exporter text file collector failed to scrape.
NodeClockSkewDetectedWarning: Clock skew detected.
NodeClockNotSynchronisingWarning: Clock not synchronising.
NodeRAIDDegradedCritical: RAID Array is degraded
NodeRAIDDiskFailureWarning: Failed device in RAID array
NodeFileDescriptorLimitWarning: Kernel is predicted to exhaust file descriptors limit soon.
NodeFileDescriptorLimitCritical: Kernel is predicted to exhaust file descriptors limit soon.

Metrics

The following metrics are automatically written to your Grafana Cloud instance by connecting your Linux Server instance through this integration:

  • instance:node_cpu_utilisation:rate5m
  • instance:node_load1_per_cpu:ratio
  • instance:node_memory_utilisation:ratio
  • instance:node_network_receive_bytes_excluding_lo:rate5m
  • instance:node_network_receive_drop_excluding_lo:rate5m
  • instance:node_network_transmit_bytes_excluding_lo:rate5m
  • instance:node_network_transmit_drop_excluding_lo:rate5m
  • instance:node_num_cpu:sum
  • instance:node_vmstat_pgmajfault:rate5m
  • instance_device:node_disk_io_time_seconds:rate5m
  • instance_device:node_disk_io_time_weighted_seconds:rate5m
  • node_cpu_seconds_total
  • node_disk_io_time_seconds_total
  • node_disk_io_time_weighted_seconds_total
  • node_disk_read_bytes_total
  • node_disk_written_bytes_total
  • node_exporter_build_info
  • node_filefd_allocated
  • node_filefd_maximum
  • node_filesystem_avail_bytes
  • node_filesystem_files
  • node_filesystem_files_free
  • node_filesystem_readonly
  • node_filesystem_size_bytes
  • node_load1
  • node_load15
  • node_load5
  • node_md_disks
  • node_md_disks_required
  • node_memory_Buffers_bytes
  • node_memory_Cached_bytes
  • node_memory_MemAvailable_bytes
  • node_memory_MemFree_bytes
  • node_memory_MemTotal_bytes
  • node_memory_Slab_bytes
  • node_network_receive_bytes_total
  • node_network_receive_drop_total
  • node_network_receive_errs_total
  • node_network_receive_packets_total
  • node_network_transmit_bytes_total
  • node_network_transmit_drop_total
  • node_network_transmit_errs_total
  • node_network_transmit_packets_total
  • node_nf_conntrack_entries
  • node_nf_conntrack_entries_limit
  • node_textfile_scrape_error
  • node_time_seconds
  • node_timex_maxerror_seconds
  • node_timex_offset_seconds
  • node_timex_sync_status
  • node_uname_info
  • node_vmstat_pgmajfault

Changelog

# 0.0.8 - October 2022

- Update upstream node_exporter mixin: [ba8c043079b38748e57adf1f80e3d86a4060efc5](https://github.com/prometheus/node_exporter/commit/ba8c043079b38748e57adf1f80e3d86a4060efc5)
- Enable multicluster dashboards for use in kubernetes.
- Add direct log file scrape to the agent snippets

# 0.0.7 - September 2022

- Remove source_address from relabel_configs

# 0.0.6 - May 2022

- Reverse fsSpaceAvailableCriticalThreshold and fsSpaceAvailableWarningThreshold
- Update units for disk and networking panels

# 0.0.5 - May 2022

- Update 'Disk Space Usage' panel to table format

# 0.0.4 - April 2022

- Fixed alerts and recording rules by providing proper nodeSelector

# 0.0.3 - February 2022

- Added logs support from Loki datasource

# 0.0.2 - October 2021

- Update all rate queries to use `$__rate_interval`

# 0.0.1 - June 2020

- Initial release

Cost

By connecting your Linux Server instance to Grafana Cloud you might incur charges. To view information on the number of active series that your Grafana Cloud account uses for metrics included in each Cloud tier, see Active series and dpm usage and Cloud tier pricing.