Linux Server integration for Grafana Cloud
Linux is a family of open-source Unix-like operating systems based on the Linux kernel. Linux is the leading operating system on servers, and is one of the most prominent examples of free and open-source software collaboration.
Linux Server integration for Grafana Cloud enables you to collect metrics related to the operating system running on a node, including aspects like CPU usage, load average, memory usage, and disk and networking I/O using node_exporter integration. It also allows you to use the agent to scrape logs.
This integration includes 24 useful alerts and 7 pre-built dashboards to help monitor and visualize Linux Server metrics and logs.
Before you begin
Each Linux node being observed must have its dedicated Grafana Alloy running.
If you want to monitor more than one Linux Node with this integration, we recommend you to use the Ansible collection for Grafana Cloud to deploy Grafana Alloy to multiple machines, as described in this documentation.
Install Linux Server integration for Grafana Cloud
- In your Grafana Cloud stack, click Connections in the left-hand menu.
- Find Linux Server and click its tile to open the integration.
- Review the prerequisites in the Configuration Details tab and set up Grafana Agent to send Linux Server metrics and logs to your Grafana Cloud instance.
- Click Install to add this integration’s pre-built dashboards and alerts to your Grafana Cloud instance, and you can start monitoring your Linux Server setup.
Configuration snippets for Grafana Alloy
Simple mode
These snippets are configured to scrape a single Linux Server instance running locally with default ports.
Manually copy and append the following snippets into your Grafana Alloy configuration file.
Integrations snippets
Logs snippets
linux
Advanced mode
To instruct Grafana Alloy to scrape your Linux Server instance, go though the subsequent instructions.
The snippets provide examples to guide you through the configuration process.
First, Manually copy and append the following snippets into your Grafana Alloy configuration file.
Then follow the instructions below to modify the necessary variables.
Advanced integrations snippets
This integration uses the prometheus.exporter.unix
component to collect system metrics.
The supplied configuration is tuned to exclude any metrics from the exporter which are not used by the integration’s dashboards, alerts, or recording rules. If a broader configuration which includes additional metrics is desired, the prometheus.exporter.unix
component can be adjusted accordingly.
Advanced logs snippets
linux
This integration uses the loki.source.journal
, and local.file_match
components to collect system logs.
This includes the systemd journal and the file(s) matching /var/log/{syslog,messages,*.log}
.
If you wish to capture other log files, you must add new new maps to the path_targets
list parameter of the local.file_match
component. If you wish for these additionally captured logs to be labeled so that they can be seen in Linux Node integration logs dashboard, the entry must include the same instance and job labels.
Grafana Agent static configuration (deprecated)
The following section shows configuration for running Grafana Agent in static mode which is deprecated. You should use Grafana Alloy for all new deployments.
Before you begin
Each Linux node being observed must have its dedicated Grafana Agent running.
If you want to monitor more than one Linux Node with this integration, we recommend you to use the Ansible collection for Grafana Cloud to deploy Grafana Agent to multiple machines, as described in this documentation.
Install Linux Server integration for Grafana Cloud
- In your Grafana Cloud stack, click Connections in the left-hand menu.
- Find Linux Server and click its tile to open the integration.
- Review the prerequisites in the Configuration Details tab and set up Grafana Agent to send Linux Server metrics and logs to your Grafana Cloud instance.
- Click Install to add this integration’s pre-built dashboards and alerts to your Grafana Cloud instance, and you can start monitoring your Linux Server setup.
Post-install configuration for the Linux Server integration
This integration is configured to work with the node_exporter
, which is embedded in the Grafana Agent.
Enable the integration by manually adding the provided snippets to your agent configuration file.
Note: The
instance
label must uniquely identify the node being scraped. Also, ensure each deployed Grafana Agent has a configuration that matches the node it is deployed to.
This integration supports metrics and logs from Linux. If you want to monitor your Linux node logs, there are 3 options. You can:
- scrape the journal
- scrape your OS log files directly
- scrape both your journal and OS log files
We recommend that you enable journal scraping because it comes with a unit label that can be used to filter logs on the dashboards. Config snippets for both cases are provided.
If you want to show logs and metrics signals correlated in your dashboards, as a single pane of glass, ensure the following:
job
andinstance
label values must match fornode_exporter
integration andlogs
scrape config in your agent configuration file.job
label must be set tointegrations/node_exporter
(already configured in the snippets).instance
label must be set to a value that uniquely identifies your Linux Node. Please replace the default<your-instance-name>
value according to your environment - it should be set manually. Note that if you uselocalhost
for multiple nodes, the dashboards will not be able to filter correctly by instance.
For a full description of configuration options see how to configure the node_exporter_config
block in the agent documentation.
Configuration snippets for Grafana Agent
Below integrations
, insert the following lines and change the URLs according to your environment:
node_exporter:
enabled: true
# disable unused collectors
disable_collectors:
- ipvs #high cardinality on kubelet
- btrfs
- infiniband
- xfs
- zfs
# exclude dynamic interfaces
netclass_ignored_devices: "^(veth.*|cali.*|[a-f0-9]{15})$"
netdev_device_exclude: "^(veth.*|cali.*|[a-f0-9]{15})$"
# disable tmpfs
filesystem_fs_types_exclude: "^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|tmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$"
# drop extensive scrape statistics
metric_relabel_configs:
- action: drop
regex: node_scrape_collector_.+
source_labels: [__name__]
relabel_configs:
- replacement: '<your-instance-name>'
target_label: instance
Below logs.configs.scrape_configs
, insert the following lines according to your environment.
- job_name: integrations/node_exporter_journal_scrape
journal:
max_age: 24h
labels:
instance: '<your-instance-name>'
job: integrations/node_exporter
relabel_configs:
- source_labels: ['__journal__systemd_unit']
target_label: 'unit'
- source_labels: ['__journal__boot_id']
target_label: 'boot_id'
- source_labels: ['__journal__transport']
target_label: 'transport'
- source_labels: ['__journal_priority_keyword']
target_label: 'level'
- job_name: integrations/node_exporter_direct_scrape
static_configs:
- targets:
- localhost
labels:
instance: '<your-instance-name>'
__path__: /var/log/{syslog,messages,*.log}
job: integrations/node_exporter
Full example configuration for Grafana Agent
Refer to the following Grafana Agent configuration for a complete example that contains all the snippets used for the Linux Server integration. This example also includes metrics that are sent to monitor your Grafana Agent instance.
integrations:
prometheus_remote_write:
- basic_auth:
password: <your_prom_pass>
username: <your_prom_user>
url: <your_prom_url>
agent:
enabled: true
relabel_configs:
- action: replace
source_labels:
- agent_hostname
target_label: instance
- action: replace
target_label: job
replacement: "integrations/agent-check"
metric_relabel_configs:
- action: keep
regex: (prometheus_target_sync_length_seconds_sum|prometheus_target_scrapes_.*|prometheus_target_interval.*|prometheus_sd_discovered_targets|agent_build.*|agent_wal_samples_appended_total|process_start_time_seconds)
source_labels:
- __name__
# Add here any snippet that belongs to the `integrations` section.
# For a correct indentation, paste snippets copied from Grafana Cloud at the beginning of the line.
node_exporter:
enabled: true
# disable unused collectors
disable_collectors:
- ipvs #high cardinality on kubelet
- btrfs
- infiniband
- xfs
- zfs
# exclude dynamic interfaces
netclass_ignored_devices: "^(veth.*|cali.*|[a-f0-9]{15})$"
netdev_device_exclude: "^(veth.*|cali.*|[a-f0-9]{15})$"
# disable tmpfs
filesystem_fs_types_exclude: "^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|tmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$"
# drop extensive scrape statistics
metric_relabel_configs:
- action: drop
regex: node_scrape_collector_.+
source_labels: [__name__]
relabel_configs:
- replacement: '<your-instance-name>'
target_label: instance
logs:
configs:
- clients:
- basic_auth:
password: <your_loki_pass>
username: <your_loki_user>
url: <your_loki_url>
name: integrations
positions:
filename: /tmp/positions.yaml
scrape_configs:
# Add here any snippet that belongs to the `logs.configs.scrape_configs` section.
# For a correct indentation, paste snippets copied from Grafana Cloud at the beginning of the line.
- job_name: integrations/node_exporter_journal_scrape
journal:
max_age: 24h
labels:
instance: '<your-instance-name>'
job: integrations/node_exporter
relabel_configs:
- source_labels: ['__journal__systemd_unit']
target_label: 'unit'
- source_labels: ['__journal__boot_id']
target_label: 'boot_id'
- source_labels: ['__journal__transport']
target_label: 'transport'
- source_labels: ['__journal_priority_keyword']
target_label: 'level'
- job_name: integrations/node_exporter_direct_scrape
static_configs:
- targets:
- localhost
labels:
instance: '<your-instance-name>'
__path__: /var/log/{syslog,messages,*.log}
job: integrations/node_exporter
metrics:
configs:
- name: integrations
remote_write:
- basic_auth:
password: <your_prom_pass>
username: <your_prom_user>
url: <your_prom_url>
scrape_configs:
# Add here any snippet that belongs to the `metrics.configs.scrape_configs` section.
# For a correct indentation, paste snippets copied from Grafana Cloud at the beginning of the line.
global:
scrape_interval: 60s
wal_directory: /tmp/grafana-agent-wal
Dashboards
The Linux Server integration installs the following dashboards in your Grafana Cloud instance to help monitor your system.
- Linux node / CPU and system
- Linux node / filesystem and disks
- Linux node / fleet overview
- Linux node / logs
- Linux node / memory
- Linux node / network
- Linux node / overview
Node overview dashboard
Fleet overview dashboard
Drill down dashboards: Network interfaces
Alerts
The Linux Server integration includes the following useful alerts:
node-exporter-filesystem
node-exporter
Metrics
The most important metrics provided by the Linux Server integration, which are used on the pre-built dashboards and Prometheus alerts, are as follows:
- node_arp_entries
- node_boot_time_seconds
- node_context_switches_total
- node_cpu_seconds_total
- node_disk_io_time_seconds_total
- node_disk_io_time_weighted_seconds_total
- node_disk_read_bytes_total
- node_disk_read_time_seconds_total
- node_disk_reads_completed_total
- node_disk_write_time_seconds_total
- node_disk_writes_completed_total
- node_disk_written_bytes_total
- node_filefd_allocated
- node_filefd_maximum
- node_filesystem_avail_bytes
- node_filesystem_device_error
- node_filesystem_files
- node_filesystem_files_free
- node_filesystem_readonly
- node_filesystem_size_bytes
- node_intr_total
- node_load1
- node_load15
- node_load5
- node_md_disks
- node_md_disks_required
- node_memory_Active_anon_bytes
- node_memory_Active_bytes
- node_memory_Active_file_bytes
- node_memory_AnonHugePages_bytes
- node_memory_AnonPages_bytes
- node_memory_Bounce_bytes
- node_memory_Buffers_bytes
- node_memory_Cached_bytes
- node_memory_CommitLimit_bytes
- node_memory_Committed_AS_bytes
- node_memory_DirectMap1G_bytes
- node_memory_DirectMap2M_bytes
- node_memory_DirectMap4k_bytes
- node_memory_Dirty_bytes
- node_memory_HugePages_Free
- node_memory_HugePages_Rsvd
- node_memory_HugePages_Surp
- node_memory_HugePages_Total
- node_memory_Hugepagesize_bytes
- node_memory_Inactive_anon_bytes
- node_memory_Inactive_bytes
- node_memory_Inactive_file_bytes
- node_memory_Mapped_bytes
- node_memory_MemAvailable_bytes
- node_memory_MemFree_bytes
- node_memory_MemTotal_bytes
- node_memory_SReclaimable_bytes
- node_memory_SUnreclaim_bytes
- node_memory_ShmemHugePages_bytes
- node_memory_ShmemPmdMapped_bytes
- node_memory_Shmem_bytes
- node_memory_Slab_bytes
- node_memory_SwapTotal_bytes
- node_memory_VmallocChunk_bytes
- node_memory_VmallocTotal_bytes
- node_memory_VmallocUsed_bytes
- node_memory_WritebackTmp_bytes
- node_memory_Writeback_bytes
- node_netstat_Icmp6_InErrors
- node_netstat_Icmp6_InMsgs
- node_netstat_Icmp6_OutMsgs
- node_netstat_Icmp_InErrors
- node_netstat_Icmp_InMsgs
- node_netstat_Icmp_OutMsgs
- node_netstat_IpExt_InOctets
- node_netstat_IpExt_OutOctets
- node_netstat_TcpExt_ListenDrops
- node_netstat_TcpExt_ListenOverflows
- node_netstat_TcpExt_TCPSynRetrans
- node_netstat_Tcp_InErrs
- node_netstat_Tcp_InSegs
- node_netstat_Tcp_OutRsts
- node_netstat_Tcp_OutSegs
- node_netstat_Tcp_RetransSegs
- node_netstat_Udp6_InDatagrams
- node_netstat_Udp6_InErrors
- node_netstat_Udp6_NoPorts
- node_netstat_Udp6_OutDatagrams
- node_netstat_Udp6_RcvbufErrors
- node_netstat_Udp6_SndbufErrors
- node_netstat_UdpLite_InErrors
- node_netstat_Udp_InDatagrams
- node_netstat_Udp_InErrors
- node_netstat_Udp_NoPorts
- node_netstat_Udp_OutDatagrams
- node_netstat_Udp_RcvbufErrors
- node_netstat_Udp_SndbufErrors
- node_network_carrier
- node_network_info
- node_network_mtu_bytes
- node_network_receive_bytes_total
- node_network_receive_compressed_total
- node_network_receive_drop_total
- node_network_receive_errs_total
- node_network_receive_fifo_total
- node_network_receive_multicast_total
- node_network_receive_packets_total
- node_network_speed_bytes
- node_network_transmit_bytes_total
- node_network_transmit_compressed_total
- node_network_transmit_drop_total
- node_network_transmit_errs_total
- node_network_transmit_fifo_total
- node_network_transmit_multicast_total
- node_network_transmit_packets_total
- node_network_transmit_queue_length
- node_network_up
- node_nf_conntrack_entries
- node_nf_conntrack_entries_limit
- node_os_info
- node_procs_running
- node_sockstat_FRAG6_inuse
- node_sockstat_FRAG_inuse
- node_sockstat_RAW6_inuse
- node_sockstat_RAW_inuse
- node_sockstat_TCP6_inuse
- node_sockstat_TCP_alloc
- node_sockstat_TCP_inuse
- node_sockstat_TCP_mem
- node_sockstat_TCP_mem_bytes
- node_sockstat_TCP_orphan
- node_sockstat_TCP_tw
- node_sockstat_UDP6_inuse
- node_sockstat_UDPLITE6_inuse
- node_sockstat_UDPLITE_inuse
- node_sockstat_UDP_inuse
- node_sockstat_UDP_mem
- node_sockstat_UDP_mem_bytes
- node_sockstat_sockets_used
- node_softnet_dropped_total
- node_softnet_processed_total
- node_softnet_times_squeezed_total
- node_systemd_service_restart_total
- node_systemd_unit_state
- node_textfile_scrape_error
- node_time_zone_offset_seconds
- node_timex_estimated_error_seconds
- node_timex_maxerror_seconds
- node_timex_offset_seconds
- node_timex_sync_status
- node_uname_info
- node_vmstat_oom_kill
- node_vmstat_pgfault
- node_vmstat_pgmajfault
- node_vmstat_pgpgin
- node_vmstat_pgpgout
- node_vmstat_pswpin
- node_vmstat_pswpout
- process_max_fds
- process_open_fds
- up
Changelog
Cost
By connecting your Linux Server instance to Grafana Cloud, you might incur charges. To view information on the number of active series that your Grafana Cloud account uses for metrics included in each Cloud tier, see Active series and dpm usage and Cloud tier pricing.