Telegraf: system dashboard
InfluxDB dashboards for telegraf metrics
Templated dashboard for telegraf + influxdb.
Similar to basic https://grafana.net/dashboards/914, but with templating, repeating panels/row and etc.
Was made as a “learn influxdb/telegraf” project, ended up with something i use daily.
Variables (among standard like server / datasource / interval):
- CPUs (defaults to all)
- Disks (per-disk IOPS)
- Network interfaces (packets, bandwidth, errors/drops)
- Mountpoints (space / inodes)
- Detailed network stack info, nstat plugin allows us to grab raw snmp data, ie:
- TCP handshakes data
- TCP aborts data
- ICMP errors, ICMP data
- SYN data
- TCP errors (retransmissions/etc)
- IPv4 errors
- IPv6 errors
- Conntrack data
- File descriptors
- UDP data
…And basically everything “generic” you can extract from ordinary linux system
By default all variables points to “all”, so dashboard can be huge if you have large amounts of disks/network interfaces.
So far i tested it on machine with 46 disks, 8 interfaces and it loaded correctly (but pretty slow, poor browser barely handled all that data)
Known issues / kludges:
Docker “veth” interfaces are blacklisted via template regexp. Docker creates shitload of them with names like “veth%container_id%” and they appear in selector even if they were alive months ago, so info about that interfaces is practically useless.
Disk IO only displays for disks like /dev/sda, /dev/hda, /dev/vda. Using per-partition IOPS produces way too much graphs, but if you really want it, you can fix it by editing regexp in “disk” template variable. Also, i’m not sure about drbd and other “virtual” block devices.
Split IPv4/IPv6 data (no ipv6 networks in my ownership, so not high on priority)
- 7 aug 2019: Support nvme disks (/dev/nvmeXnY)
Data source config
Upload an updated version of an exported dashboard.json file from Grafana