Grafana Plugin Tutorial: Polystat Panel (Part 1)

Published: 2 Apr 2019 by Brian Gann RSS

Polystat

The grafana-polystat-panel plugin was created to provide a way to roll up multiple metrics and implement flexible drilldowns to other dashboards.

This example will focus on creating a panel for Cassandra using real data from Prometheus collected from our Kubernetes clusters. We’ll focus on the basic metrics for CPU/Memory/Disk coming from cAdvisor, but a well-instrumented service will have many metrics that indicate overall health, such as requests per second, error rates, and more.

This panel allows you to group these metrics together into an overall health status, which can be used to drill down to more detailed dashboards. For this Cassandra example, the end result will look like this:

panel goal

The Basics

Getting CPU, memory, and disk utilization will give enough metrics to demonstrate the idea behind compositing metrics and displaying them in Grafana. The PromQL queries below are simple and can be adapted with template variables to make the panel more “general purpose.” To get started, some simple queries will be used, then later modified.

CPU

container_cpu_usage_seconds_total{namespace="metrictank", pod_name=~"cassandra-sfs-.*", container_name="cassandra"}

The above query with polystat will show a large number of polygons (one per metric):

all cassandra pods

There are quite a number of pods displayed (we have multiple Cassandra clusters), so we will narrow this down to just a single cluster:

container_cpu_usage_seconds_total{namespace="metrictank", pod_name=~"cassandra-sfs-.*", container_name="cassandra", cluster="ops-tools1"}

The names still don’t show up since they are very long (hint: tooltips will show them). Adding {{pod_name}} to the Legend field will result in a better display:

all cassandra pods with legend

Result:

all cassandra pods with legend result

The query needs a little more work – the metric is a counter – so we’ll use irate to get instantaneous per-second values.

irate(container_cpu_usage_seconds_total{namespace="metrictank", pod_name=~"cassandra-sfs-.*", container_name="cassandra", cluster="ops-tools1"}[1m])

all cassandra pods cpu rate

Disk

While CPU is interesting, disk space in cassandra is usually what tends to run out, so we’ll add this query to show disk usage:

container_fs_usage_bytes{namespace="metrictank", pod_name=~"cassandra-sfs-.*", container_name="cassandra", cluster="ops-tools1"}
container_fs_limit_bytes{namespace="metrictank", pod_name=~"cassandra-sfs-.*", container_name="cassandra", cluster="ops-tools1"}
container_fs_limit_bytes{namespace="metrictank", pod_name=~"cassandra-sfs-.*", container_name="cassandra", cluster="ops-tools1"} - container_fs_usage_bytes{namespace="metrictank", pod_name=~"cassandra-sfs-.*", container_name="cassandra", cluster="ops-tools1"}

Memory

To complete our stats, add this memory query:

container_memory_usage_bytes{namespace="metrictank", pod_name=~"cassandra-sfs-.*", container_name="cassandra", cluster="ops-tools1"}

We can now see the result:

all cassandra pods all stats

Formatting

The stats themselves have “short” as the value type in Grafana. Switching to the options of polystat, we can adjust them to something more meaningful:

cpu overrides

disk overrides

memory overrides

Thresholding

The next step is to create thresholds for each of the metrics. In the thesholds section, add a new threshold, set the name to match a metric, and configure as needed. This example sets a 60% warning and 80% critical for CPU utilization.

cpu threshold setting

The panel will now look like this:

cpu threshold result

Composites

Now that we have basic metrics and thresholds, we can create composites. (NOTE: The composite being created here is for a single node to keep everything simple, but the final result will have all nodes displayed.)

Composites allow you to group multiple metrics together and display a single item with the threshold state reflected. The polygon is given the color of the “worst” state. The tooltip will show individual states, sorted by worst to best.

To create a composite, click Add in the Composites section:

composites

This will create a new composite named “Cassandra” and will include all metrics that match CPU/Memory/Disk.

composite cassandra

The result of the composite will change the polystat to show a single polygon that represents three different metrics, and will animate to show the value for each metric.

animated

Clickthroughs

There are three levels of clickthroughs provided by this panel.

  1. Default clickthrough
  2. Override clickthrough
  3. Composite clickthrough

The order of precedence is most-specific to least-specific (3, 2, 1).

Default Clickthrough

You can set a clickthrough to be used globally when there are no override or composite clickthroughs defined for a polygon.

In this example, the clickthrough is set to:

dashboard/db/cassandra

clickthrough default

Clicking on the polygon will take you to the Cassandra dashboard, in the same Grafana server. The clickthrough can be any valid url.

The plugin also includes parameters that can be passed to other dashboards.

dashboard/db/cassandra?var-environment=$Cluster&var-instance=All

clickthrough default

Additional variables can be passed; see this for details: https://github.com/grafana/grafana-polystat-panel#single-metric-variables.

Override Clickthrough

In the overrides section, you can specify a clickthrough that applies for that specific override. This is mainly used when not using composites.

Setting the clickthrough for CPU to be…

dashboard/cpu?var-node=${__cell_name}

…will take you to a dashboard named “CPU” and pass the value of the clicked polygon.

Composite Clickthrough

The third type of clickthrough is used to specify where to go when a composited polygon is clicked. The implementation is the same as above.

Composites have another set of variables that can be passed to clickthroughs. See: https://github.com/grafana/grafana-polystat-panel#composite-metric-variables.

Templating

To keep the example above simple, the names are hardcoded. Leveraging Grafana template variables will make the dashboard more flexible.

The queries use “namespace” and “cluster,” so let’s create those.

Add a template variable to allow selection of different clusters:

templated variables

Almost there

The dashboard will look like this, showing a single node with three different metrics displayed.

dashboard completed

dashboard completed with animation

Wrapping Up

To complete the panel, just modify the composites to match regex per-node.

composite1 composite2

After changing the composites, the end result will look like this:

panel goal

About Part 2

Part 2 will detail more composite options and advanced features to make them even easier to create.

If you have created some dashboards already with polystat, we’d love to see them!

Related Posts

Two years ago, when it was time for the L.A.-based company to find and implement a perfect metrics monitoring partner, the process proved to be more slow-burn love affair versus whirlwind romance.
The rest of the city may still have been in a post-Oscars haze, but nearly 300 monitoring mavens gathered in downtown L.A. bright and early on Feb. 25 to kick off GrafanaCon 2019.
Loki and Grafana are a perfect match. The backend is kept lean and space-efficient, while the user interface allows ad-hoc field parsing and simple statistics. This post details some of the UX goals we had to deliver logs simpler and faster.

Related Case Studies

DigitalOcean gains new insight with Grafana visualizations

The company relies on Grafana to be the consolidated data visualization and dashboard solution for sharing data.

"Grafana produces beautiful graphs we can send to our customers, works with our Chef deployment process, and is all hosted in-house."
– David Byrd, Product Manager, DigitalOcean

How Grafana Cloud is enabling HotSchedules to develop next-generation applications

The visibility for all these metrics helps service delivery teams quickly iterate on new features.

"Grafana Cloud enables us to achieve observability bliss at HotSchedules. We don’t have to worry about scaling and maintaining the service."
– Denise Stockman, Director, Infrastructure, Hotschedules

How Grafana Cloud is enabling Packet’s teams, from engineering to sales

Most people at Packet look at a Grafana dashboard at some point during their work day.

"Being able to, in a single dashboard, pull in data from a variety of different data sources provides the biggest value for us."
– Nathan Goulding, SVP, Engineering, Packet