Outlier detection

If you’re watching average CPU across a 20-node cluster:

  • One node at 95% barely moves the cluster average, so nothing looks wrong.
  • An alert doesn’t fire, so you don’t know until it affects users.

When you’re trying to determine an instance that is behaving differently from its peers, a fixed threshold can’t help.

Outlier detection solves these kinds of issues.

  • You define the group of instances that should behave alike.
  • Outlier detection flags any that drift from the rest.
Several peer instances tracking together while one instance drifts away and is flagged as an outlier
Try it out in Grafana Play

Script

Outlier detection compares the members of a group and flags any that behave differently from the rest. For example, it can flag one instance in a cluster that uses much more CPU than its peers. You set a baseline group, and the feature highlights the odd one out.