Menu

This is documentation for the next version of Grafana. For the latest stable release, go to the latest version.

Grafana Cloud Enterprise Open source

Group alert notifications

Grouping in Grafana Alerting allows you to batch relevant alerts together into a smaller number of notifications. This is particularly important if notifications are delivered to first-responders, such as engineers on-call, where receiving lots of notifications in a short period of time can be overwhelming. In some cases, it can negatively impact a first-responders ability to respond to an incident. For example, consider a large outage where many of your systems are down. In this case, grouping can be the difference between receiving 1 phone call and 100 phone calls.

Group notifications

Grouping combines similar alert instances within a specific period into a single notification, reducing alert noise.

In the notification policy, you can configure how to group multiple alerts into a single notification:

  • The Group by option specifies the criteria for grouping incoming alerts within the policy. The default is by alert rule.
  • Timing options determine when and how often to send the notification.
A diagram about the components of a notification policy, including labels and groups

Alert instances are grouped together if they have the same exact label values for the labels configured in the Group by option.

For example, given the Group by option set to the team label:

  • alertname:foo, team=frontend, and alertname:bar, team=frontend are in one group.
  • alertname:foo, team=backend, and alertname:qux, team=backend are in another group.

Group by alert rule or labels

By default, notification policies in Grafana group alerts by the alert rule. Specifically, they are grouped using the alertname and grafana_folder labels, as alert rule names are not unique across folders.

If you want to group alerts by other labels, something other than the alert rule, change the Group by option to any other combination of labels.

A single group for all alerts

If you want to group all alerts handled by the notification policy in a single group (without grouping notifications by alert rule or other labels), you can do so by leaving Group by empty.

Disable grouping

If you want to receive every alert as a separate notification, you can do so by grouping by a special label called ..., ensuring that other labels are not present.

Timing options

In the notification policy, you can also configure how often notifications are sent for each group of alerts. There are three distinct timers applied to groups within the notification policy:

  • Group wait: the time to wait before sending the first notification for a new group of alerts.
  • Group interval: the time to wait before sending a notification about changes in the alert group.
  • Repeat interval: the time to wait before sending a notification if the group has not changed since the last notification.

These timers reduce the number of notifications sent. By delaying the delivery of notifications, incoming alerts can be grouped into just one notification instead of many.

A basic sequence diagram of the the notification policy timers
A basic sequence diagram of the notification policy timers

Group wait

Default: 30 seconds

Group wait is the duration Grafana waits before sending the first notification for a new group of alerts.

The longer the group wait, the more time other alerts have to be included in the initial notification of the new group. The shorter the group wait, the earlier the first notification is sent, but at the risk of not including some alerts.

Example

Consider a notification policy that:

  • Matches all alert instances with the team label—matching labels equals to team=~.+.
  • Groups notifications by the team label—one group for each distinct team.
  • Sets the Group wait timer to 30s.
TimeIncoming alert instanceNotification policy groupNumber of instances
00:00alert_name=f1 team=frontendfrontend1Starts the group wait timer of the frontend group.
00:10alert_name=f2 team=frontendfrontend2
00:20alert_name=b1 team=backendbackend1Starts the group wait timer of the backend group.
00:30*frontend2Group wait elapsed.
Send initial notification reporting 2 alerts.
00:35alert_name=b2 team=backendbackend2
00:40alert_name=b3 team=backendbackend3
00:50*backend3Group wait elapsed.
Send initial notification reporting 3 alerts.

Group interval

Default: 5 minutes

If an alert was too late to be included in the first notification due to group wait, it is included in subsequent notifications after group interval.

Group interval is the duration to wait before sending notifications about group changes. For instance, a group change may be adding a new firing alert to the group, or resolving an existing alert.

Example

Here are the related excerpts from the previous example:

TimeIncoming alert instanceNotification policy groupNumber of instances
00:30*frontend2Group wait elapsed and starts Group interval timer.
Send initial notification reporting 2 alerts.
00:50*backend3Group wait elapsed and starts Group interval timer.
Send initial notification reporting 3 alerts.

And below is the continuation of the example setting the Group interval timer to 5 minutes:

TimeIncoming alert instanceNotification policy groupNumber of instances
01:30alert_name=f3 team=frontendfrontend3
02:30alert_name=f4 team=frontendfrontend4
05:30*frontend4Group interval elapsed and resets timer.
Send one notification reporting 4 alerts.
05:50*backend3Group interval elapsed and resets timer.
No group changes, and do not send notification.
08:00alert_name=f4 team=backendbackend4
10:30*frontend4Group interval elapsed and resets timer.
No group changes, and do not send notification.
10:50*backend4Group interval elapsed and resets timer.
Send one notification reporting 4 alerts.

How it works

Once the first notification has been sent for a new group of alerts, the group interval timer starts.

When the group interval timer elapses, the system resets the group interval timer and sends a notification only if there were group changes. This process repeats until there are no more alerts.

It’s important to note that an alert instance exits the group after being resolved and notified of its state change. When no alerts remain, the group is deleted, and then the group wait timer handles the first notification for the next incoming alert once again.

Repeat interval

Default: 4 hours

Repeat interval acts as a reminder that alerts in the group are still firing.

The repeat interval timer decides how often notifications are sent (or repeated) if the group has not changed since the last notification.

How it works

Repeat interval is evaluated every time the group interval resets. If the alert group has not changed and the time since the last notification was longer than the repeat interval, then a notification is sent as a reminder that the alerts are still firing.

Repeat interval must not only be greater than or equal to group interval, but also must be a multiple of Group interval. If Repeat interval is not a multiple of group interval it is coerced into one. For example, if your Group interval is 5 minutes, and your Repeat interval is 9 minutes, the Repeat interval is rounded up to the nearest multiple of 5 which is 10 minutes.

Example

Here are the related excerpts from the previous example:

TimeIncoming alert instanceNotification policy groupNumber of instances
05:30*frontend4Group interval resets.
Send the last notification.
10:50*backend4Group interval resets.
Send the last notification.

And below is the continuation of the example setting the Repeat interval timer to 4 hours:

TimeIncoming alert instanceNotification policy groupNumber of instances
04:05:30frontend4Group interval resets. The time since the last notification was no longer than the repeat interval.
04:10:30frontend4Group interval resets. The time since the last notification was longer than the repeat interval.
Send one notification reminding the 4 firing alerts.
04:10:50backend4Group interval resets. The time since the last notification was no longer than the repeat interval.
04:15:50backend4Group interval resets. The time since the last notification was longer than the repeat interval.
Send one notification reminding the 4 firing alerts.