Alert rule evaluation
The criteria determining when an alert rule fires are based on two settings:
- Evaluation group: how frequently the alert rule is evaluated.
- Pending period: how long the condition must be met to start firing.
Evaluation group
Every alert rule is assigned to an evaluation group. You can assign the alert rule to an existing evaluation group or create a new one.
Each evaluation group contains an evaluation interval that determines how frequently the alert rule is checked. For instance, the evaluation may occur every 10s
, 30s
, 1m
, 10m
, etc.
Evaluation strategies
Alert rules in different groups can be evaluated simultaneously.
Grafana-managed alert rules within the same group are evaluated concurrently—they are evaluated at different times over the same evaluation interval but display the same evaluation timestamp.
Data-source managed alert rules within the same group are evaluated sequentially, one after the other—this is necessary to ensure that recording rules are evaluated before alert rules.
Pending period
You can set a pending period to prevent unnecessary alerts from temporary issues.
The pending period specifies how long the condition must be met before firing, ensuring the condition is consistently met over a consecutive period.
You can also set the pending period to zero to skip it and have the alert fire immediately once the condition is met.
Condition operator
There are several condition operators available.
- and: Two conditions before and after must be true for the overall condition to be true.
- or: If one of conditions before and after are true, the overall condition is true.
- logic-or: If the condition before logic-or is true, the overall condition is immediately true, without evaluating subsequent conditions.
Here are some examples of operators.
TRUE and TRUE or FALSE and FALSE
evaluate toFALSE
, because last two conditions returnFALSE
.TRUE and TRUE logic-or FALSE and FALSE
evaluate toTRUE
, because the preceding condition returnsTRUE
.
Evaluation example
Keep in mind:
- One alert rule can generate multiple alert instances - one for each time series produced by the alert rule’s query.
- Alert instances from the same alert rule may be in different states. For instance, only one observed machine might start firing.
- Only Alerting and Resolved alert instances are routed to manage their notifications.
Consider an alert rule with an evaluation interval set at every 30 seconds and a pending period of 90 seconds. The evaluation occurs as follows:
Time | Condition | Alert instance state | Pending counter |
---|---|---|---|
00:30 (first evaluation) | Not met | Normal | - |
01:00 (second evaluation) | Breached | Pending | 0s |
01:30 (third evaluation) | Breached | Pending | 30s |
02:00 (fourth evaluation) | Breached | Pending | 60s |
02:30 (fifth evaluation) | Breached | Alerting* | 90s |
An alert instance is resolved when it transitions from the Firing
to the Normal
state. For instance, in the previous example:
Time | Condition | Alert instance state | Pending counter |
---|---|---|---|
03:00 (sixth evaluation) | Not met | Normal Resolved * | 120s |
03:30 (seventh evaluation) | Not met | Normal | 150s |
To learn more about the state changes of alert rules and alert instances, refer to State and health of alert rules.