Alerts allow you to identify problems in your system moments after they occur. By quickly identifying unintended changes in your system, you can minimize disruptions to your services.
Alerts consists of two parts:
- Alert rules - When the alert is triggered. Alert rules are defined by one or more conditions that are regularly evaluated by Grafana.
- Notification channel - How the alert is delivered. When the conditions of an alert rule are met, the Grafana notifies the channels configured for that alert.
Currently only the graph panel visualization supports alerts.
You can perform the following tasks for alerts:
- Add or edit an alert notification channel
- Create an alert rule
- View existing alert rules and their current state
- Test alert rules and troubleshoot
Currently alerting supports a limited form of high availability. Since v4.2.0 of Grafana, alert notifications are deduped when running multiple servers. This means all alerts are executed on every server but no duplicate alert notifications are sent due to the deduping logic. Proper load balancing of alerts will be introduced in the future.
You can also set alert rule notifications along with a detailed message about the alert rule. The message can contain anything: information about how you might solve the issue, link to runbook, and so on.
The actual notifications are configured and shared between multiple alerts.
Alert rules are evaluated in the Grafana backend in a scheduler and query execution engine that is part
of core Grafana. Only some data sources are supported right now. They include
Google Cloud Monitoring,
Azure Data Explorer.
Metrics from the alert engine
The alert engine publishes some internal metrics about itself. You can read more about how Grafana publishes internal metrics.
|Total number of alerts||counter||
|Alert execution result||counter||
|Notifications sent counter||counter||
|Alert execution timer||timer||