Meta monitoring for Cloud
Grafana Cloud

Meta monitoring for Cloud

Meta monitoring is the process of monitoring your monitoring system (or alerting system).

Monitor your alerting implementation to understand its health, detect potential issues, and troubleshooting.

Grafana provides predefined metrics and logs to enable you to meta monitor Grafana Alerting. You can monitor this data in different ways, such as:

  • [Optional] Create a Grafana dashboard with a panel that uses these metrics, similar to Alerting Insights.
  • [Optional] Create an alert rule in Grafana that checks a metric regularly, just like any other alert rule.
  • [Optional] Use Explore to query the metrics or logs.

Before you begin

To explore your alerting metrics and logs, you must:

  • Have Admin or Editor user permissions for the managed Grafana Cloud instance.
  • Log in to your instance and click the Explore (compass) icon in the sidebar menu.

Explore insights metrics

Alerting meta-monitoring metrics are stored in Prometheus data sources, which are part of your Grafana Cloud stack and are accessible from your Grafana instance.

Note

A single Grafana Cloud account can run multiple Grafana Cloud stacks, all using the same grafanacloud-usage data source. When querying meta-monitoring metrics in the grafanacloud-usage data source, filter by your Grafana stack identifier (id).

For Grafana-managed alerts

Available in the grafanacloud-usage Prometheus data source.

grafanacloud_grafana_instance_alerting_rule_group_rules

The number of alert rules, labeled by Grafana stack (id) and alert rule state (state).

query-example
sum by(state) (grafanacloud_grafana_instance_alerting_rule_group_rules{id="<your_grafana_stack_id>"})

The state label can be active or paused.

grafanacloud_grafana_instance_alerting_alerts

The number of alert instances, labeled by Grafana stack (id) and alert instance state (state).

query-example
sum by(state) (grafanacloud_grafana_instance_alerting_alerts{id="<your_grafana_stack_id>"})

The state label can be alerting, error, nodata, normal, or pending.

grafanacloud_grafana_instance_alerting_rule_evaluations_total:rate5m

The per-second rate of alert rule evaluations over the last 5 minutes, labeled by Grafana stack (id).

query-example
grafanacloud_grafana_instance_alerting_rule_evaluations_total:rate5m{id="<your_grafana_stack_id>"}

grafanacloud_grafana_instance_alerting_rule_evaluation_failures_total:rate5m

The per-second rate of failed alert rule evaluations over the last 5 minutes, labeled by Grafana stack (id).

query-example
grafanacloud_grafana_instance_alerting_rule_evaluation_failures_total:rate5m{id="<your_grafana_stack_id>"}

grafanacloud_grafana_instance_alerting_alertmanager_alerts

The number of alerts received by the Grafana Alertmanager for notification processing, labeled by Grafana stack (id) and alert notification state (state).

query-example
sum by(state) (grafanacloud_grafana_instance_alerting_alertmanager_alerts{id="<your_grafana_stack_id>"})

The state label can be active, suppressed, or unprocessed.

grafanacloud_grafana_instance_alerting_silences

The number of silences, labeled by Grafana stack (id) and silence state (state).

query-example
sum by(state) (grafanacloud_grafana_instance_alerting_silences{id="<your_grafana_stack_id>"})

The state label can be active, expired, or pending.

For Mimir alerts

Meta-monitoring metrics for Mimir alert rules are stored in the grafanacloud-usage and grafanacloud-<yourstackname>-prom Prometheus data sources.

You can find these metrics in Alerting insights.

  1. In your Grafana Cloud stack, click Alerts & IRM in the left-side menu.
  2. Click Alerting.
  3. On the Alerting landing page, view the Insights tab.
  4. Select a panel from the Mimir sections.
  5. Click the menu icon (three-dots).
  6. Click Explore to view the metrics and the data source queried by the panel.

Explore alerting logs

Alerting logs are stored in Loki data sources, which are part of your Grafana Cloud stack and are accessible from your Grafana instance.

For Grafana-managed alert state changes

Logs related to state changes in Grafana-managed alerts are stored in the grafanacloud-<yourstackname>-alert-state-history Loki data source.

To explore these logs, complete the following steps.

  1. In Explore, select the grafanacloud-<yourstackname>-alert-state-history Loki data source.

  2. Use the Loki query editor to find logs.

    basic-query
    {from="state-history"} | json
    additional-filters
    failing-rules
  3. Click Run query.

  4. In the Logs section, review specific details about alerts by selecting relevant fields:

    • previous: previous alert instance state.
    • current: current alert instance state.
    • ruleTitle: alert rule title.
    • ruleID and ruleUID.
    • labels_alertname, labels_new_label, and labels_grafana_folder.
    • Additional available fields.

Alternatively, you can access the History page in Grafana to visualize and filter state changes for individual alerts or all alerts.

For Mimir alerts

Logs for Mimir-managed alerts are stored in the grafanacloud-<yourstackname>-usage-insights Loki data source.

These logs help you troubleshoot alerts by providing insight about their notification status. They display error messages for failing alerts.

To explore these logs, complete the following steps.

  1. In Explore, select the grafanacloud-<yourstackname>-usage-insights Loki data source.

  2. Use the Loki query editor to find logs. The following query retrieves all alert logs:

    {instance_type="alerts"} | logfmt
  3. Click Run query.

  4. In the Logs section, review specific details about alert logs by selecting relevant fields such as msg, alert, or alerts.