Menu
Grafana Cloud

Meta monitoring for Cloud

Monitor your alerting metrics to ensure you identify potential issues before they become critical.

Meta monitoring is the process of monitoring your monitoring system and alerting when your monitoring is not working as it should.

In order to enable you to meta monitor, Grafana provides predefined metrics.

Identify which metrics are critical to your monitoring system (for example, Grafana) and then set up how you want to monitor them.

You can use meta-monitoring metrics to understand the health of your alerting system in the following ways:

  1. [Optional] Create a dashboard in Grafana that uses this metric in a panel (just like you would for any other kind of metric).
  2. [Optional] Create an alert rule in Grafana that checks this metric regularly (just like you would do for any other kind of alert rule).
  3. [Optional] Use the Explore module in Grafana.

Explore insights metrics

The panels in Alerting Insights query meta-monitoring metrics from the grafanacloud-usage, grafanacloud-alert-state-history, and grafanacloud-prom data sources.

You can also query these data sources to monitor the status of your alerting setup and build your own dashboards. Use the Metrics Explorer or the Metrics Browser to explore all available metrics, and search for names with rule and alert in them.

Explore insights logs

Use insights logs for Mimir-managed alerts to help you determine which Mimir-managed alerting and recording rules are failing to evaluate and why. These logs contain helpful information on specific alert rules that are failing, provide you with the actual error message, and help you evaluate what is going wrong.

Before you begin

To view your insights logs, you must have the following:

  • A Grafana Cloud account
  • Admin or Editor user permissions for the managed Grafana Cloud instance

Steps

To explore logs pertaining to failing alerting and recording rules, complete the following steps.

  1. Log on to your instance and click the Explore (compass) icon in the menu sidebar.

  2. Use the data sources dropdown located at the top of the page to select the data source. The data source name should be similar to grafanacloud-<yourstackname>-usage-insights.

  3. To find the logs you want to see, use the Label filters and Line contains options in the query editor.

    To look at a particular stack, you can filter by instance_id instead of org_id.

    The following is an example query that would surface insights logs:

    {org_id="<your-org-id>"} | logfmt | component = `ruler` | msg = `Evaluating rule failed`
  4. Click Run query.

  5. In the Logs section, view specific information on which alert rule is failing and why.

    You can see the rule contents (in the rule field), the rule name (in the name field), the name of the group it’s in (in the group field), and the error message (in the err field).