Map Prometheus Metrics to RED KPIs
Grafana Cloud

Map Prometheus metrics to RED KPIs

Prometheus provides a variety of metric types that can be used to track different aspects of your service’s performance. You can map custom or non-standard Prometheus metrics to RED KPIs to leverage the features of Asserts. These KPIs include request rate, error rate, latency average, and latency quantile.

Common inputs required for all KPI mappings

When you add a new mapping, you must provide the following information:

FieldDescription
Source metricName of the Prometheus metric to be used as input
Metric sourceIdentifier for the source of the metric, for example, grpc, springboot, loopback, and so on
Service nameLabel from the metric that identifies the service
Request context labelsOne or more labels that together uniquely identify each request. When there are multiple labels, you must specify a separator character to join the label values.
Request typeType of request being tracked. Common values include inbound and outbound. You can also specify a custom value.

Note

When defining the KPIs individually, maintain consistency in the Request context labels and Request type field definitions. After you have provided all the required inputs, the mapped KPI displays as a metric chart. You can review the KPI metric for any selected service, request, and time window. This enables a feedback loop to validate if the KPI is providing expected results.

KPI-specific inputs

The following sections provide an overview of specific inputs that you enter depending on the KPI you are mapping.

Request rate

FieldDescription
Metric typecounter or gauge

Error rate

FieldDescription
Metric typecounter or gauge
Error typeLabel in the metric containing the error code, for example, status_code
Error type mapping rulesSpecify how error codes are grouped into error types using:
  • Equality: status_code = "500"
  • Regex: status_code =~ "5.."

Latency average

FieldDescription
Latency average type
  • Gauge: Source metric is the average latency
  • Sum and Count: Two count metrics, including one for total latency and one for request count
Latency unitseconds, milliseconds, or microseconds

Latency Quantile

FieldDescription
Latency unitseconds, milliseconds, or microseconds.
QuantilesOne or more quantiles to extract: 50%, 75%, 90%, and 99%.

Map all KPIs using a histogram

Instead of mapping each KPI individually, you can derive the request rate, latency average, latency quantile, and error rate from a single histogram.

FieldDescription
Histogram metric base namePrefix of the histogram metric before _sum, _count, _bucket. Example: for http_request_duration_seconds_sum, use http_request_duration_seconds.
Latency unitseconds, milliseconds, or microseconds.
QuantilesSupported quantiles to extract from the histogram: 50%, 75%, 90%, and 99%.
Error typeLabel in the histogram metric containing the error code.
Error type mapping rulesSpecify how to group error codes into error categories using exact or pattern match. Examples:
  • Equality: status_code = "500"
  • Regex: status_code =~ "5.."

Before you begin

Before you begin, ensure that you are familiar with Prometheus metrics and RED KPIs.

Steps

  1. Sign in to Grafana Cloud and click Asserts > Configuration.
  2. Click RED mapping.
  3. Determine which KPI you want to map and click Add new mapping.
  4. Define the mapping as described in the above sections.
  5. Click Submit.