---
title: "Request, error, and duration (RED) | Grafana Cloud documentation"
description: "Learn more about requests, error, and duration in the knowledge graph"
---

> For a curated documentation index, see [llms.txt](/llms.txt). For the complete documentation index, see [llms-full.txt](/llms-full.txt).

# Request, error, and duration (RED)

Learn how the knowledge graph maps requests, errors, and duration metrics.

## Request Rate

In the knowledge graph, the `asserts:request:total` records the total count of requests.

In **SpringBoot**, the request metrics for incoming requests are available through `http_server_requests_seconds`, which is a [histogram](https://prometheus.io/docs/concepts/metric_types/#histogram). Similarly, the request metrics for outgoing calls are available through `http_client_requests_seconds`, which is also a histogram. These metrics are mapped to `asserts:request:total` for incoming and outgoing requests.

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
# Incoming requests
- record: asserts:request:total
  expr: |-
    label_replace(http_server_requests_seconds_count, "asserts_request_context", "$1", "uri", "(.+)")
  labels:
    asserts_source: spring_boot
    asserts_metric_request: total
    asserts_request_type: inbound

# Outgoing requests made through Spring classes like RestTemplate
- record: asserts:request:total
  expr: |-
    label_replace(http_client_requests_seconds_count, "asserts_request_context", "$1", "uri", "(.+)")
  labels:
    asserts_source: spring_boot
    asserts_metric_request: total
    asserts_request_type: outbound
```

Expand table

| Meta Label                | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|---------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `asserts_source`          | Used by the knowledge graph to identify which framework/instrumentation captured the metric.                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| `asserts_metric_request`  | Used by the knowledge graph to identify this as a request metric. Valid values are `total` when the source metric is a counter and `gauge` when the source metric is a gauge.                                                                                                                                                                                                                                                                                                                                                                         |
| `asserts_request_type`    | Used by the knowledge graph to categorize requests into different kinds. By default, for all supported HTTP-based frameworks, the knowledge graph categorizes requests as `inbound` for incoming requests and `outbound` for outgoing HTTP calls. These can also be arbitrary names to group APIs, for example, `timer_task` or `query`, and so on.                                                                                                                                                                                                   |
| `asserts_request_context` | Used by the knowledge graph to identify a unique request. For HTTP requests, whether `inbound` or `outbound`, this typically maps to the relative part of the request URI with high-cardinality parameters stripped off. For example, `/track/order/{}` where `{}` is a placeholder for an order ID. Frameworks like Spring Boot actuator Prometheus metrics have labels like `uri`. The [label\_replace](https://prometheus.io/docs/prometheus/latest/querying/functions/#label_replace) function is used to map `uri` to `asserts_request_context`. |

After these rules are added, the following happens:

- The **Request Rate** is computed and shown in the Service KPI Dashboard.
- The **Request Rate** is observed for anomalies, and the **RequestRateAnomaly** is triggered when there are anomalies.

> Note
> 
> In the previous example, the source metric is available as a counter, therefore, it was mapped to `asserts:request:total`. If the source metric was a gauge, then it should be mapped to `asserts:request:gauge`and set `asserts_metric_request: gauge`.

## Error Ratio

In the knowledge graph, the `asserts:error:total` metric records the total count of errors, broken down by different error types. Let’s add this rule for Spring Boot `inbound` and `outbound` requests:

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
# Inbound request errors
- record: asserts:client:error:total
  expr: |
    label_replace(http_server_requests_seconds_count{status=~"4.."}, "asserts_request_context", "$1", "uri", "(.+)")
  labels:
    asserts_source: spring_boot
    asserts_metric_error: client_total
    asserts_request_type: inbound

- record: asserts:error:total
  expr: |
    label_replace(http_server_requests_seconds_count {status=~"5.."}, "asserts_request_context", "$1", "uri", "(.+)")
  labels:
    asserts_source: spring_boot
    asserts_metric_error: total
    asserts_request_type: inbound
    asserts_error_type: server_errors

# Outbound request errors
- record: asserts:error:total
  expr: |
    label_replace(http_client_requests_seconds_count{status=~"4.."}, "asserts_request_context", "$1", "uri", "(.+)")
  labels:
    asserts_source: spring_boot
    asserts_metric_error: total
    asserts_request_type: outbound
    asserts_error_type: client_errors

- record: asserts:error:total
  expr: |
    label_replace(http_client_requests_seconds_count{status=~"5.."}, "asserts_request_context", "$1", "uri", "(.+)")
  labels:
    asserts_source: spring_boot
    asserts_metric_error: total
    asserts_request_type: outbound
    asserts_error_type: server_errors
```

Expand table

| **Meta Label**         | **Description**                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
|------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `asserts_metric_error` | Used by the knowledge graph to identify this as an error metric of type counter. Valid values are `total`, `gauge`, `client_total` and `client_gauge`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `asserts_error_type`   | Used by the knowledge graph to categorize errors into different kinds. Commonly useful types are `server_errors` and `client_errors`. In this example, a condition on the `status` label has been used to define these types. Note that client errors for `inbound` calls are mapped using a special type `client_total`. This is because inbound client errors tend to be noisy. The knowledge graph still observes them, but the signals captured surface only when anomalies occur. That is, if there is a steady stream of client errors, no signal is generated. However, if there is a sudden change in the rate of these errors, an anomaly signal is generated. |

After these rules are added, the following happens:

- The **Error Ratio** is computed for all request contexts and shown in the Service KPI Dashboard. The ratio is computed as `sum by(asserts_env, asserts_site, namespace, workload, service, job, asserts_request_type, asserts_request_context, asserts_error_type)(rate(asserts:error:total[5m])) ignoring(asserts_error_type) / sum by(asserts_env, asserts_site, namespace, workload, service, job, asserts_request_type, asserts_request_context)(rate(asserts:request:total[5m]))`. Note that the labels used in the aggregation for `asserts:request:total` and `asserts:error:total` metrics should match for the ratio to be recorded.
- The **ErrorRatioBreach** is triggered if the ratio breaches a certain threshold.
- The **ErrorBuildup** (multi burn-multi window) is triggered if the error budget breaches.
- The **Error Ratio** is observed for anomalies, and **ErrorRatioAnomaly** is triggered when there are anomalies.

> Note
> 
> In the above example, the source metric is available as a counter. So it was mapped to `asserts:error:total`. If the source metric were a gauge, then it should be mapped to `asserts:error:gauge` and set `asserts_metric_error: gauge` or `asserts_metric_error: client_gauge` in the case of inbound client errors.

## Latency Average

The knowledge graph computes the latency average using the following metrics:

- `asserts:latency:total` - the latency total time in seconds
- `asserts:latency:count` - the total number of requests

Add the recording rules for these two metrics from the respective histogram metrics which have the `_sum` and `_count` metrics.

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
# Inbound Latency
- record: asserts:latency:total
  expr: |
    label_replace(http_server_requests_seconds_sum, "asserts_request_context", "$1", "uri", "(.+)")
  labels:
    asserts_source: spring_boot
    asserts_metric_latency: seconds_sum
    asserts_request_type: inbound

- record: asserts:latency:count
  expr: |
    label_replace(http_server_requests_seconds_count, "asserts_request_context", "$1", "uri", "(.+)")
  labels:
    asserts_source: spring_boot
    asserts_metric_latency: count
    asserts_request_type: inbound

# Outbound latency
- record: asserts:latency:total
  expr:
    label_replace(http_client_requests_seconds_sum, "asserts_request_context", "$1", "uri", "(.+)")
  labels:
    asserts_source: spring_boot
    asserts_metric_latency: seconds_sum
    asserts_request_type: outbound

- record: asserts:latency:count
  expr:
    label_replace(http_client_requests_seconds_count, "asserts_request_context", "$1", "uri", "(.+)")
  labels:
    asserts_source: spring_boot
    asserts_metric_latency: count
    asserts_request_type: outbound
```

Expand table

| **Meta Label**           | **Description**                                                                                                                                                                                                                                                                                                                     |
|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `asserts_metric_latency` | Used by the knowledge graph to identify the numerator and denominator to compute the latency average along with the unit of the source latency metric. Valid values for latency (the numerator) are `seconds_sum`, `milliseconds_sum`, and `microseconds_sum`. For the latency count (the denominator), the valid value is `count`. |

After rules are added, the following occurs:

- **Latency Average** is computed for all requests and shown in the Service KPI Dashboards.
- The **Latency Average** is observed for anomalies, and **LatencyAverageAnomaly** is triggered when there are anomalies.

> Note
> 
> In the previous example, the source metric is available as a counter, therefore, it was mapped to `asserts:latency:total` and `asserts:latency:count`. If the source metric was a gauge, then it should be directly mapped to `asserts:latency:average`. While doing this, be mindful of the labels in the source metric. When the source is a counter, Asserts does some aggregation internally, and only the key labels are retained, which reduces the cardinality in the metrics it records. In the direct mapping, this is not the case.

## Latency P99

Similarly, we can record the latency p99 for the requests as follows:

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
# Inbound requests latency P99
- record: asserts:latency:p99
  expr: >
    label_replace(
      histogram_quantile (
        0.99,
        sum(rate(http_server_requests_seconds_bucket[5m]) > 0) by (le, namespace, job, service, workload, uri, asserts_env, asserts_site)
      )
      , "asserts_request_context", "$1", "uri", "(.+)"
    )
  labels:
    asserts_source: spring_boot
    asserts_entity_type: Service
    asserts_request_type: inbound

# Outbound requests latency P99
- record: asserts:latency:p99
  expr: >
    label_replace(
      histogram_quantile (
        0.99,
        sum(rate(http_client_requests_seconds_bucket[5m]) > 0) by (le, namespace, job, service, workload, uri, asserts_env, asserts_site)
      )
      , "asserts_request_context", "$1", "uri", "(.+)"
    )
  labels:
    asserts_source: spring_boot
    asserts_entity_type: Service
    asserts_request_type: outbound
```

Expand table

| **Meta Label**        | **Description**                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `asserts_env`         | Used by the knowledge graph to identify the environment. All discovered entities and observed metrics are automatically scoped to an environment.                                                                                                                                                                                                                                                                                                                                                                                     |
| `asserts_site`        | Used by the knowledge graph to identify the region/site within an environment. For example, you could have a `prod` environment but multiple regions, such as `us-east-1`, `us-west-2`, etc. This label is used to capture the region information. Note that this depends on how environment information is encoded in the metrics. Sometimes, both the environment and the region information may be encoded in a single label value; in such cases, the `asserts_env` label contains that value, and this label may not be present. |
| `asserts_entity_type` | Used by the knowledge graph to identify the level at which the metric is being observed. The `workload`, `service`, and `job` are special labels that the knowledge graph uses to identify the `Service`. These labels are also used to discover the `Service` entity in the knowledge graph entity model. In this example, while aggregating, these labels are retained, so this metric is observed for the corresponding `Service` entity.                                                                                          |

After this is recorded, the knowledge graph shows this metric in the Service KPI Dashboard, and begins observing for the clock minutes when the **Latency P99** exceeds a threshold. These minutes are tracked through a total bad minutes counter. Based on the ratio of `bad minutes` to `total minutes` in a given time-window, the **LatencyP99ErrorBuildup** is triggered. This alert is a Multi-Burn, Multi-Window error budget-based alert.

## Latency P99 across all requests of a Service

The Latency P99 for the entire service, regardless of different request contexts, can be recorded as follows:

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
- record: asserts:latency:service:p99
  expr: >
    histogram_quantile (
      0.99,
      sum(rate(http_server_requests_seconds_bucket[5m]) > 0)
        by (le, namespace, job, service, workload, asserts_env, asserts_site)
    )
  labels:
    asserts_entity_type: Service
    asserts_request_type: inbound
    asserts_source: spring_boot
```

This metric is useful while creating a Latency SLO for the entire service.
