Menu
Open source

Metrics summary API

Warning

The Metrics summary API is an experimental feature that is disabled by default. To enable it, adjust your configuration as suggested below.

This document explains how to use the metrics summary API in Tempo. This API returns RED metrics (span count, erroring span count, and latency information) for kind=server spans sent to Tempo in the last hour, grouped by a user-specified attribute.

Configuration

To enable the experimental metrics summary API, you must turn on the local blocks processor in the metrics generator. Be aware that the generator will use considerably more resources, including disk space, if it is enabled:

yaml
overrides:
  defaults:
    metrics_generator:
      processors: [..., 'local-blocks']

Request

To make a request to this API, use the following endpoint on the query-frontend:

GET http://<tempo>/api/metrics/summary

Query Parameters

All query parameters must be URL-encoded to preserve non-URL-safe characters in the query such as &.

NameExamplesDefinitionRequired?
q{ resource.service.name = "foo" && span.http.status_code != 200 }The TraceQL query with full syntax. All spans matching this query are included in the calculations. Any valid TraceQL query is supported.Yes
groupByname
.foo
resource.namespace
span.http.url,span.http.status_code
One or more TraceQL values to group by. Any valid intrinsic or attribute with scope. To group by multiple values use a comma-delimited list.Yes
start1672549200Start of time range in Unix seconds. If not specified, then all recent data is queried.No
end1672549200End of the time range in Unix seconds. If not specified, then all recent data is queried.No

Example:

bash
curl "$URL/api/metrics/summary" --data-urlencode 'q={resource.service.name="checkout-service"}' --data-urlencode 'groupBy=name'

Response

The Tempo response is a SpanMetricsSummary object defined in tempo.proto, relevant section pasted below:

message SpanMetricsSummaryResponse {
  repeated SpanMetricsSummary summaries = 1;
}

message SpanMetricsSummary {
  uint64 spanCount = 1;
  uint64 errorSpanCount = 2;
  TraceQLStatic static = 3;
  uint64 p99 = 4;
  uint64 p95 = 5;
  uint64 p90 = 6;
  uint64 p50 = 7;
}

message TraceQLStatic {
  int32 type = 1;
  int64 n = 2;
  double f = 3;
  string s = 4;
  bool b = 5;
  uint64 d = 6;
  int32 status = 7;
  int32 kind = 8;
}

The response is returned as JSON following standard protobuf->JSON mapping rules.

Note

The uint64 fields cannot be fully expressed by JSON numeric values so the fields are serialized as strings.

Example:

JavaScript
{
   "summaries": [
       {
           "spanCount": "20",
           "series" : [
               {
                   "key": ".attr1",
                   "value": {
                       "type": 5,
                       "s": "foo"
                   },
               },
               ...
           ],
           "p99": "68719476736",
           "p95": "1073741824",
           "p90": "1017990479",
           "p50": "664499239"
       },
FieldNotes
summariesThe list of metrics per group.
.spanCountNumber of spans in this group.
.errorSpanCountNumber of spans with status=error. (This field will not be present if the value is 0.)
.seriesThe unique values for this group. A key/value pair will be returned for each entry in groupBy.
.keyKey name.
.valueValue with TraceQL underlying type.
.typeData type enum`` defined [here](https://github.com/grafana/tempo/blob/main/pkg/traceql/enum_statics.go#L8) (This field will not be present if the value is 0.) <br/>0 = nil<br/>3 = integer<br/> 4 = float<br/> 5 =string<br/> 6 = bool<br/> 7 = duration`
8 = span status
9 = span kind
.nPopulated if this is an integer value.
.sPopulated if this is a string value.
.fPopulated if this is a float value.
.bPopulated if this is a boolean value.
.dPopulated if this is a duration value.
.statusPopulated if this is a span status value.
.kindPopulated if this is a span kind value.
.p99The p99 latency of this group in nanoseconds.
.p95The p95 latency of this group in nanoseconds.
.p90The p90 latency of this group in nanoseconds.
.p50The p50 latency of this group in nanoseconds.