Menu

This is documentation for the next version of Tempo. For the latest stable release, go to the latest version.

Open source

Version 2.2 release notes

The Tempo team is pleased to announce the release of Tempo 2.2.

This release gives you:

  • Major additions to the TraceQL language: structural operators (descendant, child, sibling), results grouped by attribute (by()), a select operator, and 3 new intrinsic attributes.
  • Faster TraceQL results, thanks to a streaming endpoint that returns partial results as the query executes as well as a multitude of performance-related improvements.
  • An experimental metrics-summary API that returns RED metrics for recently received spans grouped by your attribute of choice.

Tempo 2.2 makes vParquet2, a Parquet version designed to be more compatible with other Parquet implementations, the default block format. This block format is required for using structural operators and improves query performance relative to previous formats.

Read the Tempo 2.2 blog post for more examples and details about these improvements.

Note

For a complete list of changes, enhancements, and bug fixes refer to the Tempo 2.2 changelog.

Features and enhancements

Some of the most important features and enhancements in Tempo 2.2 are highlighted below.

Expanding the TraceQL language

With this release, we’ve added to the TraceQL language. TraceQL now offers:

  • Structural operators: descendant (»), child (>), and sibling (~) (documentation). Find relevant traces based on their structure and relationships among spans. [PR #2625 #2660]
  • A select() operation that allows you to specify arbitrary span attributes that you want included in the TraceQL response (documentation) [PR 2494]
  • A by() operation that groups span sets within a trace by an attribute of your choosing. This operation is not supported in the Grafana UI yet; you can only use by() when querying Tempo’s search API directly. (documentation [PR 2490]
  • New intrinsic attributes for use in TraceQL queries: traceDuration, rootName, and rootServiceName (documentation) [PR #2503]

Read the Tempo 2.2 blog post for examples of how to use these new language additions.

To learn more about the TraceQL syntax, see the TraceQL documentation. For information on planned future extensions to the TraceQL language, see future work.

Get TraceQL results faster

We’re always trying to reduce the time you spend waiting to get results to your TraceQL queries, and we’ve made some nice progress on this front with this release.

We’ve added a GRPC streaming endpoint to Tempo’s query frontend that allows a client to stream search results from Tempo. The Tempo CLI has been updated to use this new streaming endpoint [PR #2366] . As of version 10.1, Grafana supports it as well, though you must first enable the traceQLStreaming feature toggle [PR #72288].

By streaming results to the client, you can start to look at traces matching your query before the entire query completes. This is particularly helpful for long-running queries; while the total time to complete the query is the same, you can start looking at your first matches before the full set of matched traces is returned.

In addition to streaming partial results, we’ve merged a number of improvements to speed up TraceQL queries. Here are just a few:

  • Add support for query batching between frontend and queriers to improve throughput [PR 2677]
  • Improve performance of TraceQL regex [PR 2484]
  • Fully skip over Parquet row groups with no matches in the column dictionaries [PR 2676]
  • New synchronous read mode for vParquet and vParquet2 [PRs 2165, 2535]
  • Improved TraceQL throughput by asynchronously creating jobs. [PR 2530]

Metrics summary API (experimental)

Tempo has added an experimental API that returns RED metrics (span count, erroring span count, and latency information) for spans of kind=server sent in the last hour, grouped by an attribute of your choice. For example, you could use this API to compare error rates of spans with different values of the namespace attribute. From here, you might see that spans from namespace=A have a significantly higher error rate than those from namespace=B. As another example, you could use this API to compare latencies of your spans broken down by the region attribute. From here, you might notice that spans from region=North-America have higher latencies than those from region=Asia-Pacific.

This API is meant to enable ad-hoc analysis of your incoming spans; by segmenting your spans by attribute and looking for differences in RED metrics, you can more quickly isolate where problems like elevated error rates or higher latencies are coming from.

Unlike RED metrics computed by Tempo’s [metrics-generator]((/docs/tempo/next/metrics-generator/), the values returned by this API are not persisted as time series. This has the advantage that you do not need to provide your own time series databases for storing and querying these metrics. It also allows you to compute RED metrics broken down by high cardinality attributes that would be too expensive to store in a time series database. Use the metrics generator if you want to store and visualize RED metrics over multi-hour or multi-day time ranges, or you want to alert on these metrics.

To learn more about this API, refer to the metrics summary API documentation.

This work is represented in multiple PRs: 2368, 2418, 2424, 2442, 2480, 2481, 2501, 2579, and 2582.

Other enhancements

  • Tempo’s tag values and tag names APIs now support filtering [PR 2253]. This lets you retrieve all valid attribute values and names given certain criteria. For example, you can get a list of values for the attribute namespace seen on spans with attribute resource=A. This feature is off by default; to enable, configure autocomplete_filtering_enabled. (documentation). Grafana’s autocomplete can make use of this filtering capability to provide better suggestions starting in v10.2 PR [67845].

  • Tempo’s metrics-generator now supports span filtering. Setting up filters allows you to compute metrics over the specific spans you care about, excluding others. It also can reduce the cardinality of generated metrics, and therefore the cost of storing those metrics in a Prometheus-compatible TSDB. (documentation) [PR 2274]

  • Tempo’s metrics-generator can now detect virtual nodes (documentation) [PR 2365]. As a result, you’ll now see these virtual nodes represented in your service graph. For more information, refer to the virtual nodes documentation.

Upgrade considerations

When upgrading to Tempo 2.2, be aware of these breaking changes:

  • JSonnet users only: We’ve converted the metrics-generator component from a k8s deployment to a k8s statefulset. Refer to the PR for seamless migration instructions. [PRs #2533, #2467]
  • Removed or renamed configuration parameters (see section below)

While not a breaking change, upgrading to Tempo 2.2 will by default change Tempo’s block format to vParquet2. To stay on a previous block format, read the Parquet configuration documentation. We strongly encourage upgrading to vParquet2 as soon as possible as this is required for using structural operators in your TraceQL queries and provides query performance improvements, in particular on queries using the duration intrinsic.

Removed or renamed configuration parameters

The following fields were removed or renamed.

storage:

trace:

s3:

insecure_skip_verify: true // renamed to tls_insecure_skip_verify

ParameterComments

query_frontend:

tolerate_failed_blocks: <int>

Remove support for `tolerant_failed_blocks` [PR 2416]
Renamed `insecure_skip_verify` to `tls_insecure_skip_verify` [PR 2407]

Bug fixes

For a complete list, refer to the Tempo changelog.

2.2.4

2.2.3

  • Fixed S3 credentials providers configuration PR 2889

2.2.2

  • Fixed node role auth IDMSv1 PR 2760

2.2.1

  • Fixed incorrect metrics for index failures PR 2781
  • Fixed a panic in the metrics-generator when using multiple tenants with default overrides PR 2786
  • Restored tenant_header_key removed in PR 2414 PR 2786
  • Disabled streaming over HTTP by default PR 2803

2.2

  • Fixed an issue in the metrics-generator that prevented scaling up parallelism when remote writing of metrics was lagging behind [PR 2463]
  • Fixed an issue where metrics-generator was setting wrong labels for traces_target_info [PR 2546]
  • Fixed an issue where matches and other spanset level attributes were not persisted to the TraceQL results. [PR 2490]
  • Fixed an issue where ingester search could occasionally fail with file does not exist error [PR 2534]