Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.
Version 2.2 release notes
The Tempo team is pleased to announce the release of Tempo 2.2.
This release gives you:
- Major additions to the TraceQL language: structural operators (descendant, child, sibling), results grouped by attribute (
by()
), aselect
operator, and 3 new intrinsic attributes. - Faster TraceQL results, thanks to a streaming endpoint that returns partial results as the query executes as well as a multitude of performance-related improvements.
- An experimental metrics-summary API that returns RED metrics for recently received spans grouped by your attribute of choice.
Tempo 2.2 makes vParquet2, a Parquet version designed to be more compatible with other Parquet implementations, the default block format. This block format is required for using structural operators and improves query performance relative to previous formats.
Read the Tempo 2.2 blog post for more examples and details about these improvements.
Note
For a complete list of changes, enhancements, and bug fixes refer to the Tempo 2.2 changelog.
Features and enhancements
Some of the most important features and enhancements in Tempo 2.2 are highlighted below.
Expanding the TraceQL language
With this release, we’ve added to the TraceQL language. TraceQL now offers:
- Structural operators: descendant (»), child (>), and sibling (~) (documentation). Find relevant traces based on their structure and relationships among spans. [PR #2625 #2660]
- A
select()
operation that allows you to specify arbitrary span attributes that you want included in the TraceQL response (documentation) [PR 2494] - A
by()
operation that groups span sets within a trace by an attribute of your choosing. This operation is not supported in the Grafana UI yet; you can only useby()
when querying Tempo’s search API directly. (documentation [PR 2490] - New intrinsic attributes for use in TraceQL queries:
traceDuration
,rootName
, androotServiceName
(documentation) [PR #2503]
Read the Tempo 2.2 blog post for examples of how to use these new language additions.
To learn more about the TraceQL syntax, see the TraceQL documentation. For information on planned future extensions to the TraceQL language, see future work.
Get TraceQL results faster
We’re always trying to reduce the time you spend waiting to get results to your TraceQL queries, and we’ve made some nice progress on this front with this release.
We’ve added a GRPC streaming endpoint to Tempo’s query frontend that allows a client to stream search results from Tempo. The Tempo CLI has been updated to use this new streaming endpoint [PR #2366] . As of version 10.1, Grafana supports it as well, though you must first enable the traceQLStreaming
feature toggle [PR #72288].
By streaming results to the client, you can start to look at traces matching your query before the entire query completes. This is particularly helpful for long-running queries; while the total time to complete the query is the same, you can start looking at your first matches before the full set of matched traces is returned.
In addition to streaming partial results, we’ve merged a number of improvements to speed up TraceQL queries. Here are just a few:
- Add support for query batching between frontend and queriers to improve throughput [PR 2677]
- Improve performance of TraceQL regex [PR 2484]
- Fully skip over Parquet row groups with no matches in the column dictionaries [PR 2676]
- New synchronous read mode for vParquet and vParquet2 [PRs 2165, 2535]
- Improved TraceQL throughput by asynchronously creating jobs. [PR 2530]
Metrics summary API (experimental)
Tempo has added an experimental API that returns RED metrics (span count, erroring span count, and latency information) for spans of kind=server
sent in the last hour, grouped by an attribute of your choice.
For example, you could use this API to compare error rates of spans with different values of the namespace
attribute.
From here, you might see that spans from namespace=A
have a significantly higher error rate than those from namespace=B
.
As another example, you could use this API to compare latencies of your spans broken down by the region
attribute.
From here, you might notice that spans from region=North-America
have higher latencies than those from region=Asia-Pacific
.
This API is meant to enable ad-hoc analysis of your incoming spans; by segmenting your spans by attribute and looking for differences in RED metrics, you can more quickly isolate where problems like elevated error rates or higher latencies are coming from.
Unlike RED metrics computed by Tempo’s [metrics-generator]((/docs/tempo/v2.4.x/metrics-generator/), the values returned by this API are not persisted as time series. This has the advantage that you do not need to provide your own time series databases for storing and querying these metrics. It also allows you to compute RED metrics broken down by high cardinality attributes that would be too expensive to store in a time series database. Use the metrics generator if you want to store and visualize RED metrics over multi-hour or multi-day time ranges, or you want to alert on these metrics.
To learn more about this API, refer to the metrics summary API documentation.
This work is represented in multiple PRs: 2368, 2418, 2424, 2442, 2480, 2481, 2501, 2579, and 2582.
Other enhancements
Tempo’s tag values and tag names APIs now support filtering [PR 2253]. This lets you retrieve all valid attribute values and names given certain criteria. For example, you can get a list of values for the attribute
namespace
seen on spans with attributeresource=A.
This feature is off by default; to enable, configureautocomplete_filtering_enabled
. (documentation). Grafana’s autocomplete can make use of this filtering capability to provide better suggestions starting in v10.2 PR [67845].Tempo’s metrics-generator now supports span filtering. Setting up filters allows you to compute metrics over the specific spans you care about, excluding others. It also can reduce the cardinality of generated metrics, and therefore the cost of storing those metrics in a Prometheus-compatible TSDB. (documentation) [PR 2274]
Tempo’s metrics-generator can now detect virtual nodes (documentation) [PR 2365]. As a result, you’ll now see these virtual nodes represented in your service graph. For more information, refer to the virtual nodes documentation.
Upgrade considerations
When upgrading to Tempo 2.2, be aware of these breaking changes:
- JSonnet users only: We’ve converted the metrics-generator component from a k8s deployment to a k8s statefulset. Refer to the PR for seamless migration instructions. [PRs #2533, #2467]
- Removed or renamed configuration parameters (see section below)
While not a breaking change, upgrading to Tempo 2.2 will by default change Tempo’s block format to vParquet2.
To stay on a previous block format, read the Parquet configuration documentation.
We strongly encourage upgrading to vParquet2 as soon as possible as this is required for using structural operators in your TraceQL queries and provides query performance improvements, in particular on queries using the duration
intrinsic.
Removed or renamed configuration parameters
The following fields were removed or renamed.
Parameter | Comments |
| Remove support for `tolerant_failed_blocks` [PR 2416] |
Renamed `insecure_skip_verify` to `tls_insecure_skip_verify` [PR 2407] |
Bug fixes
For a complete list, refer to the Tempo changelog.
2.2.4
- Updated Alpine image version to 3.18 to patch CVE-2022-48174 PR 3046
- Bumped Jaeger query docker image to 1.50.0 PR 2998
2.2.3
- Fixed S3 credentials providers configuration PR 2889
2.2.2
- Fixed node role auth IDMSv1 PR 2760
2.2.1
- Fixed incorrect metrics for index failures PR 2781
- Fixed a panic in the metrics-generator when using multiple tenants with default overrides PR 2786
- Restored
tenant_header_key
removed in PR 2414 PR 2786 - Disabled streaming over HTTP by default PR 2803
2.2
- Fixed an issue in the metrics-generator that prevented scaling up parallelism when remote writing of metrics was lagging behind [PR 2463]
- Fixed an issue where metrics-generator was setting wrong labels for
traces_target_info
[PR 2546] - Fixed an issue where matches and other spanset level attributes were not persisted to the TraceQL results. [PR 2490]
- Fixed an issue where ingester search could occasionally fail with
file does not exist
error [PR 2534]