Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.
Version 2.4 release notes
The Tempo team is pleased to announce the release of Tempo 2.4.
This release gives you:
- New features, including multi-tenant queries and experimental TraceQL metrics queries
- Performance enhancements, thanks to the addition of new caching tiers
- Cost savings, thanks to polling improvements that reduce calls to object storage
As part of this release, vParquet3 has also been promoted to the new default storage format for traces. For more about why we’re so excited about vParquet3, refer to Accelerate TraceQL queries at scale with dedicated attribute columns in Grafana Tempo.
Read the Tempo 2.4 blog post for more examples and details about these improvements.
These release notes highlight the most important features and bugfixes. For a complete list, refer to the Tempo changelog.
Features and enhancements
The most important features and enhancements in Tempo 2.4 are highlighted below.
Multi-tenant queries
Tempo now allows you to query multiple tenants at once. We’ve made multi-tenant queries compatible with streaming (first released in v2.2) so you can get query results as fast as possible. To learn more, refer to Cross-tenant federation and Enable multi-tenancy. [PRs 3262, 3087]
TraceQL metrics (experimental)
We’re excited to announce the addition of metrics queries to the TraceQL language. Metric queries extend trace queries by applying a function to trace query results. This powerful feature creates metrics from traces, much in the same way that LogQL metric queries create metrics from logs.
In this case, we are calculating the rate of the erroring spans coming from the service foo
. Rate is a spans/sec
quantity.
{ resource.service.name = "foo" && status = error } | rate()
In addition, you can use Grafana Explore to query and visualize the metrics with the Tempo data soruce in Grafana or Grafana Cloud.
For more information, refer to the TraceQL metrics documentation. [PRs 3227 #3252, 3258]
To learn more about the TraceQL syntax, see the TraceQL documentation. For information on planned future extensions to the TraceQL language, refer to future work.
TraceQL performance improvements
We continue to make query performance improvements so you spend less time waiting on results to your TraceQL queries. Below are some notable PRs that made it into this release:
- Improve TraceQL regex performance in certain queries. [PR 3139]
- Improve TraceQL performance in complex queries. [PR 3113]
- TraceQL/Structural operators performance improvement. [PR 3088]
vParquet3 is now the default block format
Tempo 2.4 makes vParquet3 the default storage format.
We’re excited about vParquet3 relative to prior formats because of its support for dedicated attribute columns, which help speed up queries on your largest and most queried attributes. We’ve seen excellent performance improvements when running it ourselves, and by promoting it to the default, we’re signaling that it is ready for broad adoption.
Dedicated attribute columns, available using vParquet3, improve query performance by storing the largest and most frequently used attributes in their own columns, rather than in the generic attribute key-value list. For more information, refer to Accelerate TraceQL queries at scale with dedicated attribute columns in Grafana Tempo. [PR 2526]
If you had manually configured vParquet3, we recommend removing it to move forward with Tempo defaults.
To read more about the design of vParquet3, refer to the design proposal. For general information, refer to the Apache Parquet schema.
Additional caching layers
Tempo has added two new caches to improve TraceQL query performance. The frontend-search cache handles job search caching. The parquet-page cache handles page level caching. Refer to the Cache section of the Configuration documentation for how to configure these new caching layers. As part of adding these new caching layers, we’ve refactored our caching interface. This includes breaking changes described in “Breaking Changes”. [PRs 3166, 3225, 3196]
Polling improvements for cost reduction
We’ve improved how Tempo polls object storage, ensuring that we reuse previous results. This has dramatically reduced the number of requests Tempo makes to the object store. Not only does this reduce the load on your object store, for many, it will save you money (since most hosted object storage solutions charge per request).
We’ve also added the list_blocks_concurrency
parameter to allow you to tune the number of list calls Tempo makes in parallel to object storage so you can select the value that works best for your environment. We’ve set the default value to 3
, which should work well for the average Tempo cluster. [PR 2652]
Other enhancements and improvements
In addition, the following improvements have been made in Tempo 2.4:
- Improved Tempo error handling on writes, so that one erroring trace doesn’t result in an entire batch of traces being dropped. PR 2571
- Added per-tenant compaction window. PR 3129
- Added
--max-start-time
and--min-start-time
flag to tempo-cli commandanalyse blocks
. PR 3250 - Added per-tenant configurable
remote_write
headers to metrics-generator. #3175 - Added variable expansion support to overrides configuration. PR 3175
- Added HTML pages
/status/overrides
and/status/overrides/{tenant}
. PR 3244 #3332 - Precalculate and reuse the vParquet3 schema before opening blocks. PR 3367
- Made the trace ID label name configurable for remote written exemplars. PR 3074
- Performance improvements in span filtering. PR 3025
- Introduced localblocks process configuration option to select only server spans. PR 3303
Upgrade considerations
When upgrading to Tempo 2.4, be aware of these considerations and breaking changes.
Transition to vParquet 3
vParquet3 format is now the default block format. It is production ready and we highly recommend switching to it for improved query performance and dedicated attribute columns.
Upgrading to Tempo 2.4 modifies the Parquet block format. Although you can use Tempo 2.3 with vParquet2 or vParquet3, you can only use Tempo 2.4 with vParquet3.
With this release, the first version of our Parquet backend, vParquet, is being deprecated. Tempo 2.4 still reads vParquet1 blocks. However, Tempo will exit with error if they are manually configured. [PR 3377]
For information on upgrading, refer to Upgrade to Tempo 2.4 and Choose a different block format .
Updated, removed, or renamed configuration parameters
Parameter | Comments |
autocomplete_filtering_enabled | Set to true by default [PR 3178] |
distributor.log_received_traces | Use the distributor.log_received_spans configuration block instead. [PR #3008] |
tempo_query_frontend_queries_total{op="searchtags|metrics"} | Removed deprecated frontend metrics configuration option |
| These fields have been removed in favor of the new cache configuration. Refer to Cache configuration refactored. |
The distributor now returns 200 for any batch containing only trace_too_large
and max_live_traces
errors. The number of discarded spans are still reflected in the tempo_discarded_spans_total metrics
.
Removed experimental websockets support for search streaming
GPRC is now the supported method for streaming results. Websockets support for search streaming has been removed. Websocket support was initially added due to conflicts with GRPC, HTTP, and TLS. Those issues were corrected in PR 3300. PR 3307
Cache configuration refactored
The major cache refactor allows multiple role-based caches to be configured. [PR 3166]
This change resulted in the following fields being deprecated.
These have all been migrated to a top level cache:
field.
For more information about the configuration, refer to the Cache section.
The old configuration block looked like this:
storage:
trace:
cache:
search:
cache_control:
background_cache:
memcached:
redis:
With the new configuration, you create your list of caches,- with either redis
or memcached
cluster with your configuration, then define the types of data and roles.
cache:
caches:
- memcached:
host: <some memcached cluster>
roles:
- bloom
- parquet-footer
- memcached:
host: <some memcached cluster>
roles:
- frontend-search
Security fixes
The following vulnerabilities have been addressed:
- Addressed CVE-2023-5363.
- Updated the
memcached
default image in Jsonnet for multiple CVEs. PR 3310 - Update golang.org/x/net package to 0.24.0 to fix CVE-2023-45288 PR 3613
Bugfixes
For a complete list, refer to the Tempo changelog.
2.4.2
- Update golang.org/x/net package to 0.24.0 to fix CVE-2023-45288. PR 3613
2.4.1
- Fixed compaction/retention in AWS S3 and GCS when a prefix is configured. PR 3465
2.4.0
- Prevent building parquet iterators that would loop forever. PR 3159
- Sanitize name in mapped dimensions in span-metrics processor. PR 3171
- Fixed an issue where cached footers were requested then ignored. PR 3196
- Fixed a panic in autocomplete when the query condition had the wrong type. PR 3277
- Fixed TLS when GRPC is enabled on HTTP. PR 3300
- Correctly return 400 when max limit is requested on search. PR 3340
- Fixed autocomplete filters sometimes returning erroneous results. PR 3339
- Fixed trace context propagation between query-frontend and querier. PR 3387
- Fixed parsing of span.resource.xyz attributes in TraceQL. PR 3284
- Changed exit code if config is successfully verified. PR 3174
- The tempo-cli analyze blocks command no longer fails on compacted blocks. PR 3183
- Moved waitgroup handling for poller error condition. PR 3224
- Fixed head block excessive locking in ingester search. PR 3328
- Fixed an issue with ingester failed to write traces to disk after a crash or unclean restart. PR 3346