Menu

This is documentation for the next version of Tempo. For the latest stable release, go to the latest version.

Open source

Version 2.6 release notes

The Tempo team is pleased to announce the release of Tempo 2.6.

This release gives you:

  • Additions to the TraceQL language, including the ability to search by span events, links, and arrays
  • Additions to TraceQL metric query-types including a compare function and the ability to do instant queries (which will return faster than range queries).
  • Performance and stability enhancements

Read the Tempo 2.6 blog post for more examples and details about these improvements.

These release notes highlight the most important features and bugfixes. For a complete list, refer to the Tempo changelog.

Features and enhancements

The most important features and enhancements in Tempo 2.6 are highlighted below.

Additional TraceQL metrics (experimental)

In this release, we’ve added several TraceQL metrics. In Tempo 2.6, TraceQL metrics adds:

Additionally, we’re working on refactoring the replication factor. Refer to the Operational change for TraceQL metrics section for details.

Note that using TraceQL metrics may require additional system resources.

For more information, refer to the TraceQL metrics queries and Configure TraceQL metrics.

TraceQL improvements

Unique to Tempo, TraceQL is a query language that lets you perform custom queries into your tracing data. To learn more about the TraceQL syntax, refer to the TraceQL documentation.

We’ve added event attributes and link scopes. Like spans, they both have instrinsics and attributes.

The event scope lets you query events that happen within a span. A span event is a unique point in time during the span’s duration. While spans help build the structural hierarchy of your services, span events can provide a deeper level of granularity to help debug your application faster and maintain optimal performance. To learn more about how you can use span events, read the What are span events? blog post. [PRs 3708, 3708, 3908]

If you’ve instrumented your traces for span links, you can use the link scope to search for an attribute within a span link. A span link associates one span with one or more other spans. [PRs 3814, 3741]

For more information on span links, refer to the Span Links documentation in the Open Telemetry project.

You can search for an attribute in your link:

{ link.opentracing.ref_type = "child_of" }

A TraceQL example showing <code>link</code> scope

We’ve also added autocomplete support for events and links. [PR 3846]

Tempo 2.6 improves TraceQL performance with these updates:

  • Performance improvement for rate() by () queries [PR 3719]
  • Add caching to query range queries [PR 3796]
  • Only stream diffs on metrics queries [PR 3808]
  • Tag value lookup use protobuf internally for improved latency [PR 3731]
  • TraceQL metrics queries use protobuf internally for improved latency [PR 3745]
  • TraceQL search and other endpoints use protobuf internally for improved latency and resource usage [PR 3944]
  • Add local disk caching of metrics queries in local-blocks processor [PR 3799]
  • Performance improvement for queries using trace-level intrinsics [PR 3920]
  • Use multiple goroutines to unmarshal responses in parallel in the query frontend. [PR 3713]

Native histogram support

The metrics-generator can produce native histograms for high-resolution data. PR 3789

Native histograms are a data type in Prometheus that can produce, store, and query high-resolution histograms of observations. It usually offers higher resolution and more straightforward instrumentation than classic histograms.

To learn more, refer to the Native histogram documentation.

Performance improvements

One of our major improvements in Tempo 2.6 is the reduction of memory usage due to polling improvements. [PRs 3950, 3951, 3952

Comparison graph showing the reduction of memory usage due to the polling improvements

This improvement is a result of some of these changes:

  • Add data quality metric to measure traces without a root [PR 3812]
  • Reduce memory consumption of query-frontend [PR 3888]
  • Reduce allocs of caching middleware [PR 3976]
  • Reduce allocs building queriers sharded requests [PR 3932]
  • Improve trace id lookup from Tempo Vulture by selecting a date range [PR 3874]

Other enhancements and improvements

This release also has these notable updates:

Upgrade considerations

When upgrading to Tempo 2.6, be aware of these considerations and breaking changes.

Operational change for TraceQL metrics

We’ve changed to an RF1 (Replication Factor 1) pattern for TraceQL metrics as we were unable to hit performance goals for RF3 de-duplication. This requires some operational changes to query TraceQL metrics.

TraceQL metrics are still considered experimental. We hope to mark them GA soon when we productionize a complete RF1 write-read path. [PRs 3628, 3691, 3723, 3995]

For recent data

The local-blocks processor must be enabled to start using metrics queries like { } | rate(). If not enabled metrics queries fail with the error localblocks processor not found. Enabling the local-blocks processor can be done either per tenant or in all tenants.

  • Per-tenant in the per-tenant overrides:

    yaml
      overrides:
        'tenantID':
          metrics_generator_processors:
            - local-blocks
  • By default, for all tenants in the main config:

    yaml
    overrides:
      defaults:
        metrics_generator:
          processors: [local-blocks]

Add this configuration to run TraceQL metrics queries against all spans (and not just server spans):

yaml
metrics_generator:
  processor:
    local_blocks:
      filter_server_spans: false

For historical data

To run metrics queries on historical data, you must configure the local-blocks processor to flush rf1 blocks to object storage:

yaml
metrics_generator:
  processor:
    local_blocks:
      flush_to_storage: true

Transition to vParquet4

vParquet4 format is now the default block format. It’s production ready and we highly recommend switching to it for improved query performance. [PR 3810]

Upgrading to Tempo 2.6 modifies the Parquet block format. You don’t need to do anything with Parquet to go from 2.5 to 2.6. If you used vParquet2 or vParquet3, all of your old blocks remain and can be read by Tempo 2.6. Tempo 2.6 creates vParquet4 blocks by default, which enables the new TraceQL features.

Although you can use Tempo 2.6 with vParquet2 or vParquet3, you can only use vParquet4 with Tempo 2.5 and later. If you are using 2.5 with vParquet4, you’ll need to upgrade to Tempo 2.6 to use the new TraceQL features.

You can also use the tempo-cli analyse blocks command to query vParquet4 blocks. PR 3868]. Refer to the Tempo CLI documentation for more information.

For information on upgrading, refer to Upgrade to Tempo 2.6 and Choose a different block format.

Updated, removed, or renamed configuration parameters

ParameterComments
storage:
   azure:
    use_v2_sdk:
Removed. Azure v2 is the only and primary Azure backend [PR 3875]
autocomplete_filtering_enabledThe feature flag option has been removed. The feature is always enabled. [PR 3729]
completedfilepath and blocksfilepathRemoved unused WAL configuration options. [PR 3911]
compaction_disabledNew. Allow compaction disablement per-tenant. [PR 3965, documentation]
Storage:
  s3:
    [enable_dual_stack: <bool>]
Boolean flag to activate or deactivate dualstack mode on the Storage block configuration for S3. [PR 3721, documentation]

Bugfixes

For a complete list, refer to the Tempo changelog.

  • Fix panic in certain metrics queries using rate() with by. [PR 3847]
  • Fix metrics queries when grouping by attributes that may not exist. [PR 3734]
  • Fix metrics query histograms and quantiles on traceDuration. [PR 3879]
  • Fix divide by 0 bug in query frontend exemplar calculations. [PR 3936]
  • Fix autocomplete of a query using scoped instrinsics. [PR 3865]
  • Improved handling of complete blocks in localblocks processor after enabling flushing. [PR 3805]
  • Fix double appending the primary iterator on second pass with event iterator. [PR 3903]
  • Fix frontend parsing error on cached responses [PR 3759]
  • max_global_traces_per_user: take into account ingestion.tenant_shard_size when converting to local limit. [PR 3618]
  • Fix HTTP connection reuse on GCP and AWS by reading io.EOF through the http body. [PR 3760]
  • Handle out of boundaries spans kinds. [PR 3861]
  • Maintain previous tenant blocklist on tenant errors. PR 3860
  • Fix prefix handling in Azure backend Find() call. [PR 3875]
  • Correct block end time when the ingested traces are outside the ingestion slack. [PR 3954]
  • Fix race condition where a streaming response could be marshaled while being modified in the combiner resulting in a panic. [PR 3961]
  • Pass search options to the backend for SearchTagValuesBlocksV2 requests. [PR 3971]