<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Components on Grafana Labs</title><link>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/</link><description>Recent content in Components on Grafana Labs</description><generator>Hugo -- gohugo.io</generator><language>en</language><atom:link href="/docs/tempo/v3.0.x/reference-tempo-architecture/components/index.xml" rel="self" type="application/rss+xml"/><item><title>Distributor</title><link>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/distributor/</link><pubDate>Thu, 28 May 2026 17:50:33 +0100</pubDate><guid>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/distributor/</guid><content><![CDATA[&lt;h1 id=&#34;distributor&#34;&gt;Distributor&lt;/h1&gt;
&lt;p&gt;The distributor is the entry point for all trace data into Tempo.
It receives spans from instrumented applications and validates them against configured limits.&lt;/p&gt;
&lt;p&gt;How the distributor forwards data depends on the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/reference-tempo-architecture/deployment-modes/&#34;&gt;deployment mode&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Microservices mode: The distributor shards traces by trace ID and writes them to Kafka. Downstream components including block-builders, live-stores, and metrics-generators each consume from Kafka independently.&lt;/li&gt;
&lt;li&gt;Monolithic mode: The distributor pushes data in-process directly to the live-store and metrics-generator. No Kafka is required.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;receiving-traces&#34;&gt;Receiving traces&lt;/h2&gt;
&lt;p&gt;The distributor uses the receiver layer from the &lt;a href=&#34;https://github.com/open-telemetry/opentelemetry-collector&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;OpenTelemetry Collector&lt;/a&gt; and accepts spans in multiple formats:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;OpenTelemetry Protocol (OTLP) over gRPC and HTTP, which is the recommended format&lt;/li&gt;
&lt;li&gt;Jaeger (Thrift and gRPC)&lt;/li&gt;
&lt;li&gt;Zipkin&lt;/li&gt;
&lt;li&gt;Kafka&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We recommend using OTLP over gRPC when possible.
Both &lt;a href=&#34;https://github.com/grafana/alloy/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Grafana Alloy&lt;/a&gt; and the OpenTelemetry Collector support OTLP export natively.&lt;/p&gt;
&lt;h2 id=&#34;validation-and-rate-limiting&#34;&gt;Validation and rate limiting&lt;/h2&gt;
&lt;p&gt;Before forwarding data, the distributor validates incoming data against configured ingestion limits.
These are the only limits enforced synchronously at ingestion time.&lt;/p&gt;
&lt;p&gt;The ingestion rate limit sets the maximum bytes per second per tenant.
Exceeding this returns a &lt;code&gt;RATE_LIMITED&lt;/code&gt; error to the client.
The ingestion burst size controls the maximum burst allowed above the sustained rate.
For details on which settings honor the global strategy and which are always local, refer to 
    &lt;a href=&#34;/docs/tempo/v3.0.x/configuration/#ingestion-rate-strategy&#34;&gt;Ingestion rate strategy&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Other limits such as &lt;code&gt;max_live_traces_bytes&lt;/code&gt; are enforced asynchronously downstream by live-stores, while &lt;code&gt;max_bytes_per_trace&lt;/code&gt; is enforced downstream as well, including by block-builders in microservices mode.&lt;/p&gt;
&lt;p&gt;When the distributor refuses spans due to rate limits,
it increments the &lt;code&gt;tempo_discarded_spans_total&lt;/code&gt; metric with a &lt;code&gt;reason&lt;/code&gt; label indicating why.&lt;/p&gt;
&lt;h3 id=&#34;logging-discarded-spans&#34;&gt;Logging discarded spans&lt;/h3&gt;
&lt;p&gt;To log individual discarded spans for debugging:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;distributor:
  log_discarded_spans:
    enabled: true
    include_all_attributes: false&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Setting &lt;code&gt;include_all_attributes: true&lt;/code&gt; produces more verbose logs that include span attributes,
which can help identify misbehaving clients.&lt;/p&gt;
&lt;h2 id=&#34;writing-to-kafka-microservices-mode&#34;&gt;Writing to Kafka (microservices mode)&lt;/h2&gt;
&lt;p&gt;In microservices mode, after validation, the distributor shards traces by hashing the trace ID,
looks up the partition ring to determine which Kafka partitions are active,
and writes records to the appropriate partitions.
It waits for Kafka to acknowledge the write before returning a response to the client.&lt;/p&gt;
&lt;p&gt;The write is only considered successful after Kafka returns with success.
This ensures that once the client gets a success response, the data is durably stored.&lt;/p&gt;
&lt;h3 id=&#34;partitioning&#34;&gt;Partitioning&lt;/h3&gt;
&lt;p&gt;The distributor shards traces by trace ID, meaning all spans for the same trace go to the same Kafka partition.
This has two benefits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Block-builders can build blocks where all spans for a trace are co-located within a single consumption cycle.&lt;/li&gt;
&lt;li&gt;Live-stores can serve complete traces from a single partition without cross-partition coordination.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The distributor uses the partition ring, not Kafka&amp;rsquo;s partition routing, to determine target partitions.
This allows Tempo to control the partition lifecycle independently of Kafka.&lt;/p&gt;
&lt;h2 id=&#34;in-process-push-monolithic-mode&#34;&gt;In-process push (monolithic mode)&lt;/h2&gt;
&lt;p&gt;In monolithic mode, the distributor pushes trace data directly to the live-store and metrics-generator within the same process. No Kafka producer is initialized, and the distributor doesn&amp;rsquo;t use the partition ring for routing. The write is acknowledged to the client after the live-store accepts the data.&lt;/p&gt;
&lt;h2 id=&#34;key-metrics&#34;&gt;Key metrics&lt;/h2&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Metric&lt;/th&gt;
              &lt;th&gt;Description&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_distributor_spans_received_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Total spans received by the distributor&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_discarded_spans_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Spans discarded, labeled by &lt;code&gt;reason&lt;/code&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_distributor_bytes_received_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Total bytes received&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;rate(tempo_distributor_spans_received_total[5m])&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Current ingestion rate in spans per second, derived in PromQL from the received spans counter&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;h2 id=&#34;related-resources&#34;&gt;Related resources&lt;/h2&gt;
&lt;p&gt;Refer to the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/configuration/#distributor&#34;&gt;distributor configuration&lt;/a&gt; for the full list of options.&lt;/p&gt;
]]></content><description>&lt;h1 id="distributor">Distributor&lt;/h1>
&lt;p>The distributor is the entry point for all trace data into Tempo.
It receives spans from instrumented applications and validates them against configured limits.&lt;/p></description></item><item><title>Kafka</title><link>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/kafka/</link><pubDate>Thu, 28 May 2026 17:50:33 +0100</pubDate><guid>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/kafka/</guid><content><![CDATA[&lt;h1 id=&#34;kafka&#34;&gt;Kafka&lt;/h1&gt;
&lt;p&gt;In 
    &lt;a href=&#34;/docs/tempo/v3.0.x/reference-tempo-architecture/deployment-modes/&#34;&gt;microservices mode&lt;/a&gt;, Tempo uses a Kafka-compatible message queue as the backbone of its write path. Any Kafka-compatible system works.&lt;/p&gt;
&lt;p&gt;Kafka isn&amp;rsquo;t used in 
    &lt;a href=&#34;/docs/tempo/v3.0.x/reference-tempo-architecture/deployment-modes/&#34;&gt;monolithic mode&lt;/a&gt;. In monolithic mode, the distributor pushes data in-process directly to the live-store and metrics-generator.&lt;/p&gt;
&lt;h2 id=&#34;role-in-the-architecture&#34;&gt;Role in the architecture&lt;/h2&gt;
&lt;p&gt;Kafka serves as a durable write-ahead log (WAL) between distributors and downstream consumers (block-builders, live-stores, and metrics-generators).&lt;/p&gt;
&lt;p&gt;With Kafka, durability is centralized. Once Kafka acknowledges a write, the data is safe regardless of what happens to any Tempo component. Consumers are stateless—block-builders and live-stores can crash and restart, replaying from their last committed Kafka offset to rebuild state without data loss. Because Kafka provides durability, Tempo doesn&amp;rsquo;t need to replicate data across multiple instances on the write path, enabling a replication factor of 1 that significantly reduces storage costs.&lt;/p&gt;
&lt;h2 id=&#34;partitioning&#34;&gt;Partitioning&lt;/h2&gt;
&lt;p&gt;Kafka topics are divided into partitions. Distributors hash the trace ID to determine the target partition. Each Kafka partition is consumed by exactly one block-builder instance and one live-store instance (per availability zone).&lt;/p&gt;
&lt;p&gt;Tempo maintains its own partition ring that maps Tempo partitions to Kafka partitions. While these are typically 1:1, the partition ring is logically independent from Kafka&amp;rsquo;s partition metadata. Refer to the &lt;a href=&#34;../partition-ring/&#34;&gt;partition ring&lt;/a&gt; documentation for details.&lt;/p&gt;
&lt;h3 id=&#34;scaling-partitions&#34;&gt;Scaling partitions&lt;/h3&gt;
&lt;p&gt;The number of Kafka partitions determines the maximum parallelism for block-builders and live-stores. Each partition is owned by exactly one instance of each consumer type.&lt;/p&gt;
&lt;p&gt;To scale block-builders or live-stores horizontally, you need at least as many partitions as instances. Adding Kafka partitions is a Kafka-side operation. Block-builders and live-stores use static partition assignment based on their instance ordinal, so scaling them requires adding both Kafka partitions and StatefulSet replicas together.&lt;/p&gt;
&lt;h2 id=&#34;consumer-groups&#34;&gt;Consumer groups&lt;/h2&gt;
&lt;p&gt;Tempo runs multiple independent consumer groups against the same Kafka topic:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Consumer group&lt;/th&gt;
              &lt;th&gt;Component&lt;/th&gt;
              &lt;th&gt;Purpose&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;block-builder&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Block-builder&lt;/td&gt;
              &lt;td&gt;Builds blocks for long-term storage&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;live-store&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Live-store&lt;/td&gt;
              &lt;td&gt;Serves recent data for queries&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;metrics-generator&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Metrics-generator&lt;/td&gt;
              &lt;td&gt;Derives metrics from trace data&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;Each consumer group tracks its own offsets. Block-builders and live-stores consume the same data independently and at their own pace. A slow block-builder doesn&amp;rsquo;t affect live-store availability, and vice versa.&lt;/p&gt;
&lt;h2 id=&#34;retention-and-offset-management&#34;&gt;Retention and offset management&lt;/h2&gt;
&lt;p&gt;Kafka&amp;rsquo;s retention policy determines how far back consumers can replay. Set it high enough to cover the block-builder&amp;rsquo;s consumption cycle time (plus buffer for failures and restarts) and the live-store&amp;rsquo;s replay window on startup.&lt;/p&gt;
&lt;p&gt;If a consumer falls behind Kafka&amp;rsquo;s retention window, it loses the ability to replay missed data. Monitor consumer lag to avoid this situation.&lt;/p&gt;
&lt;h3 id=&#34;key-metrics-for-monitoring-consumer-lag&#34;&gt;Key metrics for monitoring consumer lag&lt;/h3&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;tempo_ingest_group_partition_lag{group=&amp;#34;&amp;lt;consumer-group&amp;gt;&amp;#34;}
tempo_ingest_group_partition_lag_seconds{group=&amp;#34;&amp;lt;consumer-group&amp;gt;&amp;#34;}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;code&gt;tempo_ingest_group_partition_lag&lt;/code&gt; tracks lag in number of records per partition. &lt;code&gt;tempo_ingest_group_partition_lag_seconds&lt;/code&gt; tracks lag in wall-clock seconds.&lt;/p&gt;
&lt;h2 id=&#34;configuration&#34;&gt;Configuration&lt;/h2&gt;
&lt;p&gt;Kafka connection settings are configured under the &lt;code&gt;ingest&lt;/code&gt; section:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;ingest:
  kafka:
    address: kafka:9092
    topic: tempo-traces&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h2 id=&#34;related-resources&#34;&gt;Related resources&lt;/h2&gt;
&lt;p&gt;Refer to the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/configuration/#ingest&#34;&gt;ingest configuration&lt;/a&gt; for Kafka connection settings.&lt;/p&gt;
]]></content><description>&lt;h1 id="kafka">Kafka&lt;/h1>
&lt;p>In
&lt;a href="/docs/tempo/v3.0.x/reference-tempo-architecture/deployment-modes/">microservices mode&lt;/a>, Tempo uses a Kafka-compatible message queue as the backbone of its write path. Any Kafka-compatible system works.&lt;/p>
&lt;p>Kafka isn&amp;rsquo;t used in
&lt;a href="/docs/tempo/v3.0.x/reference-tempo-architecture/deployment-modes/">monolithic mode&lt;/a>. In monolithic mode, the distributor pushes data in-process directly to the live-store and metrics-generator.&lt;/p></description></item><item><title>Block-builder</title><link>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/block-builder/</link><pubDate>Thu, 28 May 2026 17:50:33 +0100</pubDate><guid>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/block-builder/</guid><content><![CDATA[&lt;h1 id=&#34;block-builder&#34;&gt;Block-builder&lt;/h1&gt;
&lt;p&gt;The block-builder is the write-path component responsible for building Parquet blocks and flushing them to object storage.
It consumes trace data from Kafka and organizes it into blocks suitable for long-term retention and efficient querying.&lt;/p&gt;
&lt;p&gt;The block-builder only runs in 
    &lt;a href=&#34;/docs/tempo/v3.0.x/reference-tempo-architecture/deployment-modes/&#34;&gt;microservices mode&lt;/a&gt;. In monolithic mode, the live-store handles flushing trace data to object storage directly.&lt;/p&gt;
&lt;p&gt;For a configuration block example, refer to the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/configuration/#block-builder&#34;&gt;block-builder section&lt;/a&gt; of the Configuration documentation.&lt;/p&gt;
&lt;h2 id=&#34;consumption-cycle&#34;&gt;Consumption cycle&lt;/h2&gt;
&lt;p&gt;The block-builder operates on a cyclical consumption model.&lt;/p&gt;
&lt;p&gt;On each cycle, the block-builder rewinds to the last committed Kafka offset to ensure any partially processed data from a previous cycle is re-consumed.
It reads records from Kafka up to a configured boundary (time-based), organizes the consumed spans by tenant,
writes them into Parquet blocks on local disk, uploads the blocks to object storage, and commits the Kafka offset.&lt;/p&gt;
&lt;h3 id=&#34;hard-cuts&#34;&gt;Hard cuts&lt;/h3&gt;
&lt;p&gt;The block-builder performs a hard cut at the end of each consumption cycle.
All spans consumed during that cycle are flushed into blocks, regardless of whether the traces they belong to are complete.
If a trace has spans arriving across two consumption cycles, those spans end up in separate blocks.&lt;/p&gt;
&lt;p&gt;This is by design. The block-builder has no concept of &amp;ldquo;live traces&amp;rdquo; or trace completion.
Trace assembly is handled at query time by the querier, which merges spans from multiple blocks.&lt;/p&gt;
&lt;h2 id=&#34;block-creation&#34;&gt;Block creation&lt;/h2&gt;
&lt;p&gt;Each consumption cycle produces one or more blocks per tenant per partition.
Blocks are written in Apache Parquet format and contain the span data (&lt;code&gt;data.parquet&lt;/code&gt;),
block metadata (&lt;code&gt;meta.json&lt;/code&gt;) including time range, tenant, and a &lt;code&gt;replaces&lt;/code&gt; field for atomic block replacement,
as well as bloom filters and indexes for efficient querying.&lt;/p&gt;
&lt;h3 id=&#34;span-deduplication&#34;&gt;Span deduplication&lt;/h3&gt;
&lt;p&gt;During block creation, the block-builder deduplicates spans within each trace.
Because the block-builder rewinds to the last committed Kafka offset on each cycle, replicated or re-consumed records can produce duplicate spans.
The block-builder identifies duplicates using a combination of span ID and span kind, and removes them before writing the block.&lt;/p&gt;
&lt;p&gt;Use the &lt;code&gt;tempo_block_builder_spans_deduped_total&lt;/code&gt; metric (labeled by &lt;code&gt;tenant&lt;/code&gt;) to track how many duplicate spans are removed.&lt;/p&gt;
&lt;h3 id=&#34;deterministic-block-ids&#34;&gt;Deterministic block IDs&lt;/h3&gt;
&lt;p&gt;The block-builder generates block IDs deterministically based on the partition, tenant, and Kafka offset range.
This is critical for crash recovery: if a block-builder crashes mid-flush and restarts,
it produces the same block IDs on retry, safely overwriting any partial data from the previous attempt.&lt;/p&gt;
&lt;h2 id=&#34;flush-and-recovery&#34;&gt;Flush and recovery&lt;/h2&gt;
&lt;p&gt;The flush process supports safe replay at every stage.&lt;/p&gt;
&lt;h3 id=&#34;flush-order&#34;&gt;Flush order&lt;/h3&gt;
&lt;p&gt;The block-builder flushes blocks to object storage in a specific order:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Bloom filters and indexes&lt;/li&gt;
&lt;li&gt;&lt;code&gt;data.parquet&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;nocompact.flg&lt;/code&gt; (a flag file that prevents compaction during the flush)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;meta.json&lt;/code&gt; (the block becomes &amp;ldquo;live&amp;rdquo; at this point)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;A block isn&amp;rsquo;t visible to the read path until its &lt;code&gt;meta.json&lt;/code&gt; is written.
Before that point, any crash is fully recoverable—the block-builder rewinds and overwrites.&lt;/p&gt;
&lt;h3 id=&#34;recovering-from-partial-flushes&#34;&gt;Recovering from partial flushes&lt;/h3&gt;
&lt;p&gt;If the block-builder crashes before writing &lt;code&gt;meta.json&lt;/code&gt;, the block is invisible to readers.
On restart, it rewinds to the last committed offset, regenerates the same block ID, and overwrites the partial data.&lt;/p&gt;
&lt;p&gt;If the crash happens after &lt;code&gt;meta.json&lt;/code&gt; is written, the block is already live.
On restart, the block-builder detects the existing block and advances to the next ID in sequence,
using the &lt;code&gt;replaces&lt;/code&gt; field to atomically replace the old block.&lt;/p&gt;
&lt;h3 id=&#34;the-replaces-field&#34;&gt;The &lt;code&gt;replaces&lt;/code&gt; field&lt;/h3&gt;
&lt;p&gt;When a block-builder retries a flush and finds that a previous block already exists (its &lt;code&gt;meta.json&lt;/code&gt; was written),
the new block includes a &lt;code&gt;replaces&lt;/code&gt; field in its &lt;code&gt;meta.json&lt;/code&gt; listing the old block ID.
This tells the read path to ignore the old block once the new one is visible,
preventing duplicate data from appearing in query results.&lt;/p&gt;
&lt;h3 id=&#34;the-nocompactflg-file&#34;&gt;The &lt;code&gt;nocompact.flg&lt;/code&gt; file&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;nocompact.flg&lt;/code&gt; file is written before &lt;code&gt;meta.json&lt;/code&gt; to prevent backend workers from touching the block while it&amp;rsquo;s still being built.
After the block-builder finishes its cycle, it removes this flag.
This prevents a race condition where a backend worker might try to compact a block that&amp;rsquo;s about to be replaced.&lt;/p&gt;
&lt;h2 id=&#34;scaling&#34;&gt;Scaling&lt;/h2&gt;
&lt;p&gt;Each block-builder instance consumes from one or more Kafka partitions.
The maximum number of block-builder instances equals the number of Kafka partitions.&lt;/p&gt;
&lt;p&gt;Block-builders use static partition assignment. Kafka does not move partitions between consumers in the consumer group for this component.
There are two ways to assign partitions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;partitions_per_instance&lt;/code&gt;: Each instance computes which partitions it owns based on its ordinal ID.
This is the default and works well with &lt;code&gt;StatefulSets&lt;/code&gt; where the block-builder mirrors its replica count from the live-store, scaling in lockstep.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;assigned_partitions&lt;/code&gt;: An explicit mapping of instance IDs to partition lists. This gives full manual control over which instance handles which partitions.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Size the scratch disk to hold at least one full consumption cycle&amp;rsquo;s worth of data across all assigned partitions and tenants.&lt;/p&gt;
&lt;h2 id=&#34;key-metrics&#34;&gt;Key metrics&lt;/h2&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Metric&lt;/th&gt;
              &lt;th&gt;Description&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_block_builder_flushed_blocks&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Number of blocks flushed to object storage&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_block_builder_spans_deduped_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Duplicate spans removed during block creation, by tenant&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_block_builder_fetch_errors_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Kafka fetch errors encountered&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_ingest_group_partition_lag{group=&amp;quot;block-builder&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Consumer lag per partition&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;h2 id=&#34;related-resources&#34;&gt;Related resources&lt;/h2&gt;
&lt;p&gt;Refer to the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/configuration/#block-builder&#34;&gt;block-builder section of the Tempo configuration&lt;/a&gt; for the full list of block-builder options.&lt;/p&gt;
]]></content><description>&lt;h1 id="block-builder">Block-builder&lt;/h1>
&lt;p>The block-builder is the write-path component responsible for building Parquet blocks and flushing them to object storage.
It consumes trace data from Kafka and organizes it into blocks suitable for long-term retention and efficient querying.&lt;/p></description></item><item><title>Live-store</title><link>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/live-store/</link><pubDate>Thu, 28 May 2026 17:50:33 +0100</pubDate><guid>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/live-store/</guid><content><![CDATA[&lt;h1 id=&#34;live-store&#34;&gt;Live-store&lt;/h1&gt;
&lt;p&gt;The live-store is the read-path component responsible for serving recent trace data.
It holds traces in memory, making them available for queries during the window between ingestion and block availability in object storage.&lt;/p&gt;
&lt;p&gt;How the live-store receives data depends on the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/reference-tempo-architecture/deployment-modes/&#34;&gt;deployment mode&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Microservices mode: The live-store consumes trace data from Kafka independently of block-builders.&lt;/li&gt;
&lt;li&gt;Monolithic mode: The live-store receives trace data directly from the distributor in-process. No Kafka consumption is involved.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;why-live-stores-exist&#34;&gt;Why live-stores exist&lt;/h2&gt;
&lt;p&gt;In microservices mode, there&amp;rsquo;s a gap between when trace data is written to Kafka and when the block-builder flushes it to object storage. During this window, the only way to query that data is through the live-store.&lt;/p&gt;
&lt;p&gt;In monolithic mode, the live-store serves the same role of providing immediate query access to recently ingested data, but it receives data directly from the distributor rather than from Kafka.&lt;/p&gt;
&lt;p&gt;In both modes, the live-store holds traces in memory organized by trace ID, responds to queries from queriers for recent data, and periodically flushes traces to a local WAL in Parquet format for TraceQL search and metrics queries.&lt;/p&gt;
&lt;h2 id=&#34;trace-lifecycle&#34;&gt;Trace lifecycle&lt;/h2&gt;
&lt;p&gt;When the live-store receives spans, it assembles them into traces in memory.
Each trace goes through three stages.&lt;/p&gt;
&lt;p&gt;First, the trace is active—it&amp;rsquo;s receiving spans, remains in memory, and is queryable.
Then, when no new spans have arrived within the configured &lt;code&gt;max_trace_idle&lt;/code&gt;
the trace becomes idle and is flushed to the local WAL.
Once flushed, the trace data is written in Parquet format and becomes available for TraceQL search.
Eventually, the WAL data is cut into complete blocks.&lt;/p&gt;
&lt;h3 id=&#34;trace-idle-period&#34;&gt;Trace idle period&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;max_trace_idle&lt;/code&gt; setting controls how long the live-store waits after the last span arrives before considering a trace idle and flushing it to the WAL.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;live_store:
  max_trace_idle: 10s&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Increasing this value keeps traces in memory longer, which improves the chances that all spans for a trace are co-located when flushed.
This is beneficial for 
    &lt;a href=&#34;/docs/tempo/v3.0.x/troubleshooting/querying/long-running-traces/&#34;&gt;long-running traces&lt;/a&gt;. However, it also increases memory usage.&lt;/p&gt;
&lt;h2 id=&#34;partition-ownership&#34;&gt;Partition ownership&lt;/h2&gt;
&lt;p&gt;Live-stores own the partition lifecycle within Tempo.
Each live-store instance consumes from one or more Tempo partitions, and each partition is owned by exactly one live-store per availability zone.&lt;/p&gt;
&lt;h3 id=&#34;partition-ring&#34;&gt;Partition ring&lt;/h3&gt;
&lt;p&gt;The live-store maintains a partition ring that tracks which Tempo partitions exist,
which live-stores own each partition, and the state of each partition (pending, active, or inactive).&lt;/p&gt;
&lt;p&gt;This ring is propagated via memberlist gossip.
Refer to the &lt;a href=&#34;../partition-ring/&#34;&gt;partition ring&lt;/a&gt; documentation for details on partition states and transitions.&lt;/p&gt;
&lt;h3 id=&#34;startup&#34;&gt;Startup&lt;/h3&gt;
&lt;p&gt;When a live-store starts, it checks the partition ring for its assigned partition.
If the partition exists, the live-store joins as an owner.
If it doesn&amp;rsquo;t exist, the live-store creates it in pending state and waits for enough owners to register before automatically promoting it to active.
In microservices mode, the live-store then replays from its last committed Kafka offset to rebuild in-memory state.&lt;/p&gt;
&lt;h3 id=&#34;shutdown-and-scaling-down&#34;&gt;Shutdown and scaling down&lt;/h3&gt;
&lt;p&gt;Scaling down live-stores requires marking the partition as inactive while the live-store is still running.
This transitions the partition to read-only mode.
After enough time passes for the data to be flushed to object storage,
you can safely remove the partition and live-store.&lt;/p&gt;
&lt;p&gt;Abruptly removing a live-store without marking its partition inactive makes that partition&amp;rsquo;s recent data temporarily unavailable until another live-store picks it up (in a zone-aware setup, the other zone&amp;rsquo;s live-store continues serving).&lt;/p&gt;
&lt;h2 id=&#34;zone-aware-high-availability&#34;&gt;Zone-aware high availability&lt;/h2&gt;
&lt;p&gt;For production deployments, live-stores are typically deployed across multiple availability zones.
Each Tempo partition is owned by one live-store per zone.&lt;/p&gt;
&lt;p&gt;If a live-store in one zone becomes unavailable, the live-store in the other zone continues serving queries for the same partitions.
Queriers only need a response from one live-store per partition (read quorum of 1),
so queries succeed as long as at least one zone is healthy.
This provides high availability without requiring data deduplication on the read path.&lt;/p&gt;
&lt;p&gt;Refer to the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/operations/manage-advanced-systems/zone-aware-live-stores/&#34;&gt;zone-aware live-stores&lt;/a&gt; documentation for configuration details.&lt;/p&gt;
&lt;h2 id=&#34;local-wal&#34;&gt;Local WAL&lt;/h2&gt;
&lt;p&gt;When traces are flushed from memory, they&amp;rsquo;re written to a local WAL in Parquet format. This serves two purposes.&lt;/p&gt;
&lt;p&gt;First, it provides search availability—after data is in the WAL,
trace data is available for TraceQL search queries, not just trace ID lookups.
Second, it aids recovery on restart. In microservices mode, if the live-store restarts,
it replays from Kafka, and the WAL provides a way to serve queries during replay.&lt;/p&gt;
&lt;p&gt;The WAL is eventually cut into complete blocks that are also stored locally.
These blocks are queryable until the data ages out of the live-store&amp;rsquo;s retention window.&lt;/p&gt;
&lt;h2 id=&#34;key-metrics&#34;&gt;Key metrics&lt;/h2&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Metric&lt;/th&gt;
              &lt;th&gt;Description&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_live_store_traces_created_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Total number of traces created in the live-store&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_live_store_lagged_requests_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Requests where the live-store could not guarantee complete results due to Kafka lag, labeled by &lt;code&gt;route&lt;/code&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_warnings_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Warnings during trace processing, labeled by &lt;code&gt;reason&lt;/code&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_ingest_group_partition_lag{group=&amp;quot;live-store&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Consumer lag per partition&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;h2 id=&#34;related-resources&#34;&gt;Related resources&lt;/h2&gt;
&lt;p&gt;Refer to the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/configuration/#live-store&#34;&gt;live-store configuration&lt;/a&gt; for the full list of options.&lt;/p&gt;
]]></content><description>&lt;h1 id="live-store">Live-store&lt;/h1>
&lt;p>The live-store is the read-path component responsible for serving recent trace data.
It holds traces in memory, making them available for queries during the window between ingestion and block availability in object storage.&lt;/p></description></item><item><title>Query frontend</title><link>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/query-frontend/</link><pubDate>Thu, 28 May 2026 17:50:33 +0100</pubDate><guid>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/query-frontend/</guid><content><![CDATA[&lt;h1 id=&#34;query-frontend&#34;&gt;Query frontend&lt;/h1&gt;
&lt;p&gt;The query frontend is the entry point for all queries in Tempo.
It receives TraceQL queries and trace ID lookups, shards them into parallel jobs,
and distributes those jobs to queriers for execution.&lt;/p&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How it works&lt;/h2&gt;
&lt;p&gt;The query frontend handles the full lifecycle of a query.
It shards a single query into many smaller jobs, each covering a subset of the data (for example, a subset of blocks or a time range).
Jobs are placed in a per-tenant queue and dispatched to queriers in batches, reducing round-trip overhead.
As queriers return partial results, the frontend merges and deduplicates them into a final response.
If a querier fails to process a job, the frontend retries it on another querier.
For search queries with a result limit, the frontend cancels remaining jobs as soon as enough results are collected.&lt;/p&gt;
&lt;h2 id=&#34;job-sharding&#34;&gt;Job sharding&lt;/h2&gt;
&lt;p&gt;The frontend uses &lt;code&gt;target_bytes_per_job&lt;/code&gt; to estimate how large each job should be.
Smaller values create more, smaller jobs (higher parallelism but more overhead).
Larger values create fewer, bigger jobs (less overhead but lower parallelism).&lt;/p&gt;
&lt;p&gt;The total number of jobs for a query depends on the time range,
the volume of data in that range, and the &lt;code&gt;target_bytes_per_job&lt;/code&gt; setting.&lt;/p&gt;
&lt;h3 id=&#34;concurrent-jobs&#34;&gt;Concurrent jobs&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;concurrent_jobs&lt;/code&gt; setting controls how many jobs for a single query are dispatched to the queue at once.
If a query produces 5,000 jobs and &lt;code&gt;concurrent_jobs&lt;/code&gt; is 1,000, only 1,000 jobs are active at a time.
As jobs complete, new ones are dispatched.&lt;/p&gt;
&lt;p&gt;This limits the blast radius of a single large query.
In shared clusters, keeping this value lower ensures fair scheduling across tenants.&lt;/p&gt;
&lt;h2 id=&#34;querier-connections&#34;&gt;Querier connections&lt;/h2&gt;
&lt;p&gt;Queriers connect to the query frontend over streaming gRPC.
Each connection processes one batch at a time synchronously.
The number of concurrent connections from a querier determines how many batches it can process in parallel.&lt;/p&gt;
&lt;p&gt;This is controlled by either &lt;code&gt;querier.max_concurrent_queries&lt;/code&gt; (maximum total concurrent jobs per querier) or &lt;code&gt;querier.frontend_worker.parallelism&lt;/code&gt; (number of connections per query frontend).&lt;/p&gt;
&lt;h2 id=&#34;key-configuration&#34;&gt;Key configuration&lt;/h2&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;query_frontend:
  max_outstanding_per_tenant: 2000  # Max jobs in queue per tenant
  max_batch_size: 7                 # Jobs per batch sent to querier
  max_retries: 2                    # Retry count for failed jobs
  search:
    concurrent_jobs: 2000           # Max concurrent jobs per query
    target_bytes_per_job: 104857600 # ~100MB per job&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Refer to 
    &lt;a href=&#34;/docs/tempo/v3.0.x/operations/backend_search/&#34;&gt;Tune search performance&lt;/a&gt; for detailed tuning guidance.&lt;/p&gt;
&lt;h2 id=&#34;key-metrics&#34;&gt;Key metrics&lt;/h2&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Metric&lt;/th&gt;
              &lt;th&gt;Description&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_query_frontend_queries_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Total queries received&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_query_frontend_queue_length&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Current queue depth per tenant&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;h2 id=&#34;related-resources&#34;&gt;Related resources&lt;/h2&gt;
&lt;p&gt;Refer to the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/configuration/#query-frontend&#34;&gt;query-frontend configuration&lt;/a&gt; for the full list of options.&lt;/p&gt;
]]></content><description>&lt;h1 id="query-frontend">Query frontend&lt;/h1>
&lt;p>The query frontend is the entry point for all queries in Tempo.
It receives TraceQL queries and trace ID lookups, shards them into parallel jobs,
and distributes those jobs to queriers for execution.&lt;/p></description></item><item><title>Querier</title><link>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/querier/</link><pubDate>Thu, 28 May 2026 17:50:33 +0100</pubDate><guid>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/querier/</guid><content><![CDATA[&lt;h1 id=&#34;querier&#34;&gt;Querier&lt;/h1&gt;
&lt;p&gt;The querier is the worker component that executes query jobs dispatched by the query frontend. It fetches trace data from both live-stores (for recent data) and object storage (for historical data), then returns results to the query frontend for merging.&lt;/p&gt;
&lt;h2 id=&#34;why-the-querier-exists&#34;&gt;Why the querier exists&lt;/h2&gt;
&lt;p&gt;Trace data in Tempo lives in two places: recent data in live-stores and historical data in object storage blocks.
The querier bridges both sources, fetching and merging data so that the query frontend doesn&amp;rsquo;t need to know where data lives.
This separation lets you scale query execution independently from query planning and result merging.&lt;/p&gt;
&lt;h2 id=&#34;query-execution&#34;&gt;Query execution&lt;/h2&gt;
&lt;p&gt;When a querier receives a batch of jobs from the query frontend, it processes each job by determining where the relevant data lives.&lt;/p&gt;
&lt;p&gt;For recent data, the querier contacts live-stores that own the partitions covering the query&amp;rsquo;s time range. Live-stores respond with any matching spans held in memory or their local WAL.&lt;/p&gt;
&lt;p&gt;For historical data, the querier reads block metadata from the blocklist, identifies which blocks may contain matching data, and fetches the relevant portions from object storage. Bloom filters efficiently skip blocks that don&amp;rsquo;t contain the requested trace IDs.&lt;/p&gt;
&lt;p&gt;Results from both sources are combined and returned to the query frontend.&lt;/p&gt;
&lt;h2 id=&#34;live-store-queries&#34;&gt;Live-store queries&lt;/h2&gt;
&lt;p&gt;The querier uses the partition ring to determine which live-stores to contact for a given query. For zone-aware deployments, the querier only needs a response from one live-store per partition (read quorum of 1).&lt;/p&gt;
&lt;p&gt;If a live-store is unavailable, the querier falls back to the live-store in the other availability zone. If no live-store is available for a partition, recent data for that partition is temporarily unavailable, but historical queries still work.&lt;/p&gt;
&lt;h2 id=&#34;backend-queries&#34;&gt;Backend queries&lt;/h2&gt;
&lt;p&gt;For historical data, the querier consults the blocklist (maintained by backend workers) to find blocks in the relevant time range. It uses bloom filters to quickly eliminate blocks that don&amp;rsquo;t contain the target trace ID, fetches matching block data from object storage (using caching where configured), reads the Parquet data, and applies any TraceQL filters.&lt;/p&gt;
&lt;h3 id=&#34;caching&#34;&gt;Caching&lt;/h3&gt;
&lt;p&gt;Queriers benefit significantly from caching. Tempo supports multiple cache tiers.&lt;/p&gt;
&lt;p&gt;The frontend search cache caches query results at the frontend level. It has a low hit rate and is mainly useful for repeated queries. The Parquet page cache caches individual Parquet pages with a high hit rate, useful across many different queries. The bloom filter cache caches bloom filters used for trace ID lookups, also with a high hit rate.&lt;/p&gt;
&lt;p&gt;Lower-level caches (bloom, Parquet page) have higher hit rates and should be sized more generously than higher-level caches.&lt;/p&gt;
&lt;h2 id=&#34;concurrency&#34;&gt;Concurrency&lt;/h2&gt;
&lt;p&gt;The number of jobs a querier processes concurrently is controlled by &lt;code&gt;max_concurrent_queries&lt;/code&gt; (the maximum number of jobs processed at once) or &lt;code&gt;frontend_worker.parallelism&lt;/code&gt; (the number of connections to each query frontend, which determines concurrent batch processing).&lt;/p&gt;
&lt;p&gt;Increasing concurrency makes queriers process more jobs in parallel but increases memory usage. If queriers run out of memory, reduce concurrency and scale horizontally instead.&lt;/p&gt;
&lt;h3 id=&#34;memory-sizing&#34;&gt;Memory sizing&lt;/h3&gt;
&lt;p&gt;Querier memory usage roughly scales with: &lt;code&gt;job_size * querier_concurrency &#43; buffer&lt;/code&gt;. You can tune this by adjusting &lt;code&gt;target_bytes_per_job&lt;/code&gt; (at the frontend), &lt;code&gt;max_concurrent_queries&lt;/code&gt; (at the querier), and &lt;code&gt;frontend_worker.parallelism&lt;/code&gt; (which affects how many batches the querier processes at once).&lt;/p&gt;
&lt;h2 id=&#34;related-resources&#34;&gt;Related resources&lt;/h2&gt;
&lt;p&gt;Refer to the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/configuration/#querier&#34;&gt;querier configuration&lt;/a&gt; for the full list of options.&lt;/p&gt;
]]></content><description>&lt;h1 id="querier">Querier&lt;/h1>
&lt;p>The querier is the worker component that executes query jobs dispatched by the query frontend. It fetches trace data from both live-stores (for recent data) and object storage (for historical data), then returns results to the query frontend for merging.&lt;/p></description></item><item><title>Compaction</title><link>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/compaction/</link><pubDate>Thu, 28 May 2026 17:50:33 +0100</pubDate><guid>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/compaction/</guid><content><![CDATA[&lt;h1 id=&#34;compaction&#34;&gt;Compaction&lt;/h1&gt;
&lt;p&gt;The backend scheduler and worker replace the legacy compactor.
Together, they handle compaction, retention, and blocklist maintenance for data in object storage.&lt;/p&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How it works&lt;/h2&gt;
&lt;p&gt;The backend scheduler creates jobs and assigns them to workers.
Workers connect to the scheduler via gRPC, request jobs, execute them, and report results back.
This split makes compaction horizontally scalable—you can add workers to increase throughput without changing the scheduler.&lt;/p&gt;
&lt;h3 id=&#34;job-types&#34;&gt;Job types&lt;/h3&gt;
&lt;p&gt;The scheduler produces three types of jobs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Compaction: merges small blocks into larger ones to reduce the number of blocks queriers need to scan and improve query performance.&lt;/li&gt;
&lt;li&gt;Retention: deletes blocks older than the configured retention period.&lt;/li&gt;
&lt;li&gt;Redaction: rewrites blocks to remove matching trace data from object storage.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;job-lifecycle&#34;&gt;Job lifecycle&lt;/h3&gt;
&lt;p&gt;The scheduler uses providers to generate jobs.
Each provider runs independently and feeds jobs into a shared channel.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The compaction provider periodically measures tenants and produces compaction jobs based on the blocklist.&lt;/li&gt;
&lt;li&gt;The retention provider produces retention jobs on a schedule.&lt;/li&gt;
&lt;li&gt;The redaction provider drains a persistent queue of pending redaction requests. The scheduler&amp;rsquo;s rescan logic handles waiting for any compaction jobs that were active at submission time to complete before the rewritten blocks become eligible for querying.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When a worker calls &lt;code&gt;Next&lt;/code&gt;, the scheduler assigns an available job and persists the assignment to a local work cache.
The worker executes the job and calls &lt;code&gt;UpdateJob&lt;/code&gt; with a success or failure status.
On success, the scheduler applies the results to the in-memory blocklist (for example, marking compacted blocks as removed).
The work cache is periodically flushed to object storage for crash recovery.&lt;/p&gt;
&lt;h2 id=&#34;backend-scheduler&#34;&gt;Backend scheduler&lt;/h2&gt;
&lt;p&gt;The scheduler is a singleton: only one instance should run at a time.
It maintains the work cache, which tracks all active and completed jobs,
and polls object storage to keep the blocklist up to date.&lt;/p&gt;
&lt;p&gt;The scheduler exposes an HTTP status endpoint that lists all known jobs with their status, tenant, worker assignment,
and timestamps.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;backend_scheduler:
  maintenance_interval: 1m
  backend_flush_interval: 1m&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h2 id=&#34;backend-worker&#34;&gt;Backend worker&lt;/h2&gt;
&lt;p&gt;Workers are stateless job executors. Each worker connects to the scheduler, requests a job, processes it, and reports back.
Multiple workers can run in parallel.&lt;/p&gt;
&lt;p&gt;Workers also maintain the blocklist for all tenants.
Tenant polling is coordinated through a ring, so each worker polls a subset of tenants.
This distributes the load of scanning object storage across all workers.&lt;/p&gt;
&lt;p&gt;Workers use a ring for tenant sharding.
The ring determines which worker is responsible for polling each tenant&amp;rsquo;s blocklist.
By default the ring is disabled, meaning each worker polls all tenants without sharding.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;backend_worker:
  backend_scheduler_addr: backend-scheduler:9095
  finish_on_shutdown_timeout: 30s&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;graceful-shutdown&#34;&gt;Graceful shutdown&lt;/h3&gt;
&lt;p&gt;When a worker receives a shutdown signal,
it has a configurable timeout (&lt;code&gt;finish_on_shutdown_timeout&lt;/code&gt;) to complete the current job before being terminated.
This prevents partially completed jobs from being left in an inconsistent state.&lt;/p&gt;
&lt;h2 id=&#34;scheduler-status-api&#34;&gt;Scheduler status API&lt;/h2&gt;
&lt;p&gt;The backend scheduler exposes an HTTP endpoint that shows the current state of all jobs:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;GET /status/backendscheduler&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The response is a plain-text table with two sections:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Active Jobs: all jobs in the scheduler work cache, sorted by creation time. This includes jobs in any state &amp;ndash; use the &lt;code&gt;status&lt;/code&gt; column to interpret each row. A non-empty &lt;code&gt;worker&lt;/code&gt; field indicates the job is currently assigned to a worker.&lt;/li&gt;
&lt;li&gt;Pending Jobs: redaction jobs in the pending queue. Some may already be eligible to run; others may still be waiting for the rescan or compaction preconditions to clear.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This endpoint is useful for diagnosing stalled jobs, verifying that workers are consuming work, and checking whether a redaction request has been processed.&lt;/p&gt;
&lt;h2 id=&#34;key-metrics&#34;&gt;Key metrics&lt;/h2&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Metric&lt;/th&gt;
              &lt;th&gt;Description&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempodb_compaction_blocks_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Blocks compacted&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempodb_compaction_bytes_written_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Bytes written during compaction&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempodb_retention_marked_for_deletion_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Blocks marked for deletion by retention&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempodb_retention_deleted_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Blocks deleted by retention&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_backend_scheduler_jobs_created_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Jobs created&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_backend_scheduler_jobs_completed_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Jobs completed successfully&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_backend_scheduler_jobs_failed_total&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Jobs that failed&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_backend_scheduler_jobs_active&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Jobs currently assigned to a worker&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempo_backend_scheduler_job_duration_seconds&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Job execution duration histogram&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempodb_blocklist_length&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Number of live blocks per tenant; high values indicate compaction is falling behind&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;tempodb_compaction_outstanding_blocks&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;Outstanding blocks awaiting compaction per tenant; the primary autoscaling signal&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;Most scheduler job metrics carry &lt;code&gt;tenant&lt;/code&gt; and &lt;code&gt;job_type&lt;/code&gt; labels; &lt;code&gt;tempo_backend_scheduler_job_duration_seconds&lt;/code&gt; carries only &lt;code&gt;job_type&lt;/code&gt;.
The &lt;code&gt;job_type&lt;/code&gt; label uses protobuf enum string values: &lt;code&gt;JOB_TYPE_COMPACTION&lt;/code&gt;, &lt;code&gt;JOB_TYPE_RETENTION&lt;/code&gt;, and &lt;code&gt;JOB_TYPE_REDACTION&lt;/code&gt;.
The duration histogram measures elapsed time from job creation to completion, not execution time alone.&lt;/p&gt;
&lt;h2 id=&#34;monitoring&#34;&gt;Monitoring&lt;/h2&gt;
&lt;p&gt;The Tempo mixin ships a pre-built Grafana dashboard, &lt;strong&gt;Tempo - Backend Work&lt;/strong&gt;, that covers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Blocklist length and poll duration&lt;/li&gt;
&lt;li&gt;Active, completed, failed, and retried job counts&lt;/li&gt;
&lt;li&gt;Compaction throughput (objects written, bytes written, blocks compacted)&lt;/li&gt;
&lt;li&gt;Outstanding blocks per tenant&lt;/li&gt;
&lt;li&gt;CPU and memory for both the backend scheduler and backend workers&lt;/li&gt;
&lt;li&gt;A backend-worker autoscaling panel&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To use the dashboard, install the Tempo mixin from &lt;code&gt;operations/tempo-mixin/&lt;/code&gt; and import the generated dashboard into your Grafana instance.&lt;/p&gt;
&lt;h2 id=&#34;related-resources&#34;&gt;Related resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v3.0.x/operations/compaction/&#34;&gt;Compaction operations&lt;/a&gt; for timing requirements and block selection details.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v3.0.x/configuration/#backend-scheduler&#34;&gt;Configuration reference&lt;/a&gt; for the full list of options.&lt;/li&gt;
&lt;/ul&gt;
]]></content><description>&lt;h1 id="compaction">Compaction&lt;/h1>
&lt;p>The backend scheduler and worker replace the legacy compactor.
Together, they handle compaction, retention, and blocklist maintenance for data in object storage.&lt;/p>
&lt;h2 id="how-it-works">How it works&lt;/h2>
&lt;p>The backend scheduler creates jobs and assigns them to workers.
Workers connect to the scheduler via gRPC, request jobs, execute them, and report results back.
This split makes compaction horizontally scalable—you can add workers to increase throughput without changing the scheduler.&lt;/p></description></item><item><title>Metrics-generator</title><link>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/metrics-generator/</link><pubDate>Thu, 28 May 2026 17:50:33 +0100</pubDate><guid>https://grafana.com/docs/tempo/v3.0.x/reference-tempo-architecture/components/metrics-generator/</guid><content><![CDATA[&lt;h1 id=&#34;metrics-generator&#34;&gt;Metrics-generator&lt;/h1&gt;
&lt;p&gt;The metrics-generator is an optional component that derives metrics from trace data,
which are then remote-written to a metrics backend, for example, Prometheus or Grafana Mimir.&lt;/p&gt;
&lt;p&gt;How the metrics-generator receives data depends on the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/reference-tempo-architecture/deployment-modes/&#34;&gt;deployment mode&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Microservices mode: The metrics-generator consumes trace data from Kafka as an independent consumer group.&lt;/li&gt;
&lt;li&gt;Monolithic mode: The metrics-generator receives trace data directly from the distributor in-process. No Kafka consumption is involved.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;why-it-matters&#34;&gt;Why it matters&lt;/h2&gt;
&lt;p&gt;Traces contain rich information about service interactions, latencies, and error rates.
The metrics-generator extracts this information and produces time-series metrics,
enabling alerting and Grafana dashboards without requiring separate instrumentation.&lt;/p&gt;
&lt;p&gt;It supports two types of metric generation.
Span metrics produce request rate, error rate, and duration (RED) metrics from individual spans.
These can be broken down by service, operation, status code, and custom dimensions extracted from span attributes.
Service graphs build a graph of service-to-service communication by matching client and server spans,
producing metrics for request rates, error rates, and latencies between service pairs.&lt;/p&gt;
&lt;h2 id=&#34;kafka-consumption&#34;&gt;Kafka consumption&lt;/h2&gt;
&lt;p&gt;In microservices mode, the metrics-generator consumes trace data directly from Kafka, like live-stores and block-builders.
It runs as an independent consumer group, tracking its own offsets separately.&lt;/p&gt;
&lt;h3 id=&#34;monitoring-consumption&#34;&gt;Monitoring consumption&lt;/h3&gt;
&lt;p&gt;Use the following metrics to verify the generator is consuming data:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;tempo_ingest_group_partition_lag{group=&amp;#34;metrics-generator&amp;#34;}
tempo_ingest_group_partition_lag_seconds{group=&amp;#34;metrics-generator&amp;#34;}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;High or growing lag indicates the generator is falling behind.
The &lt;code&gt;tempo_ingest_storage_reader&lt;/code&gt; family of metrics exposes detailed information about fetch operations and errors from the Kafka client library.&lt;/p&gt;
&lt;h2 id=&#34;active-series-limiting&#34;&gt;Active series limiting&lt;/h2&gt;
&lt;p&gt;The generator protects itself and downstream metrics storage with configurable limits.&lt;/p&gt;
&lt;h3 id=&#34;series-based-limiting&#34;&gt;Series-based limiting&lt;/h3&gt;
&lt;p&gt;You can cap the total number of active time series the generator produces:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;overrides:
  defaults:
    metrics_generator:
      max_active_series: 0  # 0 = unlimited&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This value is per metrics-generator instance. The actual maximum across the cluster is &lt;code&gt;&amp;lt;instances&amp;gt; * max_active_series&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;When the limit is reached, the generator produces overflow series with the label &lt;code&gt;metric_overflow=&amp;quot;true&amp;quot;&lt;/code&gt; instead of dropping data entirely.
As existing series become stale, new series split out from the overflow bucket.&lt;/p&gt;
&lt;h3 id=&#34;entity-based-limiting&#34;&gt;Entity-based limiting&lt;/h3&gt;
&lt;p&gt;Entity-based limiting is an alternative to series-based limiting.
An entity is a unique label combination (excluding external labels) across multiple metrics.
Entity limiting ensures the generator always produces the full set of metrics for a given entity rather than limiting randomly.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;metrics_generator:
  limiter_type: entity&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;per-label-cardinality-limiting&#34;&gt;Per-label cardinality limiting&lt;/h3&gt;
&lt;p&gt;You can cap the number of distinct values a single label can have.
When exceeded, new values are replaced with &lt;code&gt;__cardinality_overflow__&lt;/code&gt; while other labels remain unaffected.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;overrides:
  defaults:
    metrics_generator:
      max_cardinality_per_label: 0  # 0 = disabled&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h2 id=&#34;remote-write&#34;&gt;Remote write&lt;/h2&gt;
&lt;p&gt;The generator writes metrics to one or more remote-write endpoints. Monitor write health with:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;prometheus_remote_storage_samples_failed_total
prometheus_remote_storage_samples_dropped_total&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h2 id=&#34;related-resources&#34;&gt;Related resources&lt;/h2&gt;
&lt;p&gt;Refer to the 
    &lt;a href=&#34;/docs/tempo/v3.0.x/metrics-from-traces/metrics-generator/&#34;&gt;metrics-generator documentation&lt;/a&gt; for configuration and usage details.&lt;/p&gt;
]]></content><description>&lt;h1 id="metrics-generator">Metrics-generator&lt;/h1>
&lt;p>The metrics-generator is an optional component that derives metrics from trace data,
which are then remote-written to a metrics backend, for example, Prometheus or Grafana Mimir.&lt;/p></description></item></channel></rss>