<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Troubleshoot Tempo on Grafana Labs</title><link>https://grafana.com/docs/tempo/v2.10.x/troubleshooting/</link><description>Recent content in Troubleshoot Tempo on Grafana Labs</description><generator>Hugo -- gohugo.io</generator><language>en</language><atom:link href="/docs/tempo/v2.10.x/troubleshooting/index.xml" rel="self" type="application/rss+xml"/><item><title>Issues with sending traces</title><link>https://grafana.com/docs/tempo/v2.10.x/troubleshooting/send-traces/</link><pubDate>Thu, 09 Apr 2026 14:59:14 +0000</pubDate><guid>https://grafana.com/docs/tempo/v2.10.x/troubleshooting/send-traces/</guid><content><![CDATA[&lt;h1 id=&#34;issues-with-sending-traces&#34;&gt;Issues with sending traces&lt;/h1&gt;
&lt;p&gt;Learn about issues related to sending traces.&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.10.x/troubleshooting/send-traces/max-trace-limit-reached/&#34;&gt;Distributor refusing spans&lt;/a&gt;&lt;br&gt;Troubleshoot distributor refusing spans&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.10.x/troubleshooting/send-traces/alloy/&#34;&gt;Troubleshoot Grafana Alloy&lt;/a&gt;&lt;br&gt;Gain visibility on how many traces are being pushed to Grafana Alloy and if they are making it to the Tempo backend.&lt;/li&gt;&lt;/ul&gt;
]]></content><description>&lt;h1 id="issues-with-sending-traces">Issues with sending traces&lt;/h1>
&lt;p>Learn about issues related to sending traces.&lt;/p>
&lt;ul>&lt;li>
&lt;a href="/docs/tempo/v2.10.x/troubleshooting/send-traces/max-trace-limit-reached/">Distributor refusing spans&lt;/a>&lt;br>Troubleshoot distributor refusing spans&lt;/li>&lt;li>
&lt;a href="/docs/tempo/v2.10.x/troubleshooting/send-traces/alloy/">Troubleshoot Grafana Alloy&lt;/a>&lt;br>Gain visibility on how many traces are being pushed to Grafana Alloy and if they are making it to the Tempo backend.&lt;/li>&lt;/ul></description></item><item><title>Issues with querying</title><link>https://grafana.com/docs/tempo/v2.10.x/troubleshooting/querying/</link><pubDate>Thu, 09 Apr 2026 14:59:14 +0000</pubDate><guid>https://grafana.com/docs/tempo/v2.10.x/troubleshooting/querying/</guid><content><![CDATA[&lt;h1 id=&#34;issues-with-querying&#34;&gt;Issues with querying&lt;/h1&gt;
&lt;p&gt;Learn about issues related to querying.&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.10.x/troubleshooting/querying/unable-to-see-trace/&#34;&gt;Unable to find traces&lt;/a&gt;&lt;br&gt;Troubleshoot missing traces&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.10.x/troubleshooting/querying/too-many-jobs-in-queue/&#34;&gt;Too many jobs in the queue&lt;/a&gt;&lt;br&gt;Troubleshoot too many jobs in the queue&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.10.x/troubleshooting/querying/bad-blocks/&#34;&gt;Bad blocks&lt;/a&gt;&lt;br&gt;Troubleshoot queries failing with an error message indicating bad blocks.&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.10.x/troubleshooting/querying/search-tag/&#34;&gt;Tag search&lt;/a&gt;&lt;br&gt;Troubleshoot No options found in Grafana tag search&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.10.x/troubleshooting/querying/response-too-large/&#34;&gt;Response larger than the max&lt;/a&gt;&lt;br&gt;Troubleshoot response larger than the max error message&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.10.x/troubleshooting/querying/long-running-traces/&#34;&gt;Long-running traces&lt;/a&gt;&lt;br&gt;Troubleshoot search results when using long-running traces&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.10.x/troubleshooting/querying/too-many-requests-error/&#34;&gt;Too many requests error&lt;/a&gt;&lt;br&gt;Troubleshoot Too many requests error for a Tempo query&lt;/li&gt;&lt;/ul&gt;
]]></content><description>&lt;h1 id="issues-with-querying">Issues with querying&lt;/h1>
&lt;p>Learn about issues related to querying.&lt;/p>
&lt;ul>&lt;li>
&lt;a href="/docs/tempo/v2.10.x/troubleshooting/querying/unable-to-see-trace/">Unable to find traces&lt;/a>&lt;br>Troubleshoot missing traces&lt;/li>&lt;li>
&lt;a href="/docs/tempo/v2.10.x/troubleshooting/querying/too-many-jobs-in-queue/">Too many jobs in the queue&lt;/a>&lt;br>Troubleshoot too many jobs in the queue&lt;/li>&lt;li>
&lt;a href="/docs/tempo/v2.10.x/troubleshooting/querying/bad-blocks/">Bad blocks&lt;/a>&lt;br>Troubleshoot queries failing with an error message indicating bad blocks.&lt;/li>&lt;li>
&lt;a href="/docs/tempo/v2.10.x/troubleshooting/querying/search-tag/">Tag search&lt;/a>&lt;br>Troubleshoot No options found in Grafana tag search&lt;/li>&lt;li>
&lt;a href="/docs/tempo/v2.10.x/troubleshooting/querying/response-too-large/">Response larger than the max&lt;/a>&lt;br>Troubleshoot response larger than the max error message&lt;/li>&lt;li>
&lt;a href="/docs/tempo/v2.10.x/troubleshooting/querying/long-running-traces/">Long-running traces&lt;/a>&lt;br>Troubleshoot search results when using long-running traces&lt;/li>&lt;li>
&lt;a href="/docs/tempo/v2.10.x/troubleshooting/querying/too-many-requests-error/">Too many requests error&lt;/a>&lt;br>Troubleshoot Too many requests error for a Tempo query&lt;/li>&lt;/ul></description></item><item><title>Troubleshoot metrics-generator</title><link>https://grafana.com/docs/tempo/v2.10.x/troubleshooting/metrics-generator/</link><pubDate>Thu, 09 Apr 2026 14:59:14 +0000</pubDate><guid>https://grafana.com/docs/tempo/v2.10.x/troubleshooting/metrics-generator/</guid><content><![CDATA[&lt;h1 id=&#34;troubleshoot-metrics-generator&#34;&gt;Troubleshoot metrics-generator&lt;/h1&gt;
&lt;p&gt;If you&amp;rsquo;re concerned with data quality issues in the metrics-generator, consider:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Reviewing your telemetry pipeline to determine the number of dropped spans. You are only looking for major issues here.&lt;/li&gt;
&lt;li&gt;Reviewing the 
    &lt;a href=&#34;/docs/tempo/v2.10.x/metrics-generator/service_graphs/&#34;&gt;service graph documentation&lt;/a&gt; to understand how they are built.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If everything seems acceptable from these two perspectives, consider the following topics to help resolve general issues with all metrics and span metrics specifically.&lt;/p&gt;
&lt;h2 id=&#34;all-metrics&#34;&gt;All metrics&lt;/h2&gt;
&lt;p&gt;This section covers metrics for all metrics related to the metrics-generator.&lt;/p&gt;
&lt;h3 id=&#34;dropped-spans-in-the-distributor&#34;&gt;Dropped spans in the distributor&lt;/h3&gt;
&lt;p&gt;The distributor has a queue of outgoing spans to the metrics-generators.
If the queue is full, then the distributor drops spans before they reach the generator. Use the following metric to determine if that&amp;rsquo;s happening:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sum(rate(tempo_distributor_queue_pushes_failures_total{}[1m]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;failed-pushes-to-the-generator&#34;&gt;Failed pushes to the generator&lt;/h3&gt;
&lt;p&gt;For any number of reasons, the distributor can fail a push to the generators. Use the following metric to
determine if that&amp;rsquo;s happening:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sum(rate(tempo_distributor_metrics_generator_pushes_failures_total{}[1m]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;discarded-spans-in-the-generator&#34;&gt;Discarded spans in the generator&lt;/h3&gt;
&lt;p&gt;Spans are rejected from being considered by the metrics-generator by a configurable slack time as well as due to user
configurable filters. You can see the number of spans rejected by reason using this metric:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sum(rate(tempo_metrics_generator_spans_discarded_total{}[1m])) by (reason)&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;If a lot of spans are dropped in the metrics-generator due to your filters, you will need to adjust them. If spans are dropped
due to the ingestion slack time, consider adjusting this setting:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;metrics_generator:
  metrics_ingestion_time_range_slack: 30s&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;If spans are regularly exceeding this value you may want to consider reviewing your tracing pipeline to see if you have excessive buffering.
Note that increasing this value allows the generator to consume more spans, but does reduce the accuracy of metrics because spans farther
away from &amp;ldquo;now&amp;rdquo; are included.&lt;/p&gt;
&lt;p&gt;Spans could also be discarded if the attributes aren&amp;rsquo;t valid UTF-8 characters when those attributes are converted to metric labels.&lt;/p&gt;
&lt;h3 id=&#34;max-active-series&#34;&gt;Max active series&lt;/h3&gt;
&lt;p&gt;The generator protects itself and your remote-write target by having a maximum number of series the generator produces.
Use the &lt;code&gt;sum&lt;/code&gt; below to determine if series are being dropped due to this limit:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sum(rate(tempo_metrics_generator_registry_series_limited_total{}[1m]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Use the following setting to update the limit:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;overrides:
  defaults:
    metrics_generator:
      max_active_series: 0&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Note that this value is per metrics generator. The actual max series remote written will be &lt;code&gt;&amp;lt;# of metrics generators&amp;gt; * &amp;lt;metrics_generator.max_active_series&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&#34;overflow-series&#34;&gt;Overflow series&lt;/h3&gt;
&lt;p&gt;When the active series limit is reached, the metrics-generator produces overflow series instead of dropping new data. These series have the label &lt;code&gt;metric_overflow=&amp;quot;true&amp;quot;&lt;/code&gt; and capture all data that would otherwise be lost.&lt;/p&gt;
&lt;p&gt;To identify overflow series in your metrics:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;{metric_overflow=&amp;#34;true&amp;#34;}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;As existing series become stale and are removed, new series are split out from the overflow bucket until the limit is reached again. To reduce overflow, either increase &lt;code&gt;max_active_series&lt;/code&gt; or reduce cardinality by adjusting dimensions or filters.&lt;/p&gt;
&lt;h3 id=&#34;entity-based-limiting&#34;&gt;Entity-based limiting&lt;/h3&gt;
&lt;p&gt;You can configure entity-based limiting as an alternative to series-based limiting.
An entity is a unique label combination (excluding external labels) across multiple metrics.
Entity-based limiting ensures the generator always produces the full set of metrics for a given entity, rather than limiting randomly once the series limit is triggered.&lt;/p&gt;
&lt;p&gt;To enable entity-based limiting, set &lt;code&gt;limiter_type&lt;/code&gt; to &lt;code&gt;entity&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;metrics_generator:
  limiter_type: entity&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Use the following metric to determine if entities are being limited:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sum(rate(tempo_metrics_generator_registry_entities_limited_total{}[1m]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Configure the entity limit with:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;overrides:
  defaults:
    metrics_generator:
      max_active_entities: 0&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;estimate-active-series-demand&#34;&gt;Estimate active series demand&lt;/h3&gt;
&lt;p&gt;When the active series limit is reached, the &lt;code&gt;tempo_metrics_generator_registry_active_series&lt;/code&gt; metric no longer reflects the true demand. Use the &lt;code&gt;tempo_metrics_generator_registry_active_series_demand_estimate&lt;/code&gt; metric to estimate what the active series count would be without the limit:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;tempo_metrics_generator_registry_active_series_demand_estimate{}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This metric uses HyperLogLog estimation and has approximately 3% deviation from the actual cardinality. Use this to determine if you need to increase limits or reduce cardinality.&lt;/p&gt;
&lt;h3 id=&#34;remote-write-failures&#34;&gt;Remote write failures&lt;/h3&gt;
&lt;p&gt;For any number of reasons, the generator may fail a write to the remote write target. Use the following metrics to
determine if that&amp;rsquo;s happening:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sum(rate(prometheus_remote_storage_samples_failed_total{}[1m]))
sum(rate(prometheus_remote_storage_samples_dropped_total{}[1m]))
sum(rate(prometheus_remote_storage_exemplars_failed_total{}[1m]))
sum(rate(prometheus_remote_storage_exemplars_dropped_total{}[1m]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h2 id=&#34;service-graph-metrics&#34;&gt;Service graph metrics&lt;/h2&gt;
&lt;p&gt;Service graphs have additional configuration which can impact the quality of the output metrics.&lt;/p&gt;
&lt;h3 id=&#34;expired-edges&#34;&gt;Expired edges&lt;/h3&gt;
&lt;p&gt;The following metrics can be used to determine how many edges are failing to find a match.
The expired edge only includes those edges that are expired and have no matching information to generate a service graph edge.&lt;/p&gt;
&lt;p&gt;Rate of edges that have expired without a match:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sum(rate(tempo_metrics_generator_processor_service_graphs_expired_edges{}[1m]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Rate of all edges:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sum(rate(tempo_metrics_generator_processor_service_graphs_edges{}[1m]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;If you are seeing a large number of edges expire without a match, consider adjusting the &lt;code&gt;wait&lt;/code&gt; setting. This
controls how long the metrics generator waits to find a match before it gives up.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;metrics_generator:
  processor:
    service_graphs:
      wait: 10s&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;service-graph-max-items&#34;&gt;Service graph max items&lt;/h3&gt;
&lt;p&gt;The service graph processor has a maximum number of edges it tracks at once to limit the total amount of memory the processor uses.
To determine if edges are being dropped due to this limit, check:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sum(rate(tempo_metrics_generator_processor_service_graphs_dropped_spans{}[1m]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Use &lt;code&gt;max_items&lt;/code&gt; to adjust the maximum amount of edges tracked:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;metrics_generator:
  processor:
    service_graphs:
      max_items: 10000&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
]]></content><description>&lt;h1 id="troubleshoot-metrics-generator">Troubleshoot metrics-generator&lt;/h1>
&lt;p>If you&amp;rsquo;re concerned with data quality issues in the metrics-generator, consider:&lt;/p>
&lt;ul>
&lt;li>Reviewing your telemetry pipeline to determine the number of dropped spans. You are only looking for major issues here.&lt;/li>
&lt;li>Reviewing the
&lt;a href="/docs/tempo/v2.10.x/metrics-generator/service_graphs/">service graph documentation&lt;/a> to understand how they are built.&lt;/li>
&lt;/ul>
&lt;p>If everything seems acceptable from these two perspectives, consider the following topics to help resolve general issues with all metrics and span metrics specifically.&lt;/p></description></item><item><title>Troubleshoot out-of-memory errors</title><link>https://grafana.com/docs/tempo/v2.10.x/troubleshooting/out-of-memory-errors/</link><pubDate>Thu, 09 Apr 2026 14:59:14 +0000</pubDate><guid>https://grafana.com/docs/tempo/v2.10.x/troubleshooting/out-of-memory-errors/</guid><content><![CDATA[&lt;h1 id=&#34;troubleshoot-out-of-memory-errors&#34;&gt;Troubleshoot out-of-memory errors&lt;/h1&gt;
&lt;p&gt;Learn about out-of-memory (OOM) issues and how to troubleshoot them.&lt;/p&gt;
&lt;h2 id=&#34;set-the-max-attribute-size-to-help-control-out-of-memory-errors&#34;&gt;Set the max attribute size to help control out of memory errors&lt;/h2&gt;
&lt;p&gt;Tempo queriers can run out of memory when fetching traces that have spans with very large attributes.
This issue has been observed when trying to fetch a single trace using the &lt;a href=&#34;/docs/tempo/latest/api_docs/#query&#34;&gt;&lt;code&gt;tracebyID&lt;/code&gt; endpoint&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To avoid these out-of-memory crashes, use &lt;code&gt;max_attribute_bytes&lt;/code&gt; to limit the maximum allowable size of any individual attribute.
Any key or values that exceed the configured limit are truncated before storing.&lt;/p&gt;
&lt;p&gt;Use the &lt;code&gt;tempo_distributor_attributes_truncated_total&lt;/code&gt; metric to track how many attributes are truncated.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;   # Optional
    # Configures the max size an attribute can be. Any key or value that exceeds this limit will be truncated before storing
    # Setting this parameter to &amp;#39;0&amp;#39; would disable this check against attribute size
    [max_attribute_bytes: &amp;lt;int&amp;gt; | default = &amp;#39;2048&amp;#39;]&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Refer to the 
    &lt;a href=&#34;/docs/tempo/v2.10.x/configuration/#set-max-attribute-size-to-help-control-out-of-memory-errors&#34;&gt;configuration for distributors&lt;/a&gt; documentation for more information.&lt;/p&gt;
&lt;h2 id=&#34;max-trace-size&#34;&gt;Max trace size&lt;/h2&gt;
&lt;p&gt;Traces which are long-running (minutes or hours) or large (100K - 1M spans) spike the memory usage of each component when the large trace is encountered.
Tempo treats traces as single units, and keeps all data for a trace together to enable features like structural queries and analysis.&lt;/p&gt;
&lt;p&gt;Reading a large trace can spike the memory usage of the read components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;query-frontend&lt;/li&gt;
&lt;li&gt;querier&lt;/li&gt;
&lt;li&gt;ingester&lt;/li&gt;
&lt;li&gt;metrics-generator&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Writing a large trace can spike the memory usage of the write components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ingester&lt;/li&gt;
&lt;li&gt;compactor&lt;/li&gt;
&lt;li&gt;metrics-generator&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Start with a smaller trace size limit of 15MB, and increase it as needed.
With an average span size of 300 bytes, this allows for 50K spans per trace.&lt;/p&gt;
&lt;p&gt;Verify that you&amp;rsquo;ve configured a limit in &lt;code&gt;max_bytes_per_trace&lt;/code&gt;.
The largest recommended limit is 60MB.&lt;/p&gt;
&lt;p&gt;Configure the limit in the per-tenant overrides:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;overrides:
    &amp;#39;tenant123&amp;#39;:
        max_bytes_per_trace: 1.5e&amp;#43;07&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Refer to the 
    &lt;a href=&#34;/docs/tempo/v2.10.x/configuration/#standard-overrides&#34;&gt;Standard overrides&lt;/a&gt; documentation for more information.&lt;/p&gt;
&lt;p&gt;If you have long-running batch job traces, consider using span links to break them apart.&lt;/p&gt;
&lt;h2 id=&#34;large-attributes&#34;&gt;Large attributes&lt;/h2&gt;
&lt;p&gt;Very large attributes, 10KB or longer, can spike the memory usage of each component when they are encountered.
Tempo&amp;rsquo;s Parquet format uses dictionary-encoded columns, which works well for repeated values.
However, for very large and high cardinality attributes, this can require a large amount of memory.&lt;/p&gt;
&lt;p&gt;A common source of large attributes is auto-instrumentation in these areas:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;HTTP
&lt;ul&gt;
&lt;li&gt;Request or response bodies&lt;/li&gt;
&lt;li&gt;Large headers
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/attributes-registry/http/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;http.request.header.&amp;lt;key&amp;gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Large URLs
&lt;ul&gt;
&lt;li&gt;http.url&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/attributes-registry/url/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;url.full&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Databases
&lt;ul&gt;
&lt;li&gt;Full query statements&lt;/li&gt;
&lt;li&gt;db.statement&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/attributes-registry/db/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;db.query.text&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Queues
&lt;ul&gt;
&lt;li&gt;Message bodies&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When reading these attributes, they can spike the memory usage of the read components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;query-frontend&lt;/li&gt;
&lt;li&gt;querier&lt;/li&gt;
&lt;li&gt;ingester&lt;/li&gt;
&lt;li&gt;metrics-generator&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When writing these attributes, they can spike the memory usage of the write components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ingester&lt;/li&gt;
&lt;li&gt;compactor&lt;/li&gt;
&lt;li&gt;metrics-generator&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can &lt;a href=&#34;https://github.com/grafana/tempo/pull/4335&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;automatically limit attribute sizes&lt;/a&gt; using 
    &lt;a href=&#34;/docs/tempo/v2.10.x/configuration/#set-max-attribute-size-to-help-control-out-of-memory-errors&#34;&gt;&lt;code&gt;max_attribute_bytes&lt;/code&gt;&lt;/a&gt;.
You can also use these options:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Manually update application instrumentation to remove or limit these attributes&lt;/li&gt;
&lt;li&gt;Drop the attributes in the tracing pipeline using &lt;a href=&#34;https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/attributesprocessor&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;attribute processor&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
]]></content><description>&lt;h1 id="troubleshoot-out-of-memory-errors">Troubleshoot out-of-memory errors&lt;/h1>
&lt;p>Learn about out-of-memory (OOM) issues and how to troubleshoot them.&lt;/p>
&lt;h2 id="set-the-max-attribute-size-to-help-control-out-of-memory-errors">Set the max attribute size to help control out of memory errors&lt;/h2>
&lt;p>Tempo queriers can run out of memory when fetching traces that have spans with very large attributes.
This issue has been observed when trying to fetch a single trace using the &lt;a href="/docs/tempo/latest/api_docs/#query">&lt;code>tracebyID&lt;/code> endpoint&lt;/a>.&lt;/p></description></item></channel></rss>