<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Set up your collector on Grafana Labs</title><link>https://grafana.com/docs/tempo/v2.9.x/set-up-for-tracing/instrument-send/set-up-collector/</link><description>Recent content in Set up your collector on Grafana Labs</description><generator>Hugo -- gohugo.io</generator><language>en</language><atom:link href="/docs/tempo/v2.9.x/set-up-for-tracing/instrument-send/set-up-collector/index.xml" rel="self" type="application/rss+xml"/><item><title>Grafana Alloy</title><link>https://grafana.com/docs/tempo/v2.9.x/set-up-for-tracing/instrument-send/set-up-collector/grafana-alloy/</link><pubDate>Sat, 04 Apr 2026 09:35:34 +0000</pubDate><guid>https://grafana.com/docs/tempo/v2.9.x/set-up-for-tracing/instrument-send/set-up-collector/grafana-alloy/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-alloy&#34;&gt;Grafana Alloy&lt;/h1&gt;
&lt;p&gt;Grafana Alloy offers native pipelines for OTel, Prometheus, Pyroscope, Loki, and many other metrics, logs, traces, and profile tools.
In addition, you can use Alloy pipelines to do other tasks, such as configure alert rules in Loki and Mimir. Alloy is fully compatible with the OTel Collector, Prometheus Agent, and Promtail.&lt;/p&gt;
&lt;p&gt;You can use Alloy to collect and forward traces to Tempo.
Using Alloy provides a hassle-free option, especially when dealing with multiple applications or microservices, allowing you to centralize the tracing process without changing your application&amp;rsquo;s codebase.&lt;/p&gt;
&lt;p&gt;You can use Alloy as an alternative to either of these solutions or combine it into a hybrid system of multiple collectors and agents.
You can deploy Alloy anywhere within your IT infrastructure and pair it with your Grafana LGTM stack, a telemetry backend from Grafana Cloud, or any other compatible backend from any other vendor.
Alloy is flexible, and you can easily configure it to fit your needs for on-premise, cloud-only, or a mix of both.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;&lt;img src=&#34;/media/docs/tempo/intro/tempo-auto-log.svg&#34; alt=&#34;Automatic logging overview&#34;&gt;&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s commonly used as a tracing pipeline, offloading traces from the
application and forwarding them to a storage backend.&lt;/p&gt;
&lt;p&gt;Grafana Alloy configuration files are written in the 
    &lt;a href=&#34;/docs/alloy/v2.9.x/get-started/configuration-syntax/&#34;&gt;Alloy configuration syntax&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For more information, refer to the 
    &lt;a href=&#34;/docs/alloy/v2.9.x/introduction/&#34;&gt;Introduction to Grafana Alloy&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;architecture&#34;&gt;Architecture&lt;/h2&gt;
&lt;p&gt;Grafana Alloy can run a set of tracing pipelines to collect data from your applications and write it to Tempo.
Pipelines are built using OpenTelemetry, and consist of &lt;code&gt;receivers&lt;/code&gt;, &lt;code&gt;processors&lt;/code&gt;, and &lt;code&gt;exporters&lt;/code&gt;.
The architecture mirrors that of the OTel Collector&amp;rsquo;s &lt;a href=&#34;https://github.com/open-telemetry/opentelemetry-collector/blob/846b971758c92b833a9efaf742ec5b3e2fbd0c89/docs/design.md&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;design&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Refer to the &lt;a href=&#34;/docs/alloy/latest/reference/components/&#34;&gt;components reference&lt;/a&gt; for all available configuration options.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;&lt;img src=&#34;https://raw.githubusercontent.com/open-telemetry/opentelemetry-collector/846b971758c92b833a9efaf742ec5b3e2fbd0c89/docs/images/design-pipelines.png&#34; alt=&#34;Tracing pipeline architecture&#34;&gt;&lt;/p&gt;
&lt;p&gt;This lets you configure multiple distinct tracing
pipelines, each of which collects separate spans and sends them to different
backends.&lt;/p&gt;
&lt;h2 id=&#34;set-up-alloy-to-receive-traces&#34;&gt;Set up Alloy to receive traces&lt;/h2&gt;
&lt;!-- vale Grafana.Parentheses = NO --&gt;
&lt;p&gt;Grafana Alloy supports multiple ingestion receivers:
OTLP (OpenTelemetry), Jaeger, Zipkin, OpenCensus, and Kafka.&lt;/p&gt;
&lt;!-- vale Grafana.Parentheses = YES --&gt;
&lt;p&gt;Each tracing pipeline can be configured to receive traces in all these formats.
Traces that arrive to a pipeline go through the receivers/processors/exporters defined in that pipeline.&lt;/p&gt;
&lt;p&gt;To use Alloy for tracing, you need to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/alloy/v2.9.x/set-up/&#34;&gt;Set up Grafana Alloy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/alloy/v2.9.x/configure/&#34;&gt;Configure Grafana Alloy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Set up any additional features&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Refer to 
    &lt;a href=&#34;/docs/alloy/v2.9.x/collect/&#34;&gt;Collect and forward data with Grafana Alloy&lt;/a&gt; for examples of collecting data.&lt;/p&gt;
&lt;h2 id=&#34;set-up-pipeline-processing&#34;&gt;Set up pipeline processing&lt;/h2&gt;
&lt;p&gt;Grafana Alloy processes tracing data as it flows through the pipeline to make the distributed tracing system more reliable and leverage the data for other purposes such as trace discovery, tail-based sampling, and generating metrics.&lt;/p&gt;
&lt;h3 id=&#34;batching&#34;&gt;Batching&lt;/h3&gt;
&lt;p&gt;Alloy supports batching of traces.
Batching helps better compress the data, reduces the number of outgoing connections, and is a recommended best practice.
To configure it, refer to the &lt;code&gt;otelcol.processor.batch&lt;/code&gt; block in the 
    &lt;a href=&#34;/docs/alloy/v2.9.x/reference/components/otelcol/otelcol.processor.batch/&#34;&gt;components reference&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;attributes-manipulation&#34;&gt;Attributes manipulation&lt;/h3&gt;
&lt;p&gt;Grafana Alloy allows for general manipulation of attributes on spans that pass through it.
A common use may be to add an environment or cluster variable.
There are several processors that can manipulate attributes, some examples include: the &lt;code&gt;otelcol.processor.attributes&lt;/code&gt; block in the 
    &lt;a href=&#34;/docs/alloy/v2.9.x/reference/components/otelcol/otelcol.processor.attributes/&#34;&gt;component reference&lt;/a&gt; and the &lt;code&gt;otelcol.processor.transform&lt;/code&gt; block 
    &lt;a href=&#34;/docs/alloy/v2.9.x/reference/components/otelcol/otelcol.processor.transform/&#34;&gt;component reference&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&#34;attach-metadata-with-prometheus-service-discovery&#34;&gt;Attach metadata with Prometheus Service Discovery&lt;/h3&gt;
&lt;p&gt;Prometheus Service Discovery mechanisms enable you to attach the same metadata to your traces as your metrics.
For example, for Kubernetes users this means that you can dynamically attach metadata for namespace, Pod, and name of the container sending spans.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;Alloy&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-alloy&#34;&gt;otelcol.receiver.otlp &amp;#34;default&amp;#34; {
  http {}
  grpc {}

  output {
    traces  = [otelcol.processor.k8sattributes.default.input]
  }
}

otelcol.processor.k8sattributes &amp;#34;default&amp;#34; {
  extract {
    metadata = [
      &amp;#34;k8s.namespace.name&amp;#34;,
      &amp;#34;k8s.pod.name&amp;#34;,
      &amp;#34;k8s.container.name&amp;#34;
    ]
  }

  output {
    traces = [otelcol.exporter.otlp.default.input]
  }
}

otelcol.exporter.otlp &amp;#34;default&amp;#34; {
  client {
    endpoint = env(&amp;#34;OTLP_ENDPOINT&amp;#34;)
  }
}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Refer to the &lt;code&gt;otelcol.processor.k8sattributes&lt;/code&gt; block in the 
    &lt;a href=&#34;/docs/alloy/v2.9.x/reference/components/otelcol/otelcol.processor.k8sattributes/&#34;&gt;components reference&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;trace-discovery-through-automatic-logging&#34;&gt;Trace discovery through automatic logging&lt;/h3&gt;
&lt;p&gt;Automatic logging writes well formatted log lines to help with trace discovery.&lt;/p&gt;
&lt;p&gt;For a closer look into the feature, visit 
    &lt;a href=&#34;/docs/tempo/v2.9.x/set-up-for-tracing/instrument-send/set-up-collector/grafana-alloy/automatic-logging/&#34;&gt;Automatic logging&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;tail-based-sampling&#34;&gt;Tail-based sampling&lt;/h3&gt;
&lt;p&gt;Alloy implements tail-based sampling for distributed tracing systems and multi-instance Alloy deployments.
With this feature, you can make sampling decisions based on data from a trace, rather than exclusively with probabilistic methods.&lt;/p&gt;
&lt;p&gt;For a detailed description, refer to 
    &lt;a href=&#34;/docs/tempo/v2.9.x/set-up-for-tracing/instrument-send/set-up-collector/tail-sampling/&#34;&gt;Tail sampling&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;generate-metrics-from-spans&#34;&gt;Generate metrics from spans&lt;/h3&gt;
&lt;p&gt;Alloy can take advantage of the span data flowing through the pipeline to generate Prometheus metrics.&lt;/p&gt;
&lt;p&gt;Refer to 
    &lt;a href=&#34;/docs/tempo/v2.9.x/metrics-from-traces/span-metrics/&#34;&gt;Span metrics&lt;/a&gt; for a more detailed explanation of the feature.&lt;/p&gt;
&lt;h3 id=&#34;service-graph-metrics&#34;&gt;Service graph metrics&lt;/h3&gt;
&lt;p&gt;Service graph metrics represent the relationships between services within a distributed system.&lt;/p&gt;
&lt;p&gt;This service graphs processor builds a map of services by analyzing traces, with the objective to find &lt;em&gt;edges&lt;/em&gt;.
Edges are spans with a parent-child relationship, that represent a jump, such as a request, between two services.
The amount of requests and their duration are recorded as metrics, which are used to represent the graph.&lt;/p&gt;
&lt;p&gt;To read more about this processor, refer to 
    &lt;a href=&#34;/docs/tempo/v2.9.x/metrics-from-traces/service_graphs/&#34;&gt;Service graphs&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;export-spans&#34;&gt;Export spans&lt;/h2&gt;
&lt;p&gt;Alloy can export traces to multiple different backends for every tracing pipeline.
Exporting is built using OpenTelemetry Collector&amp;rsquo;s &lt;a href=&#34;https://github.com/open-telemetry/opentelemetry-collector/blob/846b971758c92b833a9efaf742ec5b3e2fbd0c89/exporter/otlpexporter/README.md&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;OTLP exporter&lt;/a&gt;.
Alloy supports exporting tracing in OTLP format.&lt;/p&gt;
&lt;p&gt;Aside from endpoint and authentication, the exporter also provides mechanisms for retrying on failure,
and implements a queue buffering mechanism for transient failures, such as networking issues.&lt;/p&gt;
&lt;p&gt;To see all available options,
refer to the &lt;code&gt;otelcol.exporter.otlp&lt;/code&gt; block in the 
    &lt;a href=&#34;/docs/alloy/v2.9.x/reference/components/otelcol/otelcol.exporter.otlp/&#34;&gt;Alloy configuration reference&lt;/a&gt; and the &lt;code&gt;otelcol.exporter.otlphttp&lt;/code&gt; block in the 
    &lt;a href=&#34;/docs/alloy/v2.9.x/reference/components/otelcol/otelcol.exporter.otlphttp/&#34;&gt;Alloy configuration reference&lt;/a&gt;.&lt;/p&gt;
]]></content><description>&lt;h1 id="grafana-alloy">Grafana Alloy&lt;/h1>
&lt;p>Grafana Alloy offers native pipelines for OTel, Prometheus, Pyroscope, Loki, and many other metrics, logs, traces, and profile tools.
In addition, you can use Alloy pipelines to do other tasks, such as configure alert rules in Loki and Mimir. Alloy is fully compatible with the OTel Collector, Prometheus Agent, and Promtail.&lt;/p></description></item><item><title>Sampling</title><link>https://grafana.com/docs/tempo/v2.9.x/set-up-for-tracing/instrument-send/set-up-collector/tail-sampling/</link><pubDate>Sat, 04 Apr 2026 09:35:34 +0000</pubDate><guid>https://grafana.com/docs/tempo/v2.9.x/set-up-for-tracing/instrument-send/set-up-collector/tail-sampling/</guid><content><![CDATA[&lt;h1 id=&#34;sampling&#34;&gt;Sampling&lt;/h1&gt;
&lt;p&gt;Grafana Tempo is a cost-effective solution that ingests and stores traces that provide maximum observability across your application estate.
However, sometimes constraints mean that storing all of your traces is not desirable, for example runtime or egress traffic related costs.
There are a number of ways to lower trace volume, including varying sampling strategies.&lt;/p&gt;
&lt;p&gt;Sampling is the process of determining which traces to store (in Tempo or Grafana Cloud Traces) and which to discard. Sampling comes in two different strategy types: head and tail sampling.&lt;/p&gt;
&lt;p&gt;Sampling functionality exists in both &lt;a href=&#34;/docs/alloy/&#34;&gt;Grafana Alloy&lt;/a&gt; and the OpenTelemetry Collector. Alloy can collect, process, and export telemetry signals, with configuration files written in 
    &lt;a href=&#34;/docs/alloy/v2.9.x/get-started/configuration-syntax/&#34;&gt;Alloy configuration syntax&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;head-and-tail-sampling&#34;&gt;Head and tail sampling&lt;/h2&gt;
&lt;p&gt;When sampling, you can use a head or tail sampling strategy.&lt;/p&gt;
&lt;p&gt;With a head sampling strategy, the decision to sample the trace is usually made as early as possible and doesn’t need to take into account the whole trace.
It’s a simple but effective sampling strategy.&lt;/p&gt;
&lt;p&gt;With a tail sampling strategy, the decision to sample a trace is made after considering all or most of the spans. For example, tail sampling is a good option to sample only traces that have errors or traces with long request duration.
Tail sampling is more complex to configure, implement, and maintain but is the recommended sampling strategy for large systems with a high telemetry volume.&lt;/p&gt;
&lt;p&gt;You can use sampling with Tempo using Grafana or Grafana Cloud.&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;/media/docs/tempo/sampling/tempo-tail-based-sampling.svg&#34;
  alt=&#34;Tail sampling overview and components with Tempo, Alloy, and Grafana&#34;/&gt;&lt;/p&gt;
&lt;h3 id=&#34;resources&#34;&gt;Resources&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://opentelemetry.io/docs/concepts/sampling/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;OpenTelemetry Sampling documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Sampling in Grafana Cloud Traces with a collector: &lt;a href=&#34;/docs/opentelemetry/collector/sampling/head/&#34;&gt;Head sampling&lt;/a&gt; and &lt;a href=&#34;/docs/opentelemetry/collector/sampling/tail/&#34;&gt;Tail sampling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.9.x/configuration/grafana-alloy/tail-sampling/enable-tail-sampling/&#34;&gt;Enable tail sampling in Tempo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/tempo/v2.9.x/configuration/grafana-alloy/tail-sampling/policies-strategies/&#34;&gt;Sampling policies and strategies&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;sampling-and-telemetry-correlation&#34;&gt;Sampling and telemetry correlation&lt;/h2&gt;
&lt;p&gt;Sampling is a decision on whether or not to keep (and then store) a trace, or whether to discard it.
These decisions have implications when it comes to correlating trace data with other signals.&lt;/p&gt;
&lt;p&gt;For example, many services that are instrumented also produce logs, metrics, or profiles.
These signals can reference each other.
In the case of a trace, this reference can be via a trace ID embedded into a 
    &lt;a href=&#34;/docs/grafana/next/datasources/tempo/traces-in-grafana/link-trace-id/&#34;&gt;log line&lt;/a&gt;, an 
    &lt;a href=&#34;/docs/grafana/next/fundamentals/exemplars/&#34;&gt;exemplar&lt;/a&gt; embedded into a metric value, or a profile ID &lt;a href=&#34;/docs/grafana-cloud/monitor-applications/profiles/traces-to-profiles/&#34;&gt;embedded into a trace&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;By choosing to not sample a trace, a particular signal references the dropped trace&amp;rsquo;s ID in some cases.
This drop can lead to a situation in Grafana where following a link to a trace ID from a log line or an exemplar from a metric value, results in a query for that trace ID failing because the trace has not been sampled.
Profiles may not show up without specifically querying for them, because a trace that would have included the profile&amp;rsquo;s flame graph hasn&amp;rsquo;t been stored.&lt;/p&gt;
&lt;p&gt;This isn&amp;rsquo;t usually a huge issue, because sampling policies tend to be chosen that show non-normative behavior, for example, errors being thrown or long latencies on requests.
An observer is more likely to be choosing traces that show these issues rather than the required behavior.
Understand how signals correlate between each other helps determine how to choose these policies.&lt;/p&gt;
&lt;h2 id=&#34;how-tail-sampling-works-in-the-opentelemetry-tail-sampling-processor&#34;&gt;How tail sampling works in the OpenTelemetry Tail Sampling Processor&lt;/h2&gt;
&lt;p&gt;In tail sampling, sampling decisions are made at the end of the workflow allowing for a more accurate sampling decision.
Alloy uses the &lt;a href=&#34;https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/tailsamplingprocessor/README.md&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;OpenTelemetry Tail Sampling Processor&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Alloy organizes spans by trace ID and evaluates its data to see if it meets one of the defined policy types (for example, &lt;code&gt;latency&lt;/code&gt; or &lt;code&gt;status_code&lt;/code&gt;).
For instance, a policy can check if a trace contains an error or the trace duration was longer than a specified threshold.&lt;/p&gt;
&lt;p&gt;A trace is sampled if it meets the conditions of at least one policy.&lt;/p&gt;
&lt;h3 id=&#34;decision-periods&#34;&gt;Decision periods&lt;/h3&gt;
&lt;p&gt;To group spans by trace ID, Alloy buffers spans for a configurable amount of time, after which it considers the trace complete.
This configurable amount of time is known as the decision period. Longer running traces are split into more than one.&lt;/p&gt;
&lt;p&gt;In situations where a specific trace is longer in duration than the decision period, multiple decisions might be made for any future spans that fall outside of the decision period window.
This can result in some spans for a trace being sampled, while others are not.&lt;/p&gt;
&lt;p&gt;For example, consider a situation where the tail sampler decision period is 10 seconds, and a single policy exists to sample traces where an error is set on at least one span.
One of the traces is 20 seconds in duration and a single span at time offset 15 seconds exhibits an error status.&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;/media/docs/tempo/sampling/tempo-decision-point-sampling.svg&#34;
  alt=&#34;Trace Policy: Error when status exists&#34;/&gt;&lt;/p&gt;
&lt;p&gt;When the first span for the trace is observed, the decision period time of 10 seconds is initiated.
After the decision period has expired, the tail sampler won&amp;rsquo;t have observed any spans with an error status, and will therefore discard the trace spans.&lt;/p&gt;
&lt;p&gt;When the next span for the trace arrives, a new decision period of 10 seconds begins.
In this period, one of the observed spans has an error set on it. When the decision period expires, all of the spans for the trace in that period will be sampled.&lt;/p&gt;
&lt;p&gt;This leads to a fragmented trace being stored in Tempo, where only the spans for the last 10 seconds of the trace will be available to query.
While this is still a potentially useful trace, careful determination of how to set the decision period is key to ensuring that trace spans are sampled correctly.&lt;/p&gt;
&lt;p&gt;However, using longer decision periods increases the memory overhead of buffering the spans required to make a decision for each trace.&lt;/p&gt;
&lt;p&gt;For this reason, enabling a decision cache can ensure that previous sampling decisions for a specific trace ID are honored even after the expiration of the decision period.
For more details, refer to the Caches section.&lt;/p&gt;
&lt;h3 id=&#34;caches&#34;&gt;Caches&lt;/h3&gt;
&lt;p&gt;The OpenTelemetry tail sampling processor includes two separate caches, the sampled and non-sampled caches.
The sampled cache keeps a list of all trace IDs where a prior decision to keep spans has been made.
The non-sampled cache keeps a list of all trace IDs where a prior decision to drop spans has been made.
Both caches are configured by the maximum number of traces that should be stored in the cache, and can be enabled either independently or jointly.&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;/media/docs/tempo/sampling/tempo-alloy-sampling-policies.svg&#34;
  alt=&#34;Decision points and caches workflow&#34;/&gt;&lt;/p&gt;
&lt;p&gt;In the above diagram, should both caches be enabled, then a decision to drop samples for the trace is made after 10 seconds and the trace ID stored in the non-sampled cache.
This means that even spans that have an error status for that trace are dropped after the initial decision period, as the non-sampled cache matches the trace ID and pre-emptively drops the span.
However, the same is true should a sampled decision have been made, where any future spans do not match any policies but whose trace ID is found in the sampled cache.&lt;/p&gt;
&lt;p&gt;Understanding how these caches work ensures that you still keep decisions that have previously been made.
For example, you could use the sampled cache to short-circuit future decisions for a trace, immediately sampling the incoming span.
This allows a decision to be made without having to buffer any other spans.&lt;/p&gt;
&lt;p&gt;Here are some general guidelines for using caches.
Every installation is different.
Using the caches can impact the amount of data generated.&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Cache type&lt;/th&gt;
              &lt;th&gt;Use case&lt;/th&gt;
              &lt;th&gt;Benefits/Considerations&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;Sample caches&lt;/td&gt;
              &lt;td&gt;Keep any future spans from traces that have been sampled.&lt;/td&gt;
              &lt;td&gt;&lt;ul&gt;&lt;li&gt;Cuts down span storage per trace to only those matching policies.&lt;/li&gt;&lt;li&gt; Can cause fragmented tracing.&lt;/li&gt;&lt;/ul&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;Non-sampled&lt;/td&gt;
              &lt;td&gt;Drop any future spans from traces where a decision to not sample those traces has explicitly occurred.&lt;/td&gt;
              &lt;td&gt;&lt;ul&gt;&lt;li&gt;Lowers chance of storing traces after the initial decision period. &lt;/li&gt;&lt;li&gt;Misses any trace whose spans exhibit future policy criteria matching.&lt;/li&gt;&lt;/ul&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;Both&lt;/td&gt;
              &lt;td&gt;Use an initial decision period that makes a decision once and uses that decision going forward.&lt;/td&gt;
              &lt;td&gt;&lt;ul&gt;&lt;li&gt;Guarantees capture of full traces.&lt;/li&gt;&lt;li&gt;Lower chance of capturing useful traces with a long duration.&lt;/li&gt;&lt;li&gt;Can lose spans if they are longer than the decision period.&lt;/li&gt;&lt;ul&gt;&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;

&lt;div class=&#34;admonition admonition-note&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Note&lt;/p&gt;&lt;p&gt;Enabling both sampled and non-sampled caches exhibits functionality similar to that of not enabling caches.
However, it short-circuits any future decision making once an initial decision period has expired. Enabling both caches lowers memory requirements for buffering spans.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;h3 id=&#34;tail-sampling-load-balancing&#34;&gt;Tail sampling load balancing&lt;/h3&gt;
&lt;p&gt;Situations may arise for multi-instance Alloy deployments where emitted spans belonging to the same trace may arrive at different instances.
For most cases, sampling decisions rely on all the spans for a specific trace ID being received by a single instance.&lt;/p&gt;
&lt;p&gt;You can configure Alloy to load balance traces across instances by exporting spans belonging to a specific trace ID to the same instance.
For example, if 10 traces are coming in and there are four Alloy instances, then each instance will receive three traces and one instance will receive four traces.
The load balancing maintains consistent hashing across all instances.&lt;/p&gt;
&lt;p&gt;Tail sampling load balancing is usually carried out by running two layers of collectors.
The first layer receives the telemetry data (in this case trace spans), and then distributes these to the second layer that carries out the sampling policies.&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;/media/docs/tempo/sampling/tempo-alloy-sampling-loadbalancing.svg&#34;
  alt=&#34;Load balancing incoming traces using Alloy&#34;/&gt;&lt;/p&gt;
&lt;p&gt;Alloy includes a 
    &lt;a href=&#34;/docs/alloy/v2.9.x/reference/components/otelcol/otelcol.exporter.loadbalancing/&#34;&gt;load-balancing exporter&lt;/a&gt; that can carry out routing to further collector targets based on a set number of keys (in the case of trace sampling, usually the &lt;code&gt;traceID&lt;/code&gt; key).
Alloy uses the &lt;a href=&#34;https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/loadbalancingexporter/README.md&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;OpenTelemetry load balancing exporter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The routing key ensures that a specific collector in the second layer always handles spans from the same trace ID, guaranteeing that sampling decisions are made correctly.
You can configure the exporter with targets using static IP addresses, multi-IP DNS A record entries, and a Kubernetes headless service resolver.
Using this configuration lets you scale up or down the number of layer two collectors.&lt;/p&gt;
&lt;p&gt;There are some important points to note with the load balancer exporter around scaling and resilience, mostly around its eventual consistency model. For more information, refer to &lt;a href=&#34;https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/loadbalancingexporter/README.md#resilience-and-scaling-considerations&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Resilience and scaling considerations&lt;/a&gt;.
The most important in terms of tail sampling is that routing occurs based on an algorithm taking into account the number of backends available to the load balancer.
This can affect the target for trace ID spans before eventual consistency occurs.&lt;/p&gt;
&lt;p&gt;For an example manifest for a two layer OpenTelemetry Collector deployment based around Kubernetes services, refer to the &lt;a href=&#34;https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/loadbalancingexporter/example/k8s-resolver/README.md&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Kubernetes service resolver README&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;pipeline-workflows&#34;&gt;Pipeline workflows&lt;/h2&gt;
&lt;p&gt;When implementing tail sampling into your telemetry collection pipeline, there are some considerations that should be applied.
The act of sampling reduces the amount of tracing telemetry data that&amp;rsquo;s sent to Tempo.
This can have an effect on observation of data inside Grafana.&lt;/p&gt;
&lt;p&gt;The following is a suggested pipeline that can be applied to both &lt;a href=&#34;/docs/alloy/latest/&#34;&gt;Grafana Alloy&lt;/a&gt; and the &lt;a href=&#34;https://opentelemetry.io/docs/collector/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;OpenTelemetry Collector&lt;/a&gt;, to carry out tail sampling, but also ensure that other telemetry signals are still captured for observation from within Grafana and Grafana Cloud.&lt;/p&gt;
&lt;p&gt;This pipeline exists in the second layer of collectors, sent data by the load balancing layer, and is commonly deployed as a Kubernetes &lt;code&gt;StatefulSet&lt;/code&gt; to ensure that each instance has a consistent identity. A realistic example pipeline could be made of up the following components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;OTLP Receiver&lt;/strong&gt; is the &lt;a href=&#34;https://opentelemetry.io/docs/specs/otel/protocol/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;OpenTelemetry Protocol&lt;/a&gt; (OTLP) receiver in this pipeline, and receives traces from the load balancing exporter. This receiver is responsible for initiating the processing pipeline within this collector layer.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Transform Processor&lt;/strong&gt; is used to modify any incoming trace spans before they are exported to other components in the pipeline. This allows the mutation of attributes (for example, deletion, mutation, insertion, etc.), as well as any other required &lt;a href=&#34;https://opentelemetry.io/docs/collector/transforming-telemetry/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;OpenTelemetry Transform Language&lt;/a&gt; (OTTL)-based operations. This component must come before metric generation Connectors or the tail sampling Processor so that required changes can be used for label names (for metrics) or policy matching (for tail sampling).&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;SpanMetrics Connector&lt;/strong&gt; is responsible for extracting metrics from the incoming traces, and can be used as a fork in the pipeline. These metrics include crucial information such as trace latency, error rates, and other performance indicators, which are essential for understanding the health and performance of your services. It&amp;rsquo;s important to ensure that this Connector is configured to receive span data before any tail sampling occurs.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;ServiceGraph Connector&lt;/strong&gt; generates service dependency graphs from the traces, and can be used as a fork in the pipeline or chained together with the span metrics connector. These graphs visually represent the interactions between various services in your system, helping to identify bottlenecks and understand the flow of requests. It&amp;rsquo;s important to ensure that this Connector is configured to receive span data before any tail sampling occurs.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Tail Sampling Processor&lt;/strong&gt; is the core of the secondary collector layer. It applies the sampling policies you’ve configured to decide which traces should be retained and further processed. The sampling decision is made after the entire trace has been observed, or the decision wait time has elapsed, allowing the processor to make more informed choices based on the full context of the trace.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;OTLP Exporter&lt;/strong&gt; exports the sampled traces (or generated span and service metrics) to Grafana Cloud via OTLP.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Prometheus Exporter&lt;/strong&gt; is optional. If metrics aren&amp;rsquo;t sent via OTLP, then you can use this component to send Prometheus compatible metrics to &lt;a href=&#34;/oss/mimir/&#34;&gt;Mimir&lt;/a&gt; or &lt;a href=&#34;/products/cloud/metrics/&#34;&gt;Grafana Cloud Metrics&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
]]></content><description>&lt;h1 id="sampling">Sampling&lt;/h1>
&lt;p>Grafana Tempo is a cost-effective solution that ingests and stores traces that provide maximum observability across your application estate.
However, sometimes constraints mean that storing all of your traces is not desirable, for example runtime or egress traffic related costs.
There are a number of ways to lower trace volume, including varying sampling strategies.&lt;/p></description></item></channel></rss>