Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.
Configuration
You can use Grafana Cloud to avoid installing, maintaining, and scaling your own instance of Grafana Tempo. Create a free account to get started, which includes free forever access to 10k metrics, 50GB logs, 50GB traces, 500VUh k6 testing & more.
This document explains the configuration options for Tempo as well as the details of what they impact. It includes:
- server
- distributor
- ingester
- metrics-generator
- query-frontend
- querier
- compactor
- storage
- memberlist
- overrides
- search
- usage-report
Use environment variables in the configuration
You can use environment variable references in the configuration file to set values that need to be configurable during deployment using --config.expand-env
option.
To do this, use:
${VAR}
Where VAR
is the name of the environment variable.
Each variable reference is replaced at startup by the value of the environment variable. The replacement is case-sensitive and occurs before the YAML file is parsed. References to undefined variables are replaced by empty strings unless you specify a default value or custom error text.
To specify a default value, use:
${VAR:-default_value}
where default_value
is the value to use if the environment variable is undefined.
You can find more about other supported syntax here.
Server
Tempo uses the Weaveworks/common server. For more information on configuration options, see here.
Distributor
For more information on configuration options, see here.
Distributors receive spans and forward them to the appropriate ingesters.
The following configuration enables all available receivers with their default configuration. For a production deployment, enable only the receivers you need. Additional documentation and more advanced configuration options are available in the receiver README.
Ingester
For more information on configuration options, see here.
The ingester is responsible for batching up traces and pushing them to TempoDB.
# Ingester configuration block
ingester:
# Lifecycler is responsible for managing the lifecycle of entries in the ring.
# For a complete list of config options check the lifecycler section under the ingester config at the following link -
# https://cortexmetrics.io/docs/configuration/configuration-file/#ingester_config
lifecycler:
ring:
# number of replicas of each span to make while pushing to the backend
replication_factor: 3
# amount of time a trace must be idle before flushing it to the wal.
# (default: 10s)
[trace_idle_period: <duration>]
# how often to sweep all tenants and move traces from live -> wal -> completed blocks.
# (default: 10s)
[flush_check_period: <duration>]
# maximum size of a block before cutting it
# (default: 1073741824 = 1GB)
[max_block_bytes: <int>]
# maximum length of time before cutting a block
# (default: 1h)
[max_block_duration: <duration>]
# duration to keep blocks in the ingester after they have been flushed
# (default: 15m)
[ complete_block_timeout: <duration>]
Metrics-generator
For more information on configuration options, see here.
The metrics-generator processes spans and write metrics using the Prometheus remote write protocol.
The metrics-generator is an optional component, it can be enabled by setting the following top-level setting. In microservices mode, it must be set for the distributors and the metrics-generators.
metrics_generator_enabled: true
Metrics-generator processors are disabled by default. To enable it for a specific tenant set metrics_generator_processors
in the overrides section.
Query-frontend
For more information on configuration options, see here.
The Query Frontend is responsible for sharding incoming requests for faster processing in parallel (by the queriers).
Querier
For more information on configuration options, see here.
The Querier is responsible for querying the backends/cache for the traceID.
It also queries compacted blocks that fall within the (2 * BlocklistPoll) range where the value of Blocklist poll duration is defined in the storage section below.
Compactor
For more information on configuration options, see here.
Compactors stream blocks from the storage backend, combine them and write them back. Values shown below are the defaults.
Storage
Tempo supports Amazon S3, GCS, Azure, and local file system for storage. In addition, you can use Memcached or Redis for increased query performance.
For more information on configuration options, see here.
Local storage recommendations
While you can use local storage, object storage is recommended for production workloads. A local backend will not correctly retrieve traces with a distributed deployment unless all components have access to the same disk. Tempo is designed for object storage more than local storage.
At Grafana Labs, we have run Tempo with SSDs when using local storage. Hard drives have not been tested.
How much storage space you need can be estimated by considering the ingested bytes and retention. For example, ingested bytes per day times retention days = stored bytes.
You can not use both local and object storage in the same Tempo deployment.
Storage block configuration example
The storage block is used to configure TempoDB. The following example shows common options. For further platform-specific information, refer to the following:
Memberlist
Memberlist is the default mechanism for all of the Tempo pieces to coordinate with each other.
Overrides
Tempo provides an overrides module for users to set global or per-tenant override settings.
Ingestion limits
The default limits in Tempo may not be sufficient in high-volume tracing environments.
Errors including RATE_LIMITED
/TRACE_TOO_LARGE
/LIVE_TRACES_EXCEEDED
occur when these limits are exceeded.
See below for how to override these limits globally or per tenant.
Standard overrides
You can create an overrides
section to configure new ingestion limits that applies to all tenants of the cluster.
A snippet of a config.yaml
file showing how the overrides section is here.
Tenant-specific overrides
You can set tenant-specific overrides settings in a separate file and point per_tenant_override_config
to it.
This overrides file is dynamically loaded.
It can be changed at runtime and reloaded by Tempo without restarting the application.
These override settings can be set per tenant.
# /conf/tempo.yaml
# Overrides configuration block
overrides:
per_tenant_override_config: /conf/overrides.yaml
---
# /conf/overrides.yaml
# Tenant-specific overrides configuration
overrides:
"<tenant id>":
[ingestion_burst_size_bytes: <int>]
[ingestion_rate_limit_bytes: <int>]
[max_bytes_per_trace: <int>]
[max_traces_per_user: <int>]
# A "wildcard" override can be used that will apply to all tenants if a match is not found otherwise.
"*":
[ingestion_burst_size_bytes: <int>]
[ingestion_rate_limit_bytes: <int>]
[max_bytes_per_trace: <int>]
[max_traces_per_user: <int>]
Override strategies
The trace limits specified by the various parameters are, by default, applied as per-distributor limits.
For example, a max_traces_per_user
setting of 10000 means that each distributor within the cluster has a limit of 10000 traces per user.
This is known as a local
strategy in that the specified trace limits are local to each distributor.
A setting that applies at a local level is quite helpful in ensuring that each distributor independently can process traces up to the limit without affecting the tracing limits on other distributors.
However, as a cluster grows quite large, this can lead to quite a large quantity of traces.
An alternative strategy may be to set a global
trace limit that establishes a total budget of all traces across all distributors in the cluster.
The global limit is averaged across all distributors by using the distributor ring.
# /conf/tempo.yaml
overrides:
[ingestion_rate_strategy: <global|local> | default = local]
For example, this configuration specifies that each instance of the distributor will apply a limit of 15MB/s
.
overrides:
- ingestion_rate_strategy: local
- ingestion_rate_limit_bytes: 15000000
This configuration specifies that together, all distributor instances will apply a limit of 15MB/s
.
So if there are 5 instances, each instance will apply a local limit of (15MB/s / 5) = 3MB/s
.
overrides:
- ingestion_rate_strategy: global
- ingestion_rate_limit_bytes: 15000000
Search
Tempo search can be enabled by the following top-level setting. In microservices mode, it must be set for the distributors and queriers.
search_enabled: true
Additional search-related settings are available in the distributor and ingester sections.
Usage-report
By default, Tempo will report anonymous usage data about the shape of a deployment to Grafana Labs. This data is used to determine how common the deployment of certain features are, if a feature flag has been enabled, and which replication factor or compression levels are used.
By providing information on how people use Tempo, usage reporting helps the Tempo team decide where to focus their development and documentation efforts. No private information is collected, and all reports are completely anonymous.
Reporting is controlled by a configuration option.
The following configuration values are used:
- Receivers enabled
- Frontend concurrency and version
- Storage cache, backend, wal and block encodings
- Ring replication factor, and
kvstore
- Features toggles enabled
No performance data is collected.
You can disable the automatic reporting of this generic information using the following configuration:
usage_report:
reporting_enabled: false