Deployment modes

Thu, 28 May 2026 17:50:33 +0100

Deployment modes

Tempo supports two deployment modes: monolithic and microservices. All components are compiled into a single binary, and the -target flag determines which mode runs.

Monolithic mode

In monolithic mode, the required components run in a single process using -target=all, which is the default. No Kafka is required. The distributor pushes trace data in-process directly to the live-store and metrics-generator, and traces are flushed to the configured storage backend. Object storage is recommended for production deployments.

Use monolithic mode when:

You are getting started with Tempo or evaluating it
You need a development or testing environment
Your trace volume is under 25-35 MB/s or 55k-80k spans/s
Operational simplicity matters more than independent scaling

Monolithic mode has some trade-offs to be aware of. All components share the same resource pool, so a spike in query load can affect write throughput and vice versa. There is no independent scaling: you can scale vertically or run multiple identical instances, but you cannot scale individual components separately. At higher volumes, memory pressure from collocated components can cause issues.

Microservices mode

In microservices mode, each component runs as a separate process with its own -target flag. For example, -target=distributor or -target=querier. This mode requires a Kafka-compatible system, such as Apache Kafka, Redpanda, or WarpStream, as the durable queue between the distributor and downstream consumers.

Use microservices mode when:

You are running a production deployment
You have high trace volumes that require independent scaling
You need high availability and isolated failure domains
You want to scale write throughput, query performance, and recent-data capacity independently

Microservices mode provides independent scaling for each component and isolated failure domains. A querier crash doesn’t affect ingestion, and a block-builder restart doesn’t affect query availability. Live-stores can be deployed across availability zones for high availability.

Choosing a mode

Consideration	Monolithic	Microservices
Kafka required	No	Yes
Scaling	Single process; scale vertically or run multiple identical instances	Each component scales independently
Failure isolation	All components share resources	Isolated failure domains per component
Operational complexity	Low	Higher, with more processes to manage
Best for	Getting started, development, up to 25-35 MB/s	Production, high volume, high availability

Next steps

For detailed architecture, component descriptions, scaling guidelines, and migration guidance, refer to the Deployment modes reference.
To size your cluster, refer to Size the cluster.
To deploy Tempo, refer to Deploy your Tempo instance.

Size the cluster

Thu, 28 May 2026 17:50:33 +0100

Size the cluster

Resource requirements for your Grafana Tempo cluster depend on the amount and rate of data processed, retained, and queried.

This document provides basic configuration guidelines that you can use as a starting point to help size your own deployment.

Note
Tempo is under continuous development. These requirements can change with each release.

Factors impacting cluster sizing

The size of the cluster you deploy depends on how many resources it needs for a given ingestion rate and retention: number of spans/time, average byte span size, rate of querying, and retention N days.

Tracing instrumentation also effects your Tempo cluster requirements. Refer to Best practices for suggestions on determining where to add spans, span length, and attributes.

Example sample cluster sizing

Distributor:

1 replica per every 10MB/s of received traffic
CPU: 2 cores
Mem: 2 GB

Live-store:

1 replica per every 6-10MB/s of received traffic
CPU: 1 core
Mem: 4-20GB, determined by trace composition
Typically deployed across multiple availability zones for high availability

Block-builder:

1 replica per partition
CPU: 0.5 cores
Mem: 5-10GB, determined by trace composition

Querier:

1 replica per every 1-2MB/s of received traffic.
CPU: dependent on trace size and queries
Mem: 4-20GB, determined by trace composition and queries
This number of queriers should give good performance for typical search patterns and time ranges. Can scale up or down to fit the specific workload.

Query-Frontend:

2 replicas, for high availability
CPU: dependent on trace size and queries
Mem: 4-20GB, dependent on trace size and queries

Backend-scheduler:

1 replica (only one scheduler should be running at a time)
CPU: 0.5 cores
Mem: 1-2GB

Backend-worker:

Sizing guidelines to be determined based on compaction workload
CPU: 0.5 cores
Mem: 1-2GB

Performance tuning resources

Refer to these documents for additional information on tuning your Tempo cluster:

For information on more advanced system options, refer to Manage advanced systems.

Plan your Tempo deployment on Grafana Labs

Deployment modes

Deployment modes

Monolithic mode

Microservices mode

Choosing a mode

Next steps

Size the cluster

Size the cluster

Factors impacting cluster sizing

Example sample cluster sizing

Performance tuning resources