Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.
Apache Parquet backend
This is an experimental feature released with Tempo 1.5. For more information about how to enable it, continue reading.
Tempo now has a columnar block format based on Apache Parquet. A columnar block format may result in improved search performance and also enables a large ecosystem of tools access to the underlying trace data.
For more information, refer to the Parquet design document and Issue 1480.
Considerations
The new Parquet block format can be used as a drop-in replacement for Tempo’s existing block format. No data conversion or upgrade process is necessary. As soon as the Parquet format is enabled, Tempo starts writing data in that format, leaving existing data as-is.
Please note, however, that enabling the Parquet block format means Tempo will require more CPU and memory resources than it previously did.
Enable Parquet
To use Parquet, set the block format option to vParquet
in the Storage section of the configuration file.
# block format version. options: v2, vParquet
[version: vParquet | default = v2]
The following adjustments are recommended for your configuration:
querier:
max_concurrent_queries: 100
search:
prefer_self: 50 # only if you're using external endpoints
query_frontend:
max_outstanding_per_tenant: 2000
search:
concurrent_jobs: 2000
target_bytes_per_job: 400_000_000
storage:
trace:
<gcs|s3|azure>:
hedge_requests_at: 1s
hedge_requests_up_to: 2
Parquet configuration parameters
Some parameters in the Tempo configuration are specific to Parquet.
For more information, refer to the
usage-report configuration documentation.
Trace search parameters
These configuration options impact trace search.
The cache_control
section contains the follow parameters for Parquet metadata objects: