Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.

Configuration

Parquet

Open source

Apache Parquet backend

This is an experimental feature released with Tempo 1.5. For more information about how to enable it, continue reading.

Tempo now has a columnar block format based on Apache Parquet. A columnar block format may result in improved search performance and also enables a large ecosystem of tools access to the underlying trace data.

For more information, refer to the Parquet design document and Issue 1480.

Considerations

The new Parquet block format can be used as a drop-in replacement for Tempo’s existing block format. No data conversion or upgrade process is necessary. As soon as the Parquet format is enabled, Tempo starts writing data in that format, leaving existing data as-is.

Please note, however, that enabling the Parquet block format means Tempo will require more CPU and memory resources than it previously did.

Enable Parquet

To use Parquet, set the block format option to vParquet in the Storage section of the configuration file.

# block format version. options: v2, vParquet
[version: vParquet | default = v2]

The following adjustments are recommended for your configuration:

querier:
  max_concurrent_queries: 100
  search:
    prefer_self: 50   # only if you're using external endpoints

query_frontend:
  max_outstanding_per_tenant: 2000
  search:
    concurrent_jobs: 2000
    target_bytes_per_job: 400_000_000

storage:
  trace:
    <gcs|s3|azure>:
      hedge_requests_at: 1s
      hedge_requests_up_to: 2

Parquet configuration parameters

Some parameters in the Tempo configuration are specific to Parquet.
For more information, refer to the usage-report configuration documentation.

Trace search parameters

These configuration options impact trace search.

Parameter	Default value	Description
`[read_buffer_size_bytes: <int>]`	`4194304`	Size of read buffers used when performing search on a vParquet block. This value times the `read_buffer_count` is the total amount of bytes used for buffering when performing search on a Parquet block.

`[read_buffer_count: <int>]`	8	Number of read buffers used when performing search on a vParquet block. This value times the `read_buffer_size_bytes` is the total amount of bytes used for buffering when performing search on a Parquet block.

The cache_control section contains the follow parameters for Parquet metadata objects:

Parameter	Default value	Description
`[footer: \| default = false]`	`false`	Specifies if the footer should be cached
`[column_index: <bool> \| default = false]`	`false`	Specifies if the column index should be cached
`[offset_index: <bool> \| default = false]`	`false`	Specifies if the offset index should be cached