This is documentation for the next version of Tempo. For the latest stable release, go to the latest version.
Apache Parquet block format
Tempo has a default columnar block format based on Apache Parquet. This format is required for tags-based search as well as TraceQL, the query language for traces. The columnar block format improves search performance and enables an ecosystem of tools, including Tempo CLI, to access the underlying trace data.
For more information, refer to the Parquet design document and Issue 1480. Additionally, there is now a Parquet v3 design document.
Considerations
The Parquet block format is enabled by default since Tempo 2.0.
If you install using the Tempo Helm charts, then Parquet is enabled by default. No data conversion or upgrade process is necessary. As soon as a block format is enabled, Tempo starts writing data in that format, leaving existing data as-is.
Block formats based on Parquet require more CPU and memory resources than the previous v2
format but provide search and TraceQL functionality.
Choose a different block format
The default block format is vParquet4
, which is the latest iteration of the Parquet-based columnar block format in Tempo.
vParquet4 introduces new columns which enable querying for data in array attributes as well as events and links.
For more information, refer to Dedicated attribute columns.
You can still use the previous format vParquet3
.
To enable it, set the block version option to vParquet3
in the Storage section of the configuration file.
# block format version. options: v2, vParquet2, vParquet3, vParquet4
[version: vParquet4]
In some cases, you may choose to disable Parquet and use the old v2
block format.
Using the v2
block format disables all forms of search, but also reduces resource consumption, and may be desired for a high-throughput cluster that doesn’t need these capabilities.
To make this change, set the block version option to v2
in the Storage section of the configuration file.
# block format version. options: v2, vParquet2, vParquet3, vParquet4
[version: v2]
To re-enable the default vParquet4
format, remove the block version option from the Storage section of the configuration file or set the option to vParquet4
.
Parquet configuration parameters
Some parameters in the Tempo configuration are specific to Parquet. For more information, refer to the storage configuration documentation.
Trace search parameters
These configuration options impact trace search.
Parameter | Default value | Description |
---|---|---|
[read_buffer_size_bytes: <int>] | 10485676 | Size of read buffers used when performing search on a vParquet block. This value times the read_buffer_count is the total amount of bytes used for buffering when performing search on a Parquet block. |
[read_buffer_count: <int>] | 32 | Number of read buffers used when performing search on a vParquet block. This value times the read_buffer_size_bytes is the total amount of bytes used for buffering when performing search on a Parquet block. |
The cache_control
section contains the follow parameters for Parquet metadata objects:
Parameter | Default value | Description |
---|---|---|
[footer: | false | Specifies if the footer should be cached |
[column_index: <bool> | default = false] | false | Specifies if the column index should be cached |
[offset_index: <bool> | default = false] | false | Specifies if the offset index should be cached |