Grafana Pyroscope bucket index
The bucket index is a per-tenant file that contains the list of blocks and block deletion marks in the storage. The bucket index is stored in the backend object storage, is periodically updated by the compactor, and used by store-gateways to discover blocks in the storage.
Because of this, they need to periodically scan the bucket to look for new blocks uploaded by ingesters or compactors, and blocks deleted (or marked for deletion) by compactors.
When the bucket index is enabled, store-gateways periodically look up the per-tenant bucket index instead of scanning the bucket via
list objects operations.
This provides the following benefits:
- Reduced number of API calls to the object storage by store-gateway
- No “list objects” storage API calls performed by store-gateway
Structure of the index
List of complete blocks of a tenant, including blocks marked for deletion. Partial blocks are excluded from the index.
List of block deletion marks.
A Unix timestamp, with precision measured in seconds, displays the last time index was updated and written to the storage.
How it gets updated
The compactor periodically scans the bucket and uploads an updated bucket index to the storage.
You can configure the frequency with which the bucket index is updated via
The use of the bucket index is optional, but the index is built and updated by the compactor even if
This behavior ensures that the bucket index for any tenant exists and that query result consistency is guaranteed if a Grafana Pyroscope cluster operator enables the bucket index in a live cluster.
The overhead introduced by keeping the bucket index updated is not significant.
How it’s used by the store-gateway
The store-gateway, at startup and periodically, fetches the bucket index for each tenant that belongs to its shard, and uses it as the source of truth for the blocks and deletion marks in the storage. This removes the need to periodically scan the bucket to discover blocks belonging to its shard.
Ingesters regularly add new blocks to the bucket as they offload data to long-term storage, and compactors subsequently compact these blocks and mark the original blocks for deletion. Actual deletion happens after the delay value that is associated with the parameter
-compactor.deletion-delay. An attempt to fetch a deleted block will lead to failure of the query. Therefore, in this context, an almost up-to-date view is a view that’s outdated by less than the value of