Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.
About the Pyroscope architecture
Pyroscope has a microservices-based architecture. The system has multiple horizontally scalable microservices that can run separately and in parallel. Pyroscope microservices are called components.
Pyroscope’s design compiles the code for all components into a single binary.
The -target
parameter controls which component(s) that single binary will behave as. For those looking for a simple way to get started, Pyroscope can also be run in monolithic mode, with all components running simultaneously in one process.
For more information, refer to Deployment modes.
Pyroscope components
Most components are stateless and do not require any data persisted between process restarts. Some components are stateful and rely on non-volatile storage to prevent data loss between process restarts. For details about each component, see its page in Components.
The write path
Ingesters receive incoming profiles from the distributors. Each push request belongs to a tenant, and the ingester appends the received profiles to the specific per-tenant Pyroscope database that is stored on the local disk.
The per-tenant Pyroscope database is lazily created in each ingester as soon as the first profiles are received for that tenant.
The in-memory profiles are periodically flushed to disk and new block is created.
For more information, refer to Ingester.
Series sharding and replication
By default, each profile series is replicated to three ingesters, and each ingester writes its own block to the long-term storage. The Compactor merges blocks from multiple ingesters into a single block, and removes duplicate samples. Blocks compaction significantly reduces storage utilization.
The read path
Queries coming into Pyroscope arrive at query-frontend component which is responsible for accelerating queries and dispatching them to the query-scheduler.
The query-scheduler maintains a queue of queries and ensures that each tenant’s queries are fairly executed.
The queriers act as workers, pulling queries from the queue in the query-scheduler. The queriers connect to the ingesters to fetch all the data needed to execute a query. For more information about how the query is executed, refer to querier.
Depending on the time window selected, the querier involves ingesters for recent data and store-gateways for data from long-term storage.
Long-term storage
The Pyroscope storage format is described in detail in on the block format page. The Pyroscope storage format stores each tenant’s profiles into their own on-disk block. Each on-disk block directory contains an index file, a file containing metadata, and the Parquet tables.
Pyroscope requires any of the following object stores for the block files:
- Amazon S3
- Google Cloud Storage
- Microsoft Azure Storage
- OpenStack Swift
- Local Filesystem (single node only)
For more information, refer to configure object storage and configure disk storage.