pyroscope.scrape
pyroscope.scrape
collects pprof performance profiles for a given set of HTTP targets
.
pyroscope.scrape
mimcks the scraping behavior of prometheus.scrape
.
Similarly to how Prometheus scrapes metrics via HTTP, pyroscope.scrape
collects profiles via HTTP requests.
Unlike Prometheus, which usually only scrapes one /metrics
endpoint per target, pyroscope.scrape
may need to scrape multiple endpoints for the same target.
This is because different types of profiles are scraped on different endpoints.
For example, “mutex” profiles may be scraped on a /debug/pprof/delta_mutex
HTTP endpoint, whereas memory consumption may be scraped on a /debug/pprof/allocs
HTTP endpoint.
The profile paths, protocol scheme, scrape interval, scrape timeout, query parameters, as well as any other settings can be configured within pyroscope.scrape
.
The pyroscope.scrape
component regards a scrape as successful if it responded with an HTTP 200 OK
status code and returned the body of a valid pprof profile.
If a scrape request fails, the debug UI for pyroscope.scrape
will show:
- Detailed information about the failure.
- The time of the last successful scrape.
- The labels last used for scraping.
The scraped performance profiles can be forwarded to components such as pyroscope.write
via the forward_to
argument.
Multiple pyroscope.scrape
components can be specified by giving them different labels.
Usage
pyroscope.scrape "LABEL" {
targets = TARGET_LIST
forward_to = RECEIVER_LIST
}
Arguments
pyroscope.scrape
starts a new scrape job to scrape all of the input targets.
Multiple scrape jobs can be started for a single input target
when scraping multiple profile types.
The list of arguments that can be used to configure the block is presented below.
Any omitted arguments take on their default values.
If conflicting arguments are being passed (for example, configuring both bearer_token
and bearer_token_file
), then pyroscope.scrape
will fail to start and will report an error.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
targets | list(map(string)) | List of targets to scrape. | yes | |
forward_to | list(ProfilesReceiver) | List of receivers to send scraped profiles to. | yes | |
job_name | string | The job name to override the job label with. | component name | no |
params | map(list(string)) | A set of query parameters with which the target is scraped. | no | |
scrape_interval | duration | How frequently to scrape the targets of this scrape configuration. | "15s" | no |
scrape_timeout | duration | The timeout for scraping targets of this configuration. Must be larger than scrape_interval . | "18s" | no |
delta_profiling_duration | duration | The duration for a delta profiling to be scraped. Must be larger than 1 second. | "14s" | no |
scheme | string | The URL scheme with which to fetch metrics from targets. | "http" | no |
bearer_token_file | string | File containing a bearer token to authenticate with. | no | |
bearer_token | secret | Bearer token to authenticate with. | no | |
enable_http2 | bool | Whether HTTP2 is supported for requests. | true | no |
follow_redirects | bool | Whether redirects returned by the server should be followed. | true | no |
proxy_url | string | HTTP proxy to send requests through. | no | |
no_proxy | string | Comma-separated list of IP addresses, CIDR notations, and domain names to exclude from proxying. | no | |
proxy_from_environment | bool | Use the proxy URL indicated by environment variables. | false | no |
proxy_connect_header | map(list(secret)) | Specifies headers to send to proxies during CONNECT requests. | no |
At most, one of the following can be provided:
no_proxy
can contain IPs, CIDR notations, and domain names. IP and domain names can contain port numbers.
proxy_url
must be configured if no_proxy
is configured.
proxy_from_environment
uses the environment variables HTTP_PROXY, HTTPS_PROXY, and NO_PROXY (or the lowercase versions thereof).
Requests use the proxy from the environment variable matching their scheme, unless excluded by NO_PROXY.
proxy_url
and no_proxy
must not be configured if proxy_from_environment
is configured.
proxy_connect_header
should only be configured if proxy_url
or proxy_from_environment
are configured.
job_name
argument
job_name
defaults to the component’s unique identifier.
For example, the job_name
of pyroscope.scrape "local" { ... }
will be "pyroscope.scrape.local"
.
targets
argument
The list of targets
can be provided statically, dynamically, or a combination of both.
The special __address__
label must always be present and corresponds to the
<host>:<port>
that is used for the scrape request.
Labels starting with a double underscore (__
) are treated as internal, and are removed prior to scraping.
The special label service_name
is required and must always be present.
If it’s not specified, pyroscope.scrape
will attempt to infer it from either of the following sources, in this order:
__meta_kubernetes_pod_annotation_pyroscope_io_service_name
which is apyroscope.io/service_name
pod annotation.__meta_kubernetes_namespace
and__meta_kubernetes_pod_container_name
__meta_docker_container_name
__meta_dockerswarm_container_label_service_name
or__meta_dockerswarm_service_name
If service_name
is not specified and could not be inferred, then it is set to unspecified
.
The following labels are automatically injected to the scraped profiles so that they can be linked to a scrape target:
Label | Description |
---|---|
"job" | The job_name that the target belongs to. |
"instance" | The __address__ or <host>:<port> of the scrape target’s URL. |
"service_name" | The inferred Pyroscope service name. |
scrape_interval
argument
The scrape_interval
typically refers to the frequency with which Alloy collects performance profiles from the monitored targets.
It represents the time interval between consecutive scrapes or data collection events.
This parameter is important for controlling the trade-off between resource usage and the freshness of the collected data.
If scrape_interval
is short:
- Advantages:
- Fewer profiles may be lost if the application being scraped crashes.
- Disadvantages:
- Greater consumption of CPU, memory, and network resources during scrapes and remote writes.
- The backend database (Pyroscope) will consume more storage space.
If scrape_interval
is long:
- Advantages:
- Lower resource consumption.
- Disadvantages:
- More profiles may be lost if the application being scraped crashes.
- If the delta argument is set to
true
, the batch size of each remote write to Pyroscope may be bigger. The Pyroscope database may need to be tuned with higher limits. - If the delta argument is set to
true
, there is a larger risk of reaching the HTTP server timeouts of the application being scraped.
For example, consider this situation:
pyroscope.scrape
is configured with ascrape_interval
of"60s"
.- The application being scraped is running an HTTP server with a timeout of 30 seconds.
- Any scrape HTTP requests where the delta argument is set to
true
will fail, because they will attempt to run for 59 seconds.
Blocks
The following blocks are supported inside the definition of pyroscope.scrape
:
Hierarchy | Block | Description | Required |
---|---|---|---|
basic_auth | basic_auth | Configure basic_auth for authenticating to targets. | no |
authorization | authorization | Configure generic authorization to targets. | no |
oauth2 | oauth2 | Configure OAuth2 for authenticating to targets. | no |
oauth2 > tls_config | tls_config | Configure TLS settings for connecting to targets via OAuth2. | no |
tls_config | tls_config | Configure TLS settings for connecting to targets. | no |
profiling_config | profiling_config | Configure profiling settings for the scrape job. | no |
profiling_config > profile.memory | profile.memory | Collect memory profiles. | no |
profiling_config > profile.block | profile.block | Collect profiles on blocks. | no |
profiling_config > profile.goroutine | profile.goroutine | Collect goroutine profiles. | no |
profiling_config > profile.mutex | profile.mutex | Collect mutex profiles. | no |
profiling_config > profile.process_cpu | profile.process_cpu | Collect CPU profiles. | no |
profiling_config > profile.fgprof | profile.fgprof | Collect fgprof profiles. | no |
profiling_config > profile.godeltaprof_memory | profile.godeltaprof_memory | Collect godeltaprof memory profiles. | no |
profiling_config > profile.godeltaprof_mutex | profile.godeltaprof_mutex | Collect godeltaprof mutex profiles. | no |
profiling_config > profile.godeltaprof_block | profile.godeltaprof_block | Collect godeltaprof block profiles. | no |
profiling_config > profile.custom | profile.custom | Collect custom profiles. | no |
clustering | clustering | Configure the component for when Alloy is running in clustered mode. | no |
The >
symbol indicates deeper levels of nesting.
For example, oauth2 > tls_config
refers to a tls_config
block defined inside an oauth2
block.
Any omitted blocks take on their default values.
For example, if profile.mutex
is not specified in the config, the defaults documented in profile.mutex will be used.
basic_auth block
Name | Type | Description | Default | Required |
---|---|---|---|---|
password_file | string | File containing the basic auth password. | no | |
password | secret | Basic auth password. | no | |
username | string | Basic auth username. | no |
password
and password_file
are mutually exclusive, and only one can be provided inside a basic_auth
block.
authorization block
Name | Type | Description | Default | Required |
---|---|---|---|---|
credentials_file | string | File containing the secret value. | no | |
credentials | secret | Secret value. | no | |
type | string | Authorization type, for example, “Bearer”. | no |
credential
and credentials_file
are mutually exclusive, and only one can be provided inside an authorization
block.
oauth2 block
Name | Type | Description | Default | Required |
---|---|---|---|---|
client_id | string | OAuth2 client ID. | no | |
client_secret_file | string | File containing the OAuth2 client secret. | no | |
client_secret | secret | OAuth2 client secret. | no | |
endpoint_params | map(string) | Optional parameters to append to the token URL. | no | |
proxy_url | string | HTTP proxy to send requests through. | no | |
no_proxy | string | Comma-separated list of IP addresses, CIDR notations, and domain names to exclude from proxying. | no | |
proxy_from_environment | bool | Use the proxy URL indicated by environment variables. | false | no |
proxy_connect_header | map(list(secret)) | Specifies headers to send to proxies during CONNECT requests. | no | |
scopes | list(string) | List of scopes to authenticate with. | no | |
token_url | string | URL to fetch the token from. | no |
client_secret
and client_secret_file
are mutually exclusive, and only one can be provided inside an oauth2
block.
The oauth2
block may also contain a separate tls_config
sub-block.
no_proxy
can contain IPs, CIDR notations, and domain names. IP and domain names can contain port numbers.
proxy_url
must be configured if no_proxy
is configured.
proxy_from_environment
uses the environment variables HTTP_PROXY, HTTPS_PROXY, and NO_PROXY (or the lowercase versions thereof).
Requests use the proxy from the environment variable matching their scheme, unless excluded by NO_PROXY.
proxy_url
and no_proxy
must not be configured if proxy_from_environment
is configured.
proxy_connect_header
should only be configured if proxy_url
or proxy_from_environment
are configured.
tls_config block
Name | Type | Description | Default | Required |
---|---|---|---|---|
ca_pem | string | CA PEM-encoded text to validate the server with. | no | |
ca_file | string | CA certificate to validate the server with. | no | |
cert_pem | string | Certificate PEM-encoded text for client authentication. | no | |
cert_file | string | Certificate file for client authentication. | no | |
insecure_skip_verify | bool | Disables validation of the server certificate. | no | |
key_file | string | Key file for client authentication. | no | |
key_pem | secret | Key PEM-encoded text for client authentication. | no | |
min_version | string | Minimum acceptable TLS version. | no | |
server_name | string | ServerName extension to indicate the name of the server. | no |
The following pairs of arguments are mutually exclusive and can’t both be set simultaneously:
ca_pem
andca_file
cert_pem
andcert_file
key_pem
andkey_file
When configuring client authentication, both the client certificate (using cert_pem
or cert_file
) and the client key (using key_pem
or key_file
) must be provided.
When min_version
isn’t provided, the minimum acceptable TLS version is inherited from Go’s default minimum version, TLS 1.2.
If min_version
is provided, it must be set to one of the following strings:
"TLS10"
(TLS 1.0)"TLS11"
(TLS 1.1)"TLS12"
(TLS 1.2)"TLS13"
(TLS 1.3)
profiling_config block
The profiling_config
block configures the profiling settings when scraping targets.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
path_prefix | string | The path prefix to use when scraping targets. | no |
profile.memory block
The profile.memory
block collects profiles on memory consumption.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | boolean | Enable this profile type to be scraped. | true | no |
path | string | The path to the profile type on the target. | "/debug/pprof/allocs" | no |
delta | boolean | Whether to scrape the profile as a delta. | false | no |
For more information about the delta
argument, see the delta argument section.
profile.block block
The profile.block
block collects profiles on process blocking.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | boolean | Enable this profile type to be scraped. | true | no |
path | string | The path to the profile type on the target. | "/debug/pprof/block" | no |
delta | boolean | Whether to scrape the profile as a delta. | false | no |
For more information about the delta
argument, see the delta argument section.
profile.goroutine block
The profile.goroutine
block collects profiles on the number of goroutines.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | boolean | Enable this profile type to be scraped. | true | no |
path | string | The path to the profile type on the target. | "/debug/pprof/goroutine" | no |
delta | boolean | Whether to scrape the profile as a delta. | false | no |
For more information about the delta
argument, see the delta argument section.
profile.mutex block
The profile.mutex
block collects profiles on mutexes.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | boolean | Enable this profile type to be scraped. | true | no |
path | string | The path to the profile type on the target. | "/debug/pprof/mutex" | no |
delta | boolean | Whether to scrape the profile as a delta. | false | no |
For more information about the delta
argument, see the delta argument section.
profile.process_cpu block
The profile.process_cpu
block collects profiles on CPU consumption for the process.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | boolean | Enable this profile type to be scraped. | true | no |
path | string | The path to the profile type on the target. | "/debug/pprof/profile" | no |
delta | boolean | Whether to scrape the profile as a delta. | true | no |
For more information about the delta
argument, see the delta argument section.
profile.fgprof block
The profile.fgprof
block collects profiles from an fgprof endpoint.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | boolean | Enable this profile type to be scraped. | false | no |
path | string | The path to the profile type on the target. | "/debug/fgprof" | no |
delta | boolean | Whether to scrape the profile as a delta. | true | no |
For more information about the delta
argument, see the delta argument section.
profile.godeltaprof_memory block
The profile.godeltaprof_memory
block collects profiles from godeltaprof memory endpoint. The delta is computed on the target.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | boolean | Enable this profile type to be scraped. | false | no |
path | string | The path to the profile type on the target. | "/debug/pprof/delta_heap" | no |
profile.godeltaprof_mutex block
The profile.godeltaprof_mutex
block collects profiles from godeltaprof mutex endpoint.
The delta is computed on the target.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | boolean | Enable this profile type to be scraped. | false | no |
path | string | The path to the profile type on the target. | "/debug/pprof/delta_mutex" | no |
profile.godeltaprof_block block
The profile.godeltaprof_block
block collects profiles from godeltaprof block endpoint. The delta is computed on the target.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | boolean | Enable this profile type to be scraped. | false | no |
path | string | The path to the profile type on the target. | "/debug/pprof/delta_block" | no |
profile.custom block
The profile.custom
block allows for collecting profiles from custom endpoints. Blocks must be specified with a label:
profile.custom "PROFILE_TYPE" {
enabled = true
path = "PROFILE_PATH"
}
Multiple profile.custom
blocks can be specified.
Labels assigned to profile.custom
blocks must be unique across the component.
The following arguments are supported:
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | boolean | Enable this profile type to be scraped. | yes | |
path | string | The path to the profile type on the target. | yes | |
delta | boolean | Whether to scrape the profile as a delta. | false | no |
When the delta
argument is true
, a seconds
query parameter is automatically added to requests.
The seconds
used will be equal to scrape_interval - 1
.
clustering block
Name | Type | Description | Default | Required |
---|---|---|---|---|
enabled | bool | Enables sharing targets with other cluster nodes. | false | yes |
When Alloy is using clustering, and enabled
is set to true, then this pyroscope.scrape
component instance opts-in to participating in the cluster to distribute scrape load between all cluster nodes.
Clustering causes the set of targets to be locally filtered down to a unique subset per node, where each node is roughly assigned the same number of targets. If the state of the cluster changes, such as a new node joins, then the subset of targets to scrape per node will be recalculated.
When clustering mode is enabled, all Alloy instances participating in the cluster must use the same configuration file and have access to the same service discovery APIs.
If Alloy is not running in clustered mode, this block is a no-op.
Common configuration
delta
argument
When the delta
argument is false
, the pprof HTTP query will be instantaneous.
When the delta
argument is true
:
- The pprof HTTP query will run for a certain amount of time.
- A
seconds
parameter is automatically added to the HTTP request. - The default value for the
seconds
query parameter isscrape_interval - 1
. If you setdelta_profiling_duration
, thenseconds
is assigned the same value asdelta_profiling_duration
. However, thedelta_profiling_duration
cannot be larger thanscrape_interval
. For example, if you setscrape_interval
to"15s"
, thenseconds
defaults to14s
If you setdelta_profiling_duration
to16s
, thenscrape_interval
must be set to at least17s
. If the HTTP endpoint is/debug/pprof/profile
, then the HTTP query will become/debug/pprof/profile?seconds=14
Exported fields
pyroscope.scrape
does not export any fields that can be referenced by other components.
Component health
pyroscope.scrape
is only reported as unhealthy if given an invalid configuration.
Debug information
pyroscope.scrape
reports the status of the last scrape for each configured scrape job on the component’s debug endpoint.
Debug metrics
pyroscope_fanout_latency
(histogram): Write latency for sending to direct and indirect components.
Examples
Default endpoints of static targets
The following example sets up a scrape job of a statically configured list of targets - Alloy itself and Pyroscope.
The scraped profiles are sent to pyroscope.write
which remote writes them to a Pyroscope database.
pyroscope.scrape "local" {
targets = [
{"__address__" = "localhost:4040", "service_name"="pyroscope"},
{"__address__" = "localhost:12345", "service_name"="alloy"},
]
forward_to = [pyroscope.write.local.receiver]
}
pyroscope.write "local" {
endpoint {
url = "http://pyroscope:4040"
}
}
These endpoints will be scraped every 15 seconds:
http://localhost:4040/debug/pprof/allocs
http://localhost:4040/debug/pprof/block
http://localhost:4040/debug/pprof/goroutine
http://localhost:4040/debug/pprof/mutex
http://localhost:4040/debug/pprof/profile?seconds=14
http://localhost:12345/debug/pprof/allocs
http://localhost:12345/debug/pprof/block
http://localhost:12345/debug/pprof/goroutine
http://localhost:12345/debug/pprof/mutex
http://localhost:12345/debug/pprof/profile?seconds=14
seconds=14
is added to the /debug/pprof/profile
endpoint, because:
- The
delta
argument of theprofile.process_cpu
block istrue
by default. scrape_interval
is"15s"
by default.
The /debug/fgprof
endpoint will not be scraped, because the enabled
argument of the profile.fgprof
block is false
by default.
Default endpoints of dynamic targets
discovery.http "dynamic_targets" {
url = "https://example.com/scrape_targets"
refresh_interval = "15s"
}
pyroscope.scrape "local" {
targets = [discovery.http.dynamic_targets.targets]
forward_to = [pyroscope.write.local.receiver]
}
pyroscope.write "local" {
endpoint {
url = "http://pyroscope:4040"
}
}
Default endpoints of static and dynamic targets
discovery.http "dynamic_targets" {
url = "https://example.com/scrape_targets"
refresh_interval = "15s"
}
pyroscope.scrape "local" {
targets = array.concat([
{"__address__" = "localhost:4040", "service_name"="pyroscope"},
{"__address__" = "localhost:12345", "service_name"="alloy"},
], discovery.http.dynamic_targets.targets)
forward_to = [pyroscope.write.local.receiver]
}
pyroscope.write "local" {
endpoint {
url = "http://pyroscope:4040"
}
}
Enabling and disabling profiles
pyroscope.scrape "local" {
targets = [
{"__address__" = "localhost:12345", "service_name"="alloy"},
]
profiling_config {
profile.fgprof {
enabled = true
}
profile.block {
enabled = false
}
profile.mutex {
enabled = false
}
}
forward_to = [pyroscope.write.local.receiver]
}
These endpoints will be scraped every 15 seconds:
http://localhost:12345/debug/pprof/allocs
http://localhost:12345/debug/pprof/goroutine
http://localhost:12345/debug/pprof/profile?seconds=14
http://localhost:12345/debug/fgprof?seconds=14
These endpoints will NOT be scraped because they are explicitly disabled:
http://localhost:12345/debug/pprof/block
http://localhost:12345/debug/pprof/mutex
Compatible components
pyroscope.scrape
can accept arguments from the following components:
- Components that export Targets
- Components that export Pyroscope
ProfilesReceiver
Note
Connecting some components may not be sensible or components may require further configuration to make the connection work correctly. Refer to the linked documentation for more details.