Menu
Beta

pyroscope.scrape

BETA: This is a beta component. Beta components are subject to breaking changes, and may be replaced with equivalent functionality that cover the same use case.

pyroscope.scrape collects pprof performance profiles for a given set of HTTP targets.

pyroscope.scrape mimcks the scraping behavior of prometheus.scrape. Similarly to how Prometheus scrapes metrics via HTTP, pyroscope.scrape collects profiles via HTTP requests.

Unlike Prometheus, which usually only scrapes one /metrics endpoint per target, pyroscope.scrape may need to scrape multiple endpoints for the same target. This is because different types of profiles are scraped on different endpoints. For example, “mutex” profiles may be scraped on a /debug/pprof/delta_mutex HTTP endpoint, whereas memory consumption may be scraped on a /debug/pprof/allocs HTTP endpoint.

The profile paths, protocol scheme, scrape interval, scrape timeout, query parameters, as well as any other settings can be configured within pyroscope.scrape.

The pyroscope.scrape component regards a scrape as successful if it responded with an HTTP 200 OK status code and returned the body of a valid pprof profile.

If a scrape request fails, the debug UI for pyroscope.scrape will show:

  • Detailed information about the failure.
  • The time of the last successful scrape.
  • The labels last used for scraping.

The scraped performance profiles can be forwarded to components such as pyroscope.write via the forward_to argument.

Multiple pyroscope.scrape components can be specified by giving them different labels.

Usage

river
pyroscope.scrape "LABEL" {
  targets    = TARGET_LIST
  forward_to = RECEIVER_LIST
}

Arguments

pyroscope.scrape starts a new scrape job to scrape all of the input targets. Multiple scrape jobs can be started for a single input target when scraping multiple profile types.

The list of arguments that can be used to configure the block is presented below.

Any omitted arguments take on their default values. If conflicting arguments are being passed (for example, configuring both bearer_token and bearer_token_file), then pyroscope.scrape will fail to start and will report an error.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
targetslist(map(string))List of targets to scrape.yes
forward_tolist(ProfilesReceiver)List of receivers to send scraped profiles to.yes
job_namestringThe job name to override the job label with.component nameno
paramsmap(list(string))A set of query parameters with which the target is scraped.no
scrape_intervaldurationHow frequently to scrape the targets of this scrape configuration."15s"no
scrape_timeoutdurationThe timeout for scraping targets of this configuration. Must be larger than scrape_interval."18s"no
schemestringThe URL scheme with which to fetch metrics from targets."http"no
bearer_token_filestringFile containing a bearer token to authenticate with.no
bearer_tokensecretBearer token to authenticate with.no
enable_http2boolWhether HTTP2 is supported for requests.trueno
follow_redirectsboolWhether redirects returned by the server should be followed.trueno
proxy_urlstringHTTP proxy to send requests through.no
no_proxystringComma-separated list of IP addresses, CIDR notations, and domain names to exclude from proxying.no
proxy_from_environmentboolUse the proxy URL indicated by environment variables.falseno
proxy_connect_headermap(list(secret))Specifies headers to send to proxies during CONNECT requests.no

At most, one of the following can be provided:

no_proxy can contain IPs, CIDR notations, and domain names. IP and domain names can contain port numbers. proxy_url must be configured if no_proxy is configured.

proxy_from_environment uses the environment variables HTTP_PROXY, HTTPS_PROXY and NO_PROXY (or the lowercase versions thereof). Requests use the proxy from the environment variable matching their scheme, unless excluded by NO_PROXY. proxy_url and no_proxy must not be configured if proxy_from_environment is configured.

proxy_connect_header should only be configured if proxy_url or proxy_from_environment are configured.

job_name argument

job_name defaults to the component’s unique identifier.

For example, the job_name of pyroscope.scrape "local" { ... } will be "pyroscope.scrape.local".

targets argument

The list of targets can be provided statically, dynamically, or a combination of both.

The special __address__ label must always be present and corresponds to the <host>:<port> that is used for the scrape request.

Labels starting with a double underscore (__) are treated as internal, and are removed prior to scraping.

The special label service_name is required and must always be present. If it is not specified, pyroscope.scrape will attempt to infer it from either of the following sources, in this order:

  1. __meta_kubernetes_pod_annotation_pyroscope_io_service_name which is a pyroscope.io/service_name pod annotation.
  2. __meta_kubernetes_namespace and __meta_kubernetes_pod_container_name
  3. __meta_docker_container_name
  4. __meta_dockerswarm_container_label_service_name or __meta_dockerswarm_service_name

If service_name is not specified and could not be inferred, then it is set to unspecified.

The following labels are automatically injected to the scraped profiles so that they can be linked to a scrape target:

LabelDescription
"job"The job_name that the target belongs to.
"instance"The __address__ or <host>:<port> of the scrape target’s URL.
"service_name"The inferred Pyroscope service name.

scrape_interval argument

The scrape_interval typically refers to the frequency with which Grafana Agent Flow collects performance profiles from the monitored targets. It represents the time interval between consecutive scrapes or data collection events. This parameter is important for controlling the trade-off between resource usage and the freshness of the collected data.

If scrape_interval is short:

  • Advantages:
    • Fewer profiles may be lost if the application being scraped crashes.
  • Disadvantages:
    • Greater consumption of CPU, memory, and network resources during scrapes and remote writes.
    • The backend database (Pyroscope) will consume more storage space.

If scrape_interval is long:

  • Advantages:
    • Lower resource consumption.
  • Disadvantages:
    • More profiles may be lost if the application being scraped crashes.
    • If the delta argument is set to true, the batch size of each remote write to Pyroscope may be bigger. The Pyroscope database may need to be tuned with higher limits.
    • If the delta argument is set to true, there is a larger risk of reaching the HTTP server timeouts of the application being scraped.

For example, consider this situation:

  • pyroscope.scrape is configured with a scrape_interval of "60s".
  • The application being scraped is running an HTTP server with a timeout of 30 seconds.
  • Any scrape HTTP requests where the delta argument is set to true will fail, because they will attempt to run for 59 seconds.

Blocks

The following blocks are supported inside the definition of pyroscope.scrape:

HierarchyBlockDescriptionRequired
basic_authbasic_authConfigure basic_auth for authenticating to targets.no
authorizationauthorizationConfigure generic authorization to targets.no
oauth2oauth2Configure OAuth2 for authenticating to targets.no
oauth2 > tls_configtls_configConfigure TLS settings for connecting to targets via OAuth2.no
tls_configtls_configConfigure TLS settings for connecting to targets.no
profiling_configprofiling_configConfigure profiling settings for the scrape job.no
profiling_config > profile.memoryprofile.memoryCollect memory profiles.no
profiling_config > profile.blockprofile.blockCollect profiles on blocks.no
profiling_config > profile.goroutineprofile.goroutineCollect goroutine profiles.no
profiling_config > profile.mutexprofile.mutexCollect mutex profiles.no
profiling_config > profile.process_cpuprofile.process_cpuCollect CPU profiles.no
profiling_config > profile.fgprofprofile.fgprofCollect fgprof profiles.no
profiling_config > profile.godeltaprof_memoryprofile.godeltaprof_memoryCollect godeltaprof memory profiles.no
profiling_config > profile.godeltaprof_mutexprofile.godeltaprof_mutexCollect godeltaprof mutex profiles.no
profiling_config > profile.godeltaprof_blockprofile.godeltaprof_blockCollect godeltaprof block profiles.no
profiling_config > profile.customprofile.customCollect custom profiles.no
clusteringclusteringConfigure the component for when Grafana Agent Flow is running in clustered mode.no

The > symbol indicates deeper levels of nesting. For example, oauth2 > tls_config refers to a tls_config block defined inside an oauth2 block.

Any omitted blocks take on their default values. For example, if profile.mutex is not specified in the config, the defaults documented in profile.mutex will be used.

basic_auth block

NameTypeDescriptionDefaultRequired
password_filestringFile containing the basic auth password.no
passwordsecretBasic auth password.no
usernamestringBasic auth username.no

password and password_file are mutually exclusive, and only one can be provided inside a basic_auth block.

authorization block

NameTypeDescriptionDefaultRequired
credentials_filestringFile containing the secret value.no
credentialssecretSecret value.no
typestringAuthorization type, for example, “Bearer”.no

credential and credentials_file are mutually exclusive, and only one can be provided inside an authorization block.

oauth2 block

NameTypeDescriptionDefaultRequired
client_idstringOAuth2 client ID.no
client_secret_filestringFile containing the OAuth2 client secret.no
client_secretsecretOAuth2 client secret.no
endpoint_paramsmap(string)Optional parameters to append to the token URL.no
proxy_urlstringHTTP proxy to send requests through.no
no_proxystringComma-separated list of IP addresses, CIDR notations, and domain names to exclude from proxying.no
proxy_from_environmentboolUse the proxy URL indicated by environment variables.falseno
proxy_connect_headermap(list(secret))Specifies headers to send to proxies during CONNECT requests.no
scopeslist(string)List of scopes to authenticate with.no
token_urlstringURL to fetch the token from.no

client_secret and client_secret_file are mutually exclusive, and only one can be provided inside an oauth2 block.

The oauth2 block may also contain a separate tls_config sub-block.

no_proxy can contain IPs, CIDR notations, and domain names. IP and domain names can contain port numbers. proxy_url must be configured if no_proxy is configured.

proxy_from_environment uses the environment variables HTTP_PROXY, HTTPS_PROXY and NO_PROXY (or the lowercase versions thereof). Requests use the proxy from the environment variable matching their scheme, unless excluded by NO_PROXY. proxy_url and no_proxy must not be configured if proxy_from_environment is configured.

proxy_connect_header should only be configured if proxy_url or proxy_from_environment are configured.

tls_config block

NameTypeDescriptionDefaultRequired
ca_pemstringCA PEM-encoded text to validate the server with.no
ca_filestringCA certificate to validate the server with.no
cert_pemstringCertificate PEM-encoded text for client authentication.no
cert_filestringCertificate file for client authentication.no
insecure_skip_verifyboolDisables validation of the server certificate.no
key_filestringKey file for client authentication.no
key_pemsecretKey PEM-encoded text for client authentication.no
min_versionstringMinimum acceptable TLS version.no
server_namestringServerName extension to indicate the name of the server.no

The following pairs of arguments are mutually exclusive and can’t both be set simultaneously:

  • ca_pem and ca_file
  • cert_pem and cert_file
  • key_pem and key_file

When configuring client authentication, both the client certificate (using cert_pem or cert_file) and the client key (using key_pem or key_file) must be provided.

When min_version is not provided, the minimum acceptable TLS version is inherited from Go’s default minimum version, TLS 1.2. If min_version is provided, it must be set to one of the following strings:

  • "TLS10" (TLS 1.0)
  • "TLS11" (TLS 1.1)
  • "TLS12" (TLS 1.2)
  • "TLS13" (TLS 1.3)

profiling_config block

The profiling_config block configures the profiling settings when scraping targets.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
path_prefixstringThe path prefix to use when scraping targets.no

profile.memory block

The profile.memory block collects profiles on memory consumption.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
enabledbooleanEnable this profile type to be scraped.trueno
pathstringThe path to the profile type on the target."/debug/pprof/allocs"no
deltabooleanWhether to scrape the profile as a delta.falseno

For more information about the delta argument, see the delta argument section.

profile.block block

The profile.block block collects profiles on process blocking.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
enabledbooleanEnable this profile type to be scraped.trueno
pathstringThe path to the profile type on the target."/debug/pprof/block"no
deltabooleanWhether to scrape the profile as a delta.falseno

For more information about the delta argument, see the delta argument section.

profile.goroutine block

The profile.goroutine block collects profiles on the number of goroutines.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
enabledbooleanEnable this profile type to be scraped.trueno
pathstringThe path to the profile type on the target."/debug/pprof/goroutine"no
deltabooleanWhether to scrape the profile as a delta.falseno

For more information about the delta argument, see the delta argument section.

profile.mutex block

The profile.mutex block collects profiles on mutexes.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
enabledbooleanEnable this profile type to be scraped.trueno
pathstringThe path to the profile type on the target."/debug/pprof/mutex"no
deltabooleanWhether to scrape the profile as a delta.falseno

For more information about the delta argument, see the delta argument section.

profile.process_cpu block

The profile.process_cpu block collects profiles on CPU consumption for the process.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
enabledbooleanEnable this profile type to be scraped.trueno
pathstringThe path to the profile type on the target."/debug/pprof/profile"no
deltabooleanWhether to scrape the profile as a delta.trueno

For more information about the delta argument, see the delta argument section.

profile.fgprof block

The profile.fgprof block collects profiles from an fgprof endpoint.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
enabledbooleanEnable this profile type to be scraped.falseno
pathstringThe path to the profile type on the target."/debug/fgprof"no
deltabooleanWhether to scrape the profile as a delta.trueno

For more information about the delta argument, see the delta argument section.

profile.godeltaprof_memory block

The profile.godeltaprof_memory block collects profiles from godeltaprof memory endpoint. The delta is computed on the target.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
enabledbooleanEnable this profile type to be scraped.falseno
pathstringThe path to the profile type on the target."/debug/pprof/delta_heap"no

profile.godeltaprof_mutex block

The profile.godeltaprof_mutex block collects profiles from godeltaprof mutex endpoint. The delta is computed on the target.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
enabledbooleanEnable this profile type to be scraped.falseno
pathstringThe path to the profile type on the target."/debug/pprof/delta_mutex"no

profile.godeltaprof_block block

The profile.godeltaprof_block block collects profiles from godeltaprof block endpoint. The delta is computed on the target.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
enabledbooleanEnable this profile type to be scraped.falseno
pathstringThe path to the profile type on the target."/debug/pprof/delta_block"no

profile.custom block

The profile.custom block allows for collecting profiles from custom endpoints. Blocks must be specified with a label:

river
profile.custom "PROFILE_TYPE" {
  enabled = true
  path    = "PROFILE_PATH"
}

Multiple profile.custom blocks can be specified. Labels assigned to profile.custom blocks must be unique across the component.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
enabledbooleanEnable this profile type to be scraped.yes
pathstringThe path to the profile type on the target.yes
deltabooleanWhether to scrape the profile as a delta.falseno

When the delta argument is true, a seconds query parameter is automatically added to requests. The seconds used will be equal to scrape_interval - 1.

clustering (beta)

NameTypeDescriptionDefaultRequired
enabledboolEnables sharing targets with other cluster nodes.falseyes

When Grafana Agent Flow is using clustering, and enabled is set to true, then this pyroscope.scrape component instance opts-in to participating in the cluster to distribute scrape load between all cluster nodes.

Clustering causes the set of targets to be locally filtered down to a unique subset per node, where each node is roughly assigned the same number of targets. If the state of the cluster changes, such as a new node joins, then the subset of targets to scrape per node will be recalculated.

When clustering mode is enabled, all Grafana Agents participating in the cluster must use the same configuration file and have access to the same service discovery APIs.

If Grafana Agent Flow is not running in clustered mode, this block is a no-op.

Common configuration

delta argument

When the delta argument is false, the pprof HTTP query will be instantaneous.

When the delta argument is true:

  • The pprof HTTP query will run for a certain amount of time.
  • A seconds parameter is automatically added to the HTTP request.
  • The seconds used will be equal to scrape_interval - 1. For example, if scrape_interval is "15s", seconds will be 14 seconds. If the HTTP endpoint is /debug/pprof/profile, then the HTTP query will become /debug/pprof/profile?seconds=14

Exported fields

pyroscope.scrape does not export any fields that can be referenced by other components.

Component health

pyroscope.scrape is only reported as unhealthy if given an invalid configuration.

Debug information

pyroscope.scrape reports the status of the last scrape for each configured scrape job on the component’s debug endpoint.

Debug metrics

  • pyroscope_fanout_latency (histogram): Write latency for sending to direct and indirect components.

Examples

Default endpoints of static targets

The following example sets up a scrape job of a statically configured list of targets - Grafana Agent itself and Pyroscope. The scraped profiles are sent to pyroscope.write which remote writes them to a Pyroscope database.

river
pyroscope.scrape "local" {
  targets = [
    {"__address__" = "localhost:4100", "service_name"="pyroscope"},
    {"__address__" = "localhost:12345", "service_name"="agent"},
  ]

  forward_to = [pyroscope.write.local.receiver]
}

pyroscope.write "local" {
  endpoint {
    url = "http://pyroscope:4100"
  }
}

These endpoints will be scraped every 15 seconds:

http://localhost:4100/debug/pprof/allocs
http://localhost:4100/debug/pprof/block
http://localhost:4100/debug/pprof/goroutine
http://localhost:4100/debug/pprof/mutex
http://localhost:4100/debug/pprof/profile?seconds=14

http://localhost:12345/debug/pprof/allocs
http://localhost:12345/debug/pprof/block
http://localhost:12345/debug/pprof/goroutine
http://localhost:12345/debug/pprof/mutex
http://localhost:12345/debug/pprof/profile?seconds=14

Note that seconds=14 is added to the /debug/pprof/profile endpoint, because:

  • The delta argument of the profile.process_cpu block is true by default.
  • scrape_interval is "15s" by default.

Also note that the /debug/fgprof endpoint will not be scraped, because the enabled argument of the profile.fgprof block is false by default.

Default endpoints of dynamic targets

river
discovery.http "dynamic_targets" {
  url = "https://example.com/scrape_targets"
  refresh_interval = "15s"
}

pyroscope.scrape "local" {
  targets = [discovery.http.dynamic_targets.targets]

  forward_to = [pyroscope.write.local.receiver]
}

pyroscope.write "local" {
  endpoint {
    url = "http://pyroscope:4100"
  }
}

Default endpoints of static and dynamic targets

river
discovery.http "dynamic_targets" {
  url = "https://example.com/scrape_targets"
  refresh_interval = "15s"
}

pyroscope.scrape "local" {
  targets = concat([
    {"__address__" = "localhost:4040", "service_name"="pyroscope"},
    {"__address__" = "localhost:12345", "service_name"="agent"},
  ], discovery.http.dynamic_targets.targets)

  forward_to = [pyroscope.write.local.receiver]
}

pyroscope.write "local" {
  endpoint {
    url = "http://pyroscope:4100"
  }
}

Enabling and disabling profiles

river
pyroscope.scrape "local" {
  targets = [
    {"__address__" = "localhost:12345", "service_name"="agent"},
  ]

  profiling_config {
    profile.fgprof {
      enabled = true
    }
    profile.block {
      enabled = false
    }
    profile.mutex {
      enabled = false
    }
  }

  forward_to = [pyroscope.write.local.receiver]
}

These endpoints will be scraped every 15 seconds:

http://localhost:12345/debug/pprof/allocs
http://localhost:12345/debug/pprof/goroutine
http://localhost:12345/debug/pprof/profile?seconds=14
http://localhost:12345/debug/fgprof?seconds=14

These endpoints will NOT be scraped because they are explicitly disabled:

http://localhost:12345/debug/pprof/block
http://localhost:12345/debug/pprof/mutex

Compatible components

pyroscope.scrape can accept arguments from the following components:

Note

Connecting some components may not be sensible or components may require further configuration to make the connection work correctly. Refer to the linked documentation for more details.