Menu

This is documentation for the next version of Agent. For the latest stable release, go to the latest version.

Open source

prometheus.scrape

prometheus.scrape configures a Prometheus scraping job for a given set of targets. The scraped metrics are forwarded to the list of receivers passed in forward_to.

Multiple prometheus.scrape components can be specified by giving them different labels.

Usage

prometheus.scrape "LABEL" {
  targets    = TARGET_LIST
  forward_to = RECEIVER_LIST
}

Arguments

The component configures and starts a new scrape job to scrape all the input targets. The list of arguments that can be used to configure the block is presented below.

The scrape job name defaults to the component’s unique identifier.

Any omitted fields take on their default values. In case that conflicting attributes are being passed (e.g. defining both a BearerToken and BearerTokenFile or configuring both Basic Authorization and OAuth2 at the same time), the component reports an error.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
targetslist(map(string))List of targets to scrape.yes
forward_tolist(MetricsReceiver)List of receivers to send scraped metrics to.yes
job_namestringThe value to use for the job label if not already set.component nameno
extra_metricsboolWhether extra metrics should be generated for scrape targets.falseno
enable_protobuf_negotiationboolWhether to enable protobuf negotiation with the client.falseno
honor_labelsboolIndicator whether the scraped metrics should remain unmodified.falseno
honor_timestampsboolIndicator whether the scraped timestamps should be respected.trueno
track_timestamps_stalenessboolIndicator whether to track the staleness of the scraped timestamps.falseno
paramsmap(list(string))A set of query parameters with which the target is scraped.no
scrape_classic_histogramsboolWhether to scrape a classic histogram that is also exposed as a native histogram.falseno
scrape_intervaldurationHow frequently to scrape the targets of this scrape configuration."60s"no
scrape_timeoutdurationThe timeout for scraping targets of this configuration."10s"no
metrics_pathstringThe HTTP resource path on which to fetch metrics from targets./metricsno
schemestringThe URL scheme with which to fetch metrics from targets.no
body_size_limitintAn uncompressed response body larger than this many bytes causes the scrape to fail. 0 means no limit.no
sample_limituintMore than this many samples post metric-relabeling causes the scrape to failno
target_limituintMore than this many targets after the target relabeling causes the scrapes to fail.no
label_limituintMore than this many labels post metric-relabeling causes the scrape to fail.no
label_name_length_limituintMore than this label name length post metric-relabeling causes the scrape to fail.no
label_value_length_limituintMore than this label value length post metric-relabeling causes the scrape to fail.no
bearer_token_filestringFile containing a bearer token to authenticate with.no
bearer_tokensecretBearer token to authenticate with.no
enable_http2boolWhether HTTP2 is supported for requests.trueno
follow_redirectsboolWhether redirects returned by the server should be followed.trueno
proxy_urlstringHTTP proxy to send requests through.no
no_proxystringComma-separated list of IP addresses, CIDR notations, and domain names to exclude from proxying.no
proxy_from_environmentboolUse the proxy URL indicated by environment variables.falseno
proxy_connect_headermap(list(secret))Specifies headers to send to proxies during CONNECT requests.no

At most, one of the following can be provided:

no_proxy can contain IPs, CIDR notations, and domain names. IP and domain names can contain port numbers. proxy_url must be configured if no_proxy is configured.

proxy_from_environment uses the environment variables HTTP_PROXY, HTTPS_PROXY and NO_PROXY (or the lowercase versions thereof). Requests use the proxy from the environment variable matching their scheme, unless excluded by NO_PROXY. proxy_url and no_proxy must not be configured if proxy_from_environment is configured.

proxy_connect_header should only be configured if proxy_url or proxy_from_environment are configured.

track_timestamps_staleness controls whether Prometheus tracks staleness of metrics which with an explicit timestamp present in scraped data.

  • An “explicit timestamp” is an optional timestamp in the Prometheus metrics exposition format. For example, this sample has a timestamp of 1395066363000:
    http_requests_total{method="post",code="200"} 1027 1395066363000
  • If track_timestamps_staleness is set to true, a staleness marker will be inserted when a metric is no longer present or the target is down.
  • A “staleness marker” is just a sample with a specific NaN value which is reserved for internal use by Prometheus.
  • It is recommended to set track_timestamps_staleness to true if the database where metrics are written to has enabled out of order ingestion.
  • If track_timestamps_staleness is set to false, samples with explicit timestamps will only be labeled as stale after a certain time period, which in Prometheus is 5 minutes by default.

Blocks

The following blocks are supported inside the definition of prometheus.scrape:

HierarchyBlockDescriptionRequired
basic_authbasic_authConfigure basic_auth for authenticating to targets.no
authorizationauthorizationConfigure generic authorization to targets.no
oauth2oauth2Configure OAuth2 for authenticating to targets.no
oauth2 > tls_configtls_configConfigure TLS settings for connecting to targets via OAuth2.no
tls_configtls_configConfigure TLS settings for connecting to targets.no
clusteringclusteringConfigure the component for when the Agent is running in clustered mode.no

The > symbol indicates deeper levels of nesting. For example, oauth2 > tls_config refers to a tls_config block defined inside an oauth2 block.

basic_auth block

NameTypeDescriptionDefaultRequired
password_filestringFile containing the basic auth password.no
passwordsecretBasic auth password.no
usernamestringBasic auth username.no

password and password_file are mutually exclusive, and only one can be provided inside a basic_auth block.

authorization block

NameTypeDescriptionDefaultRequired
credentials_filestringFile containing the secret value.no
credentialssecretSecret value.no
typestringAuthorization type, for example, “Bearer”.no

credential and credentials_file are mutually exclusive, and only one can be provided inside an authorization block.

oauth2 block

NameTypeDescriptionDefaultRequired
client_idstringOAuth2 client ID.no
client_secret_filestringFile containing the OAuth2 client secret.no
client_secretsecretOAuth2 client secret.no
endpoint_paramsmap(string)Optional parameters to append to the token URL.no
proxy_urlstringHTTP proxy to send requests through.no
no_proxystringComma-separated list of IP addresses, CIDR notations, and domain names to exclude from proxying.no
proxy_from_environmentboolUse the proxy URL indicated by environment variables.falseno
proxy_connect_headermap(list(secret))Specifies headers to send to proxies during CONNECT requests.no
scopeslist(string)List of scopes to authenticate with.no
token_urlstringURL to fetch the token from.no

client_secret and client_secret_file are mutually exclusive, and only one can be provided inside an oauth2 block.

The oauth2 block may also contain a separate tls_config sub-block.

no_proxy can contain IPs, CIDR notations, and domain names. IP and domain names can contain port numbers. proxy_url must be configured if no_proxy is configured.

proxy_from_environment uses the environment variables HTTP_PROXY, HTTPS_PROXY and NO_PROXY (or the lowercase versions thereof). Requests use the proxy from the environment variable matching their scheme, unless excluded by NO_PROXY. proxy_url and no_proxy must not be configured if proxy_from_environment is configured.

proxy_connect_header should only be configured if proxy_url or proxy_from_environment are configured.

tls_config block

NameTypeDescriptionDefaultRequired
ca_pemstringCA PEM-encoded text to validate the server with.no
ca_filestringCA certificate to validate the server with.no
cert_pemstringCertificate PEM-encoded text for client authentication.no
cert_filestringCertificate file for client authentication.no
insecure_skip_verifyboolDisables validation of the server certificate.no
key_filestringKey file for client authentication.no
key_pemsecretKey PEM-encoded text for client authentication.no
min_versionstringMinimum acceptable TLS version.no
server_namestringServerName extension to indicate the name of the server.no

The following pairs of arguments are mutually exclusive and can’t both be set simultaneously:

  • ca_pem and ca_file
  • cert_pem and cert_file
  • key_pem and key_file

When configuring client authentication, both the client certificate (using cert_pem or cert_file) and the client key (using key_pem or key_file) must be provided.

When min_version is not provided, the minimum acceptable TLS version is inherited from Go’s default minimum version, TLS 1.2. If min_version is provided, it must be set to one of the following strings:

  • "TLS10" (TLS 1.0)
  • "TLS11" (TLS 1.1)
  • "TLS12" (TLS 1.2)
  • "TLS13" (TLS 1.3)

clustering block

NameTypeDescriptionDefaultRequired
enabledboolEnables sharing targets with other cluster nodes.falseyes

When Grafana Agent Flow is using clustering, and enabled is set to true, then this prometheus.scrape component instance opts-in to participating in the cluster to distribute scrape load between all cluster nodes.

Clustering assumes that all cluster nodes are running with the same configuration file, have access to the same service discovery APIs and that all prometheus.scrape components that have opted-in to using clustering, over the course of a scrape interval, are converging on the same target set from upstream components in their targets argument.

All prometheus.scrape components instances opting in to clustering use target labels and a consistent hashing algorithm to determine ownership for each of the targets between the cluster peers. Then, each peer only scrapes the subset of targets that it is responsible for, so that the scrape load is distributed. When a node joins or leaves the cluster, every peer recalculates ownership and continues scraping with the new target set. This performs better than hashmod sharding where all nodes have to be re-distributed, as only 1/N of the targets ownership is transferred, but is eventually consistent (rather than fully consistent like hashmod sharding is).

If Grafana Agent Flow is not running in clustered mode, then the block is a no-op and prometheus.scrape scrapes every target it receives in its arguments.

Exported fields

prometheus.scrape does not export any fields that can be referenced by other components.

Component health

prometheus.scrape is only reported as unhealthy if given an invalid configuration.

Debug information

prometheus.scrape reports the status of the last scrape for each configured scrape job on the component’s debug endpoint.

Debug metrics

  • agent_prometheus_fanout_latency (histogram): Write latency for sending to direct and indirect components.
  • agent_prometheus_scrape_targets_gauge (gauge): Number of targets this component is configured to scrape.
  • agent_prometheus_forwarded_samples_total (counter): Total number of samples sent to downstream components.

Scraping behavior

The prometheus.scrape component borrows the scraping behavior of Prometheus. Prometheus, and by extent this component, uses a pull model for scraping metrics from a given set of targets. Each scrape target is defined as a set of key-value pairs called labels. The set of targets can either be static, or dynamically provided periodically by a service discovery component such as discovery.kubernetes. The special label __address__ must always be present and corresponds to the <host>:<port> that is used for the scrape request.

By default, the scrape job tries to scrape all available targets’ /metrics endpoints using HTTP, with a scrape interval of 1 minute and scrape timeout of 10 seconds. The metrics path, protocol scheme, scrape interval and timeout, query parameters, as well as any other settings can be configured using the component’s arguments.

If a target is hosted at the in-memory traffic address specified by the run command, prometheus.scrape will scrape the metrics in-memory, bypassing the network.

The scrape job expects the metrics exposed by the endpoint to follow the OpenMetrics format. All metrics are then propagated to each receiver listed in the component’s forward_to argument.

Labels coming from targets, that start with a double underscore __ are treated as internal, and are removed prior to scraping.

The prometheus.scrape component regards a scrape as successful if it responded with an HTTP 200 OK status code and returned a body of valid metrics.

If the scrape request fails, the component’s debug UI section contains more detailed information about the failure, the last successful scrape, as well as the labels last used for scraping.

The following labels are automatically injected to the scraped time series and can help pin down a scrape target.

LabelDescription
jobThe configured job name that the target belongs to. Defaults to the fully formed component name.
instanceThe __address__ or <host>:<port> of the scrape target’s URL.

Similarly, these metrics that record the behavior of the scrape targets are also automatically available.

Metric NameDescription
up1 if the instance is healthy and reachable, or 0 if the scrape failed.
scrape_duration_secondsDuration of the scrape in seconds.
scrape_samples_scrapedThe number of samples the target exposed.
scrape_samples_post_metric_relabelingThe number of samples remaining after metric relabeling was applied.
scrape_series_addedThe approximate number of new series in this scrape.
scrape_timeout_secondsThe configured scrape timeout for a target. Useful for measuring how close a target was to timing out using scrape_duration_seconds / scrape_timeout_seconds
scrape_sample_limitThe configured sample limit for a target. Useful for measuring how close a target was to reaching the sample limit using scrape_samples_post_metric_relabeling / (scrape_sample_limit > 0)
scrape_body_size_bytesThe uncompressed size of the most recent scrape response, if successful. Scrapes failing because the body_size_limit is exceeded report -1, other scrape failures report 0.

The up metric is particularly useful for monitoring and alerting on the health of a scrape job. It is set to 0 in case anything goes wrong with the scrape target, either because it is not reachable, because the connection times out while scraping, or because the samples from the target could not be processed. When the target is behaving normally, the up metric is set to 1.

To enable scraping of Prometheus’ native histograms over gRPC, the enable_protobuf_negotiation must be set to true. The scrape_classic_histograms argument controls whether the component should also scrape the ‘classic’ histogram equivalent of a native histogram, if it is present.

Example

The following example sets up the scrape job with certain attributes (scrape endpoint, scrape interval, query parameters) and lets it scrape two instances of the blackbox exporter. The exposed metrics are sent over to the provided list of receivers, as defined by other components.

river
prometheus.scrape "blackbox_scraper" {
  targets = [
    {"__address__" = "blackbox-exporter:9115", "instance" = "one"},
    {"__address__" = "blackbox-exporter:9116", "instance" = "two"},
  ]

  forward_to = [prometheus.remote_write.grafanacloud.receiver, prometheus.remote_write.onprem.receiver]

  scrape_interval = "10s"
  params          = { "target" = ["grafana.com"], "module" = ["http_2xx"] }
  metrics_path    = "/probe"
}

Here are the endpoints that are being scraped every 10 seconds:

http://blackbox-exporter:9115/probe?target=grafana.com&module=http_2xx
http://blackbox-exporter:9116/probe?target=grafana.com&module=http_2xx

Technical details

prometheus.scrape supports gzip compression.

The following special labels can change the behavior of prometheus.scrape:

  • __address__ is the name of the label that holds the <host>:<port> address of a scrape target.
  • __metrics_path__ is the name of the label that holds the path on which to scrape a target.
  • __scheme__ is the name of the label that holds the scheme (http,https) on which to scrape a target.
  • __scrape_interval__ is the name of the label that holds the scrape interval used to scrape a target.
  • __scrape_timeout__ is the name of the label that holds the scrape timeout used to scrape a target.
  • __param_<name> is a prefix for labels that provide URL parameters <name> used to scrape a target.

Special labels added after a scrape

  • __name__ is the label name indicating the metric name of a timeseries.
  • job is the label name indicating the job from which a timeseries was scraped.
  • instance is the label name used for the instance label.

Compatible components

prometheus.scrape can accept arguments from the following components:

Note

Connecting some components may not be sensible or components may require further configuration to make the connection work correctly. Refer to the linked documentation for more details.