Instrument and send data

Grafana Alloy

Reference

Components

prometheus

prometheus.write.queue

Grafana Cloud

Experimental

prometheus.write.queue

prometheus.write.queue collects metrics sent from other components into a Write-Ahead Log (WAL) and forwards them over the network to a series of user-supplied endpoints. Metrics are sent over the network using the [Prometheus Remote Write protocol][remote_write-spec].

You can specify multiple prometheus.write.queue components by giving them different labels.

You should consider everything here extremely experimental and highly subject to change. [remote_write-spec]: https://prometheus.io/docs/specs/remote_write_spec/

Usage

prometheus.write.queue "LABEL" {
  endpoint "default "{
    url = REMOTE_WRITE_URL

    ...
  }

  ...
}

Arguments

The following arguments are supported:

Name	Type	Description	Default	Required
`ttl`	`time`	`duration`	How long the samples can be queued for before they are discarded.	`2h`

Blocks

The following blocks are supported inside the definition of prometheus.write.queue:

Hierarchy	Block	Description	Required
persistence	persistence	Configuration for persistence	no
endpoint	endpoint	Location to send metrics to.	no
endpoint > basic_auth	basic_auth	Configure basic_auth for authenticating to the endpoint.	no

The > symbol indicates deeper levels of nesting. For example, endpoint > basic_auth refers to a basic_auth block defined inside an endpoint block.

persistence block

The persistence block describes how often and at what limits to write to disk. Persistence settings are shared for each endpoint.

The following arguments are supported:

Name	Type	Description	Default	Required
`max_signals_to_batch`	`uint`	The maximum number of signals before they are batched to disk.	`10000`	no
`batch_interval`	`duration`	How often to batch signals to disk if `max_signals_to_batch` is not reached.	`5s`	no

endpoint block

The endpoint block describes a single location to send metrics to. Multiple endpoint blocks can be provided to send metrics to multiple locations. Each endpoint will have its own WAL folder.

The following arguments are supported:

Name	Type	Description	Default	Required
`url`	`string`	Full URL to send metrics to.		yes
`bearer_token`	`secret`	Bearer token to authenticate with.		no
`write_timeout`	`duration`	Timeout for requests made to the URL.	`"30s"`	no
`retry_backoff`	`duration`	How often to wait between retries.	`1s`	no
`max_retry_attempts`	Maximum number of retries before dropping the batch.	`0`	no
`batch_count`	`uint`	How many series to queue in each queue.	`1000`	no
`flush_interval`	`duration`	How often to wait until sending if `batch_count` is not triggered.	`1s`	no
`parallelism`	`uint`	How many parallel batches to write.	10	no
`external_labels`	`map(string)`	Labels to add to metrics sent over the network.		no

basic_auth block

Name	Type	Description	Required
`password_file`	`string`	File containing the basic auth password.	no
`password`	`secret`	Basic auth password.	no
`username`	`string`	Basic auth username.	no

password and password_file are mutually exclusive, and only one can be provided inside a basic_auth block.

Exported fields

The following fields are exported and can be referenced by other components:

Name	Type	Description
`receiver`	`MetricsReceiver`	A value that other components can use to send metrics to.

Component health

prometheus.write.queue is only reported as unhealthy if given an invalid configuration. In those cases, exported fields are kept at their last healthy values.

Debug information

prometheus.write.queue does not expose any component-specific debug information.

Debug metrics

The following metrics are provided for backward compatibility. They generally behave the same, but there are likely edge cases where they differ.

prometheus_remote_write_wal_storage_created_series_total (counter): Total number of created series appended to the WAL.
prometheus_remote_write_wal_storage_removed_series_total (counter): Total number of series removed from the WAL.
prometheus_remote_write_wal_samples_appended_total (counter): Total number of samples appended to the WAL.
prometheus_remote_write_wal_exemplars_appended_total (counter): Total number of exemplars appended to the WAL.
prometheus_remote_storage_samples_total (counter): Total number of samples sent to remote storage.
prometheus_remote_storage_exemplars_total (counter): Total number of exemplars sent to remote storage.
prometheus_remote_storage_metadata_total (counter): Total number of metadata entries sent to remote storage.
prometheus_remote_storage_samples_failed_total (counter): Total number of samples that failed to send to remote storage due to non-recoverable errors.
prometheus_remote_storage_exemplars_failed_total (counter): Total number of exemplars that failed to send to remote storage due to non-recoverable errors.
prometheus_remote_storage_metadata_failed_total (counter): Total number of metadata entries that failed to send to remote storage due to non-recoverable errors.
prometheus_remote_storage_samples_retries_total (counter): Total number of samples that failed to send to remote storage but were retried due to recoverable errors.
prometheus_remote_storage_exemplars_retried_total (counter): Total number of exemplars that failed to send to remote storage but were retried due to recoverable errors.
prometheus_remote_storage_metadata_retried_total (counter): Total number of metadata entries that failed to send to remote storage but were retried due to recoverable errors.
prometheus_remote_storage_samples_dropped_total (counter): Total number of samples which were dropped after being read from the WAL before being sent to remote_write because of an unknown reference ID.
prometheus_remote_storage_exemplars_dropped_total (counter): Total number of exemplars that were dropped after being read from the WAL before being sent to remote_write because of an unknown reference ID.
prometheus_remote_storage_enqueue_retries_total (counter): Total number of times enqueue has failed because a shard’s queue was full.
prometheus_remote_storage_sent_batch_duration_seconds (histogram): Duration of send calls to remote storage.
prometheus_remote_storage_queue_highest_sent_timestamp_seconds (gauge): Unix timestamp of the latest WAL sample successfully sent by a queue.
prometheus_remote_storage_samples_in_total (counter): Samples read into remote storage.
prometheus_remote_storage_exemplars_in_total (counter): Exemplars read into remote storage.

Metrics that are new to prometheus.write.queue. These are highly subject to change.

alloy_queue_series_serializer_incoming_signals (counter): Total number of series written to serialization.
alloy_queue_metadata_serializer_incoming_signals (counter): Total number of metadata written to serialization.
alloy_queue_series_serializer_incoming_timestamp_seconds (gauge): Highest timestamp of incoming series.
alloy_queue_series_serializer_errors (gauge): Number of errors for series written to serializer.
alloy_queue_metadata_serializer_errors (gauge): Number of errors for metadata written to serializer.
alloy_queue_series_network_timestamp_seconds (gauge): Highest timestamp written to an endpoint.
alloy_queue_series_network_sent (counter): Number of series sent successfully.
alloy_queue_metadata_network_sent (counter): Number of metadata sent successfully.
alloy_queue_network_series_failed (counter): Number of series failed.
alloy_queue_network_metadata_failed (counter): Number of metadata failed.
alloy_queue_network_series_retried (counter): Number of series retried due to network issues.
alloy_queue_network_metadata_retried (counter): Number of metadata retried due to network issues.
alloy_queue_network_series_retried_429 (counter): Number of series retried due to status code 429.
alloy_queue_network_metadata_retried_429 (counter): Number of metadata retried due to status code 429.
alloy_queue_network_series_retried_5xx (counter): Number of series retried due to status code 5xx.
alloy_queue_network_metadata_retried_5xx (counter): Number of metadata retried due to status code 5xx.
alloy_queue_network_series_network_duration_seconds (histogram): Duration writing series to endpoint.
alloy_queue_network_metadata_network_duration_seconds (histogram): Duration writing metadata to endpoint.
alloy_queue_network_series_network_errors (counter): Number of errors writing series to network.
alloy_queue_network_metadata_network_errors (counter): Number of errors writing metadata to network.

Examples

The following examples show you how to create prometheus.write.queue components that send metrics to different destinations.

Send metrics to a local Mimir instance

You can create a prometheus.write.queue component that sends your metrics to a local Mimir instance:

prometheus.write.queue "staging" {
  // Send metrics to a locally running Mimir.
  endpoint "mimir" {
    url = "http://mimir:9009/api/v1/push"

    basic_auth {
      username = "example-user"
      password = "example-password"
    }
  }
}

// Configure a prometheus.scrape component to send metrics to
// prometheus.write.queue component.
prometheus.scrape "demo" {
  targets = [
    // Collect metrics from the default HTTP listen address.
    {"__address__" = "127.0.0.1:12345"},
  ]
  forward_to = [prometheus.write.queue.staging.receiver]
}

Technical details

prometheus.write.queue uses snappy for compression. prometheus.write.queue sends native histograms by default. Any labels that start with __ will be removed before sending to the endpoint.

Data retention

Data is written to disk in blocks utilizing snappy compression. These blocks are read on startup and resent if they are still within the TTL. Any data that has not been written to disk, or that is in the network queues is lost if Alloy is restarted.

Retries

prometheus.write.queue will retry sending data if the following errors or HTTP status codes are returned:

Network errors.
HTTP 429 errors.
HTTP 5XX errors.

prometheus.write.queue will not retry sending data if any other unsuccessful status codes are returned.

Memory

prometheus.write.queue is meant to be memory efficient. You can adjust the max_signals_to_batch, parallelism, and batch_size to control how much memory is used. A higher max_signals_to_batch allows for more efficient disk compression. A higher parallelism allows more parallel writes, and batch_size allows more data sent at one time. This can allow greater throughput at the cost of more memory on both Alloy and the endpoint. The defaults are suitable for most common usages.

Compatible components

prometheus.write.queue has exports that can be consumed by the following components:

Components that consume Prometheus MetricsReceiver

Note
Connecting some components may not be sensible or components may require further configuration to make the connection work correctly. Refer to the linked documentation for more details.

Was this page helpful?

Email docs@grafana.com

Help and support

Community

Feedback

Relevant sources:

Feedback

prometheus.write.queue

Usage

Arguments