Menu
Grafana Cloud

Experimental

prometheus.write.queue

prometheus.write.queue collects metrics sent from other components into a Write-Ahead Log (WAL) and forwards them over the network to a series of user-supplied endpoints. Metrics are sent over the network using the [Prometheus Remote Write protocol][remote_write-spec].

You can specify multiple prometheus.write.queue components by giving them different labels.

You should consider everything here extremely experimental and highly subject to change. [remote_write-spec]: https://prometheus.io/docs/specs/remote_write_spec/

Usage

alloy
prometheus.write.queue "LABEL" {
  endpoint "default "{
    url = REMOTE_WRITE_URL

    ...
  }

  ...
}

Arguments

The following arguments are supported:

NameTypeDescriptionDefaultRequired
ttltimedurationHow long the samples can be queued for before they are discarded.2h

Blocks

The following blocks are supported inside the definition of prometheus.write.queue:

HierarchyBlockDescriptionRequired
persistencepersistenceConfiguration for persistenceno
endpointendpointLocation to send metrics to.no
endpoint > basic_authbasic_authConfigure basic_auth for authenticating to the endpoint.no

The > symbol indicates deeper levels of nesting. For example, endpoint > basic_auth refers to a basic_auth block defined inside an endpoint block.

persistence block

The persistence block describes how often and at what limits to write to disk. Persistence settings are shared for each endpoint.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
max_signals_to_batchuintThe maximum number of signals before they are batched to disk.10000no
batch_intervaldurationHow often to batch signals to disk if max_signals_to_batch is not reached.5sno

endpoint block

The endpoint block describes a single location to send metrics to. Multiple endpoint blocks can be provided to send metrics to multiple locations. Each endpoint will have its own WAL folder.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
urlstringFull URL to send metrics to.yes
bearer_tokensecretBearer token to authenticate with.no
write_timeoutdurationTimeout for requests made to the URL."30s"no
retry_backoffdurationHow often to wait between retries.1sno
max_retry_attemptsMaximum number of retries before dropping the batch.0no
batch_countuintHow many series to queue in each queue.1000no
flush_intervaldurationHow often to wait until sending if batch_count is not triggered.1sno
parallelismuintHow many parallel batches to write.10no
external_labelsmap(string)Labels to add to metrics sent over the network.no

basic_auth block

NameTypeDescriptionDefaultRequired
password_filestringFile containing the basic auth password.no
passwordsecretBasic auth password.no
usernamestringBasic auth username.no

password and password_file are mutually exclusive, and only one can be provided inside a basic_auth block.

Exported fields

The following fields are exported and can be referenced by other components:

NameTypeDescription
receiverMetricsReceiverA value that other components can use to send metrics to.

Component health

prometheus.write.queue is only reported as unhealthy if given an invalid configuration. In those cases, exported fields are kept at their last healthy values.

Debug information

prometheus.write.queue does not expose any component-specific debug information.

Debug metrics

The following metrics are provided for backward compatibility. They generally behave the same, but there are likely edge cases where they differ.

  • prometheus_remote_write_wal_storage_created_series_total (counter): Total number of created series appended to the WAL.
  • prometheus_remote_write_wal_storage_removed_series_total (counter): Total number of series removed from the WAL.
  • prometheus_remote_write_wal_samples_appended_total (counter): Total number of samples appended to the WAL.
  • prometheus_remote_write_wal_exemplars_appended_total (counter): Total number of exemplars appended to the WAL.
  • prometheus_remote_storage_samples_total (counter): Total number of samples sent to remote storage.
  • prometheus_remote_storage_exemplars_total (counter): Total number of exemplars sent to remote storage.
  • prometheus_remote_storage_metadata_total (counter): Total number of metadata entries sent to remote storage.
  • prometheus_remote_storage_samples_failed_total (counter): Total number of samples that failed to send to remote storage due to non-recoverable errors.
  • prometheus_remote_storage_exemplars_failed_total (counter): Total number of exemplars that failed to send to remote storage due to non-recoverable errors.
  • prometheus_remote_storage_metadata_failed_total (counter): Total number of metadata entries that failed to send to remote storage due to non-recoverable errors.
  • prometheus_remote_storage_samples_retries_total (counter): Total number of samples that failed to send to remote storage but were retried due to recoverable errors.
  • prometheus_remote_storage_exemplars_retried_total (counter): Total number of exemplars that failed to send to remote storage but were retried due to recoverable errors.
  • prometheus_remote_storage_metadata_retried_total (counter): Total number of metadata entries that failed to send to remote storage but were retried due to recoverable errors.
  • prometheus_remote_storage_samples_dropped_total (counter): Total number of samples which were dropped after being read from the WAL before being sent to remote_write because of an unknown reference ID.
  • prometheus_remote_storage_exemplars_dropped_total (counter): Total number of exemplars that were dropped after being read from the WAL before being sent to remote_write because of an unknown reference ID.
  • prometheus_remote_storage_enqueue_retries_total (counter): Total number of times enqueue has failed because a shard’s queue was full.
  • prometheus_remote_storage_sent_batch_duration_seconds (histogram): Duration of send calls to remote storage.
  • prometheus_remote_storage_queue_highest_sent_timestamp_seconds (gauge): Unix timestamp of the latest WAL sample successfully sent by a queue.
  • prometheus_remote_storage_samples_in_total (counter): Samples read into remote storage.
  • prometheus_remote_storage_exemplars_in_total (counter): Exemplars read into remote storage.

Metrics that are new to prometheus.write.queue. These are highly subject to change.

  • alloy_queue_series_serializer_incoming_signals (counter): Total number of series written to serialization.
  • alloy_queue_metadata_serializer_incoming_signals (counter): Total number of metadata written to serialization.
  • alloy_queue_series_serializer_incoming_timestamp_seconds (gauge): Highest timestamp of incoming series.
  • alloy_queue_series_serializer_errors (gauge): Number of errors for series written to serializer.
  • alloy_queue_metadata_serializer_errors (gauge): Number of errors for metadata written to serializer.
  • alloy_queue_series_network_timestamp_seconds (gauge): Highest timestamp written to an endpoint.
  • alloy_queue_series_network_sent (counter): Number of series sent successfully.
  • alloy_queue_metadata_network_sent (counter): Number of metadata sent successfully.
  • alloy_queue_network_series_failed (counter): Number of series failed.
  • alloy_queue_network_metadata_failed (counter): Number of metadata failed.
  • alloy_queue_network_series_retried (counter): Number of series retried due to network issues.
  • alloy_queue_network_metadata_retried (counter): Number of metadata retried due to network issues.
  • alloy_queue_network_series_retried_429 (counter): Number of series retried due to status code 429.
  • alloy_queue_network_metadata_retried_429 (counter): Number of metadata retried due to status code 429.
  • alloy_queue_network_series_retried_5xx (counter): Number of series retried due to status code 5xx.
  • alloy_queue_network_metadata_retried_5xx (counter): Number of metadata retried due to status code 5xx.
  • alloy_queue_network_series_network_duration_seconds (histogram): Duration writing series to endpoint.
  • alloy_queue_network_metadata_network_duration_seconds (histogram): Duration writing metadata to endpoint.
  • alloy_queue_network_series_network_errors (counter): Number of errors writing series to network.
  • alloy_queue_network_metadata_network_errors (counter): Number of errors writing metadata to network.

Examples

The following examples show you how to create prometheus.write.queue components that send metrics to different destinations.

Send metrics to a local Mimir instance

You can create a prometheus.write.queue component that sends your metrics to a local Mimir instance:

alloy
prometheus.write.queue "staging" {
  // Send metrics to a locally running Mimir.
  endpoint "mimir" {
    url = "http://mimir:9009/api/v1/push"

    basic_auth {
      username = "example-user"
      password = "example-password"
    }
  }
}

// Configure a prometheus.scrape component to send metrics to
// prometheus.write.queue component.
prometheus.scrape "demo" {
  targets = [
    // Collect metrics from the default HTTP listen address.
    {"__address__" = "127.0.0.1:12345"},
  ]
  forward_to = [prometheus.write.queue.staging.receiver]
}

Technical details

prometheus.write.queue uses snappy for compression. prometheus.write.queue sends native histograms by default. Any labels that start with __ will be removed before sending to the endpoint.

Data retention

Data is written to disk in blocks utilizing snappy compression. These blocks are read on startup and resent if they are still within the TTL. Any data that has not been written to disk, or that is in the network queues is lost if Alloy is restarted.

Retries

prometheus.write.queue will retry sending data if the following errors or HTTP status codes are returned:

  • Network errors.
  • HTTP 429 errors.
  • HTTP 5XX errors.

prometheus.write.queue will not retry sending data if any other unsuccessful status codes are returned.

Memory

prometheus.write.queue is meant to be memory efficient. You can adjust the max_signals_to_batch, parallelism, and batch_size to control how much memory is used. A higher max_signals_to_batch allows for more efficient disk compression. A higher parallelism allows more parallel writes, and batch_size allows more data sent at one time. This can allow greater throughput at the cost of more memory on both Alloy and the endpoint. The defaults are suitable for most common usages.

Compatible components

prometheus.write.queue has exports that can be consumed by the following components:

Note

Connecting some components may not be sensible or components may require further configuration to make the connection work correctly. Refer to the linked documentation for more details.