High-availability deduplication

Open source

Configure Grafana Mimir high-availability deduplication

You can have more than one Prometheus instance that scrapes the same metrics for redundancy. Grafana Mimir already performs replication for redundancy, so you do not need to ingest the same data twice. In Grafana Mimir, you can deduplicate the data that you receive from HA pairs of Prometheus instances.

Assume that there are two teams, each running their own Prometheus instance, which monitors different services: Prometheus team-1 and Prometheus team-2. If the teams are running Prometheus HA pairs, the individual Prometheus instances would be team-1.a and team-1.b, and team-2.a and team-2.b.

Grafana Mimir only ingests from either team-1.a or team-1.b, and only from team-2.a or team-2.b. It does this by electing a leader replica for each Prometheus server. For example, in the case of team-1, the leader replica would be team-1.a. As long as team-1.a is the leader, the samples that team-1.b sends are dropped. And if Grafana Mimir does not see any new samples from team-1.a for a short period of time (30 seconds by default), it switches the leader to team-1.b.

If team-1.a goes down for more than 30 seconds, Grafana Mimir’s HA sample handling will have switched and elected team-1.b as the leader. The failure timeout ensures that too much data is not dropped before failover to the other replica.

Note
In a scenario where the default scrape period is 15 seconds, and the timeouts in Grafana Mimir are set to the default values, when a leader-election failover occurs, you’ll likely only lose a single scrape of data.
For any query using the rate() function, make the rate time interval at least four times that of the scrape period to account for any of these failover scenarios.
For example, with the default scrape period of 15 seconds, use a rate time-interval at least 1-minute.

Distributor high-availability (HA) tracker

The distributor includes a high-availability (HA) tracker.

The HA tracker deduplicates incoming samples based on a cluster and replica label expected on each incoming series. The cluster label uniquely identifies the cluster of redundant Prometheus servers for a given tenant. The replica label uniquely identifies the replica within the Prometheus cluster. Incoming samples are considered duplicated (and thus dropped) if they are received from any replica that is not the currently elected leader within a cluster.

If the HA tracker is enabled but incoming samples contain only one or none of the cluster and replica labels, these samples are accepted by default and never deduplicated.

Note: for performance reasons, the HA tracker only checks the cluster and replica label of the first series in the request to determine whether all series in the request should be deduplicated. This assumes that all series inside the request have the same cluster and replica labels, which is typically true when Prometheus is configured with external labels. Ensure this requirement is honored if you have a non-standard Prometheus setup (for example, you’re using Prometheus federation or have a metrics proxy in between).
If all series inside a single write request can’t be guaranteed to share the same cluster and replica labels, enable the experimental per-series HA deduplication via -distributor.ha-tracker.per-sample-dedupe (or the ha_tracker_per_sample_dedupe per-tenant limit). With this setting enabled, each timeseries in the request is evaluated independently, so non-elected replicas are dropped even when mixed with elected-replica or non-HA series in the same request.

Error responses

When the HA tracker drops samples, Mimir returns one of the following errors depending on the reason:

Replicas did not match: When samples are received from a non-elected replica, Mimir returns an HTTP 202 Accepted response with the message replicas did not match, rejecting sample: replica=<replica>, elected=<elected>. This indicates that the samples were successfully deduplicated and can be safely ignored.
Too many HA clusters: When the number of HA clusters for a tenant exceeds the configured limit, Mimir returns an HTTP 400 Bad Request response with the error ID err-mimir-tenant-too-many-ha-clusters. To adjust this limit, configure -distributor.ha-tracker.max-clusters or contact your service administrator.

Configuration

This section includes information about how to configure Prometheus and Grafana Mimir.

How to configure Prometheus

To configure Prometheus, set two identifiers for each Prometheus server, one for the cluster. For example, set team-1 or team-2, and one to identify the replica in the cluster, for example a or b. The easiest approach is to set external labels. The default labels are cluster and __replica__.

The following example shows how to set identifiers in Prometheus:

global:
  external_labels:
    cluster: prom-team1
    __replica__: replica1

and

global:
  external_labels:
    cluster: prom-team1
    __replica__: replica2

Note
The preceding labels are external labels and have nothing to do with remote_write configuration.

These two label names are configurable on a per-tenant basis within Grafana Mimir. For example, if the label name of one cluster is used by some workloads, set the label name of another cluster to something else that uniquely identifies the second cluster.

Set the replica label so that the value for each Prometheus cluster is unique in that cluster.

Note
Grafana Mimir drops this label when ingesting data, but preserves the cluster label. This way, your time series won’t change when replicas change.

How to configure Grafana Mimir

The minimal configuration required is as follows:

Enable the HA tracker.
Configure the HA tracker KV store.
Configure expected label names for each cluster and its replica.

Enable the HA tracker

To enable the HA tracker feature, set the -distributor.ha-tracker.enable=true CLI flag (or its YAML configuration option) in the distributor.

Next, decide whether you want to enable it for all tenants or just a subset of tenants. To enable it for all tenants, set -distributor.ha-tracker.enable-for-all-users=true. Alternatively, you can enable the HA tracker only on a per-tenant basis, keeping the default -distributor.ha-tracker.enable-for-all-users=false and overriding it on a per-tenant basis setting accept_ha_samples in the overrides section of the runtime configuration.

Configure the HA tracker KV store

The HA tracker requires a key-value (KV) store to coordinate which replica is currently elected. Starting from Mimir 3.0, memberlist is the recommended and default KV store backend for the HA tracker.

To migrate from Consul or etcd to memberlist without downtime, see Migrate HA tracker from Consul or etcd to memberlist.

The following CLI flags (and their respective YAML configuration options) are available for configuring the HA tracker KV store:

-distributor.ha-tracker.store: Backend storage to use (default: memberlist).
-memberlist.*: Memberlist client configuration. This is shared by multiple components.

The memberlist configuration is typically shared across multiple Mimir components (distributors, ingesters, etc.), so if you already have memberlist configured for hash ring synchronization, no additional configuration is required for the HA tracker.

Configure expected label names for each Prometheus cluster and replica

The HA tracker deduplicates incoming series that have cluster and replica labels. You can configure the name of these labels either globally or on a per-tenant basis.

Configure the default cluster and replica label names using the following CLI flags (or their respective YAML configuration options):

-distributor.ha-tracker.cluster: Name of the label whose value uniquely identifies a Prometheus HA cluster (defaults to cluster).
-distributor.ha-tracker.replica: Name of the label whose value uniquely identifies a Prometheus replica within the HA cluster (defaults to __replica__).

Note
The HA label names can be overridden on a per-tenant basis by setting ha_cluster_label and ha_replica_label in the overrides section of the runtime configuration.

Example configuration

The following configuration example snippet enables the HA tracker for all tenants via a YAML configuration file:

limits:
  accept_ha_samples: true

distributor:
  ha_tracker:
    enable_ha_tracker: true
    kvstore:
      store: memberlist

memberlist:
  # Memberlist configuration (typically shared with other components)
  join_members:
    - <IP_OR_DNS:PORT>
    - <IP_OR_DNS:PORT>

Note
If memberlist is already configured for other Mimir components, such as the hash ring, the HA tracker automatically uses that configuration. In most deployments, you don’t need any additional memberlist configuration.

For more information, see distributor. The HA tracker flags are prefixed with -distributor.ha-tracker.*.

Was this page helpful?

Suggest an edit in GitHub

Create a GitHub issue

Email docs@grafana.com

Help and support

Community

Configure Grafana Mimir high-availability deduplication

Distributor high-availability (HA) tracker

Error responses

Configuration

How to configure Prometheus

How to configure Grafana Mimir

Enable the HA tracker

Configure the HA tracker KV store

Configure expected label names for each Prometheus cluster and replica

Example configuration

Was this page helpful?

Still have questions?

Get every update

Configure Grafana Mimir high-availability deduplication

Distributor high-availability (HA) tracker

Error responses

Configuration

How to configure Prometheus

How to configure Grafana Mimir

Enable the HA tracker

Configure the HA tracker KV store

Configure expected label names for each Prometheus cluster and replica

Example configuration

Was this page helpful?

Related resources from Grafana Labs