Migrate from single zone to zone-aware replication with Helm

Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.

Documentationbreadcrumb arrow Grafana Mimirbreadcrumb arrow Migration guidesbreadcrumb arrow Migrate from single zone to zone-aware replication with Helm
Open source

Migrate from single zone to zone-aware replication with Helm

This document explains how to migrate stateful components from single zone to zone-aware replication with Helm. The three components in question are the alertmanager, the store-gateway and the ingester.

The migration path of the alertmanager and store-gatway is straight forward, however migrating ingesters is more complicated.

This document is applicable to both Grafana Mimir and Grafana Enterprise Metrics.


  1. Zone-aware replication is turned off for the component in question

  2. The installation is already upgraded to mimir-distributed Helm chart version 4.0.0 or later.

  3. If you have modified the mimir.config value, please make sure to merge in the latest version from the chart. Or consider using mimir.structuredConfig instead, see Manage the configuration of Grafana Mimir with Helm

Migrate alertmanager to zone-aware replication

Using zone-aware replication for alertmanager is optional and is only available if alertmanager is deployed as a StatefulSet.

Configure zone-aware replication for alertmanagers

This section is about planning and configuring the availability zones defined under the alertmanager.zoneAwareReplication Helm value.

There are two use cases in general:

  1. Speeding up rollout of alertmanagers in case there are more than 3 replicas. In this case use the default value in the small.yaml, large.yaml, capped-small.yaml or capped-large.yaml. The default value defines 3 “virtual” zones and sets affinity rules so that alertmanagers from different zones do not mix, but it allows multiple alertmanagers of the same zone on the same node:

        topologyKey: "kubernetes.io/hostname" # Triggers creating anti-affinity rules
  2. Geographical redundancy. In this case you need to set a suitable nodeSelector value to choose where the pods of each zone are to be placed. Setting topologyKey will instruct the Helm chart to create anti-affinity rules so that alertmanagers from different zones do not mix, but it allows multiple alertmanagers of the same zone on the same node. For example:

        topologyKey: "kubernetes.io/hostname" # Triggers creating anti-affinity rules
          - name: zone-a
              topology.kubernetes.io/zone: us-central1-a
          - name: zone-b
              topology.kubernetes.io/zone: us-central1-b
          - name: zone-c
              topology.kubernetes.io/zone: us-central1-c

Note: as the zones value is an array, you must copy and modify it to make changes to it, there is no way to overwrite just parts of the array!

Set the chosen configuration in your custom values (e.g. custom.yaml).

Note: The number of alertmanager pods that will be started is derived from alertmanager.replicas. Each zone will start alertmanager.replicas / number of zones pods, rounded up to the nearest integer value. For example if you have 3 zones, then alertmanager.replicas=3 will yield 1 alertmanaer per zone, but alertmanager.replicas=4 will yield 2 per zone, 6 in total.

Migrate alertmanager

Before starting this procedure, set up your zones according to Configure zone-aware replication for alertmanagers.

  1. Create a new empty YAML file called migrate.yaml.

  2. Start the migration.

    Copy the following into the migrate.yaml file:

        enabled: true
          enabled: true
      enabled: true
  3. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

    In this step zone-awareness is enabled with the default zone and new StatefulSets are created for zone-aware alertmanagers, but no new pods are started.

  4. Wait until all alertmanagers are restarted and are ready.

  5. Scale up zone-aware alertmanagers.

    Replace the contents of the migrate.yaml file with:

        enabled: true
          enabled: true
          writePath: true
      enabled: true
  6. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

  7. Wait until all new zone-aware alertmanagers are started and are ready.

  8. Set the final configuration.

    Merge the following values into your custom Helm values file:

        enabled: true
      enabled: true
  9. Upgrade the installation with the helm command using your regular command line flags.

  10. Wait until old non zone-aware alertmanagers are terminated.

Migrate store-gateways to zone-aware replication

Configure zone-aware replication for store-gateways

This section is about planning and configuring the availability zones defined under the store_gateway.zoneAwareReplication Helm value.

There are two use cases in general:

  1. Speeding up rollout of store-gateways in case there are more than 3 replicas. In this case use the default value in the small.yaml, large.yaml, capped-small.yaml or capped-large.yaml. The default value defines 3 “virtual” zones and sets affinity rules so that store-gateways from different zones do not mix, but it allows multiple store-gateways of the same zone on the same node:

        enabled: false # Do not turn on zone-awareness without migration because of potential query errors
        topologyKey: "kubernetes.io/hostname" # Triggers creating anti-affinity rules
  2. Geographical redundancy. In this case you need to set a suitable nodeSelector value to choose where the pods of each zone are to be placed. Setting topologyKey will instruct the Helm chart to create anti-affinity rules so that store-gateways from different zones do not mix, but it allows multiple store-gateways of the same zone on the same node. For example:

        enabled: false # Do not turn on zone-awareness without migration because of potential query errors
        topologyKey: "kubernetes.io/hostname" # Triggers creating anti-affinity rules
          - name: zone-a
              topology.kubernetes.io/zone: us-central1-a
          - name: zone-b
              topology.kubernetes.io/zone: us-central1-b
          - name: zone-c
              topology.kubernetes.io/zone: us-central1-c

Note: as zones value is an array, you must copy and modify it to make changes to it, there is no way to overwrite just parts of the array!

Set the chosen configuration in your custom values (e.g. custom.yaml).

Note: The number of store-gateway pods that will be started is derived from store_gateway.replicas. Each zone will start store_gateway.replicas / number of zones pods, rounded up to the nearest integer value. For example if you have 3 zones, then store_gateway.replicas=3 will yield 1 store-gateway per zone, but store_gateway.replicas=4 will yield 2 per zone, 6 in total.

Decide which migration path to take for store-gateways

There are two ways to do the migration:

  1. With downtime. In this procedure old non zone-aware store-gateways are stopped, which will cause queries that look back more than 12 hours (or whatever querier.query_store_after Mimir parameter is set to) to fail. Ingestion is not impacted. This is the quicker and simpler way.
  2. Without downtime. This is a multi step procedure which requires additional hardware resources as the old and new store-gateways run in parallel for some time.

Migrate store-gateways with downtime

Before starting this procedure, set up your zones according to Configure zone-aware replication for store-gateways.

  1. Create a new empty YAML file called migrate.yaml.

  2. Scale the current store-gateways to 0.

    Copy the following into the migrate.yaml file:

      replicas: 0
        enabled: false
  3. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

  4. Wait until all store-gateways have terminated.

  5. Set the final configuration.

    Merge the following values into your custom Helm values file:

        enabled: true
      enabled: true

    These values are actually the default, which means that removing the values store_gateway.zoneAwareReplication.enabled and rollout_operator.enabled is also a valid step.

  6. Upgrade the installation with the helm command using your regular command line flags.

  7. Wait until all store-gateways are running and ready.

Migrate store-gateways without downtime

Before starting this procedure, set up your zones according to Configure zone-aware replication for store-gateways.

  1. Create a new empty YAML file called migrate.yaml.

  2. Create the new zone-aware store-gateways

    Copy the following into the migrate.yaml file:

        enabled: true
          enabled: true
      enabled: true
  3. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

  4. Wait for all new store-gateways to start up and be ready.

  5. Make the read path use the new zone-aware store-gateways.

    Replace the contents of the migrate.yaml file with:

        enabled: true
          enabled: true
          readPath: true
      enabled: true
  6. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

  7. Wait for all queriers and rulers to restart and become ready.

  8. Set the final configuration.

    Merge the following values into your custom Helm values file:

        enabled: true
      enabled: true

    These values are actually the default, which means that removing the values store_gateway.zoneAwareReplication.enabled and rollout_operator.enabled is also a valid step.

  9. Upgrade the installation with the helm command using your regular command line flags.

  10. Wait for non zone-aware store-gateways to terminate.

Migrate ingesters to zone-aware replication

Configure zone-aware replication for ingesters

This section is about planning and configuring the availability zones defined under the ingester.zoneAwareReplication Helm value.

There are two use cases in general:

  1. Speeding up rollout of ingesters in case there are more than 3 replicas. In this case use the default value in the small.yaml, large.yaml, capped-small.yaml or capped-large.yaml. The default value defines 3 “virtual” zones and sets affinity rules so that ingesters from different zones do not mix, but it allows multiple ingesters of the same zone on the same node:

        enabled: false # Do not turn on zone-awareness without migration because of potential data loss
        topologyKey: "kubernetes.io/hostname" # Triggers creating anti-affinity rules
  2. Geographical redundancy. In this case you need to set a suitable nodeSelector value to choose where the pods of each zone are to be placed. Setting topologyKey will instruct the Helm chart to create anti-affinity rules so that ingesters from different zones do not mix, but it allows multiple ingesters of the same zone on the same node. For example:

        enabled: false # Do not turn on zone-awareness without migration because of potential data loss
        topologyKey: "kubernetes.io/hostname" # Triggers creating anti-affinity rules
          - name: zone-a
              topology.kubernetes.io/zone: us-central1-a
          - name: zone-b
              topology.kubernetes.io/zone: us-central1-b
          - name: zone-c
              topology.kubernetes.io/zone: us-central1-c

Note: as zones value is an array, you must copy and modify it to make changes to it, there is no way to overwrite just parts of the array!

Set the chosen configuration in your custom values (e.g. custom.yaml).

Note: The number of ingester pods that will be started is derived from ingester.replicas. Each zone will start ingester.replicas / number of zones pods, rounded up to the nearest integer value. For example if you have 3 zones, then ingester.replicas=3 will yield 1 ingester per zone, but ingester.replicas=4 will yield 2 per zone, 6 in total.

Decide which migration path to take for ingesters

There are two ways to do the migration:

  1. With downtime. In this procedure ingress is stopped to the cluster while ingesters are migrated. This is the quicker and simpler way. The time it takes to execute this migration depends on how fast ingesters restart and upload their data to object storage, but in general should be finished in an hour.
  2. Without downtime. This is a multi step procedure which requires additional hardware resources as the old and new ingesters run in parallel for some time. This is a complex migration that can take days and requires monitoring for increased resouce utilization. The minimum time it takes to do this migration can be calculated as (querier.query_store_after) + (2h TSDB blocks range period + blocks_storage.tsdb.head_compaction_idle_timeout) * (1 + number_of_ingesters / 21). With the default values this means 12h + 3h * (1 + number of ingesters / 21) = 15h + 3h * (number_of_ingesters / 21). Add an extra 12 hours if shuffle sharding is enabled.

Migrate ingesters with downtime

Before starting this procedure, set up your zones according to Configure zone-aware replication for ingesters.

  1. Create a new empty YAML file called migrate.yaml.

  2. Enable flushing data from ingesters to storage on shutdown.

    Copy the following into the migrate.yaml file:

            flush_blocks_on_shutdown: true
            unregister_on_shutdown: true
        enabled: false
  3. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

  4. Wait for all ingesters to restart and be ready.

  5. Turn off traffic to the installation.

    Replace the contents of the migrate.yaml file with:

            flush_blocks_on_shutdown: true
            unregister_on_shutdown: true
        enabled: false
      replicas: 0
      replicas: 0
  6. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

  7. Wait until there is no nginx or gateway running.

  8. Scale the current ingesters to 0.

    Replace the contents of the migrate.yaml file with:

            flush_blocks_on_shutdown: true
            unregister_on_shutdown: true
      replicas: 0
        enabled: false
      replicas: 0
      replicas: 0
  9. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

  10. Wait until no ingesters are running.

  11. Start the new zone-aware ingesters.

    Replace the contents of the migrate.yaml file with:

        enabled: true
      replicas: 0
      replicas: 0
      enabled: true
  12. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

  13. Wait until all requested ingesters are running and are ready.

  14. Enable traffic to the installation.

    Merge the following values into your custom Helm values file:

        enabled: true
      enabled: true

    These values are actually the default, which means that removing the values ingester.zoneAwareReplication.enabled and rollout_operator.enabled is also a valid step.

  15. Upgrade the installation with the helm command using your regular command line flags.

Migrate ingesters without downtime

Before starting this procedure, set up your zones according to Configure zone-aware replication for ingesters

  1. Double the series limits for tenants and the ingesters.

    Explanation: while new ingesters are being added, some series will start to be written to new ingesters, however the series will also exist on old ingesters, thus the series will count twice towards limits. Not updating the limits might lead to writes to be refused due to limits violation.

    The limits.max_global_series_per_user Mimir configuration parameter has a non-zero default value of 150000. Double the default or your value by setting:

          max_global_series_per_user: 300000 # <-- or your value doubled

    If you have set the Mimir configuration parameter ingester.instance_limits.max_series via mimir.config or mimir.structuredConfig or via runtime overrides, double it for the duration of the migration.

    If you have set per tenant limits in the Mimir configuration parameters limits.max_global_series_per_user, limits.max_global_series_per_metric via mimir.config or mimir.sturcturedConfig or via runtime overrides, double the set limits. For example:

        max_series: X # <-- double it
          max_global_series_per_metric: Y # <-- double it
          max_global_series_per_user: Z # <-- double it
  2. Create a new empty YAML file called migrate.yaml.

  3. Start the migration.

    Copy the following into the migrate.yaml file:

        enabled: true
          enabled: true
          replicas: 0
      enabled: true
  4. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

    In this step new zone-aware StatefulSets are created - but no new pods are started yet. The parameter ingester.ring.zone_awareness_enabled: true is set in the Mimir configuration via the mimir.config value. The flag -ingester.ring.zone-awareness-enabled=false is set on distributors, rulers and queriers. The flags -blocks-storage.tsdb.flush-blocks-on-shutdown and -ingester.ring.unregister-on-shutdown are set to true for the ingesters.

  5. Wait for all Mimir components to restart and be ready.

  6. Add zone-aware ingester replicas, maximum 21 at a time.

    Explanation: while new ingesters are being added, some series will start to be written to new ingesters, however the series will also exist on old ingesters, thus the series will count twice towards limits. Adding only 21 replicas at a time reduces the number of series affected and thus the likelihood of beaching maximum series limits.

    1. Replace the contents of the migrate.yaml file with:

          enabled: true
            enabled: true
            replicas: <N>
        enabled: true

      Note: replace <N> with the number of replicas in each step until <N> reaches the same number as in ingester.replicas, do not increase <N> with more than 21 in each step.

    2. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

    3. Once the new ingesters are started and are ready, wait at least 3 hours.

      The 3 hours is calculated from 2h TSDB block range period + blocks_storage.tsdb.head_compaction_idle_timeout Grafana Mimir parameters to give enough time for ingesters to remove stale series from memory. Stale series will be there due to series being moved between ingesters.

    4. If the current <N> above in ingester.zoneAwareReplication.migration.replicas is less than ingester.replicas, go back and increase <N> with at most 21 and repeat these four steps.

  7. If you are using shuffle sharding, it must be turned off on the read path at this point.

    1. Update your configuration with these values and keep them until otherwise instructed.

          "querier.shuffle-sharding-ingesters-enabled": "false"
          "querier.shuffle-sharding-ingesters-enabled": "false"
    2. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

    3. Wait until queriers and rulers have restarted and are ready.

    4. Monitor resource utilization of queriers and rulers and scale up if necessary. Turning off shuffle sharding may increase resource utilization.

  8. Enable zone-awareness on the write path.

    Replace the contents of the migrate.yaml file with:

        enabled: true
          enabled: true
          writePath: true
      enabled: true
  9. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

    In this step the flag -ingester.ring.zone-awareness-enabled=false is removed from distributors and rulers.

  10. Once all distributors and rulers have restarted and are ready, wait 12 hours.

    The 12 hours is calculated from the querier.query_store_after Grafana Mimir parameter.

  11. Enable zone-awareness on the read path.

    Replace the contents of the migrate.yaml file with:

        enabled: true
          enabled: true
          writePath: true
          readPath: true
      enabled: true
  12. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

    In this step the flag -ingester.ring.zone-awareness-enabled=false is removed from queriers.

  13. Wait until all queriers have restarted and are ready.

  14. Exclude non zone-aware ingesters from the write path.

    Replace the contents of the migrate.yaml file with:

        enabled: true
          enabled: true
          writePath: true
          readPath: true
          excludeDefaultZone: true
      enabled: true
  15. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

    In this step the flag -ingester.ring.excluded-zones=zone-default is added to distributors and rulers.

  16. Wait until all distributors and rulers have restarted and are ready.

  17. Scale down non zone-aware ingesters to 0.

    Replace the contents of the migrate.yaml file with:

        enabled: true
          enabled: true
          writePath: true
          readPath: true
          excludeDefaultZone: true
          scaleDownDefaultZone: true
      enabled: true
  18. Upgrade the installation with the helm command and make sure to provide the flag -f migrate.yaml as the last flag.

  19. Wait until all non zone-aware ingesters are terminated.

  20. Delete the default zone.

    Merge the following values into your custom Helm values file:

        enabled: true
      enabled: true

    These values are actually the default, which means that removing the values ingester.zoneAwareReplication.enabled and rollout_operator.enabled from your custom.yaml is also a valid step.

  21. Upgrade the installation with the helm command using your regular command line flags.

  22. Wait at least 3 hours.

    The 3 hours is calculated from 2h TSDB block range period + blocks_storage.tsdb.head_compaction_idle_timeout Grafana Mimir parameters to give enough time for ingesters to remove stale series from memory. Stale series will be there due to series being moved between ingesters.

  23. If you are using shuffle sharding:

    1. Wait an extra 12 hours.

      The 12 hours is calculated from the querier.query_store_after Grafana Mimir parameter. After this time, no series are stored outside their dedicated shard, meaning that shuffle sharding on the read path can be safely enabled.

    2. Remove these values from your configuration:

          "querier.shuffle-sharding-ingesters-enabled": "false"
          "querier.shuffle-sharding-ingesters-enabled": "false"
    3. Upgrade the installation with the helm command using your regular command line flags.

    4. Wait until queriers and rulers have restarted and are ready.

    5. The resource utilization of queriers and rulers should return to pre-migration levels and you can scale them down to previous numbers.

  24. Undo the doubling of series limits done in the first step.

  25. Upgrade the installation with the helm command using your regular command line flags.