Migrate from SSD to distributed
This guide provides instructions for migrating from a simple scalable deployment (SSD) to a distributed microservices deployment of Loki. Before starting the migration, make sure you have read the considerations section.
Note
In this guide, an AWS deployment is used as an example. However, the migration process is mirrored for other cloud providers. This is due to the fact that no changes are required to the underlying data storage.
Considerations
Migrating from a simple scalable deployment to a distributed deployment with zero downtime is possible but requires careful planning. The following considerations should be taken into account:
- Helm Deployment: This guide assumes that you have deploying Loki using Helm. Other migration methods are possible but are not covered in this guide.
- Kubernetes Resources: This migration method requires you to spin up distributed Loki pods before shutting down the SSD pods. This means that you need to have enough resources in your Kubernetes cluster to run both the SSD and distributed Loki pods at the same time.
- Data: No changes are required to your underlying data storage. Although data loss or corruption is unlikely, it is always recommended to back up your data before starting the migration process. If you are using a cloud provider you can take a snapshot/backup.
- Configuration: We do not account for all configuration parameters in this guide. We only cover the parameters that need to be changed. Other parameters can remain the same. However, if pattern_ingesters=trueyou will need to spin uppatternIngestersbefore shutting down the SSD ingesters. This is primarily needed for the Grafana Logs Drilldown feature.
- Zone Aware Ingesters: This guide does not currently account for Zone Aware Ingesters. Our current recommendation is to either disable Zone Aware Ingesters or to consult the Mimir migration guide. Take note, not all parameters are equivalent between Mimir and Loki.
Prerequisites
Before starting the migration process, make sure you have the following prerequisites:
- Access to your Kubernetes cluster via kubectl.
- Helm installed.
Example SSD deployment
This example will use the following SSD deployment as a reference:
Note
This example is only a reference on the parameters that need to be changed. There will be other parameters within your own config such as
limits_config,gateway,compactor, etc. These can remain the same.
---
loki:
   schemaConfig:
     configs:
       - from: "2024-04-01"
         store: tsdb
         object_store: s3
         schema: v13
         index:
           prefix: loki_index_
           period: 24h
   storage_config:
     aws:
       region: eu-central-1
       bucketnames: aws-chunks-bucket
       s3forcepathstyle: false
   ingester:
       chunk_encoding: snappy
   ruler:
    enable_api: true
    storage:
      type: s3
      s3:
        region: eu-central-1
        bucketnames: aws-ruler-bucket
        s3forcepathstyle: false
      alertmanager_url: http://prom:9093
   querier:
      max_concurrent: 4
   storage:
      type: s3
      bucketNames:
        chunks: "aws-chunks-bucket"
        ruler: "aws-ruler-bucket"
      s3:
        region: eu-central-1
deploymentMode: SimpleScalable
# SSD
backend:
 replicas: 2
read:
 replicas: 3
write:
 replicas: 3
# Distributed Loki
ingester:
 replicas: 0
 zoneAwareReplication:
  enabled: false
querier:
 replicas: 0
 maxUnavailable: 0
queryFrontend:
 replicas: 0
 maxUnavailable: 0
queryScheduler:
 replicas: 0
distributor:
 replicas: 0
 maxUnavailable: 0
compactor:
 replicas: 0
indexGateway:
 replicas: 0
 maxUnavailable: 0
ruler:
 replicas: 0
 maxUnavailable: 0
# Single binary Loki
singleBinary:
 replicas: 0
minio:
 enabled: falseStage 1: Deploying the Loki distributed components
In this stage, we will deploy the distributed Loki components alongside the SSD components. We will also change the deploymentMode to SimpleScalable<->Distributed. The SimpleScalable<->Distributed migration mode allows for a zero-downtime transition between Simple Scalable and fully distributed architectures. During migration, both deployment types run simultaneously, sharing the same object storage backend.
The following table outlines which components take over the responsibilities of the SSD components:
| Simple Scalable Components | Distributed Components | 
|---|---|
| write (Deployment) | Distributor + Ingester | 
| read (StatefulSet) | Query Frontend + Querier | 
| backend (StatefulSet) | Compactor + Ruler + Index Gateway | 
How Loki handles request routing during the migration:
The Gateway (nginx) handles request routing based on endpoint type:
- Write Path (loki/api/v1/push):- Initially routes to Simple Scalable write component
- Gradually shifted to the Distributor
- Both write paths share the same object storage, ensuring data consistency
 
- Read Path (/loki/api/v1/query):- Routes to either Simple Scalable read or distributed Query Frontend
- Query results are consistent since both architectures read from same storage
 
- Admin/Background Operations:- Compaction, retention, and rule evaluation handled by either backend or respective distributed components
- Operations are coordinated through object storage locks
 
To start the migration process:
- Create a copy of your existing - values.yamlfile and name it- values-migration.yaml.- cp values.yaml values-migration.yaml
- Next modify the following parameters; - deploymentMode,- ingesterand components based on the annotations below.- --- loki: schemaConfig: configs: - from: "2024-04-01" store: tsdb object_store: s3 schema: v13 index: prefix: loki_index_ period: 24h storage_config: aws: region: eu-central-1 bucketnames: aws-chunks-bucket s3forcepathstyle: false ingester: chunk_encoding: snappy # Add this to ingester; this will force ingesters to flush before shutting down wal: flush_on_shutdown: true ruler: enable_api: true storage: type: s3 s3: region: eu-central-1 bucketnames: aws-ruler-bucket s3forcepathstyle: false alertmanager_url: http://prom:9093 querier: max_concurrent: 4 storage: type: s3 bucketNames: chunks: "aws-chunks-bucket" ruler: "aws-ruler-bucket" s3: region: eu-central-1 # Important: Make sure to change this to SimpleScalable<->Distributed deploymentMode: SimpleScalable<->Distributed # SSD backend: replicas: 2 read: replicas: 3 write: replicas: 3 # Distributed Loki # Spin up the distributed components ingester: replicas: 3 zoneAwareReplication: enabled: false querier: replicas: 3 maxUnavailable: 0 queryFrontend: replicas: 2 maxUnavailable: 0 queryScheduler: replicas: 2 distributor: replicas: 2 maxUnavailable: 0 compactor: replicas: 1 indexGateway: replicas: 2 maxUnavailable: 0 ruler: replicas: 1 maxUnavailable: 0 # Single binary Loki singleBinary: replicas: 0 minio: enabled: false- Here is a breakdown of the changes: - ingester.wal.flush_on_shutdown: true: This will force the ingesters to flush before shutting down. This is important to prevent data loss.
- deploymentMode: SimpleScalable<->Distributed: This will allow for the SSD and distributed components to run simultaneously.
- Spin up all distributed components with the desired replicas.
 
- Deploy the distributed components using the following command: - helm upgrade --values values-migration.yaml loki grafana/loki -n loki- Caution - It is important to allow all components to fully spin up before proceeding to the next stage. You can check the status of the components using the following command: - kubectl get pods -n loki- Let all components reach the - Runningstate before proceeding to the next stage.
Stage 2: Transitioning to distributed components
The final stage of the migration involves transitioning all traffic to the distributed components. This is done by scaling down the SSD components and swapping the deploymentMode to Distributed. To do this:
- Create a copy of - values-migration.yamland name it- values-distributed.yaml.- cp values-migration.yaml values-distributed.yaml
- Next modify the following parameters; - deploymentModeand components based on the annotations below.- --- loki: schemaConfig: configs: - from: "2024-04-01" store: tsdb object_store: s3 schema: v13 index: prefix: loki_index_ period: 24h storage_config: aws: region: eu-central-1 bucketnames: aws-chunks-bucket s3forcepathstyle: false ingester: chunk_encoding: snappy wal: flush_on_shutdown: true ruler: enable_api: true storage: type: s3 s3: region: eu-central-1 bucketnames: aws-ruler-bucket s3forcepathstyle: false alertmanager_url: http://prom:9093 querier: max_concurrent: 4 storage: type: s3 bucketNames: chunks: "aws-chunks-bucket" ruler: "aws-ruler-bucket" s3: region: eu-central-1 # Important: Make sure to change this to Distributed deploymentMode: Distributed # SSD # Scale down the SSD components backend: replicas: 0 read: replicas: 0 write: replicas: 0 # Distributed Loki ingester: replicas: 3 zoneAwareReplication: enabled: false querier: replicas: 3 maxUnavailable: 0 queryFrontend: replicas: 2 maxUnavailable: 0 queryScheduler: replicas: 2 distributor: replicas: 2 maxUnavailable: 0 compactor: replicas: 1 indexGateway: replicas: 2 maxUnavailable: 0 ruler: replicas: 1 maxUnavailable: 0 # Single binary Loki singleBinary: replicas: 0 minio: enabled: false- Here is a breakdown of the changes: - deploymentMode: Distributed: This will allow for the distributed components to run in isolation.
- Scale down all SSD components to 0.
 
- Deploy the final configuration using the following command: - helm upgrade --values values-distributed.yaml loki grafana/loki -n loki
- Once the deployment is complete, you can verify that all components are running using the following command: - kubectl get pods -n loki
You should see all distributed components running and the SSD compontents have now been removed.
What’s next?
Loki in distributed mode is inherently more complex than SSD mode. It is recommended to meta-monitor your Loki deployment to ensure that everything is running smoothly. You can do this by following the meta-monitoring guide.







