Migrate from loki-distributed Helm chart
This guide will walk you through migrating to the loki Helm Chart, v3.0 or higher, from the loki-distributed Helm Chart (v0.63.2 at time of writing). The process consists of deploying the new loki Helm Chart alongside the existing loki-distributed installation. By joining the new cluster to the existing cluster’s ring, you will create one large cluster. This will allow you to manually bring down the loki-distributed components in a safe way to avoid any data loss.
Before you begin:
We recommend having a Grafana instance available to monitor both the existing and new clusters, to make sure there is no data loss during the migration process. The loki chart ships with self-monitoring features, including dashboards. These are useful for monitoring the health of the new cluster as it spins up.
Start by updating your existing Grafana Agent or Promtail config (whatever is scraping logs from your environment) to exclude the new deployment. The new loki chart ships with its own self-monitoring mechanisms, and we want to make sure it’s not scraped twice, which would produce duplicate logs. The best way to do this is via a relabel config that will drop logs from the new deployment, for example something like:
- source_labels:
- "__meta_kubernetes_pod_label_app_kubernetes_io_component"
regex: "(canary|read|write)"
action: "drop"This leverages the fact that the new deployment adds a app.kubernetes.io/component label of either read for the Read pods, write for the Write pods, and canary for the Loki Canary pods. These annotations are not present in the loki-distributed deployment, so this should only match logs from the new deployment.
To Migrate from loki-distributed to loki
Deploy new Loki Cluster
Next you will deploy the new
lokichart in the same namespace as your existingloki-distributedinstallation. Make sure to use the same buckets and storage configuration as your existing installation. You will need to set some specialmigratevalues as well:migrate: fromDistributed: enabled: true memberlistService: loki-loki-distributed-memberlistThe value of
memberlistServicemust be the kubernetes service created for the purpose of Memberlist DNS in theloki-distributeddeployment. It should match the value ofmemberlist.join_membersin the config of theloki-distributeddeployment. This is what will make the new cluster join the existing clusters ring. It is important to join all existing rings, if you are using different memberlist DNS for different rings, you must manually set the value of each applicablejoin_membersconfiguration for each ring. If using the same memberlist DNS for all rings, as is the default in theloki-distributedchart, settingmigrate.memberlistServiceshould be enough.Once the new cluster is up, add the appropriate data source in Grafana for the new cluster. Check that the following queries return results:
- Confirm new and old logs are in the new deployment. Using the new deployment’s Loki data source in Grafana, look for:
- Logs with a job that is unique to your existing Promtail or Grafana Agent, the one we adjusted above to exclude logs from the new deployment which is not yet pushing logs to the new deployment. If you can query those via the new deployment in shows we have not lost historical logs.
- Logs with the label
job="loki/loki-read". The read component does not exist inloki-distributed, so this show the new Loki cluster’s self monitoring is working correctly.
- Confirm new logs are in the old deployment. Using the old deployment’s Loki data source in Grafana, look for:
- Logs with the label
job="loki/loki-read". Since you have excluded logs from the new deployment from going to theloki-distributeddeployment, if you can query them through theloki-distributedLoki data source that show the ingesters have joined the same ring, and are queryable from theloki-distributedqueriers.
- Logs with the label
- Confirm new and old logs are in the new deployment. Using the new deployment’s Loki data source in Grafana, look for:
Convert all Clients to Push Logs to New
lokiDeploymentAssuming everything is working as expected, you can now modify the
clientssection of your Grafana Agent or Promtail configuration to push logs to the new deployment. After this change is made, theloki-distributedcluster will no longer recieve new logs and can be carefully scaled down.Once this has deployed, query the new
lokicluster’s Loki data source for new logs to make sure they’re still being ingested.Tear Down the Old Loki Canary
If you had previously deployed the canary via the
loki-canaryHelm Chart, you can now tear it down. The new chart ships the canary by default and is automatically configured to scrape it.Update Flush Config On
loki-distributedDeploymentYou are almost ready to start scaling down the old
loki-distributedcluster. Before you start, however, it is important to make sure theloki-distributedingesters are configured to flush on shutdown. Since these ingesters will not be coming back, there will be no opportunity for them to replay their WALs, so they need to flush all in-memory logs before shutting down to prevent data loss.The easiest way to do this is via the
extraArgsargument in theingestersection of theloki-distributedchart. You may also want to set the ingester’s log level todebugto see messages about the flushing process.ingester: replicas: 3 extraArgs: - '-ingester.flush-on-shutdown=true' - '-log.level=debug' ``` Deploy this change, and make sure all ingesters restart and are running the latest configuration.Scale Down Ingesters One at a Time
It is now time to start scaling down
loki-distributed. Scale down the ingester StatefulSet or Deployment (depending on how yourloki-distributedchart is deployed) 1 replica at a time. Ifdebuglogs were enabled, you can monitor the logs of each ingester as it’s terminating to make sure the flushing process was successful. Once the ingester pod is fully terminated, continue decrementing by another 1 replica. Continue until there are 0 instances of the ingester running.You can use the following command to edit the ingester StatefulSet in order to change the number of replicas (making sure to replace <NAMESPACE> with the correct namespace):
kubectl -n <NAMESPACE> edit statefulsets.apps loki-distributed-ingesterRemove
loki-distributedclusterDouble check that logs are still coming in to the new cluster. If something is wrong, it will be much easier to quickly scale back up
loki-distributedingesters before tearing down the whole cluster so you can investigate. If everything looks good, you can tear downloki-distributedusinghelm uninstall. For example:helm uninstall -n loki loki-distributedNow edit the new
lokicluster to remove themigrateoptions you added when first installing. Remove all of the following from yourvalues.yaml:migrate: fromDistributed: enabled: true memberlistService: loki-loki-distributed-memberlistTo apply the new configuration (assuming you installed the new chart as
lokiin the loki namespace):helm upgrade -n loki loki grafana/loki --values values.yamlThe
migrate.fromDistributed.memberlistServicewas added as an additional memberlist join member, so once this new config is pushed, the components should roll without interruption.Check the Dashboards
Now that the migration is finished, you can check the dashboards included with the new
lokichart to make sure everything is working as expected. You can also check the loki canary metrics to make sure nothing was lost during the migration. Assuming everything was in thelokinamespace, the following query, if run over a time period that starts before the migration, and ends after it was complete, should clearly illustrate when metrics started coming from the new canary, and if and when there was data detected by either during the process:( sum(increase(loki_canary_missing_entries_total{namespace="loki"}[$__range])) by (job) / sum(increase(loki_canary_entries_total{namespace="loki"}[$__range])) by (job) )*100



