This is documentation for the next version of Mimir. For the latest stable release, go to the latest version.

Grafana Mimir migration guidesMigrating from Thanos or Prometheus to Grafana Mimir

Migrating from Thanos or Prometheus to Grafana Mimir

This document guides an operator through the process of migrating a deployment of Thanos or Prometheus to Grafana Mimir.

Overview

Grafana Mimir stores series in TSDB blocks uploaded in an object storage bucket. These blocks are the same as those used by Prometheus and Thanos. Each project stores blocks in different places and uses slightly different block metadata files.

Configuring remote write to Grafana Mimir

For configuration of remote write to Grafana Mimir, refer to Configuring Prometheus remote write.

Uploading historic TSDB blocks to Grafana Mimir

Grafana Mimir supports uploading of historic TSDB blocks, notably from Prometheus. In order to enable this functionality, either for all tenants or a specific one, refer to Configure TSDB block upload.

Prometheus stores TSDB blocks in the path specified in the --storage.tsdb.path flag.

To find all block directories in the TSDB <STORAGE TSDB PATH>, run the following command:

find <STORAGE TSDB PATH> -name chunks -exec dirname {} \;

Grafana Mimir supports multiple tenants and stores blocks per tenant. With multi-tenancy disabled, there is a single tenant called anonymous.

Use Grafana mimirtool to upload each block, such as those identified by the previous command, to Grafana Mimir:

mimirtool backfill --address=http://<mimir-hostname> --id=<tenant> <block1> <block2>...

Note: If you need to authenticate against Grafana Mimir, you can provide an API key via the --key flag, for example --key=$(cat token.txt).

Grafana Mimir performs some sanitization and validation of each block’s metadata. As a result, it rejects Thanos blocks due to unsupported labels. As a workaround, if you need to upload Thanos blocks, upload the blocks directly to the Grafana Mimir blocks bucket, prefixed by <tenant>/<block ID>/.

Block metadata

Each block has a meta.json metadata file that is used by Grafana Mimir, Prometheus, and Thanos to identify the block contents. Each project has its own metadata conventions.

In the Grafana Mimir 2.1 (or earlier) release, the ingesters added an external label to the meta.json file to identify the tenant that owns the block.

In the Grafana Mimir 2.2 (or later) release, blocks no longer have a label that identifies the tenant.

Note: Blocks from Prometheus do not have any external labels stored in them; only blocks from Thanos use labels.

Considerations on Thanos specific features

Thanos requires that Prometheus is configured with external labels. When the Thanos sidecar uploads blocks, it includes the external labels from Prometheus in the meta.json file inside the block. When you query the block, Thanos injects Prometheus’ external labels in the series returned in the query result. Thanos also uses labels for the deduplication of replicated data.

If you want to use existing blocks from Thanos by Grafana Mimir, there are some considerations:

Grafana Mimir doesn’t inject external labels into query results. This means that blocks that were originally created by Thanos will not include their external labels in the results when queried by Grafana Mimir. If you need to have external labels in your query results, this is currently not possible to achieve in Grafana Mimir.

Grafana Mimir will not respect deduplication labels configured in Thanos when querying the blocks. For best query performance, only upload Thanos blocks from a single Prometheus replica from each HA pair. If you upload blocks from both replicas, the query results returned by Mimir will include samples from both replicas.

Grafana Mimir does not support Thanos’ downsampling feature. To guarantee query results correctness please only upload original (raw) Thanos blocks into Mimir’s storage. If you also upload blocks with downsampled data (ie. blocks with non-zero Resolution field in meta.json file), Grafana Mimir will merge raw samples and downsampled samples together at the query time. This may cause that incorrect results are returned for the query.

Migrate historic TSDB blocks from Thanos to Grafana Mimir

  1. Copy the blocks from Thanos’s bucket to an Intermediate bucket.

    Create an intermediate object storage bucket (such as Amazon S3 or GCS) within your cloud provider, where you can copy the historical blocks and work on them before uploading them to the Mimir bucket.

    Tip: Run the commands within a screen or tmux session, to avoid any interruptions because the steps might take time depending on the amount of data being processed.

    For Amazon S3, use the aws tool:

    aws s3 cp -r s3://<THANOS-BUCKET> s3://<INTERMEDIATE-MIMIR-BUCKET>/
    

    For Google Cloud Storage (GCS), use the gsutil tool:

    gsutil -m cp -r gs://<THANOS-BUCKET>/* gs://<INTERMEDIATE-MIMIR-BUCKET>/
    

    After the copy process completes, inspect the blocks in the bucket to make sure that they are valid from a Thanos perspective.

    thanos tools bucket inspect \
        --objstore.config-file bucket.yaml
    
  2. Remove the downsampled blocks.

    Mimir doesn’t understand the downsampled blocks from Thanos, such as blocks with a non-zero Resolution field in the meta.json file. Therefore, you need to remove the 5m and 1h downsampled blocks from this bucket.

    Mark the downsampled blocks for deletion:

    thanos tools bucket retention \
        --objstore.config-file bucket.yaml \
        --retention.resolution-1h=1s \
        --retention.resolution-5m=1s \
        --retention.resolution-raw=0s
    

    Cleanup the blocks marked for deletion.

    thanos tools bucket cleanup \
        --objstore.config-file bucket.yaml \
        --delete-delay=0
    
  3. Remove the duplicated blocks.

    If two replicas of Prometheus instances are deployed for high-availability, then only upload the blocks from one of the replicas and drop from the other replica.

    # Get list of all blocks in the bucket
    thanos tools bucket inspect \
        --objstore.config-file bucket.yaml \
        --output=tsv > blocks.tsv
    
    # Find blocks from replica that we will drop
    cat blocks.tsv| grep prometheus_replica=<PROMETHEUS-REPLICA-TO-DROP> \
        | awk '{print $1}' > blocks_to_drop.tsv
    
    # Mark found blocks for deletion
    for ID in $(cat blocks_to_drop.tsv)
    do
        thanos tools bucket mark \
           --marker="deletion-mark.json" \
           --objstore.config-file bucket.yaml \
           --details="Removed as duplicate" \
           --id $ID
    done
    

    Note: Replace prometheus_replica with the unique label that would differentiate Prometheus replicas in your setup.

    Clean up the duplicate blocks marked for deletion again:

    thanos tools bucket cleanup \
        --objstore.config-file bucket.yaml \
        --delete-delay=0
    

    Tip: If you want to visualize exactly what is happening in the blocks, with respect to the source of blocks, external labels, compaction levels, and more, you can use the following command to get the output as CSV and import it into a Google spreadsheet:

    thanos tools bucket inspect \
        --objstore.config-file bucket-prod.yaml \
        --output=csv > thanos-blocks.csv
    
  4. Relabel the blocks with external labels.

    Mimir doesn’t inject external labels from the meta.json file into query results. Therefore, you need to relabel the blocks with the required external labels in the meta.json file.

    Note: You can get the external labels in the meta.json file of each block from the CSV file that is imported (as mentioned in the preceding tip), and build the rewrite configuration accordingly.

    Create a rewrite configuration that is similar to this:

    # relabel-config.yaml
    - action: replace
      target_label: "<LABEL-KEY>"
      replacement: "<LABEL-VALUE>"
    

    Perform the rewrite dry run to confirm all works well.

    # Get list of all blocks in the bucket after removing the depuplicate and downsampled blocks.
    thanos tools bucket inspect \
        --objstore.config-file bucket.yaml \
        --output=tsv > blocks-to-rewrite.tsv
    
    # Check if rewrite of the blocks with external labels is working as expected.
    for ID in $(cat blocks-to-rewrite.tsv)
    do
        thanos tools bucket rewrite \
            --objstore.config-file bucket.yaml \
            --rewrite.to-relabel-config-file relabel-config.yaml \
            --dry-run \
            --id $ID
    done
    

    After you confirm that the rewrite is working as expected via --dry-run, apply the changes with the --no-dry-run flag. Remember to include --delete-blocks, otherwise the original blocks will not be marked for deletion.

    # Rewrite the blocks with external labels and mark the original blocks for deletion.
    for ID in $(cat blocks-to-rewrite.tsv)
    do
        thanos tools bucket rewrite  \
            --objstore.config-file bucket.yaml \
            --rewrite.to-relabel-config-file relabel-config.yaml \
            --no-dry-run \
            --delete-blocks \
            --id $ID
    done
    

    The output of relabelling of every block would look like something below.

    level=info ts=2022-10-10T13:03:32.032820262Z caller=factory.go:50 msg="loading bucket configuration"
    level=info ts=2022-10-10T13:03:32.516953867Z caller=tools_bucket.go:1160 msg="downloading block" source=01GEGWPME2187SVFH63G8DH7KH
    level=info ts=2022-10-10T13:03:35.825009556Z caller=tools_bucket.go:1197 msg="changelog will be available" file=/tmp/thanos-  rewrite/01GF0ZWPWGEPHG5NV79NH9KMPV/change.log
    level=info ts=2022-10-10T13:03:35.836953593Z caller=tools_bucket.go:1212 msg="starting rewrite for block" source=01GEGWPME2187SVFH63G8DH7KH  new=01GF0ZWPWGEPHG5NV79NH9KMPV toDelete= toRelabel="- action: replace\n  target_label: \"cluster\"\n  replacement: \"prod-cluster\"\n"
    level=info ts=2022-10-10T13:04:47.57624244Z caller=compactor.go:42 msg="processed 10.00% of 701243 series"
    level=info ts=2022-10-10T13:04:53.4046885Z caller=compactor.go:42 msg="processed 20.00% of 701243 series"
    level=info ts=2022-10-10T13:04:59.649337602Z caller=compactor.go:42 msg="processed 30.00% of 701243 series"
    level=info ts=2022-10-10T13:05:02.986219042Z caller=compactor.go:42 msg="processed 40.00% of 701243 series"
    level=info ts=2022-10-10T13:05:05.990498497Z caller=compactor.go:42 msg="processed 50.00% of 701243 series"
    level=info ts=2022-10-10T13:05:09.349918024Z caller=compactor.go:42 msg="processed 60.00% of 701243 series"
    level=info ts=2022-10-10T13:05:12.040895624Z caller=compactor.go:42 msg="processed 70.00% of 701243 series"
    level=info ts=2022-10-10T13:05:15.253899238Z caller=compactor.go:42 msg="processed 80.00% of 701243 series"
    level=info ts=2022-10-10T13:05:18.471471014Z caller=compactor.go:42 msg="processed 90.00% of 701243 series"
    level=info ts=2022-10-10T13:05:21.536267363Z caller=compactor.go:42 msg="processed 100.00% of 701243 series"
    level=info ts=2022-10-10T13:05:21.536466158Z caller=tools_bucket.go:1222 msg="wrote new block after modifications; flushing" source=01GEGWPME2187SVFH63G8DH7KH new=01GF0ZWPWGEPHG5NV79NH9KMPV
    level=info ts=2022-10-10T13:05:28.675240198Z caller=tools_bucket.go:1231 msg="uploading new block" source=01GEGWPME2187SVFH63G8DH7KH new=01GF0ZWPWGEPHG5NV79NH9KMPV
    level=info ts=2022-10-10T13:05:38.922348564Z caller=tools_bucket.go:1241 msg=uploaded source=01GEGWPME2187SVFH63G8DH7KH new=01GF0ZWPWGEPHG5NV79NH9KMPV
    level=info ts=2022-10-10T13:05:38.979696873Z caller=block.go:203 msg="block has been marked for deletion" block=01GEGWPME2187SVFH63G8DH7KH
    level=info ts=2022-10-10T13:05:38.979832767Z caller=tools_bucket.go:1249 msg="rewrite done" IDs=01GEGWPME2187SVFH63G8DH7KH
    level=info ts=2022-10-10T13:05:38.980197796Z caller=main.go:161 msg=exiting
    

    Cleanup the original blocks which are marked for deletion.

    thanos tools bucket cleanup \
        --objstore.config-file bucket.yaml \
        --delete-delay=0
    

    Note: If there are multiple prometheus clusters, then relabelling each of them in parallel may speed up the entire process. Get the list of blocks that for each cluster, and process it separately.

    thanos tools bucket inspect \
        --objstore.config-file bucket.yaml \
        --output=tsv \
        | grep <PROMETHEUS-CLUSTER-NAME> \
        | awk '{print $1}' > prod-blocks.tsv
    
    for ID in `cat prod-blocks.tsv`
    do
        thanos tools bucket rewrite  \
           --objstore.config-file bucket.yaml \
           --rewrite.to-relabel-config-file relabel-config.yaml \
           --delete-blocks \
           --no-dry-run \
           --id $ID
    done
    
  1. Remove external labels from meta.json.

    Mimir compactor will not be able to compact the blocks having external labels with Mimir’s own blocks that don’t have any such labels in their meta.json. Therefore these external labels have to be removed before copying them to the Mimir bucket.

    Use the below script to remove the labels from the meta.json.

    #!/bin/bash
    
    BUCKET="GCS Bucket name"
    
    echo "Fetching list of meta.json files (this can take a while if there are many blocks)"
    gsutil ls "gs://$BUCKET/*/meta.json" > meta-files.txt
    
    echo "Processing meta.json files"
    for FILE in $(cat meta-files.txt); do
       echo "Removing Thanos labels from $FILE"
       ORIG_META_JSON=$(gsutil cat "$FILE")
       UPDATED_META_JSON=$(echo "$ORIG_META_JSON" | jq "del(.thanos.labels)")
    
       if ! diff -u <( echo "$ORIG_META_JSON" | jq . ) <( echo "$UPDATED_META_JSON" | jq .) > /dev/null; then
          echo "Backing up $FILE to $FILE.orig"
          gsutil cp "$FILE" "$FILE.orig"
          echo "Uploading modified $FILE"
          echo "$UPDATED_META_JSON" | gsutil cp - "$FILE"
       else
          echo "No diff for $FILE"
       fi
    done
    
  1. Copy the blocks from the intermediate bucket to the Mimir bucket.

    For Amazon S3, use the aws tool:

    aws s3 cp -r s3://<INTERMEDIATE-MIMIR-BUCKET> s3://<MIMIR-GCS-BUCKET>/<TENANT>/
    

    For Google Cloud Storage (GCS), use the gsutil tool:

    gsutil -m cp -r gs://<INTERMEDIATE-MIMIR-BUCKET> gs://<MIMIR-GCS-BUCKET>/<TENANT>/
    

    Historical blocks are not available for querying immediately after they are uploaded because the bucket index with the list of all available blocks first needs to be updated by the compactor. The compactor typically perform such an update every 15 minutes. After an update completes, other components such as the querier or store-gateway are able to work with the historical blocks, and the blocks are available for querying through Grafana.

  2. Check the store-gateway HTTP endpoint at http://<STORE-GATEWAY-ENDPOINT>/store-gateway/tenant/<TENANT-NAME>/blocks to verify that the uploaded blocks are there.