Grafana Cloud quickstart guidesMigrating a Kube-Prometheus Helm stack

Migrating a Kube-Prometheus Helm stack to Grafana Cloud

Introduction

In this guide you’ll learn how to:

  • Install the Kube-Prometheus stack Helm chart into a Kubernetes (K8s) cluster using the Helm package manager
  • Configure your local Prometheus instance to ship metrics to Grafana Cloud using remote_write
  • Import the Kube-Prometheus Grafana Dashboards into your managed Grafana instance
  • Import the Kube-Prometheus recording and alerting rules into your Cloud Prometheus instance (optional)
  • Limit which metrics you ship from your local cluster to reduce your active series usage (optional)
  • Turn off local stack components like Grafana and Alertmanager (optional)
  • Enable multi-cluster support for the Kube-Prometheus rules and dashboards (optional)

By the end of this guide, you’ll have set up the Kube-Prometheus stack in your K8s cluster and configured it to ship its core set of metrics to Grafana Cloud for long-term storage, querying, visualization, and alerting. You’ll also have migrated the stack’s core assets (dashboards, recording rules, and alerting rules) to Grafana Cloud to leverage its scalability, availability, and efficient performance and reduce load on your local Prometheus instances.

Note: You may also wish to ship metrics to Grafana Cloud using the Grafana Agent, a lightweight telemetry collector based on Prometheus that only performs the scraping and remote_write functions. To get started with the Grafana Agent and Cloud, please see the Kubernetes Integration, which is available from the walkthrough in your hosted Grafana instance. This integration bundles a set of prebuilt dashboards and preconfigured K8s manifests to deploy the Agent into your cluster(s). You can find additonal deployment manifests for the Agent in its GitHub repository, which also contains the Agent documentation. The Agent Operator (beta) can also help you get up and running with the Agent and Cloud.

Prerequisites

Before you begin you should have the following available:

  • A Kubernetes cluster with role-based access control (RBAC) enabled.
  • A Grafana Cloud Pro account or trial. To create an account, please see Grafana Cloud and click on Start for free. A Pro tier account is necessary due to the number of dashboards, rules, and metrics imported from Kube-Prometheus. You can use a Free tier account with this guide if you import 10 or less dashboards, 100 or less rules, and keep your metrics usage under 10,000 active series.
  • The kubectl command-line tool installed on your local machine, configured to connect to your cluster. You can read more about installing kubectl in the official documentation.
  • The helm K8s package manager installed on your local machine. To learn how to install Helm, please see Installing Helm.

Step 1: Install the Kube-Prometheus stack into your cluster

In this step, you’ll use Helm to install the Kube-Prometheus stack into your K8s cluster.

The Kube-Prometheus stacks installs the following observability components:

In addition, Helm and Kube-Prometheus preconfigure these components to scrape several endpoints in your cluster by default, like the cadvisor, kubelet, and node-exporter /metrics endpoints on K8s Nodes, the K8s API server metrics endpoint, and kube-state-metrics endpoints, among others. To see a full list of configured scrape targets, please see the Kube-Prometheus Helm chart’s values.yaml. You can find scrape targets by searching for serviceMonitor objects. Configuring the Kube-Prometheus stack’s scrape targets goes beyond the scope of this guide, but to learn more, please see the ServiceMonitor spec in the Prometheus Operator GitHub repo.

The Kube-Prometheus stack also provisions several monitoring mixins. A mixin is a collection of prebuilt Grafana dashboards, Prometheus recording rules, and Prometheus alerting rules. In particular, it includes:

Mixins are written in Jsonnet, a data templating language, and generate JSON dashboard files and rules YAML files. Configuring and modifying the underlying mixins goes beyond the scope of this guide; they are imported as-is into Grafana Cloud. To learn more, please see Generate config files from the monitoring-mixins repo and Grizzly, a tool for working with Jsonnet-defined assets against the Grafana Cloud API. Note that Grizzly is currently in alpha.

To begin, add the prometheus-community Helm repo and update Helm:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

Next, install the kube-prometheus-stack chart using Helm:

helm install foo prometheus-community/kube-prometheus-stack

Note that this command will install the Kube-Prometheus stack into the default Namespace. To modify this, use a values.yaml file to override the defaults or pass in a --set flag. To learn more, please see Values Files from the Helm docs.

Replace foo with your desired release name.

Once Helm has finished installing the chart, you should see the following:

NAME: foo
LAST DEPLOYED: Fri Jun 25 15:30:30 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
  kubectl --namespace default get pods -l "release=foo"

Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.

You can use kubectl to inspect what’s been installed into the cluster:

kubectl get pod
alertmanager-foo-kube-prometheus-stack-alertmanager-0   2/2     Running   0          7m3s
foo-grafana-8547c9db6-vp8pf                             2/2     Running   0          7m6s
foo-kube-prometheus-stack-operator-6888bf88f9-26c42     1/1     Running   0          7m6s
foo-kube-state-metrics-76fbc7d6ff-vj872                 1/1     Running   0          7m6s
foo-prometheus-node-exporter-8qbrz                      1/1     Running   0          7m6s
foo-prometheus-node-exporter-d4dk4                      1/1     Running   0          7m6s
foo-prometheus-node-exporter-xplv4                      1/1     Running   0          7m6s
prometheus-foo-kube-prometheus-stack-prometheus-0       2/2     Running   1          7m3s

Here we observe that Alertmanager, Grafana, Prometheus Operator, kube-state-metrics, node-exporter, and Prometheus are all running in our cluster. In addition to these Pods, the stack installed several CRDs, or K8s Custom Resources. To see these, run kubectl get crd.

To access your Prometheus instance, use the kubectl port-forward command to forward a local port into the cluster:

kubectl port-forward svc/foo-kube-prometheus-stack-prometheus 9090

Be sure to replace foo-kube-prometheus-stack-prometheus with the appropriate service name.

In your browser, visit http://localhost:9090. You should see the Prometheus web interface. Click on Status and then Targets to see a list of preconfigured scrape targets.

You can use a similar procedure to access the Grafana and Alertmanager web interfaces.

Now that you’ve installed the stack in your cluster, you can begin shipping scraped metrics to Grafana Cloud.

Step 2: Ship metrics to Grafana Cloud

In this step you’ll configure Prometheus to ship scraped metrics to Grafana Cloud.

Active Series Warning: Shipping your K8s Prometheus metrics to Grafana Cloud using remote_write can result in a signficant increase in your active series usage and monthly bill. To estimate the number of series you’ll ship, head to the Prometheus web UI in your cluster, then click on Status, and TSDB Status to see your Prometheus instance’s stats. Number of series describes the rough number of active series you’ll be shipping over to Grafana Cloud. In a later step we’ll configure Prometheus to drop many of these to control our active series usage. Since you are only billed at the 95th percentile of active series usage, temporary spikes should not result in any cost increase. To learn more, please see 95th percentile billing from the Grafana Cloud docs.

We’ll configure Prometheus using the remoteWrite configuration section of the Helm chart’s values.yaml file. We’ll then update the release using helm upgrade.

Begin by creating a Kubernetes Secret to store your Grafana Cloud Prometheus username and password.

You can find your username by navigating to your stack in the Cloud Portal and clicking Details next to the Prometheus panel.

Your password corresponds to an API key that you can generate by clicking on Generate now in this same panel. To learn how to create a Grafana Cloud API key, please see Create a Grafana Cloud API key.

Once you’ve noted your Cloud Prometheus username and password, create the Kubernetes Secret. You can create a Secret by using a manifest file or create it directly using kubectl. In this guide we’ll create it directly using kubectl. To learn more about Kubernetes Secrets, please consult Secrets from the Kubernetes docs.

Run the following command to create a Secret called kubepromsecret:

kubectl create secret generic kubepromsecret \
  --from-literal=username=<your_grafana_cloud_prometheus_username>\
  --from-literal=password='<your_grafana_cloud_API_key>'\
  -n default

If you deployed your monitoring stack in a namespace other than default, change the -n default flag to the appropriate namespace in the above command. To learn more about this command, please see Managing Secrets using kubectl from the official Kubernetes docs.

Now that you’ve created a Secret to store your Grafana Cloud credentials, you can move on to modifying Prometheus’s configuration using a Helm values file. A Helm values file allows you to set configuration variables that are passed in to Helm’s chart templates. To see the default values file for Kube-Prometheus stack, consult values.yaml from the Kube-Prometheus stack’s GitHub repository.

We’ll create a values.yaml file defining Prometheus’s remote_write configuration, and then apply the new configuration to tje Kube-Prometheus release.

Open a file named values.yaml in your favorite editor. Paste in the following values:

prometheus:
  prometheusSpec:
    remoteWrite:
    - url: "<Your Cloud Prometheus instance remote_write endpoint>"
      basicAuth:
          username:
            name: kubepromsecret
            key: username
          password:
            name: kubepromsecret
            key: password
    replicaExternalLabelName: "__replica__"
    externalLabels: {cluster: "test"}

Here we set the remote_write URL and basic_auth username and password using the Secret created in the previous step.

We also configure two additional parameters: replicaExternalLabelName and externalLabels. Replace test with an appropriate name for your K8s cluster. Prometheus will add the cluster: test and __replica__: prometheus-foo-kube-prometheus-stack-prometheus-0 labels to any samples shipped to Grafana Cloud.

Configuring these parameters enables automatic metric deduplication in Grafana Cloud so that you can spin up additional Prometheus instances in a high-availability configuration without storing duplicate samples in your Grafana Cloud Prometheus instance. To learn more, please see Sending data from multiple high-availability Prometheus instances.

If you’re shipping data from multiple K8s clusters, setting the cluster external label will also identify the source cluster and take advantage of multi-cluster support in many of the Kube-Prometheus dashboards, recording rules, and alerting rules.

When you’re done editing the file, save and close it.

Roll out the changes with helm upgrade:

helm upgrade -f values.yaml your_release_name prometheus-community/kube-prometheus-stack

Replace your_release_name with the name of the release you used to install Kube-Prometheus. You can get a list of installed releases using helm list.

Once the changes have been rolled out, use port-forward to navigate to the Prometheus UI:

kubectl port-forward svc/foo-kube-prometheus-stack-prometheus 9090

Navigate to http://localhost:9090 in your browser, and then Status and Configuration. Verify that the remote_write block you appended above has propagated to your running Prometheus instance.

Finally, log in to your managed Grafana instance to begin querying your cluster data. You can use the Billing/Usage dashboard to inspect incoming data rates in the last 5 minutes to confirm the flow of data to Grafana Cloud. To learn more about the difference between Active Series and DPM, please see What are active series and DPM.

Now that you’re shipping metrics to Grafana Cloud and have configured the appropriate external labels, you’re ready to import your Kube-Prometheus dashboards into your hosted Grafana instance.

Step 3: Import Dashboards

In this step you’ll import the prebuilt Kube-Prometheus dashboards from your local Grafana instance into your managed Grafana instance.

Note: To learn how to enable multi-cluster support for Kube-Prometheus dashboards, please see Enabling multi-cluster support (optional).

This quickstart uses Grafana’s HTTP API to bulk export and import dashboards, which you can also do using Grafana’s Web UI. We’ll use a lightweight bash script to perform the dump and load. Note that the script does not preserve folder hierarchy and naively downloads all dashboards from a source Grafana instance and uploads them to a target Grafana instance.

To begin, navigate to Exporting and importing dashboards to hosted Grafana using the HTTP API and save the bash script into a file called dash_migrate.sh.

Create a temporary directory called temp_dir:

mkdir temp_dir

Make the script executable:

chmod +x dash_migrate.sh

Next, forward a local port to the Grafana service running in your cluster:

kubectl port-forward svc/foo-grafana 8080:80

Replace foo-grafana with the name of the Grafana service. You can find this using kubectl get svc.

With a port forwarded, log in to your Grafana instance by visiting http://localhost:8080 and entering admin as the username and the value configured for the adminPassword parameter. If you did not modify this value, you can find the default in the values.yaml file.

Create an API key by clicking on the cog in the left-hand navigation menu, and then API keys.

Note down the API key and local Grafana URL, and fill in the variables at the top of the bash script with the appropriate values:

SOURCE_GRAFANA_ENDPOINT='http://localhost:8080'
SOURCE_GRAFANA_API_KEY='your_api_key_here'
. . .

Repeat this process for your hosted Grafana instance, which you can access by navigating to the Cloud Portal. Click on Details next to your stack, and then Log In in the Grafana card. Ensure the API key has the Admin role. Once you’ve noted the endpoint URL and API key, modify the remaining values in the bash script:

. . .
DEST_GRAFANA_API_KEY='your_hosted_grafana_api_key_here'
DEST_GRAFANA_ENDPOINT='https://your_stack_name.grafana.net'
TEMP_DIR=temp_dir

Save and close the file.

Run the script:

./dash_migrate.sh -ei

The -e flag exports all dashboards from the source Grafana and saves them in temp_dir, and the -i flag imports the dashboards in temp_dir into the destination Grafana instance.

Now that you’ve imported the Kube Prometheus dashboards, navigate to your managed Grafana instance, click on Dashboards in the left-hand nav, and then Manage. From here you can access the default Kube Prometheus dashboards that you’ve just imported.

There are several open-source tools that can help you manage dashboards with Grafana using the HTTP API. One tool (currently in alpha) is Grizzly, which also allows you to work directly with the Jsonnet source used to generate the Kube-Prometheus stack configuration, as well as the generated JSON dashboard files. You can also use the Grafana Terraform provider.

Note that this quickstart uses the Helm version of the Kube-Prometheus stack, which templates manifest files generated from the underlying Kube-Prometheus project.

Step 4: Disable local components (optional)

Now that you’ve imported the Kube Prometheus dashboards to Grafana Cloud, you may wish to shut down some of the stack’s components locally. In this step we’ll turn off the following Kube Prometheus components:

  • Alertmanager, given that Grafana Cloud provisions a hosted Alertmanager instance integrated into the Grafana UI
  • Grafana

To disable these components, add the following to your values.yaml Helm configuration file:

grafana:
  enabled: false
alertmanager:
  enabled: false

Roll out the changes with helm upgrade:

helm upgrade -f values.yaml your_release_name prometheus-community/kube-prometheus-stack

You can learn how to disable recording and alerting rule evaluation using the configuration in Step 3 of Importing Recording and Alerting rules.

Conclusion

At this point you’ve rolled out the Kube-Prometheus stack in your cluster using Helm, configured Prometheus to remote_write metrics to Grafana Cloud for long-term storage and efficient querying, and have migrated Kube-Prometheus’s core set of dashboards to Grafana Cloud. Your Grafana Cloud dashboards will now query your Grafana Cloud Prometheus datasource directly. Note that your cluster-local Prometheus instance continues to evaluate alerting rules and recording rules. You can optionally migrate these by following Importing Recording and Alerting rules.

By default, Kube Prometheus will scrape almost every available endpoint in your cluster, shipping tens of thousands (possibly hundreds of thousands) of active series to Grafana Cloud. In the next guide we’ll configure Prometheus to only ship the metrics referenced in the dashboards we’ve just uploaded. You will lose long-term retention for these series, however, they will still be available locally for Prometheus’s default configured retention period.

Reducing your Prometheus active series usage.