Grafana Cloud quickstart guidesInstalling Prometheus Operator with Grafana Cloud for Kubernetes

Installing Prometheus Operator with Grafana Cloud for Kubernetes

Note: The Grafana Cloud Kubernetes Integration is now publicly available. To get started with Grafana Agent and the Kubernetes Integration, please click on the lightning bolt in the left-hand navigation panel of your hosted Grafana instance and follow the instructions for installing the Kubernetes Integration. The following guide does not use the Grafana Agent and instead rolls out Prometheus into your cluster and configures it to remote_write scraped samples to Grafana Cloud.

In this guide you’ll learn how to install Prometheus Operator in a Kubernetes (K8s) cluster, configure it to scrape an endpoint, and ship scraped metrics to Grafana Cloud.

Prometheus Operator implements the Kubernetes Operator pattern for managing a Prometheus-based Kubernetes monitoring stack. A Kubernetes Operator consists of Kubernetes custom resources and Kubernetes controller code. Together, these abstract away the management and implementation details of running a given service on Kubernetes. To learn more about Kubernetes Operators, please see Operator pattern from the Kubernetes docs.

The Prometheus Operator installs a set of Kubernetes Custom Resources that simplify Prometheus deployment and configuration. For example, using the ServiceMonitor Custom Resource, you can configure how Kubernetes services should be monitored in K8s YAML manifests instead of Prometheus configuration code. The Operator controller will then communicate with the K8s API server to add Service /metrics endpoints and automatically generate the required Prometheus scrape configurations for the configured Services. To learn more about Prometheus Operator, please see the Prometheus Operator GitHub repository.

We’ll begin by installing Prometheus Operator into the Kubernetes cluster. Next, we’ll launch a 2-replica high-availability (HA) Prometheus Deployment into the cluster using the Operator, and then expose the Prometheus server as a Service. Finally, we’ll create a ServiceMonitor to instruct Prometheus to scrape itself, and then ship those scraped metrics to Grafana Cloud.

Prerequisites

Before you begin, you should have the following available to you:

Once you have this ready to go, you can begin.

Step 1 — Install Prometheus Operator

We’ll start by installing Prometheus Operator into the Kubernetes cluster. We’ll install all of Prometheus Operator’s Kubernetes custom resource definitions (CRDs) that define the Prometheus, Alertmanager, and ServiceMonitor abstractions used to configure the monitoring stack. We’ll also deploy a Prometheus Operator controller into the cluster.

Install the Operator using the bundle.yaml file in the Prometheus Operator GitHub repository:

kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/bundle.yaml

You should see the following output:

customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
serviceaccount/prometheus-operator created
service/prometheus-operator created

bundle.yaml installs CRDs for Prometheus objects as well as a Prometheus Operator controller and Service.

Note: In this guide we’ll install everything into the default Namespace. To learn how to install Prometheus Operator into another Namespace, please see the Prometheus Operator docs.

Verify that the Prometheus Operator installation succeeded using kubectl:

kubectl get deploy
NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
prometheus-operator   1/1     1            1           3m21s

With Prometheus Operator installed, you can move on to configuring RBAC permissions for the Prometheus server.

Step 2 — Configure Prometheus RBAC Permissions

Before rolling out Prometheus, we’ll configure its RBAC privileges using a ClusterRole, and bind this ClusterRole to a ServiceAccount using a ClusterRoleBinding object.

Prometheus needs Kubernetes API access to discover targets and pull ConfigMaps. To learn more about permissions granted in this section, please see RBAC from the Prometheus Operator docs.

First, create a directory in which you’ll store any K8s manifests used for this guide, and cd into it:

mkdir operator_k8s
cd operator_k8s

Create a manifest file called prom_rbac.yaml using your favorite editor. Paste in the following Kubernetes manifest:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- apiGroups:
  - networking.k8s.io
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default

This creates a ServiceAccount called prometheus and binds it the prometheus ClusterRole. The manifest grants the ClusterRole get, list, and watch K8s API privileges.

When you’re done editing the manifest, save and close it.

Create the objects using kubectl:

kubectl apply -f
serviceaccount/prometheus created
clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created

Now that Prometheus has K8s API access, you can deploy it into the cluster.

Step 3 — Deploy Prometheus

In this step you’ll launch a 2-replica HA Prometheus deployment into your Kubernetes cluster using a Prometheus resource defined by Prometheus Operator.

This Prometheus resource encodes domain-specific Prometheus configuration into a set of readily configurable YAML fields. Instead of having to manage Prometheus configuration files and learn Prometheus config syntax, you can toggle many important configuration parameters by modifying Prometheus object variables in a K8s manifest.

Begin by creating a file called prometheus.yaml. Paste in the following manifest:

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  image: quay.io/prometheus/prometheus:v2.22.1
  nodeSelector:
    kubernetes.io/os: linux
  replicas: 2
  resources:
    requests:
      memory: 400Mi
  securityContext:
    fsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  serviceAccountName: prometheus
  version: v2.22.1
  serviceMonitorSelector: {}

Notice that we set kind to Prometheus and not Deployment or Pod. The Prometheus resource abstracts away the underlying controllers and ConfigMaps into streamlined objects used to manipulate Prometheus infrastructure components.

We define the following parameters:

  • Name the Prometheus resource prometheus
  • Give it an app: prometheus label
  • Set the container image used to run Prometheus
  • Restrict its deployment to Linux nodes
  • Ensure each replica has 400Mi of memory available to it
  • Configure its Security Context. To learn more, please see Configure a Security Context for a Pod or Container
  • Set the ServiceAccount it will use
  • Set the Prometheus version
  • Instruct Prometheus to automatically pick up all configured ServiceMonitor resources using {}. In this guide we’ll create a ServiceMonitor to get Prometheus to scrape its own metrics.

When you’re done, save and close the file.

Deploy it into your cluster using kubectl apply -f:

kubectl apply -f
prometheus.monitoring.coreos.com/prometheus created

Note: We’ve deployed Prometheus into the default Namespace. To deploy Prometheus in another Namespace, use the -n namespace_name flag with kubectl or set the namespace field for the resource in a Kubernetes manifest file.

Verify the deployment using kubectl get:

kubectl get prometheus
NAME         VERSION   REPLICAS   AGE
prometheus   v2.22.1   2          32s

You can check the underlying Pods using get pod:

kubectl get pod
NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-operator-79cd654746-mdfp6   1/1     Running   0          33m
prometheus-prometheus-0                2/2     Running   1          57s
prometheus-prometheus-1                2/2     Running   1          57s

Now that Prometheus is up and running in our cluster, we can expose it using a Service.

Step 4 — Create a Prometheus Service

To create the Prometheus Service, open a manifest file called prom_svc.yaml and paste in the following definitions:

apiVersion: v1
kind: Service
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  ports:
  - name: web
    port: 9090
    targetPort: web
  selector:
    app: prometheus
  sessionAffinity: ClientIP

This configures the following:

  • Set the Service name to prometheus
  • Create an app: prometheus Service label
  • Expose port 9090 on a cluster-wide stable IP address and forward it to the two Prometheus Pods at their default web port (9090)
  • Select the Prometheus Pods as targets using the app: prometheus label
  • Use sessionAffinity: ClientIP to ensure that connections from a particular client get forwarded to the same Pod

When you’re done, save and close the file.

Create the Service using kubectl apply -f:

kubectl apply -f prom_svc.yaml
service/prometheus created

Check your work using kubectl get:

kubectl get service
NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
kubernetes            ClusterIP   10.245.0.1       <none>        443/TCP    27h
prometheus            ClusterIP   10.245.106.105   <none>        9090/TCP   26h
prometheus-operated   ClusterIP   None             <none>        9090/TCP   8m52s
prometheus-operator   ClusterIP   None             <none>        8080/TCP   41m

The prometheus Service is up and running.

Let’s access the Prometheus server by forwarding a local port to the Prometheus Service running inside of the Kubernetes cluster:

kubectl port-forward svc/prometheus 9090
Forwarding from 127.0.0.1:9090 -> 9090
Forwarding from [::1]:9090 -> 9090

Navigate to http://localhost:9090 to access the Prometheus UI:

Prometheus UI

From here, click on Status then Targets to see any configured scrape targets.

This should be empty, as we haven’t yet configured anything to scrape. We’ll do this in the next step.

Step 5 — Create a Prometheus ServiceMonitor

In this step you’ll create a ServiceMonitor so that the Prometheus server scrapes its own metrics endpoint.

A ServiceMonitor defines a set of targets for Prometheus to monitor and scrape. Prometheus Operator abstracts away the implementation details of configuring Kubernetes service discovery and scrape targets using this ServiceMonitor resource.

Instead of having to modify a Prometheus configuration file, update a ConfigMap object, and roll out the new configuration, Prometheus Operator will automatically hook in new ServiceMonitors to your running Prometheus deployment.

Begin by creating a file called prometheus_servicemonitor.yaml. Paste in the following ServiceMonitor resource definition:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: prometheus-self
  labels:
    app: prometheus
spec:
  endpoints:
  - interval: 30s
    port: web
  selector:
    matchLabels:
      app: prometheus

Here we define the following parameters:

  • Name the ServiceMonitor prometheus-self
  • Give it an app: prometheus label
  • Set the scrape interval to 30s, and scrape the web port (defined in the prometheus Service)
  • Select the prometheus Service to scrape using the matchLabels selector with app: prometheus

When you’re done, save and close the file.

Deploy it using kubectl apply -f:

kubectl apply -f prometheus_servicemonitor.yaml
servicemonitor.monitoring.coreos.com/prometheus-self created

Prometheus Operator should generally update your Prometheus configuration immediately. In some cases, you may need to wait a minute or so for changes to propagate.

Verify your changes by forwarding a port to your Prometheus server and checking its configuration:

kubectl port-forward svc/prometheus 9090

Navigate to Status and then Targets in the Prometheus UI:

Prometheus Targets

You should see the two Prometheus replicas as scrape targets.

Navigate to Graph to test metrics collection:

Prometheus Graph UI

In the Expression box, type prometheus_http_requests_total and hit ENTER. You should see a list of scraped metrics and their values. These are HTTP request counts for various Prometheus server endpoints.

If you’re having trouble configuring a ServiceMonitor, please see Troubleshooting ServiceMonitor Changes.

At this point, you’ve configured Prometheus to scrape itself and store metrics locally. In the following steps, you’ll configure Prometheus to ship these metrics to Grafana Cloud.

Step 6 — Create a Kubernetes Secret to store Grafana Cloud credentials

Before configuring Prometheus’s remote_write feature to ship metrics to Grafana Cloud, you’ll need to create a Kubernetes Secret to store your Grafana Cloud Metrics username and password.

You can find your username by navigating to your stack in the Cloud Portal and clicking Details next to the Prometheus panel.

Your password corresponds to the API key that you generated in the prerequisites section. You can also generate one in this same panel by clicking on Generate now.

Once you’ve noted your Cloud Prometheus username and password, create the Kubernetes Secret. You can create a Secret using a manifest file or directly using kubectl. In this guide we’ll create it directly using kubectl. To learn more about Kubernetes Secrets, please consult Secrets from the Kubernetes docs.

Run the following command to create a Secret called kubepromsecret:

kubectl create secret generic kubepromsecret \
  --from-literal=username=<your_grafana_cloud_prometheus_username>\
  --from-literal=password='<your_grafana_cloud_API_key>'

Note: If you deployed your monitoring stack in a namespace other than default, append the -n flag with the appropriate namespace to the above command.

To learn more about this command, please see Managing Secret using kubectl from the official Kubernetes docs.

Now that you’ve created a Secret to store your Grafana Cloud credentials, you can move on to modifying your Prometheus configuration.

Step 7 — Configure Prometheus remote_write and metrics deduplication

In this step, you’ll configure remote_write to ship cluster metrics to Grafana Cloud.

Prometheus’s remote_write feature allows you to ship metrics to remote endpoints for long-term storage and aggregation. Grafana Cloud’s deduplication feature allows you to deduplicate metrics sent from high-availability Prometheus pairs, reducing your active series usage.

We’ll enable remote_write by modifying the prometheus resource created in Step 3. Open the prometheus.yaml manifest using your editor:

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  image: quay.io/prometheus/prometheus:v2.22.1
  nodeSelector:
    kubernetes.io/os: linux
  replicas: 2
  resources:
    requests:
      memory: 400Mi
  securityContext:
    fsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  serviceAccountName: prometheus
  version: v2.22.1
  serviceMonitorSelector: {}

Add the following section to the bottom of resource definition:

. . .
  remoteWrite:
  - url: "<Your Metrics instance remote_write endpoint>"
    basicAuth:
      username:
        name: kubepromsecret
        key: username
      password:
        name: kubepromsecret
        key: password
  replicaExternalLabelName: "__replica__"
  externalLabels:
    cluster: "<choose_a_prom_cluster_name>"

Here we configure the following parameters:

  • Set the remote_write URL corresponding to Grafana Cloud’s Prometheus metrics endpoint. You can find the /api/prom/push URL, username, and password for your metrics endpoint by clicking on Details in the Prometheus card of the Cloud Portal.
  • Configure a basicAuth username and password referencing the Secret created in the previous step named kubepromsecret. Select the username and password keys of this Secret.
  • Configure Grafana Cloud metrics deduplication using the replicaExternalLabelName and externalLabels fields. Set cluster to a value that identifies your Prometheus HA cluster. To learn more, please see Deduplicating metrics data sent from high-availability Prometheus pairs.

Save and close the file when you’re done editing.

Roll out the changes using kubectl apply -f:

kubectl apply -f prometheus.yaml
prometheus.monitoring.coreos.com/prometheus configured

At this point, you’ve successfully configured your Prometheus instances to remote_write scraped metrics to Grafana Cloud. You can verify that your changes have propagated to your running Prometheus instances using port-forward:

kubectl port-forward svc/prometheus 9090

Navigate to http://localhost:9090 in your browser, and then Status and Configuration. Verify that the remote_write and external_labels blocks you appended above have propagated to your running Prometheus instances. It may take a minute or two for Prometheus Operator to pick up the new configuration.

In the final step, you’ll query your cluster metrics in Grafana Cloud.

Step 8 — Access your Prometheus metrics in Grafana Cloud

Now that your Prometheus instances ship their scraped metrics to Grafana Cloud, you can query and visualize these metrics from the Grafana Cloud platform.

From the Cloud Portal, click Log In next to the Grafana card to log in to Grafana.

From here, click on Explore (compass) in the sidebar:

Grafana Explore Icon

In the PromQL query box, enter the same metric tested in Step 5, prometheus_http_requests_total and hit SHIFT + ENTER.

You should see a graph of time series data corresponding to different labels of the prometheus_http_requests_total metric. Grafana queries this data from the Grafana Cloud Metrics data store and not your local cluster.

From here you can create dashboards and panels to quickly visualize and alert on this data. To learn more, please see Dashboard overview and Panel overview from the Grafana docs.

Conclusion

In this guide you installed Prometheus Operator into a Kubernetes cluster, deployed a 2-replica HA Prometheus setup into the cluster, and configured it to scrape its own /metrics endpoint. You then configured Prometheus to ship this data to Grafana Cloud and deduplicate the two sets of series to reduce metrics usage.

From here you can add additional endpoints to scrape, like the Kubernetes API, or kubelet metrics from the Kubernetes nodes. To see a fully configured Prometheus Kubernetes stack in action, please see the kube-prometheus project.