Grafana Cloud

Configure Kubernetes Monitoring with easy deployment and Helm chart

The easy deployment process using the Kubernetes Monitoring GUI works in conjunction with the most recent version of Grafana Kubernetes Monitoring Helm chart. To learn how the Helm chart configuration works, refer to Helm chart overview.

Note

Configuration with the Grafana Kubernetes Monitoring Helm chart is the recommended method.

Configuration of the Grafana Kubernetes Monitoring Helm chart includes the steps shown in the following diagram:

Steps of configuration process
Steps of configuration process

Available platforms

You can install Kubernetes Monitoring on these platforms:

PlatformNotes
KubernetesIncludes Kubernetes, Amazon Elastic Kubernetes Service (EKS) on Amazon Elastic Cloud Compute (EC2), Google Kubernetes Engine (GKE), and Tanzu
Amazon EKS on AWS FargateWith this platform, Node Exporter metrics are disabled. Instead, Kubelet Resources are used to gather CPU and memory usage. Pod Logs are gathered via the Kubernetes API.
Azure Kubernetes Service (AKS)N/A
Google Kubernetes Engine (GKE) AutopilotWith this platform, Node Exporter metrics are disabled. Instead, Kubelet Resources are used to gather CPU and memory usage. Pod Logs are gathered via the Kubernetes API.
IBM Cloud Kubernetes ServiceN/A
Red Hat OpenShiftN/A

Activate and send data from your account

To configure and use Kubernetes Monitoring, you must activate the product. Activation ensures you have access, and begins your billing for usage. Next, you must send data to your account. To do so, complete the following steps:

  1. Go to your Grafana Cloud account, and click My Account.
  2. Select the stack where you want to install Kubernetes Monitoring.
  3. Click Details next to the name of your stack.
  4. At your stack page, click Launch to open Kubernetes Monitoring.
  5. Click Activate to activate Kubernetes Monitoring.
  6. At the Activate Kubernetes Monitoring dialog box, select the box and click Activate.
  7. Click Start sending data.

Kubernetes Monitoring displays the Cluster configuration tab of the Configuration page.

Configuration process

Complete the configuration process on the Cluster configuration tab.

Give it a try using Grafana Play
Give it a try using Grafana Play

With Grafana Play, you can explore and see how it works, learning from practical examples to accelerate your development. This feature can be seen on this Configuration page.

Before you begin

Make sure you have met the prerequisites required for these configuration steps.

Note

Ensure that you are familiar with the components installed by the Helm chart and how they relate to switching on or off the configuration choices available.

To deploy Kubernetes Monitoring with the Helm chart, you need:

  • The Admin role to install alerts
  • A Kubernetes Cluster, environment, or fleet you want to monitor
  • The following tools installed on your local machine or CI/CD environment where you’ll run the deployment commands:
    • kubectl: Kubernetes command-line tool to interact with your cluster
    • Helm: Package manager for Kubernetes to deploy the monitoring stack
  • Appropriate versions of items related to:
    • kube-state-metrics: Uses client-go to communicate with Clusters. For Kubernetes client-go version compatibility and any other related details, refer to kube-state-metrics.
    • OpenCost: Requires Kubernetes 1.8+ clusters.
    • Storage visualizations: Require Helm chart release v1.5.1 or greater

Backend installation (required)

Click Install to install the required, preconfigured alert rules and recording rules.

Note

These rules are required for Kubernetes Monitoring to function properly. Recording rules are the source of the workload data in the Kubernetes Monitoring. If you aren’t seeing the workload data, the most likely cause is that the recording rules and alert rules haven’t been installed.

Select features and enter Cluster information

  1. Cluster name: In the Cluster name box, enter the name of your Cluster. This name identifies your Cluster in Grafana dashboards and metrics. This name appears in the cluster label across all your metrics and logs.

  2. Namespace: In the Namespace box, replace default with the namespace where you want to deploy your monitoring infrastructure. This is the namespace for Grafana Alloy and other dependencies such as kube-state-metrics.

  3. Platform: Select the platform you are using.

  4. Kubernetes Cluster monitoring: Data collection for infrastructure metrics and logs. Switch on or off the following options.

    Note

    These options are independent of each other. For example, disabling cost or energy metrics does not disable any other option. Refer to additional information for each option by following the links and review Manage your Kubernetes configuration.

    Section of quick configuration to select features, including what data to collect
    Options for monitoring Cluster and applications
    • Cluster metrics on the infrastructure

      Node resource usage; metrics about Pod health; Persistent volume usage; and Deployment, StatefulSet, and DaemonSet status.
      Essential for Cluster health monitoring; powers the Kubernetes Overview dashboard; tracks resource utilization and capacity planning; and detects Pod crashes, OOM kills, and so on.

    • Cost metrics, which uses OpenCost

      Resource cost attribution by namespace, workload, and Pod; CPU and memory cost breakdowns; Cloud provider pricing data integration; and cost efficiency metrics.
      Shows cost data; helps identify expensive workloads; enables FinOps and cost optimization; and tracks spending trends over time.

    • Energy metrics, which uses Kepler to obtain energy data

    • Node logs

      Logs from: the operating system (kernel logs, systemd, network drivers); Kubernetes node agents (like kubelet); container runtimes (Docker, containerd); system daemons (journald, syslog).
      Diagnosing node instability (memory exhaustion, CPU throttling, or disk space issues); debugging scheduling or startup failures (when Pods can’t start, the issue may be at the Node level); investigating network or storage problems; determining driver or volume mount failures; auditing system changes (Node reboots, kubelet restarts, or OS updates).

    • Cluster events

      Generated from: the scheduler (assigning Pods to Nodes); kubelet (managing Pods on Nodes); the controller manager (handling scaling, deployments, and so on); the API server (processing requests).
      Troubleshooting scheduling & deployment issues; see why a Pod isn’t starting (no Nodes with enough memory); tracking resource lifecycle changes (Pod created → scheduled → pulled → started → ready → terminated); detecting transient or recurring failures (repeated image pull errors, failed probes, or node taints); auditing Cluster activity (to identify which controller or user triggered changes).

    • Automatic discovery with annotations, to collect Prometheus metrics from Pods and Services using annotations that define their scrape target

      Automatically discovers running pods and services; dynamically scrapes Prometheus metrics from annotated Pods; detects new workloads without manual configuration; and uses Kubernetes annotations to find metrics endpoints.
      Enable for: zero-configuration metrics collection; automatically monitor new applications as they deploy; support microservices architectures with dynamic scaling; and find application-specific metrics beyond system metrics.
      Disable when: You want to explicitly define every scrape target; you’re concerned about discovering unintended metrics endpoints; your cluster has very strict network policies.

    • Prometheus Operator objects, for collecting metrics from PodMonitors, Probes, and ServiceMonitors

      Discovers and monitors Prometheus Operator CRDs: ServiceMonitor, which defines how to scrape metrics from Kubernetes services; PodMonitor, which defines how to scrape metrics from Pods; and Probe, which defines blackbox probing of endpoints.
      Enable when: you’re already using Prometheus Operator in your Cluster; you have existing ServiceMonitor/PodMonitor definitions; you want to leverage existing Prometheus configurations; you want to enable migration from Prometheus Operator to Grafana Alloy.
      Disable when: You’re not using Prometheus Operator Objects; you prefer using Grafana Alloy’s native configuration.

  5. Containerized application monitoring: Data collection for containerized applications. Switch on or off the following options.

    • Pod logs

      Standard output (stdout) and standard error (stderr) Pod logs from the processes in the container, such as initialization messages, API request logs, warnings or errors, and application-specific information.
      Enable for: debugging issues, such as when an application crashes or behaves unexpectedly, logs reveal what went wrong; monitoring behavior, such as tracking normal operational messages (startup confirmation, API requests, or job completion); auditing events, to view logs that show what actions were taken by your app or scripts running inside containers; gaining performance insight, such as tracing slow operations or bottlenecks using timestamps and log levels.

    • Metrics and traces of inbound and outbound calls to deploy Grafana Beyla for zero-code instrumentation of applications on the Cluster

      Caution

      If you enable instrumentation with Beyla, this may affect your billing due to additional telemetry ingestion.

      Correlates Pod metrics with application traces; links infrastructure metrics to application performance; enables unified views of resource usage and request patterns; powers the correlation features in Pod detail views.
      Enable when you want to: See how Pod resource constraints affect application latency; correlate OOM kills with specific requests; understand resource consumption per endpoint; and connect infrastructure issues to user impact.
      Disable when: You only need basic infrastructure monitoring; you haven’t enabled tracing.

    • CPU profiling, to collect profiles from applications on the Cluster

      Enables continuous profiling using eBPF; collects CPU flamegraphs from running applications; captures function-level performance data; identifies code issues and performance bottlenecks.
      Enable when you want to: find expensive functions in your code; optimize application performance; debug CPU-intensive operations; identify memory allocation patterns.
      Disable when: You don’t need code-level profiling; you’re concerned about profiling overhead (~1-5% CPU); your applications are already well-optimized.

Use a Grafana.com access policy token

You can create a new access policy token or use an existing token. Refer to Grafana Cloud Access Policies for more information.

To use an existing token:

  1. Click Use an existing token.

  2. Paste the token into the Access policy token name box.

To create a new token:

  1. Click Create a new token.

  2. In the box for Access Policy Token name, enter the name of your token.

  3. In the Expiration date box, select an option for the expiration date.

    The permission scope for the token appears.

    Options for configuration, including expiration date
    Options for configuration, including expiration date
  4. Click Create token.

    The token generates and appears in the token box. This token is automatically copied into the ConfigMap file.

  5. Click the copy icon in the token box to copy the token. Make sure to save it in a secure place. It is not shown again.

Deploy monitoring resources on the Cluster

  1. Select the method of deployment.
  2. Optionally select Enable Remote Configuration to have Grafana Fleet Management centrally manage your Alloy deployments.
  3. Use the code or files for deployment, following the on-screen instructions.
Choices for deployment method and Fleet Management option
Choices for deployment method and Fleet Management option

Helm client

To use the Helm client to deploy the Kubernetes Monitoring Helm chart to the Cluster:

  1. Copy the command.

  2. Paste and run it in your terminal.

Terraform

To use Terraform to deploy the Kubernetes Monitoring Helm chart to the Cluster:

  1. Copy, modify, and save the following files to your Terraform system set up using Terraform:

    • provider.tf
    • grafana-k8s-monitoring.tf
    • vars.tf
  2. Deploy by using the commandsterraform init and terraform apply.

Remote configuration

Select Enable Remote Configuration to use Fleet Management to monitor, configure, and manage your Alloy collectors. You can update your collectors without having to update the Cluster.

Configure application instrumentation

If you chose to include receivers in the Select features and enter cluster information section, a list of endpoints appear. In your application that generates metrics, logs, or traces, enter the appropriate OTLP or Zipkin address.

Endpoints available for configuration
Endpoints available for configuration

Note

If you change the deployment name to something other than grafana-k8s-monitoring, the endpoint address is updated as well. Be sure to update your applications to point to the correct endpoint.

Done

Click the Metrics status tab to view the status of data collection. Your data becomes populated as the system components begin scraping and sending data to Grafana Cloud. This view shows the health of the different sources of metrics, Pod logs, and Cluster events, as well as any applicable version numbers.

Troubleshoot

Refer to Troubleshooting for any issues that occur after configuration.

Install any integrations

You can use Grafana integrations to monitor the health and status of services and applications running in your Kubernetes clusters.

To install a Kubernetes integration to begin scraping metrics:

  1. From the main menu, navigate to Connections, and filter for Kubernetes.

  2. Select the integration for the service you want to monitor.

  3. Follow the instructions on the screen to copy and use the configuration snippet and install the integration.

  4. After installing an integration, redeploy the configuration using the method you originally used.

Retrieve Helm values

If you installed Kubernetes Monitoring with the Helm CLI, you can retrieve the values for your configuration by using the helm get values command.