Upgrade Kubernetes Monitoring

Monitor infrastructure

Kubernetes Monitoring

Configure Kubernetes Monitoring

Upgrade

Grafana Cloud

Upgrade Kubernetes Monitoring

Choose the following instructions appropriate for your existing configuration of Kubernetes Monitoring.

Update Kubernetes Monitoring features

When a new version of alerting rules and recording rules becomes available, an Update button is available within step 2 of the configuration steps on the Cluster configuration tab. Migration is not incremental. You are updated to the latest version.

To upgrade and install the latest alerting and recording rules to your Grafana instance, click Update on the Cluster configuration tab.

Update button showing a new backend version is available

If you receive an error that the upgrade has failed, refer to troubleshooting instructions.

Upgrade to version 2

Beginning with version 2.0.0, the kubernetes-mixin dashboards are no longer available. The remaining changes from version 1.0 to 2.0 added:

New recording and alert rules
The Cluster label to all alert descriptions wherever it was missing

For more information about each release, refer to release notes.

What to expect

When you upgrade, this affects the Alerts due to the updates in the definitions. The Grafana Cloud Alerting namespace known as integrations-kubernetes is deleted and then recreated based on the latest definitions. To view what could be deleted:

Navigate in Grafana Cloud to Alerts & IRM > Alerting > Alert rules.
Select your provisioned metrics data source.
Search for namespace:integrations-kubernetes.

Any custom or modified alert or recording rules under the Alerting namespace are deleted. If you have made any customizations, move them to a different Alerting namespace to prevent those rules from being deleted by the upgrade process.

As alert rules are recreated, the for interval accumulation is reset. That means firing alerts may appear to resolve and then return to the firing state only if the for interval is breached again (counting from the point of alert rule recreation).

Also refer to the breaking change announcements for the Helm chart.

Testing the upgrade

Click the Update button on the Cluster configuration tab to automatically recreate the recording and alert rules. Some alert descriptions may now contain more detail, and some new recording and alert rules appear. Some rule groups may be split or combined to improve evaluation performance.
Check the firing Kubernetes integration alerts before during and after the upgrade using the following PromQL query in Explore against your provisioned metrics datasource: sum(ALERTS{alertname=~"Kube.*|CPUThrottlingHigh", alertstate="firing"}). You can expect firing alerts to drop and return to somewhere near the previous value (or higher due to new alert rules being added) in the subsequent 15 to 20-minute period.
Check that the recording rule sum(node_namespace_pod_container:container_memory_working_set_bytes) continues to have data in Explore before, during, and after the upgrade.

These items should not be affected by the upgrade:

Dashboards
Metrics ingestion
Any other integrations

Cardinality

Refer to Check cardinality.

Migrate from static Grafana Agent configuration

Note
Grafana Agent is approaching end of life, which occurs on November 1, 2025.

If you previously deployed Grafana Agent in static mode, and want to migrate to the latest configuration for Kubernetes Monitoring that uses Grafana Alloy, complete the following steps to remove the agent and supporting systems before deploying Grafana Kubernetes Monitoring Helm chart.

Note
If you have customized your Agent configuration (including adding Kubernetes Integrations to scrape local services or adding scrape targets for your own application metrics, logs, or traces), you must add the customizations again after deploying the Kubernetes Monitoring Helm chart. Save your existing configuration and refer to instructions on deploying integrations and how to set up Application Observability.

Save custom configuration

If you customized your Grafana Agent configuration to add metric sources, log sources, relabeling rules, or any other changes, save your config file outside of your cluster.

Note
If you have written custom relabeling rules for the static mode of Grafana Agent, you must rewrite these rules for Grafana Agent in Flow mode. Refer to Filter Prometheus metrics for more information.

Export common variables

Copy and run the following to use these variables throughout the remaining steps.

bash
# Set this to the namespace where Grafana Agent was deployed
export NAMESPACE="default"
# This will extract the installed Grafana Agent version (e.g. "v0.34.1")
export AGENT_VERSION=$(kubectl get statefulset grafana-agent -n "${NAMESPACE}" -o jsonpath='{$.spec.template.spec.containers[:1].image}' | sed -e 's/grafana\/agent:\(v[.0-9]*\.1\)/\1/')

Remove Grafana Agent for Metrics

Copy and run the following to remove the grafana-agent StatefulSet and associated ConfigMap that was used for scraping metrics.

bash
MANIFEST_URL=https://raw.githubusercontent.com/grafana/agent/${AGENT_VERSION}/production/kubernetes/agent-bare.yaml /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/grafana/agent/${AGENT_VERSION}/production/kubernetes/install-bare.sh)" | kubectl delete -f -
kubectl delete configmap grafana-agent -n "${NAMESPACE}"

Remove Grafana Agent for Logs

Copy and run the following to remove the grafana-agent-logs DaemonSet and associated ConfigMap that was used for scraping metrics.

bash
MANIFEST_URL=https://raw.githubusercontent.com/grafana/agent/${AGENT_VERSION}/production/kubernetes/agent-loki.yaml /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/grafana/agent//${AGENT_VERSION}/production/kubernetes/install-bare.sh)" | kubectl delete -f -
kubectl delete configmap grafana-agent-logs -n "${NAMESPACE}"

Remove supporting metric systems

Copy and run the following to be redeployed with the Kubernetes Monitoring Helm chart:

bash
helm delete ksm -n "${NAMESPACE}"
helm delete nodeexporter -n "${NAMESPACE}"
helm delete opencost -n "${NAMESPACE}"

Cardinality

Refer to Check cardinality.

Migrate from Grafana Agent Operator configuration

Note
Grafana Agent Operator is approaching end of life, which occurs on November 1, 2025.

If you previously deployed Grafana Agent in Operator mode, and want to migrate to the latest configuration for Kubernetes Monitoring with Alloy, complete the following to clean up before deploying the Kubernetes Monitoring Helm chart.

Note
If you have deployed additional ServiceMonitors, PodMonitors, PodLogs, or integrations objects to customize your Grafana Agent configuration (including Kubernetes integrations to scrape local services or adding scrape targets for your own application metrics, logs, or traces), you must add those again after deploying the Kubernetes Monitoring Helm chart. Save your existing configuration and refer to instructions on deploying integrations and how to set up Application Observability.

Persist a custom configuration

If you customized your Agent configuration with additional Integration objects:

Refer to the Agent Flow mode documentation for the Flow-equivalent components.
Add the components to the .extraConfig value in the Kubernetes Monitoring Helm chart.

If you are using additional PodMonitor or ServiceMonitor objects, no change is necessary. The Grafana Agent deployed by the Kubernetes Monitoring Helm chart still detects and utilizes those objects.

Export common variables

Copy and run the following to use this variable throughout the remaining steps.

bash
# Set this to the namespace where Grafana Agent was deployed
export NAMESPACE="default"

Remove monitoring objects

Next, you must remove all the objects deployed during the Agent Operator deployment process:

Kind	Name
ServiceAccount	grafana-agent
ClusterRole	grafana-agent
ClusterRoleBinding	grafana-agent
GrafanaAgent	grafana-agent
Secret	metrics-secret
Integration	node-exporter
MetricsInstance	grafana-agent-metrics
ServiceMonitor	cadvisor-monitor
ServiceMonitor	kubelet-monitor
ServiceMonitor	kube-state-metrics
ClusterRole	kube-state-metrics
ClusterRoleBinding	kube-state-metrics
Service	kube-state-metrics
Deployment	kube-state-metrics
ServiceMonitor	ksm-monitor
Secret	logs-secret
LogsInstance	grafana-agent-logs
PodLogs	kubernetes-logs
PersistentVolumeClaim	agent-eventhandler
Integration	agent-eventhandler
ServiceAccount	opencost
ClusterRole	opencost
ClusterRoleBinding	opencost
Service	opencost
Deployment	opencost
ServiceMonitor	opencost

Copy and run the following:

bash
kubectl delete -n "${NAMESPACE}" \
  serviceaccount/grafana-agent \
  clusterrole/grafana-agent \
  clusterrolebinding/grafana-agent \
  grafanaagent/grafana-agent \
  secret/metrics-secret \
  integration/node-exporter \
  metricsinstance/grafana-agent-metrics \
  servicemonitor/cadvisor-monitor \
  servicemonitor/kubelet-monitor \
  serviceaccount/kube-state-metrics \
  clusterrole/kube-state-metrics \
  clusterrolebinding/kube-state-metrics \
  service/kube-state-metrics \
  deployment/kube-state-metrics \
  servicemonitor/ksm-monitor \
  secret/logs-secret \
  logsinstance/grafana-agent-logs \
  podlogs/kubernetes-logs \
  persistentvolumeclaim/agent-eventhandler \
  integration/agent-eventhandler \
  serviceaccount/opencost \
  secret/opencost \
  clusterrole/opencost \
  clusterrolebinding/opencost \
  service/opencost \
  deployment/opencost \
  servicemonitor/opencost \

Remove Grafana Agent Operator and associated CRDs

Follow the instructions for the deployment method you used.

Configured with Helm

If you previously deployed Grafana Agent operator with Helm, copy and run the following to remove Grafana Agent Operator along with the associated Operator CRDs:

bash
helm delete -n "${NAMESPACE}" grafana-agent-operator

Configured without Helm

If you deployed without Helm, you must manually remove Grafana Agent Operator and the associated Operator CRDs:

Kind	Name
ServiceAccount	grafana-agent-operator
ClusterRole	grafana-agent-operator
ClusterRoleBinding	grafana-agent-operator
Deployment	grafana-agent-operator

To perform this manual removal, copy and run the following:

bash
kubectl delete -n "${NAMESPACE}" \
  deployment/grafana-agent-operator serviceaccount/grafana-agent-operator \
  clusterrole/grafana-agent-operator clusterrolebinding/grafana-agent-operator
kubectl delete
  -f https://raw.githubusercontent.com/grafana/agent/main/production/operator/crds/monitoring.coreos.com_podmonitors.yaml \
  -f https://raw.githubusercontent.com/grafana/agent/main/production/operator/crds/monitoring.coreos.com_probes.yaml \
  -f https://raw.githubusercontent.com/grafana/agent/main/production/operator/crds/monitoring.coreos.com_servicemonitors.yaml \
  -f https://raw.githubusercontent.com/grafana/agent/main/production/operator/crds/monitoring.grafana.com_grafanaagents.yaml \
  -f https://raw.githubusercontent.com/grafana/agent/main/production/operator/crds/monitoring.grafana.com_integrations.yaml \
  -f https://raw.githubusercontent.com/grafana/agent/main/production/operator/crds/monitoring.grafana.com_logsinstances.yaml \
  -f https://raw.githubusercontent.com/grafana/agent/main/production/operator/crds/monitoring.grafana.com_metricsinstances.yaml \
  -f https://raw.githubusercontent.com/grafana/agent/main/production/operator/crds/monitoring.grafana.com_podlogs.yaml

Cardinality

Refer to Check cardinality.

Upgrade to use cost monitoring

If you have already deployed Kubernetes Monitoring using Agent or Agent Operator, follow the instructions on this page to upgrade from Agent in static mode or from Agent Operator.

Create an access policy token

OpenCost needs the ability to read metrics from your hosted Prometheus.

Create an Access Policy Token that has the scope of read metrics.

Deploy OpenCost

Deploy OpenCost via its Helm chart. To do so, copy this code into your terminal and run it:

helm repo add opencost https://opencost.github.io/opencost-helm-chart &&
helm repo update && \
    helm install opencost opencost/opencost -n "default" -f - <<EOF
fullnameOverride: opencost
opencost:
  exporter:
    defaultClusterId: "REPLACE_WITH_CLUSTER_NAME"
    extraEnv:
      CLOUD_PROVIDER_API_KEY: AIzaSyD29bGxmHAVEOBYtgd8sYM2gM2ekfxQX4U
      EMIT_KSM_V1_METRICS: "false"
      EMIT_KSM_V1_METRICS_ONLY: "true"
      PROM_CLUSTER_ID_LABEL: cluster
  prometheus:
    password: "REPLACE_WITH_ACCESS_POLICY_TOKEN"
    username: "REPLACE_WITH_GRAFANA_CLOUD_PROMETHEUS_USER_ID"
    external:
      enabled: true
      url: "REPLACE_WITH_GRAFANA_CLOUD_PROMETHEUS_QUERY_ENDPOINT"
    internal: { enabled: false }
  ui: { enabled: false }
EOF

Value	Description
`CLOUD_PROVIDER_API_KEY`	Supplied for evaluation. For example, the GCP pricing API requires a key.
`password`	The access policy token that was just created
`username`	The Prometheus user ID (which is a number)
`url`	The Prometheus query endpoint

Check cardinality

As a best practice after upgrading and to ensure the gathered metrics are what you expect, check the current metrics usage and associated costs from the billing and usage dashboard located in your Grafana instance.

Refer to Metrics control and management for more details.

Was this page helpful?

Email docs@grafana.com

Help and support

Community

Is this page helpful?

The fastest way to get started is with the Grafana Cloud free tier which includes:

10k metrics
50GB logs
50GB traces
3 active users
14-day retention

Create free account

The fastest way to get started is with the Grafana Cloud free tier which includes:

10k metrics
50GB logs
50GB traces
3 active users
14-day rentention

Start Free

The best developer experience for performance testing

Learn more

Introducing Grafana Cloud k6, a new offering empowers teams to prevent system failures and deliver fast and reliable applications.

Learn more

Reduce metric cardinality by 30-50%
Pay only for metrics you use
Centralize control over your data in Grafana Cloud

Create free account Read the blog post

Upgrade Kubernetes Monitoring

Update Kubernetes Monitoring features

Upgrade to version 2

What to expect

Testing the upgrade

Cardinality

Migrate from static Grafana Agent configuration

Save custom configuration

Export common variables

Remove Grafana Agent for Metrics

Remove Grafana Agent for Logs

Remove supporting metric systems

Cardinality

Migrate from Grafana Agent Operator configuration

Persist a custom configuration

Export common variables

Remove monitoring objects

Remove Grafana Agent Operator and associated CRDs

Configured with Helm

Configured without Helm

Cardinality

Upgrade to use cost monitoring

Create an access policy token

Deploy OpenCost

Check cardinality

Was this page helpful?

Related resources from Grafana Labs