Monitor infrastructure

Kubernetes Monitoring

Configure Kubernetes Monitoring

Configure with Helm chart

Helm chart

Customize Helm chart

Grafana Cloud

Customize the Kubernetes Monitoring Helm chart

After you complete the deployment process of the Kubernetes Monitoring Helm chart, you can further customize your configuration. For example, you may want to collect more data, add more destinations, or need guidance on authentication settings.

The examples in the Kubernetes Monitoring Helm chart are complete examples designed to help you correctly alter your configuration by editing the values.yaml file. The following categories provide a means to find the example you need for customization.

After you make any customizations in values.yaml, redeploy the Helm chart to apply your changes.

Helm chart version

Each Kubernetes Monitoring Helm chart version has added functionality. To take advantage of features of an updated version:

Check the Helm chart release notes for the updates available in each version.
Install the latest version of the Helm chart by running:

helm upgrade --install grafana-k8s-monitoring grafana/k8s-monitoring -f values.yaml

If you need to migrate from an earlier version, refer to the steps in Version migration.

Common inclusion or exclusion filters

The following tables show common ways to include or exclude sources from being collected from various sources.

Metric sources from Kubernetes Cluster

Source	YAML	Filters
Cluster metrics	`clusterMetrics`	Source, metric, namespace
Host metrics	`hostMetrics`	Source, metric, namespace
Energy metrics	`hostMetrics.energyMetrics`	Source, metric, namespace
Cost metrics	`costMetrics`	Source, metric, namespace
Auto discovery with annotations	`annotationAutodiscovery`	Metric, label, annotation, namespace
Prometheus Operator objects	`prometheusOperatorObjects`	Metric, label, namespace

Log sources from Kubernetes Cluster

Source	YAML	Filters
Cluster events	`clusterEvents`	Level, reason, namespace
Node logs	`nodeLogs`	Unit

Pod logs from Kubernetes Cluster

Source	YAML	Filters
Pod logs via Loki	`podLogsViaLoki`	Pod, namespace
Pod logs via OpenTelemetry	`podLogsViaOpenTelemetry`	Pod, namespace
Pod logs objects	`podLogsObjects`	Pod, namespace
Pod logs via Kubernetes API	`podLogsViaKubernetesApi`	Pod, namespace

Application telemetry

Source	YAML	Filters
Application receivers	`applicationObservability`	Metric, log, trace
Metrics and traces of inbound and outbound calls	`autoInstrumentation`	Label, service
CPU profiling	`profiling`	Annotation, label, namespace
Profiles receiver	`profilesReceiver`	Namespace

Authentication

Use the following examples to configure authentication.

Example	Description
Bearer token	User a bearer token for a Prometheus, Loki, or OTLP destination.
Embedded secret	Embed the secret data directly into the destination configuration.
External secret	Use pre-existing secrets to authenticate to external services.
Oauth2	Use OAuth2 for authentication.
Sigv4	Configure a Prometheus destination using the AWS Signature Version 4 authentication method.

Data collection

Alloy

Gather metrics from Grafana Alloy.

Application data

The Application Observability feature encompasses receiving data via various receivers, processing that data, and delivering it to the specified destinations. This example shows the settings to collect telemetry data from an application.

Grafana

Gather metrics and logs from Grafana.

Grafana Loki

Gather metrics and logs from Grafana Loki.

Kubernetes infrastructure data

Example	Description
cert-manager	Gather metrics from cert-manager.
Cluster events	Gather Kubernetes Cluster events from the Kubernetes API server, and deliver them to a logs destination.
Cluster metrics	Gather metrics about the Kubernetes Cluster and deliver them to a metrics destination. This includes using services and tools such as Node Exporter, kube-state-metrics, `kubelet`, and cadvisor.
Cluster metrics with Istio Service Mesh	Gather metrics from Alloy clustering when Istio Service Mesh is enabled and has deployed the Istio sidecar to the Pods in the Cluster.
Cluster and control plane metrics	Gather metrics about the Kubernetes Cluster, including its control plane components, and deliver them to a metrics destination.
etcd	Gather metrics from etcd.
Log metrics	Generate metrics captured from Pod logs.
Node logs	Gather logs from the Nodes in your Kubernetes Cluster. This is useful when you create your own Kubernetes Cluster with `kubeadm`, because `kubelet` runs as a systemd service on Linux. This example shows gathering logs from the journald services.
Pod logs via Loki	Gather logs from the Pods in your Kubernetes Cluster using the Loki pipeline.
Pod logs via OpenTelemetry	Gather logs from the Pods in your Kubernetes Cluster natively in OTLP format.
Pod logs objects	Gather logs from the Pods in your Kubernetes Cluster using PodLogs objects.
Pod logs via Kubernetes API	Gather Pod logs by streaming them from the Kubernetes API. Useful when you cannot use the HostPath volume mount method.
Automatically discovered Pods and Services	Kubernetes Pods or Services are automatically discovered and scraped by the collector.
PodMonitors, ServiceMonitors, and Probes	Enable discovering PodMonitors, ServiceMonitors, and Probes in your Kubernetes Cluster and using them to scrape metrics.
Profiles	Gather profiles from your Kubernetes Cluster, and deliver them to Pyroscope.
Profiles receiver	Receive profiles from applications and deliver them to Pyroscope.
Database Observability for MySQL	Enable database observability for MySQL databases.
Database Observability for PostgreSQL	Enable database observability for PostgreSQL databases.
Tolerations	Allow workloads to run on Nodes with specific taints or prevent workloads from running on Nodes with specific taints.

Monitoring databases that monitor metrics, logs, or traces

With meta monitoring, you can monitor these databases and send the data to Grafana Cloud:

Enterprise databases: Grafana Enterprise Metrics, Grafana Enterprise Logs, and Grafana Enterprise Traces
OSS databases: Mimir, Loki, and Tempo

DCGM Exporter

Gather GPU metrics from DCGM Exporter.

Grafana Mimir

Gather metrics and logs from Grafana Mimir.

Multiple integrations

Deploy multiple service integrations alongside one another.

MySQL

Gather metrics and logs from MySQL.

PostgreSQL

Gather metrics and logs from PostgreSQL.

Destinations and proxies

Specify single or multiple destinations, whether a local service deployed on the same cluster, or a remote SaaS service. Use proxy URLs and TLS settings to send data to external services.

Example	Description
Loki	Send logs using the loki protocol to a logs destination.
OTLP endpoint	Send all your telemetry data to a single destination using OTLP destination.
OTLP or OTLPHTTP	Send metrics, logs, or traces using the OTLP protocol to an OTLP destination.
Proxies for external services	Use proxy URLs and TLS settings to send data to external services.
Prometheus	Send metrics using the remote write protocol to a metrics destination.
Pyroscope	Send metrics using the pyroscope protocol to a profiles destination.

Discovery

You can customize data collection during the discovery phase.

Example	Description
Annotations	Use annotations to automatically discovered and gather metrics from Kubernetes Pods and Services. You can use these annotations to further customize by job, instance, path, port number or name, scheme, and scrape interval. Also refer to Kubernetes annotations for more information.
Automatic discovery using Prometheus annotations	Use Prometheus-style annotations to enable Alloy to discovery metrics, and customize the path and port number.
Extra discovery rules and labels	You can refine what services are discovered and control the target’s label using these extra rules and labels.
MongoDB Atlas databases	Gather metrics from MongoDB Atlas databases with Alloy using the discovery.http component.
MySQL	Gather metrics and logs from MySQL.
Namespace exclusion	Define what namespaces to exclude from data discovery and include all other namespaces.
Namespace inclusion	Define what namespaces to include in data discovery and exclude all other namespaces.
Namespace labels and annotations	Include labels and annotations set on the Kubernetes namespace to the telemetry data for resources in that namespace.
Node labels	Include labels set on the Kubernetes Node to the telemetry data for that Node.
Pod labels and annotations	Include labels and annotations set on the Kubernetes Pod to the telemetry data for that Pod.

For more information on Grafana Alloy labels and relabeling, refer to:

Alloy discovery.relabel to control metrics collection or standardize target labels
Alloy __meta labels to refine data collection of Kubernetes resources. An example of this is shown in etcd.

For examples of rules and labeling after the discovery phase, refer to Processing and labeling.

Helm chart deployment

Deploy the Kubernetes Monitoring Helm chart using alternative methods or customize deployment settings.

Example	Description
Argo CD	Deploy using Argo CD.
Name overrides	Override resource names in the Helm chart.
RBAC without Cluster Roles	Deploy without ClusterRoles or ClusterRoleBindings.
Terraform	Deploy using Terraform.

Instrumentation for applications

Automatically instrument your applications for telemetry collection.

Example	Description
Metrics	Deploy Grafana Beyla to instrument your application using zero code for metrics collection.
Metrics and traces	Deploy Grafana Beyla to instrument your application using zero code for metrics collection, and generate traces for your application.

Platforms

Customize your platform to work correctly with Kubernetes Monitoring.

Example	Description
Azure AKS	Enable Kubernetes Monitoring to work correctly with Azure AKS Clusters.
EKS Fargate	Gather Pod logs on an EKS Fargate Cluster so that Kubernetes Monitoring works correctly.
GKE Autopilot	Enable Kubernetes Monitoring to work correctly on GKE Autopilot Clusters.
OpenShift	Enable Kubernetes Monitoring to work correctly on OpenShift Clusters.

Processing and labeling

After data collection during the processing phase, you can enable additional processing for telemetry data, refine the metrics you want to keep or drop, and change existing labels or add labels.

Example	Description
Additional labels	Use `extraDiscoveryRules` to further refine data collection.
Additional processing	Enable additional processing for logs and metrics, such as `extraMetricProcessingRules` and `extraLogProcessingStages`.
Metric enrichment	Enrich metrics with additional labels or metadata.
Metrics tuning	Include or exclude metrics to be sent to a metrics destination by using a rule to exclude or include metrics. The default ConfigMap that results from the configuration process creates allowlists. These allowlists are configured to keep a subset of metrics used by Kubernetes Monitoring. With metrics tuning, you can add or exclude any metrics from the default allowlist.
Tail sampling	Configure tail-based sampling policies for traces to control which traces are kept.

To learn more about labels and relabeling, refer to:

To learn more about metrics tuning and allowlists, refer to:

For examples of labels and annotations during the discovery phase, refer to Discovery.

Private image registry

To support environments that are air-gapped or should be excluded from using public image registries, you can do either of the following:

Globally override the container image registries for every subchart by using a global object.
Override the individual container image registries for every subchart by using these values.

Remote configuration

Enable Grafana Alloy to fetch and load the configuration from a remote endpoint.

Scaling and reliability

Enhance scaling and reliability where needed.

Example	Description
Alloy auto scaling	Enable an Alloy instance to scale up and down based on CPU and memory use.
Collector storage	Enable metric scraping to use a Write-Ahead Log (WAL) to store metrics in case of a scrape failure. Enable log gathering to use a volume to store log file positions. This provides a starting point for reading logs after a restart.
Highly available kube-state-metrics	Deploy kube-state-metrics in high availability mode to improve reliability.
Resource requests and limits	Set resource requests and limits for the collector workloads.
Sharded kube-state-metrics	Shard kube-state-metrics to improve scaling.

Kubernetes annotations

You can target specific namespaces or Pods for data collection with Kubernetes annotations. Often annotations are used for controlling service discovery, but you can also use them to configure how data is collected.

Annotation autodiscovery

Use the Annotation Autodiscovery feature to discover and scrape Prometheus-style metrics from Pods and Services on your Cluster. You can apply these default annotations to a Pod or Service:

k8s.grafana.com/scrape: Scrapes Pod or Service for metrics
k8s.grafana.com/job: The value to use for the job label
k8s.grafana.com/instance: The value to use for the instance label
k8s.grafana.com/metrics.container: The name of the container within the Pod to scrape for metrics. This is used to target a specific container within a Pod that has multiple containers.
k8s.grafana.com/metrics.path: The path to scrape for metrics. Defaults to /metrics.
k8s.grafana.com/metrics.portNumber: The port on the Pod or Service to scrape for metrics. This is used to target a specific port by its number rather than all ports.
k8s.grafana.com/metrics.portName: The named port on the Pod or Service to scrape for metrics. This is used to target a specific port by its name rather than all ports.
k8s.grafana.com/metrics.scheme: The scheme to use when scraping metrics. Defaults to http.
k8s.grafana.com/metrics.param: Allows for setting HTTP parameters when calling the scrape endpoint. Use with k8s.grafana.com/metrics.param_<key>="<value>".
k8s.grafana.com/metrics.scrapeInterval: The scrape interval to use when scraping metrics. Defaults to 60s.
k8s.grafana.com/metrics.scrapeTimeout: The scrape timeout to use when scraping metrics. Defaults to 10s.

Profiling

The Profiling feature allows you to collect profiling data from your applications. This feature can collect profiles using eBPF, Java, or pprof.

eBPF profiling

To use eBPF to collect CPU profiles from this Pod:

profiles.grafana.com/cpu.ebpf.enabled

Java profiling

To collect Java profiles from this Pod:

profiles.grafana.com/java.enabled

pprof profiling

You can use the following annotations to control profiling for each enabled type (memory, block, goroutine, mutex, cpu, fgprof, godeltaprof_memory, godeltaprof_mutex, and godeltaprof_block):

profiles.grafana.com/<type>.scrape: Collects pprof profiles for the specified type from this Pod.
profiles.grafana.com/<type>.port: Collects profiles for the specified type from this port number.
profiles.grafana.com/<type>.port_name: Collects profiles for the specified type from this named port.
profiles.grafana.com/<type>.path: Collects profiles for the specified type from this path.
profiles.grafana.com/<type>.scheme: The scheme to use when scraping profiles for the specified type. Defaults to http.

Pod logs

Use the following annotation to control the collection of Pod logs:

k8s.grafana.com/logs.job: The value to use for the job label

Extra config

You can use the extraConfig sections to supply additional configuration to the Grafana Alloy instances. Anything you put in these sections are added to the existing configuration that is created by this chart.

Helm provides multiple ways to set these additional configuration values. You can either keep the values in the same file as the rest of your Kubernetes Monitoring configuration, or store them separately as their own files and include during Helm chart install.

Set as values

You can set the contents of your extra configuration into your values file:

$ ls
values.yaml
$ cat values.yaml
cluster:
  name: my-cluster
...
collectors:
  metrics-collector:
    presets: [clustered, statefulset]
    extraConfig: |-
      // Any arbitrary Alloy configuration can be placed here.
      logging {
        level  = "debug"
      }
  logs-collector:
    presets: [filesystem-log-reader, daemonset]
    extraConfig: |
      // Any arbitrary Alloy configuration can be placed here.
      logging {
        level  = "debug"
      }
...
$ helm upgrade grafana-k8s-monitoring grafana/k8s-monitoring --values values.yaml

Set as files

You can save the contents of your extra configuration as files and use Helm’s --set-file argument:

$ ls
values.yaml  metricsConfig.alloy  logsConfig.alloy
$ helm upgrade grafana-k8s-monitoring --atomic --timeout 300s grafana/k8s-monitoring \
    --values values.yaml \
    --set-file "collectors.metrics-collector.extraConfig=metricsConfig.alloy" \
    --set-file "collectors.logs-collector.extraConfig=logsConfig.alloy"

This can be method beneficial after your extra configuration grows to a certain size.