Caution
Grafana Agent has reached End-of-Life (EOL) on November 1, 2025. Agent is no longer receiving vendor support and will no longer receive security or bug fixes. Current users of Agent Static mode, Agent Flow mode, and Agent Operator should proceed with migrating to Grafana Alloy. If you have already migrated to Alloy, no further action is required. Read more about why we recommend migrating to Grafana Alloy.

Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.

Grafana Agent

Grafana Agent Operator

Deploy the Agent Operator resources

Open source

Deploy the Grafana Agent Operator resources

To start collecting telemetry data, you need to roll out Grafana Agent Operator custom resources into your Kubernetes cluster. Before you can create the custom resources, you must first apply the Agent Custom Resource Definitions (CRDs) and install Agent Operator, with or without Helm. If you haven’t yet taken these steps, follow the instructions in one of the following topics:

Follow the steps in this guide to roll out the Grafana Agent Operator custom resources to:

Scrape and ship cAdvisor and kubelet metrics to a Prometheus-compatible metrics endpoint.
Collect and ship your Pods’ container logs to a Loki-compatible logs endpoint.

The hierarchy of custom resources is as follows:

GrafanaAgent
- MetricsInstance
  - PodMonitor
  - Probe
  - ServiceMonitor
- LogsInstance
  - PodLogs

To learn more about the custom resources Agent Operator provides and their hierarchy, see Grafana Agent Operator architecture.

Note: Agent Operator is currently in beta and its custom resources are subject to change.

Before you begin

Before you begin, make sure that you have deployed the Grafana Agent Operator CRDs and installed Agent Operator into your cluster. See Install Grafana Agent Operator with Helm or Install Grafana Agent Operator for instructions.

Deploy the GrafanaAgent resource

In this section, you’ll roll out a GrafanaAgent resource. See Grafana Agent Operator architecture for a discussion of the resources in the GrafanaAgent resource hierarchy.

Note: Due to the variety of possible deployment architectures, the official Agent Operator Helm chart does not provide built-in templates for the custom resources described in this guide. You must configure and deploy these manually as described in this section. We recommend templating and adding the following manifests to your own in-house Helm charts and GitOps flows.

To deploy the GrafanaAgent resource:

Copy the following manifests to a file:

apiVersion: monitoring.grafana.com/v1alpha1
kind: GrafanaAgent
metadata:
  name: grafana-agent
  namespace: default
  labels:
    app: grafana-agent
spec:
  image: grafana/agent:v0.31.3
  logLevel: info
  serviceAccountName: grafana-agent
  metrics:
    instanceSelector:
      matchLabels:
        agent: grafana-agent-metrics
    externalLabels:
      cluster: cloud

  logs:
    instanceSelector:
      matchLabels:
        agent: grafana-agent-logs

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: grafana-agent
  namespace: default

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: grafana-agent
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - nodes/proxy
  - nodes/metrics
  - services
  - endpoints
  - pods
  - events
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - networking.k8s.io
  resources:
  - ingresses
  verbs:
  - get
  - list
  - watch
- nonResourceURLs:
  - /metrics
  - /metrics/cadvisor
  verbs:
  - get

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: grafana-agent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: grafana-agent
subjects:
- kind: ServiceAccount
  name: grafana-agent
  namespace: default

In the first manifest, the GrafanaAgent resource:

Specifies an Agent image version.
Specifies MetricsInstance and LogsInstance selectors. These search for MetricsInstances and LogsInstances in the same namespace with labels matching agent: grafana-agent-metrics and agent: grafana-agent-logs, respectively.
Sets a cluster: cloud label for all metrics shipped to your Prometheus-compatible endpoint. Change this label to your cluster name. To search for MetricsInstances or LogsInstances in a different namespace, use the instanceNamespaceSelector field. To learn more about this field, see the GrafanaAgent CRD specification.

Customize the manifests as needed and roll them out to your cluster using kubectl apply -f followed by the filename.
This step creates a ServiceAccount, ClusterRole, and ClusterRoleBinding for the GrafanaAgent resource.
Deploying a GrafanaAgent resource on its own does not spin up Agent Pods. Agent Operator creates Agent Pods once MetricsInstance and LogsIntance resources have been created. Follow the instructions in the Deploy a MetricsInstance resource and Deploy LogsInstance and PodLogs resources sections to create these resources.

Disable feature flags reporting

To disable the reporting usage of feature flags to Grafana, set disableReporting field to true.

Disable support bundle generation

To disable the support bundles functionality, set the disableSupportBundle field to true.

Deploy a MetricsInstance resource

Next, you’ll roll out a MetricsInstance resource. MetricsInstance resources define a remote_write sink for metrics and configure one or more selectors to watch for creation and updates to *Monitor objects. These objects allow you to define Agent scrape targets via Kubernetes manifests:

To deploy a MetricsInstance resource:

Copy the following manifest to a file:

apiVersion: monitoring.grafana.com/v1alpha1
kind: MetricsInstance
metadata:
  name: primary
  namespace: default
  labels:
    agent: grafana-agent-metrics
spec:
  remoteWrite:
  - url: your_remote_write_URL
    basicAuth:
      username:
        name: primary-credentials-metrics
        key: username
      password:
        name: primary-credentials-metrics
        key: password

  # As an alternative authentication method, Grafana Agent also supports OAuth2.
  # - url: your_remote_write_URL
  #   oauth2:
  #     clientId:
  #       secret:
  #         key: username # Kubernetes Secret Key
  #         name: primary-credentials-metrics # Kubernetes Secret Name
  #     clientSecret:
  #       key: password # Kubernetes Secret Key
  #       name: primary-credentials-metrics # Kubernetes Secret Name
  #     tokenUrl: https://auth.example.com/realms/master/protocol/openid-connect/token


  # Supply an empty namespace selector to look in all namespaces. Remove
  # this to only look in the same namespace as the MetricsInstance CR
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector:
    matchLabels:
      instance: primary

  # Supply an empty namespace selector to look in all namespaces. Remove
  # this to only look in the same namespace as the MetricsInstance CR.
  podMonitorNamespaceSelector: {}
  podMonitorSelector:
    matchLabels:
      instance: primary

  # Supply an empty namespace selector to look in all namespaces. Remove
  # this to only look in the same namespace as the MetricsInstance CR.
  probeNamespaceSelector: {}
  probeSelector:
    matchLabels:
      instance: primary

Replace the remote_write URL and customize the namespace and label configuration as necessary.
This step associates the MetricsInstance resource with the agent: grafana-agent GrafanaAgent resource deployed in the previous step. The MetricsInstance resource watches for creation and updates to *Monitors with the instance: primary label.

Once you’ve rolled out the manifest, create the basicAuth credentials using a Kubernetes Secret:

apiVersion: v1
kind: Secret
metadata:
  name: primary-credentials-metrics
  namespace: default
stringData:
  username: 'your_cloud_prometheus_username'
  password: 'your_cloud_prometheus_API_key'

If you’re using Grafana Cloud, you can find your hosted Loki endpoint username and password by clicking Details on the Loki tile on the Grafana Cloud Portal. If you want to base64-encode these values yourself, use data instead of stringData.

Once you’ve rolled out the MetricsInstance and its Secret, you can confirm that the MetricsInstance Agent is up and running using kubectl get pod. Since you haven’t defined any monitors yet, this Agent doesn’t have any scrape targets defined. In the next section, you’ll create scrape targets for the cAdvisor and kubelet endpoints exposed by the kubelet service in the cluster.

Create ServiceMonitors for kubelet and cAdvisor endpoints

Next, you’ll create ServiceMonitors for kubelet and cAdvisor metrics exposed by the kubelet service. Every Node in your cluster exposes kubelet and cAdvisor metrics at /metrics and /metrics/cadvisor, respectively. Agent Operator creates a kubelet service that exposes these Node endpoints so that they can be scraped using ServiceMonitors.

To scrape the kubelet and cAdvisor endpoints:

Copy the following kubelet ServiceMonitor manifest to a file, then roll it out in your cluster using kubectl apply -f followed by the filename.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    instance: primary
  name: kubelet-monitor
  namespace: default
spec:
  endpoints:
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    honorLabels: true
    interval: 60s
    metricRelabelings:
    - action: keep
      regex: kubelet_cgroup_manager_duration_seconds_count|go_goroutines|kubelet_pod_start_duration_seconds_count|kubelet_runtime_operations_total|kubelet_pleg_relist_duration_seconds_bucket|volume_manager_total_volumes|kubelet_volume_stats_capacity_bytes|container_cpu_usage_seconds_total|container_network_transmit_bytes_total|kubelet_runtime_operations_errors_total|container_network_receive_bytes_total|container_memory_swap|container_network_receive_packets_total|container_cpu_cfs_periods_total|container_cpu_cfs_throttled_periods_total|kubelet_running_pod_count|node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate|container_memory_working_set_bytes|storage_operation_errors_total|kubelet_pleg_relist_duration_seconds_count|kubelet_running_pods|rest_client_request_duration_seconds_bucket|process_resident_memory_bytes|storage_operation_duration_seconds_count|kubelet_running_containers|kubelet_runtime_operations_duration_seconds_bucket|kubelet_node_config_error|kubelet_cgroup_manager_duration_seconds_bucket|kubelet_running_container_count|kubelet_volume_stats_available_bytes|kubelet_volume_stats_inodes|container_memory_rss|kubelet_pod_worker_duration_seconds_count|kubelet_node_name|kubelet_pleg_relist_interval_seconds_bucket|container_network_receive_packets_dropped_total|kubelet_pod_worker_duration_seconds_bucket|container_start_time_seconds|container_network_transmit_packets_dropped_total|process_cpu_seconds_total|storage_operation_duration_seconds_bucket|container_memory_cache|container_network_transmit_packets_total|kubelet_volume_stats_inodes_used|up|rest_client_requests_total
      sourceLabels:
      - __name__
    port: https-metrics
    relabelings:
    - sourceLabels:
      - __metrics_path__
      targetLabel: metrics_path
    - action: replace
      targetLabel: job
      replacement: integrations/kubernetes/kubelet
    scheme: https
    tlsConfig:
      insecureSkipVerify: true
  namespaceSelector:
    matchNames:
    - default
  selector:
    matchLabels:
      app.kubernetes.io/name: kubelet

Copy the following cAdvisor ServiceMonitor manifest to a file, then roll it out in your cluster using kubectl apply -f followed by the filename.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    instance: primary
  name: cadvisor-monitor
  namespace: default
spec:
  endpoints:
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    honorLabels: true
    honorTimestamps: false
    interval: 60s
    metricRelabelings:
    - action: keep
      regex: kubelet_cgroup_manager_duration_seconds_count|go_goroutines|kubelet_pod_start_duration_seconds_count|kubelet_runtime_operations_total|kubelet_pleg_relist_duration_seconds_bucket|volume_manager_total_volumes|kubelet_volume_stats_capacity_bytes|container_cpu_usage_seconds_total|container_network_transmit_bytes_total|kubelet_runtime_operations_errors_total|container_network_receive_bytes_total|container_memory_swap|container_network_receive_packets_total|container_cpu_cfs_periods_total|container_cpu_cfs_throttled_periods_total|kubelet_running_pod_count|node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate|container_memory_working_set_bytes|storage_operation_errors_total|kubelet_pleg_relist_duration_seconds_count|kubelet_running_pods|rest_client_request_duration_seconds_bucket|process_resident_memory_bytes|storage_operation_duration_seconds_count|kubelet_running_containers|kubelet_runtime_operations_duration_seconds_bucket|kubelet_node_config_error|kubelet_cgroup_manager_duration_seconds_bucket|kubelet_running_container_count|kubelet_volume_stats_available_bytes|kubelet_volume_stats_inodes|container_memory_rss|kubelet_pod_worker_duration_seconds_count|kubelet_node_name|kubelet_pleg_relist_interval_seconds_bucket|container_network_receive_packets_dropped_total|kubelet_pod_worker_duration_seconds_bucket|container_start_time_seconds|container_network_transmit_packets_dropped_total|process_cpu_seconds_total|storage_operation_duration_seconds_bucket|container_memory_cache|container_network_transmit_packets_total|kubelet_volume_stats_inodes_used|up|rest_client_requests_total
      sourceLabels:
      - __name__
    path: /metrics/cadvisor
    port: https-metrics
    relabelings:
    - sourceLabels:
      - __metrics_path__
      targetLabel: metrics_path
    - action: replace
      targetLabel: job
      replacement: integrations/kubernetes/cadvisor
    scheme: https
    tlsConfig:
      insecureSkipVerify: true
  namespaceSelector:
    matchNames:
    - default
  selector:
    matchLabels:
      app.kubernetes.io/name: kubelet

These two ServiceMonitors configure Agent to scrape all the kubelet and cAdvisor endpoints in your Kubernetes cluster (one of each per Node). In addition, it defines a job label which you can update (it is preset here for compatibility with Grafana Cloud’s Kubernetes integration). It also provides an allowlist containing a core set of Kubernetes metrics to reduce remote metrics usage. If you don’t need this allowlist, you can omit it, however, your metrics usage will increase significantly.

When you’re done, Agent should now be shipping kubelet and cAdvisor metrics to your remote Prometheus endpoint. To check this in Grafana Cloud, go to your dashboards, select Integration - Kubernetes, then select Kubernetes / Kubelet.

Deploy LogsInstance and PodLogs resources

Next, you’ll deploy a LogsInstance resource to collect logs from your cluster Nodes and ship these to your remote Loki endpoint. Agent Operator deploys a DaemonSet of Agents in your cluster that will tail log files defined in PodLogs resources.

To deploy the LogsInstance resource into your cluster:

Copy the following manifest to a file, then roll it out in your cluster using kubectl apply -f followed by the filename.
YAML
```
apiVersion: monitoring.grafana.com/v1alpha1
kind: LogsInstance
metadata:
  name: primary
  namespace: default
  labels:
    agent: grafana-agent-logs
spec:
  clients:
  - url: your_remote_logs_URL
    basicAuth:
      username:
        name: primary-credentials-logs
        key: username
      password:
        name: primary-credentials-logs
        key: password

  # Supply an empty namespace selector to look in all namespaces. Remove
  # this to only look in the same namespace as the LogsInstance CR
  podLogsNamespaceSelector: {}
  podLogsSelector:
    matchLabels:
      instance: primary
```
This LogsInstance picks up PodLogs resources with the instance: primary label. Be sure to set the Loki URL to the correct push endpoint. For Grafana Cloud, this will look similar to logs-prod-us-central1.grafana.net/loki/api/v1/push, however check the Grafana Cloud Portal to confirm by clicking Details on the Loki tile.
Also note that this example uses the agent: grafana-agent-logs label, which associates this LogsInstance with the GrafanaAgent resource defined earlier. This means that it will inherit requests, limits, affinities and other properties defined in the GrafanaAgent custom resource.
To create the Secret for the LogsInstance resource, copy the following Secret manifest to a file, then roll it out in your cluster using kubectl apply -f followed by the filename.
YAML
```
apiVersion: v1
kind: Secret
metadata:
  name: primary-credentials-logs
  namespace: default
stringData:
  username: 'your_username_here'
  password: 'your_password_here'
```
If you’re using Grafana Cloud, you can find your hosted Loki endpoint username and password by clicking Details on the Loki tile on the Grafana Cloud Portal. If you want to base64-encode these values yourself, use data instead of stringData.
Copy the following PodLogs manifest to a file, then roll it to your cluster using kubectl apply -f followed by the filename. The manifest defines your logging targets. Agent Operator turns this into Agent configuration for the logs subsystem, and rolls it out to the DaemonSet of logging Agents.
Note: The following is a minimal working example which you should adapt to your production needs.
YAML
```
apiVersion: monitoring.grafana.com/v1alpha1
kind: PodLogs
metadata:
  labels:
    instance: primary
  name: kubernetes-pods
  namespace: default
spec:
  pipelineStages:
    - docker: {}
  namespaceSelector:
    matchNames:
    - default
  selector:
    matchLabels: {}
```
This example tails container logs for all Pods in the default namespace. You can restrict the set of matched Pods by using the matchLabels selector. You can also set additional pipelineStages and create relabelings to add or modify log line labels. To learn more about the PodLogs specification and available resource fields, see the PodLogs CRD.
The above PodLogs resource adds the following labels to log lines:
- namespace
- service
- pod
- container
- job (set to PodLogs_namespace/PodLogs_name)
- __path__ (the path to log files, set to /var/log/pods/*$1/*.log where $1 is __meta_kubernetes_pod_uid/__meta_kubernetes_pod_container_name)
To learn more about this configuration format and other available labels, see the Promtail Scraping documentation. Agent Operator loads this configuration into the LogsInstance agents automatically.

The DaemonSet of logging agents should be tailing your container logs, applying default labels to the log lines, and shipping them to your remote Loki endpoint.

Summary

You’ve now rolled out the following into your cluster:

A GrafanaAgent resource that discovers one or more MetricsInstance and LogsInstances resources.
A MetricsInstance resource that defines where to ship collected metrics.
A ServiceMonitor resource to collect cAdvisor and kubelet metrics.
A LogsInstance resource that defines where to ship collected logs.
A PodLogs resource to collect container logs from Kubernetes Pods.

What’s next

You can verify that everything is working correctly by navigating to your Grafana instance and querying your Loki and Prometheus data sources.

Tip: You can deploy multiple GrafanaAgent resources to isolate allocated resources to the agent pods. By default, the GrafanaAgent resource determines the resources of all deployed agent containers. However, you might want different memory limits for metrics versus logs.

Was this page helpful?

Create a GitHub issue

Email docs@grafana.com

Help and support

Community

Deploy the Grafana Agent Operator resources

Before you begin

Deploy the GrafanaAgent resource

Disable feature flags reporting

Disable support bundle generation

Deploy a MetricsInstance resource

Create ServiceMonitors for kubelet and cAdvisor endpoints

Deploy LogsInstance and PodLogs resources

Summary

What’s next

Was this page helpful?

Still have questions?

Get every update

Deploy the Grafana Agent Operator resources

Before you begin

Deploy the GrafanaAgent resource

Disable feature flags reporting

Disable support bundle generation

Deploy a MetricsInstance resource

Create ServiceMonitors for kubelet and cAdvisor endpoints

Deploy LogsInstance and PodLogs resources

Summary

What’s next

Was this page helpful?

Related resources from Grafana Labs