Menu
Grafana Cloud

Set up Kubernetes Event monitoring (beta)

Grafana Agent bundles an eventhandler integration that watches for Kubernetes Events in your clusters and ships these to Grafana Cloud Loki. Kubernetes controllers emit Events as they perform operations in your cluster (like starting containers, scheduling Pods, etc.) and these can be a rich source of logging information to help you debug, monitor, and alert on your Kubernetes workloads. Generally, these Events can be queried using kubectl get event or kubectl describe; with the eventhandler integration enabled, you can query these directly from Grafana Cloud.

Before you start

To begin, you’ll need the following:

  • A Kubernetes cluster
  • The kubectl command-line tool installed and available on your machine
  • A Grafana Cloud account or Loki instance that will receive log entries

Deployment options

The eventhandler integration is one of several integrations embedded directly into Grafana Agent. You can run the integration in several ways:

  • A dedicated Grafana Agent StatefulSet running only the eventhandler integration
  • As part of an existing Agent StatefulSet

NOTE: Although you can run the integration without persistent storage, we recommend running it with dedicated disk storage (StatefulSet or Deployment with PersistentVolume & PersistentVolumeClaim) to take advantage of its caching feature. Kubernetes events have a lifespan of an hour; after an hour, they are deleted from the cluster’s internal key-value store. If you restart the integration within an hour of it going down, eventhandler will re-ship any Events present in the cluster’s internal store unless the cache file is provided.

Option 1: Run a dedicated eventhandler

To run a dedicated eventhandler StatefulSet and for full documentation and configuration instructions, see eventhandler_config from the Grafana Agent documentation. These docs provide sample manifests and configuration for an Agent StatefulSet running only the eventhandler integration.

You can also use a Deployment with a PersistentVolume and PersistentVolumeClaim or use node-local storage, but these methods are outside the scope of this guide and require modifying the provided manifests and instructions.

Option 2: Enable eventhandler in an existing Agent Deployment or StatefulSet

To enable the eventhandler integration in an existing Grafana Agent setup or to avoid running another Agent in your cluster, you can modify your existing Agent’s configuration to enable the integration.

Note: If you’re using a Deployment you should attach persistent disk storage and appropriately configure the integration’s cache_path to take advantage of eventhandler’s Event caching. This isn’t necessary but will prevent double-shipping cluster Events to Loki in the event of an Agent restart. To learn more about configuring a PersistentVolume for storage, please see Configure a Pod to Use a PersistentVolume for Storage from the K8s docs.

  1. Enable the integration

    Modify your existing Agent configuration by adding the following stanza to your Agent’s agent.yaml or ConfigMap:

    server:
      . . .
    metrics:
      . . .
    integrations:
      eventhandler:
        cache_path: "/etc/eventhandler/eventhandler.cache"
        logs_instance: "default"
      . . .
    

    This block enables the integration and instructs it to cache the last Event shipped at the path provided by cache_path. For a full configuration reference, please see eventhandler_config from the Agent documentation.

  2. Enable the logs instance.

    Add the following block of Agent logs config:

    server: . . .
    metrics: . . .
    integrations:
      ## see above
      . . .
    logs:
      configs:
        - name: default
          clients:
            ## you may need to replace this with a different endpoint
            - url: https://logs-prod-us-central1.grafana.net/api/prom/push
              basic_auth:
                username: YOUR_LOKI_USER
                password: YOUR_LOKI_API_KEY
              external_labels:
                cluster: 'cloud'
                job: 'integrations/kubernetes/eventhandler'
          positions:
            filename: /tmp/positions0.yaml
    

    This block enables an instance of Agent’s logs subsystem (embedded promtail) and configures it with the appropriate Loki credentials:

    • default determines where Events get shipped as Loki log lines. You can also set default labels on log lines using the external_labels parameter. The name must match logs_instance in the integrations config block.

    For full logs_config reference, please see logs_config from the Agent docs.

    You can find your Loki credentials in your org’s Grafana Cloud Portal.

  3. Run eventhandler

    To run eventhandler, you need to pass in the following flag when you run Agent:

    -enable-features=integrations-next
    

    This enables the latest version of the Agent integration subsystem. To learn more, please see Integrations Revamp from the Agent documentation.

    A full Kubernetes container spec should be similar to this one:

    containers:
      - name: agent
        image: grafana/agent:latest
        imagePullPolicy: IfNotPresent
        args:
          - -config.file=/etc/agent/agent.yaml
          - -enable-features=integrations-next
        command:
          - /bin/grafana-agent
        env:
          - name: HOSTNAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
        ports:
          - containerPort: 12345
            name: http-metrics
        volumeMounts:
          ## Should use a ConfigMap volume, stores Agent config
          - name: grafana-agent
            mountPath: /etc/agent
          ## Optional, but should use a persistent volume, stores Event cache
          - name: eventhandler-cache
            mountPath: /etc/eventhandler
    

    You should modify these parameters depending on your architecture and configured PersistentVolumes and ConfigMaps.

  4. Add ClusterRole events permission

    You also need to allow Agent’s ClusterRole to access the events resource from K8s API:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: grafana-agent
    rules:
      - apiGroups:
          - ''
        resources:
          - nodes
          - nodes/proxy
          - services
          - endpoints
          - pods
          ## added "events" here
          - events
        verbs:
          - get
          - list
          - watch
      - nonResourceURLs:
          - /metrics
        verbs:
          - get
    

    eventhandler only requires get list watch for the events resource, but for clarity we’ve appended the required permission to the default ClusterRole provided by the K8s integration (which also allows Prometheus service discovery).

Please surface any issues with this integration in the Grafana Agent GitHub Repo or on the Grafana Labs Community Slack (in #agent).

eventhandler is enabled by default in the latest version of the Kubernetes Monitoring agent manifests.