Menu
OpenTelemetry OpenTelemetry Collector Send logs to Loki Collect logs in Kubernetes with the OpenTelemetry Collector
Open source

Collect logs in Kubernetes with the OpenTelemetry Collector

In Kubernetes, when applications log to stdout, the logs are collected by the Kubernetes API server and stored as files on the node running the application. It is recommended to run a log collector on each node to collect these logs and send them to a centralized log storage. In this guide we will show you how to use the Collector to collect the logs from the node.

Kubernetes Logging

In general, Kubernetes stores the logs in /var/log/pods/<namespace>_<pod_name>_<pod_id>/<container_name>/<run_id>.log. We will use the filelog receiver in the Collector to collect the logs from this directory. The Collector needs to run as daemonset to be able to collect the logs from the nodes. The format in which the logs are stored is dependent on the container runtime being used. For example, if the runtime is containerd, the format is ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$.

Configure the Collector

The different formats are confusing and not trivial to navigate. Hence we recommend using the following config for the filelog receiver. It will automatically detect the format and parse the logs accordingly.

yaml
receivers:
    filelog:
    include:
    - /var/log/pods/*/*/*.log
    include_file_name: false
    include_file_path: true
    operators:
    - id: get-format
        routes:
        - expr: body matches "^\\{"
        output: parser-docker
        - expr: body matches "^[^ Z]+ "
        output: parser-crio
        - expr: body matches "^[^ Z]+Z"
        output: parser-containerd
        type: router
    - id: parser-crio
        output: extract_metadata_from_filepath
        regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
        timestamp:
        layout: 2006-01-02T15:04:05.999999999Z07:00
        layout_type: gotime
        parse_from: attributes.time
        type: regex_parser
    - id: parser-containerd
        output: extract_metadata_from_filepath
        regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
        timestamp:
        layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        parse_from: attributes.time
        type: regex_parser
    - id: parser-docker
        output: extract_metadata_from_filepath
        timestamp:
        layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        parse_from: attributes.time
        type: json_parser
    - id: extract_metadata_from_filepath
        parse_from: attributes["log.file.path"]
        regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
        type: regex_parser
    - from: attributes.stream
        to: attributes["log.iostream"]
        type: move
    - from: attributes.container_name
        to: resource["k8s.container.name"]
        type: move
    - from: attributes.namespace
        to: resource["k8s.namespace.name"]
        type: move
    - from: attributes.pod_name
        to: resource["k8s.pod.name"]
        type: move
    - from: attributes.restart_count
        to: resource["k8s.container.restart_count"]
        type: move
    - from: attributes.uid
        to: resource["k8s.pod.uid"]
        type: move
    - from: attributes.log
        to: body
        type: move
    start_at: beginning

Helm chart

Configuring this might seem scary but it has been made extremely easy if you’re using the helm-chart. You can find the helm chart here. To enable log collection you need to set the following values:

yaml
mode: daemonset

presets:
  logsCollection:
    enabled: true
    includeCollectorLogs: true