Menu
Grafana Cloud

Collect logs through Kubernetes stdout with the OpenTelemetry Collector

If the best practice to send logs with OpenTelemetry is to use the OTLP protocol, some use cases prevent using this pattern and require outputting logs to files or stdout. Common use cases that prevent using the OTLP protocol include:

  • Lack of support of OTLP for logs by the OpenTelemetry SDK. OpenTelemetry SDKs for Go, Python, Ruby, JavaScript/Node.js, or PHP don’t provide stable implementation of OTLP for logs.
  • Organizational constraints, often related to reliability practices, that require usage of files for logs

You can collect file-based logs with the OpenTelemetry Collector. The following logs are emitted through Kubernetes stdout, but you can apply the same pattern to logs emitted to files.

Architecture to collect logs through Kubernetes stdout with the OpenTelemetry Collector

For proper correlation with traces and metrics, you should contextualize logs with the same resource attributes and with the trace and span IDs, which means:

  • Enrich logs with the same identifying resource attributes, for example service.name, service.namespace, service.instance.id, and deployment.environment, and with trace_id and span_id
  • Go through the same metadata enrichment pipeline in the OpenTelemetry Collector, for example the Kubernetes Attributes Processor or the Resource Detection Processor

If this common enrichment is provided out-of-the-box when exporting logs through OTLP, you must add these attributes to the log lines when collected through files or stdout.

This Kubernetes architecture diagram shows containerized application logs emitted through stdout, collected with the OpenTelemetry Collector, and sent to Grafana Cloud
This Kubernetes architecture diagram shows containerized application logs emitted through stdout, collected with the OpenTelemetry Collector, and sent to Grafana Cloud.

Pros and cons of JSON and unstructured text to enrich logs with contextualization metadata

To carry over the resource attributes in the log lines, you must adhere to one of the following patterns:

  1. Export unstructured logs and parse them with regular expressions, for example:

    2024-09-17T11:29:54  INFO [nio-8080-exec-1] c.e.OrderController  : Order completed - service.name=order-processor, service.instance.id=i-123456, span_id=1d5f8ca3f9366fac...
  2. Export structured format logs like JSON logs and parse them with native parsers of the chosen format, for example:

    json
    {"timestamp": "2024-09-17T11:29:54", "level": "INFO", "body":"Order completed", "logger": "c.e.OrderController", "service_name": "order-processor", "service_instance_id": "i-123456", "span_id":"1d5f8ca3f9366fac"...}

Both patterns have pros and cons:

-JSON logs Unstructured logs
Correlation++++++
Human ReadabilityThe verbosity of JSON can seriously erode readabilityContextualization attributes can be appended at the end of the log line preserving readability
Reliability of the parsingIt’s simple to define robust JSON parsing rulesParsing unstructured text with regular expressions is fragile, particularly due to multi-line log messages like stack traces, to the point where it requires monitoring parsing failures

Emit contextualized JSON logs with Java

Most popular logging frameworks, such as Log4j or SLF4J/Logback in Java, support emitting JSON formatted logs. The integration of OpenTelemetry with logging libraries requires specifying the resource attributes available in the log line. For the sake of readability and limiting verbosity, we recommend just adding attributes that are required to filter and correlate logs, such as service.name, service.namespace, deployment.environment, and service.instance.id.

The following example uses Spring Boot to enrich logs with OpenTelemetry attributes and emit them formatted in JSON through stdout.

Note

You can find the full code of the example in the Docker LGTM repository.
  • Create a Spring Boot application (3.3.0 or newer) using the default logging instrumentation with the Logback library and the stdout output
  • Instrument the Spring Boot application with the OpenTelemetry Java Agent following the setup flow defined by the Grafana Cloud integration for Java:
    • Open the Grafana Cloud home page in a web browser
    • Navigate to Connections > Add new Connection
    • Select Java OpenTelemetry and follow the setup instructions
  • Enrich Logback logs with OpenTelemetry resource attributes and log attributes using:
export OTEL_INSTRUMENTATION_COMMON_MDC_RESOURCE_ATTRIBUTES=service.namespace,service.name,service.instance.id,service.version,deployment.environment
  • Change the Logback SpringBoot configuration to emit JSON logs with OpenTelemetry contextualization and add the following logback-spring.xml under src/main/resources:
xml
<!-- tested with Logback 1.5.0 and Spring Boot 3.3.0 -->
<configuration>
    <include resource="org/springframework/boot/logging/logback/defaults.xml"/>
    <include resource="org/springframework/boot/logging/logback/console-appender.xml"/>

    <root level="INFO">
        <appender-ref ref="CONSOLE_JSON"/>
    </root>

    <appender name="CONSOLE_JSON" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="ch.qos.logback.classic.encoder.JsonEncoder">
            <withFormattedMessage>true</withFormattedMessage>
            <withMessage>false</withMessage>
            <withArguments>false</withArguments>
            <withSequenceNumber>false</withSequenceNumber>
            <withNanoseconds>false</withNanoseconds>
        </encoder>
    </appender>
</configuration>
  • Deploy the Spring Boot application on Kubernetes and verify that the application logs are outputted to the container stdout stream with JSON formatting and OpenTelemetry contextualization

    • Example of kubectl logs my_pod_name | jq:

      json
      {
        "timestamp": 1727346005788,
        "level": "INFO",
        "threadName": "http-nio-8080-exec-5",
        "loggerName": "com.grafana.example.RollController",
        "context": {
          "name": "default",
          "birthdate": 1727345887787,
          "properties": {
          }
        },
        "mdc": {
          "trace_id": "97a39974ba2dfc9275e4d31dc2730ee4",
          "trace_flags": "01",
          "service.name": "dice",
          "service.instance.id": "0fb18318-06c0-4893-9a8b-353c00b45227",
          "service.version": "1.1",
          "span_id": "08d6d83d645e3ad2",
          "service.namespace": "shop",
          "deployment.environment": "staging"
        },
        "formattedMessage": "Anonymous player is rolling the dice: 6",
        "throwable": null
      }
  • Configure the OpenTelemetry Collector instance with the file log receiver

    • Add a file log receiver
    • Add the container parser operator
    • Add the json parser operator
      • Map body to attributes.formattedMessage
      • Map timestamp.parse_from to attributes.timestamp
      • Map severity.parse_from to attributes.level
      • Map trace.trace_id.parse_from to attributes.mdc.trace_id
      • Map trace.span_id.parse_from to attributes.mdc.span_id
      • Map trace.trace_flags.parse_from to attributes.mdc.trace_flags
    • Add the move operator
      • Move top level fields, such as threadName, to their corresponding OTel attributes, such as thread.name
      • Move entries from the mdc field to the resource entry of the log entry
    • Remove all unnecessary fields using the attributes operator
    • You can find the full configuration in the reference OTel collector configuration

Note

You can find the full code of the example in the Docker LGTM repository.