Menu

This is documentation for the next version of Agent. For the latest stable release, go to the latest version.

Experimental

otelcol.processor.transform

EXPERIMENTAL: This is an experimental component. Experimental components are subject to frequent breaking changes, and may be removed with no equivalent replacement.

otelcol.processor.transform accepts telemetry data from other otelcol components and modifies it using the OpenTelemetry Transformation Language (OTTL). OTTL statements consist of OTTL functions, which act on paths. A path is a reference to a telemetry data such as:

  • Resource attributes.
  • Instrumentation scope name.
  • Span attributes.

In addition to the standard OTTL functions, there is also a set of metrics-only functions:

OTTL statements can also contain constructs such as:

  • Booleans:
    • not true
    • not IsMatch(name, "http_.*")
  • Boolean Expressions consisting of a where followed by one or more booleans:
    • set(attributes["whose_fault"], "ours") where attributes["http.status"] == 500
    • set(attributes["whose_fault"], "theirs") where attributes["http.status"] == 400 or attributes["http.status"] == 404
  • Math expressions:
    • 1 + 1
    • end_time_unix_nano - start_time_unix_nano
    • sum([1, 2, 3, 4]) + (10 / 1) - 1

Note

There are two ways of inputting strings in River configuration files:

  • Using quotation marks (normal River strings). Characters such as \ and " must be escaped by preceding them with a \ character.
  • Using backticks (raw River strings). No characters must be escaped. However, it’s not possible to have backticks inside the string.

For example, the OTTL statement set(description, "Sum") where type == "Sum" can be written as:

  • A normal River string: "set(description, \"Sum\") where type == \"Sum\"".
  • A raw River string: `set(description, "Sum") where type == "Sum"`.

Raw strings are generally more convenient for writing OTTL statements.

Note

otelcol.processor.transform is a wrapper over the upstream OpenTelemetry Collector transform processor. If necessary, bug reports or feature requests will be redirected to the upstream repository.

You can specify multiple otelcol.processor.transform components by giving them different labels.

Warning

otelcol.processor.transform allows you to modify all aspects of your telemetry. Some specific risks are given below, but this is not an exhaustive list. It is important to understand your data before using this processor.

  • Unsound Transformations: Transformations between metric data types are not defined in the metrics data model. To use these functions, you must understand the incoming data and know that it can be meaningfully converted to a new metric data type or can be used to create new metrics.
    • Although OTTL allows you to use the set function with metric.data_type, its implementation in the transform processor is a no-op. To modify a data type, you must use a specific function such as convert_gauge_to_sum.
  • Identity Conflict: Transformation of metrics can potentially affect a metric’s identity, leading to an Identity Crisis. Be especially cautious when transforming a metric name and when reducing or changing existing attributes. Adding new attributes is safe.
  • Orphaned Telemetry: The processor allows you to modify span_id, trace_id, and parent_span_id for traces and span_id, and trace_id logs. Modifying these fields could lead to orphaned spans or logs.

Usage

river
otelcol.processor.transform "LABEL" {
  output {
    metrics = [...]
    logs    = [...]
    traces  = [...]
  }
}

Arguments

otelcol.processor.transform supports the following arguments:

NameTypeDescriptionDefaultRequired
error_modestringHow to react to errors if they occur while processing a statement."propagate"no

The supported values for error_mode are:

  • ignore: Ignore errors returned by statements and continue on to the next statement. This is the recommended mode.
  • propagate: Return the error up the pipeline. This will result in the payload being dropped from the Agent.

Blocks

The following blocks are supported inside the definition of otelcol.processor.transform:

HierarchyBlockDescriptionRequired
trace_statementstrace_statementsStatements which transform traces.no
metric_statementsmetric_statementsStatements which transform metrics.no
log_statementslog_statementsStatements which transform logs.no
outputoutputConfigures where to send received telemetry data.yes

trace_statements block

The trace_statements block specifies statements which transform trace telemetry signals. Multiple trace_statements blocks can be specified.

NameTypeDescriptionDefaultRequired
contextstringOTTL Context to use when interpreting the associated statements.yes
statementslist(string)A list of OTTL statements.yes

The supported values for context are:

  • resource: Use when interacting only with OTLP resources (for example, resource attributes).
  • scope: Use when interacting only with OTLP instrumentation scope (for example, the name of the instrumentation scope).
  • span: Use when interacting only with OTLP spans.
  • spanevent: Use when interacting only with OTLP span events.

See OTTL Context for more information about how ot use contexts.

metric_statements block

The metric_statements block specifies statements which transform metric telemetry signals. Multiple metric_statements blocks can be specified.

NameTypeDescriptionDefaultRequired
contextstringOTTL Context to use when interpreting the associated statements.yes
statementslist(string)A list of OTTL statements.yes

The supported values for context are:

  • resource: Use when interacting only with OTLP resources (for example, resource attributes).
  • scope: Use when interacting only with OTLP instrumentation scope (for example, the name of the instrumentation scope).
  • metric: Use when interacting only with individual OTLP metrics.
  • datapoint: Use when interacting only with individual OTLP metric data points.

Refer to OTTL Context for more information about how to use contexts.

log_statements block

The log_statements block specifies statements which transform log telemetry signals. Multiple log_statements blocks can be specified.

NameTypeDescriptionDefaultRequired
contextstringOTTL Context to use when interpreting the associated statements.yes
statementslist(string)A list of OTTL statements.yes

The supported values for context are:

  • resource: Use when interacting only with OTLP resources (for example, resource attributes).
  • scope: Use when interacting only with OTLP instrumentation scope (for example, the name of the instrumentation scope).
  • log: Use when interacting only with OTLP logs.

See OTTL Context for more information about how ot use contexts.

OTTL Context

Each context allows the transformation of its type of telemetry. For example, statements associated with a resource context will be able to transform the resource’s attributes and dropped_attributes_count.

Each type of context defines its own paths and enums specific to that context. Refer to the OpenTelemetry documentation for a list of paths and enums for each context:

Contexts NEVER supply access to individual items “lower” in the protobuf definition.

  • This means statements associated to a resource WILL NOT be able to access the underlying instrumentation scopes.
  • This means statements associated to a scope WILL NOT be able to access the underlying telemetry slices (spans, metrics, or logs).
  • Similarly, statements associated to a metric WILL NOT be able to access individual datapoints, but can access the entire datapoints slice.
  • Similarly, statements associated to a span WILL NOT be able to access individual SpanEvents, but can access the entire SpanEvents slice.

For practical purposes, this means that a context cannot make decisions on its telemetry based on telemetry “lower” in the structure. For example, the following context statement is not possible because it attempts to use individual datapoint attributes in the condition of a statement associated to a metric:

river
metric_statements {
  context = "metric"
  statements = [
    "set(description, \"test passed\") where datapoints.attributes[\"test\"] == \"pass\"",
  ]
}

Context ALWAYS supply access to the items “higher” in the protobuf definition that are associated to the telemetry being transformed.

  • This means that statements associated to a datapoint have access to a datapoint’s metric, instrumentation scope, and resource.
  • This means that statements associated to a spanevent have access to a spanevent’s span, instrumentation scope, and resource.
  • This means that statements associated to a span/metric/log have access to the telemetry’s instrumentation scope, and resource.
  • This means that statements associated to a scope have access to the scope’s resource.

For example, the following context statement is possible because datapoint statements can access the datapoint’s metric.

river
metric_statements {
  context = "datapoint"
  statements = [
    "set(metric.description, \"test passed\") where attributes[\"test\"] == \"pass\"",
  ]
}

The protobuf definitions for OTLP signals are maintained on GitHub:

Whenever possible, associate your statements to the context which the statement intens to transform. The contexts are nested, and the higher-level contexts don’t have to iterate through any of the contexts at a lower level. For example, although you can modify resource attributes associated to a span using the span context, it is more efficient to use the resource context.

output block

The output block configures a set of components to forward resulting telemetry data to.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
logslist(otelcol.Consumer)List of consumers to send logs to.[]no
metricslist(otelcol.Consumer)List of consumers to send metrics to.[]no
traceslist(otelcol.Consumer)List of consumers to send traces to.[]no

You must specify the output block, but all its arguments are optional. By default, telemetry data is dropped. Configure the metrics, logs, and traces arguments accordingly to send telemetry data to other components.

Exported fields

The following fields are exported and can be referenced by other components:

NameTypeDescription
inputotelcol.ConsumerA value that other components can use to send telemetry data to.

input accepts otelcol.Consumer data for any telemetry signal (metrics, logs, or traces).

Component health

otelcol.processor.transform is only reported as unhealthy if given an invalid configuration.

Debug information

otelcol.processor.transform does not expose any component-specific debug information.

Debug metrics

otelcol.processor.transform does not expose any component-specific debug metrics.

Examples

Perform a transformation if an attribute does not exist

This example sets the attribute test to pass if the attribute test does not exist.

river
otelcol.processor.transform "default" {
  error_mode = "ignore"

  trace_statements {
    context = "span"
    statements = [
      // Accessing a map with a key that does not exist will return nil. 
      `set(attributes["test"], "pass") where attributes["test"] == nil`,
    ]
  }

  output {
    metrics = [otelcol.exporter.otlp.default.input]
    logs    = [otelcol.exporter.otlp.default.input]
    traces  = [otelcol.exporter.otlp.default.input]
  }
}

Each statement is enclosed in backticks instead of quotation marks. This constitutes a raw string, and lets us avoid the need to escape each " with a \" inside a normal River string.

Rename a resource attribute

The are two ways to rename an attribute key. One way is to set a new attribute and delete the old one:

river
otelcol.processor.transform "default" {
  error_mode = "ignore"

  trace_statements {
    context = "resource"
    statements = [
      `set(attributes["namespace"], attributes["k8s.namespace.name"])`,
      `delete_key(attributes, "k8s.namespace.name")`,
    ]
  }

  output {
    metrics = [otelcol.exporter.otlp.default.input]
    logs    = [otelcol.exporter.otlp.default.input]
    traces  = [otelcol.exporter.otlp.default.input]
  }
}

Another way is to update the key using regular expressions:

river
otelcol.processor.transform "default" {
  error_mode = "ignore"

  trace_statements {
    context = "resource"
    statements = [
     `replace_all_patterns(attributes, "key", "k8s\\.namespace\\.name", "namespace")`,
    ]
  }

  output {
    metrics = [otelcol.exporter.otlp.default.input]
    logs    = [otelcol.exporter.otlp.default.input]
    traces  = [otelcol.exporter.otlp.default.input]
  }
}

Each statement is enclosed in backticks instead of quotation marks. This constitutes a raw string, and lets us avoid the need to escape each " with a \", and each \ with a \\ inside a normal River string.

Create an attribute from the contents of a log body

This example sets the attribute body to the value of the log body:

river
otelcol.processor.transform "default" {
  error_mode = "ignore"

  log_statements {
    context = "log"
    statements = [
      `set(attributes["body"], body)`,
    ]
  }

  output {
    metrics = [otelcol.exporter.otlp.default.input]
    logs    = [otelcol.exporter.otlp.default.input]
    traces  = [otelcol.exporter.otlp.default.input]
  }
}

Each statement is enclosed in backticks instead of quotation marks. This constitutes a raw string, and lets us avoid the need to escape each " with a \" inside a normal River string.

Combine two attributes

This example sets the attribute test to the value of attributes service.name and service.version combined.

river
otelcol.processor.transform "default" {
  error_mode = "ignore"

  trace_statements {
    context = "resource"
    statements = [
      // The Concat function combines any number of strings, separated by a delimiter.
      `set(attributes["test"], Concat([attributes["foo"], attributes["bar"]], " "))`,
    ]
  }

  output {
    metrics = [otelcol.exporter.otlp.default.input]
    logs    = [otelcol.exporter.otlp.default.input]
    traces  = [otelcol.exporter.otlp.default.input]
  }
}

Each statement is enclosed in backticks instead of quotation marks. This constitutes a raw string, and lets us avoid the need to escape each " with a \" inside a normal River string.

Parsing JSON logs

Given the following JSON body:

json
{
  "name": "log",
  "attr1": "example value 1",
  "attr2": "example value 2",
  "nested": {
    "attr3": "example value 3"
  }
}

You can add specific fields as attributes on the log:

river
otelcol.processor.transform "default" {
  error_mode = "ignore"

  log_statements {
    context = "log"

    statements = [
      // Parse body as JSON and merge the resulting map with the cache map, ignoring non-json bodies.
      // cache is a field exposed by OTTL that is a temporary storage place for complex operations.
      `merge_maps(cache, ParseJSON(body), "upsert") where IsMatch(body, "^\\{")`,
  
      // Set attributes using the values merged into cache.
      // If the attribute doesn't exist in cache then nothing happens.
      `set(attributes["attr1"], cache["attr1"])`,
      `set(attributes["attr2"], cache["attr2"])`,
  
      // To access nested maps you can chain index ([]) operations.
      // If nested or attr3 do no exist in cache then nothing happens.
      `set(attributes["nested.attr3"], cache["nested"]["attr3"])`,
    ]
  }

  output {
    metrics = [otelcol.exporter.otlp.default.input]
    logs    = [otelcol.exporter.otlp.default.input]
    traces  = [otelcol.exporter.otlp.default.input]
  }
}

Each statement is enclosed in backticks instead of quotation marks. This constitutes a raw string, and lets us avoid the need to escape each " with a \", and each \ with a \\ inside a normal River string.

Various transformations of attributes and status codes

The example takes advantage of context efficiency by grouping transformations with the context which it intends to transform.

river
otelcol.receiver.otlp "default" {
  http {}
  grpc {}

  output {
    metrics = [otelcol.processor.transform.default.input]
    logs    = [otelcol.processor.transform.default.input]
    traces  = [otelcol.processor.transform.default.input]
  }
}

otelcol.processor.transform "default" {
  error_mode = "ignore"

  trace_statements {
    context = "resource"
    statements = [
      `keep_keys(attributes, ["service.name", "service.namespace", "cloud.region", "process.command_line"])`,
      `replace_pattern(attributes["process.command_line"], "password\\=[^\\s]*(\\s?)", "password=***")`,
      `limit(attributes, 100, [])`,
      `truncate_all(attributes, 4096)`,
    ]
  }

  trace_statements {
    context = "span"
    statements = [
      `set(status.code, 1) where attributes["http.path"] == "/health"`,
      `set(name, attributes["http.route"])`,
      `replace_match(attributes["http.target"], "/user/*/list/*", "/user/{userId}/list/{listId}")`,
      `limit(attributes, 100, [])`,
      `truncate_all(attributes, 4096)`,
    ]
  }

  metric_statements {
    context = "resource"
    statements = [
      `keep_keys(attributes, ["host.name"])`,
      `truncate_all(attributes, 4096)`,
    ]
  }

  metric_statements {
    context = "metric"
    statements = [
      `set(description, "Sum") where type == "Sum"`,
    ]
  }

  metric_statements {
    context = "datapoint"
    statements = [
      `limit(attributes, 100, ["host.name"])`,
      `truncate_all(attributes, 4096)`,
      `convert_sum_to_gauge() where metric.name == "system.processes.count"`,
      `convert_gauge_to_sum("cumulative", false) where metric.name == "prometheus_metric"`,
    ]
  }

  log_statements {
    context = "resource"
    statements = [
      `keep_keys(attributes, ["service.name", "service.namespace", "cloud.region"])`,
    ]
  }

  log_statements {
    context = "log"
    statements = [
      `set(severity_text, "FAIL") where body == "request failed"`,
      `replace_all_matches(attributes, "/user/*/list/*", "/user/{userId}/list/{listId}")`,
      `replace_all_patterns(attributes, "value", "/account/\\d{4}", "/account/{accountId}")`,
      `set(body, attributes["http.route"])`,
    ]
  }

  output {
    metrics = [otelcol.exporter.otlp.default.input]
    logs    = [otelcol.exporter.otlp.default.input]
    traces  = [otelcol.exporter.otlp.default.input]
  }
}

otelcol.exporter.otlp "default" {
  client {
    endpoint = env("OTLP_ENDPOINT")
  }
}

Each statement is enclosed in backticks instead of quotation marks. This constitutes a raw string, and lets us avoid the need to escape each " with a \", and each \ with a \\ inside a normal River string.

Compatible components

otelcol.processor.transform can accept arguments from the following components:

otelcol.processor.transform has exports that can be consumed by the following components:

Note

Connecting some components may not be sensible or components may require further configuration to make the connection work correctly. Refer to the linked documentation for more details.