Menu
Open source

otelcol.connector.spanmetrics

otelcol.connector.spanmetrics accepts span data from other otelcol components and aggregates Request, Error and Duration (R.E.D) OpenTelemetry metrics from the spans:

  • Request counts are computed as the number of spans seen per unique set of dimensions, including Errors. Multiple metrics can be aggregated if, for instance, a user wishes to view call counts just on service.name and span.name.

    Requests are tracked using a calls metric with a status.code datapoint attribute set to Ok:

    calls { service.name="shipping", span.name="get_shipping/{shippingId}", span.kind="SERVER", status.code="Ok" }
  • Error counts are computed from the number of spans with an Error status code.

    Errors are tracked using a calls metric with a status.code datapoint attribute set to Error:

    calls { service.name="shipping", span.name="get_shipping/{shippingId}, span.kind="SERVER", status.code="Error" }
  • Duration is computed from the difference between the span start and end times and inserted into the relevant duration histogram time bucket for each unique set dimensions.

    Span durations are tracked using a duration histogram metric:

    duration { service.name="shipping", span.name="get_shipping/{shippingId}", span.kind="SERVER", status.code="Ok" }

Note

otelcol.connector.spanmetrics is a wrapper over the upstream OpenTelemetry Collector spanmetrics connector. Bug reports or feature requests will be redirected to the upstream repository, if necessary.

Multiple otelcol.connector.spanmetrics components can be specified by giving them different labels.

Usage

alloy
otelcol.connector.spanmetrics "LABEL" {
  histogram {
    ...
  }

  output {
    metrics = [...]
  }
}

Arguments

otelcol.connector.spanmetrics supports the following arguments:

NameTypeDescriptionDefaultRequired
aggregation_temporalitystringConfigures whether to reset the metrics after flushing."CUMULATIVE"no
dimensions_cache_sizenumberHow many dimensions to cache.1000no
exclude_dimensionslist(string)List of dimensions to be excluded from the default set of dimensions.[]no
metrics_flush_intervaldurationHow often to flush generated metrics."60s"no
metrics_expirationdurationTime period after which metrics are considered stale and are removed from the cache."0s"no
metric_timestamp_cache_sizenumberControls the size of a cache used to keep track of the last time a metric was flushed.1000no
namespacestringMetric namespace."traces.span.metrics"no
resource_metrics_cache_sizenumberThe size of the cache holding metrics for a service.1000no
resource_metrics_key_attributeslist(string)Limits the resource attributes used to create the metrics.[]no

Adjusting dimensions_cache_size can improve the Alloy process’ memory usage.

The supported values for aggregation_temporality are:

  • "CUMULATIVE": The metrics will not be reset after they are flushed.
  • "DELTA": The metrics will be reset after they are flushed.

If namespace is set, the generated metric name will be added a namespace. prefix.

Setting metrics_expiration to "0s" means that the metrics will never expire.

resource_metrics_cache_size is mostly relevant for cumulative temporality. It helps avoid issues with increasing memory and with incorrect metric timestamp resets.

metric_timestamp_cache_size is only relevant for delta temporality span metrics. It controls the size of a cache used to keep track of the last time a metric was flushed. When a metric is evicted from the cache, its next data point will indicate a “reset” in the series. Downstream components converting from delta to cumulative may handle these resets by setting cumulative counters back to 0.

resource_metrics_key_attributes can be used to avoid situations where resource attributes may change across service restarts, causing metric counters to break (and duplicate). A resource does not need to have all of the attributes. The list must include enough attributes to properly identify unique resources or risk aggregating data from more than one service and span. For example, ["service.name", "telemetry.sdk.language", "telemetry.sdk.name"].

Blocks

The following blocks are supported inside the definition of otelcol.connector.spanmetrics:

HierarchyBlockDescriptionRequired
dimensiondimensionDimensions to be added in addition to the default ones.no
eventseventsConfigures the events metric.no
events > dimensiondimensionSpan event attributes to add as dimensions to the events metric, on top of the default ones and the ones configured in the top-level dimension block.no
exemplarsexemplarsConfigures how to attach exemplars to histograms.no
histogramhistogramConfigures the histogram derived from spans durations.yes
histogram > explicitexplicitConfiguration for a histogram with explicit buckets.no
histogram > exponentialexponentialConfiguration for a histogram with exponential buckets.no
outputoutputConfigures where to send telemetry data.yes
debug_metricsdebug_metricsConfigures the metrics that this component generates to monitor its state.no

It is necessary to specify either a “exponential” or an “explicit” block:

dimension block

The dimension block configures dimensions to be added in addition to the default ones.

The default dimensions are:

  • service.name
  • span.name
  • span.kind
  • status.code

The default dimensions are always added. If no additional dimensions are specified, only the default ones will be added.

The following attributes are supported:

NameTypeDescriptionDefaultRequired
defaultstringValue to use if the attribute is missing.nullno
namestringSpan attribute or resource attribute to look up.yes

otelcol.connector.spanmetrics will look for the name attribute in the span’s collection of attributes. If it is not found, the resource attributes will be checked.

If the attribute is missing in both the span and resource attributes:

  • If default is not set, the dimension will be omitted.
  • If default is set, the dimension will be added and its value will be set to the value of default.

events block

The events block configures the events metric, which tracks span events.

The following attributes are supported:

NameTypeDescriptionDefaultRequired
enabledboolEnables all events metric.falseno

At least one dimension block is required if enabled is set to true.

histogram block

The histogram block configures the histogram derived from spans’ durations.

The following attributes are supported:

NameTypeDescriptionDefaultRequired
disableboolDisable all histogram metrics.falseno
unitstringConfigures the histogram units."ms"no

The supported values for unit are:

  • "ms": milliseconds
  • "s": seconds

exponential block

The exponential block configures a histogram with exponential buckets.

The following attributes are supported:

NameTypeDescriptionDefaultRequired
max_sizenumberMaximum number of buckets per positive or negative number range.160no

explicit block

The explicit block configures a histogram with explicit buckets.

The following attributes are supported:

NameTypeDescriptionDefaultRequired
bucketslist(duration)List of histogram buckets.["2ms", "4ms", "6ms", "8ms", "10ms", "50ms", "100ms", "200ms", "400ms", "800ms", "1s", "1400ms", "2s", "5s", "10s", "15s"]no

exemplars block

The exemplars block configures how to attach exemplars to histograms.

The following attributes are supported:

NameTypeDescriptionDefaultRequired
enabledboolConfigures whether to add exemplars to histograms.falseno
max_per_data_pointnumberLimits the number of exemplars that can be added to a unique dimension set.nullno

max_per_data_point can help with reducing memory consumption.

output block

The output block configures a set of components to forward resulting telemetry data to.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
metricslist(otelcol.Consumer)List of consumers to send metrics to.[]no

You must specify the output block, but all its arguments are optional. By default, telemetry data is dropped. Configure the metrics argument accordingly to send telemetry data to other components.

debug_metrics block

The debug_metrics block configures the metrics that this component generates to monitor its state.

The following arguments are supported:

NameTypeDescriptionDefaultRequired
disable_high_cardinality_metricsbooleanWhether to disable certain high cardinality metrics.trueno
levelstringControls the level of detail for metrics emitted by the wrapped collector."detailed"no

disable_high_cardinality_metrics is the Grafana Alloy equivalent to the telemetry.disableHighCardinalityMetrics feature gate in the OpenTelemetry Collector. It removes attributes that could cause high cardinality metrics. For example, attributes with IP addresses and port numbers in metrics about HTTP and gRPC connections are removed.

Note

If configured, disable_high_cardinality_metrics only applies to otelcol.exporter.* and otelcol.receiver.* components.

level is the Alloy equivalent to the telemetry.metrics.level feature gate in the OpenTelemetry Collector. Possible values are "none", "basic", "normal" and "detailed".

Exported fields

The following fields are exported and can be referenced by other components:

NameTypeDescription
inputotelcol.ConsumerA value that other components can use to send telemetry data to.

input accepts otelcol.Consumer traces telemetry data. It does not accept metrics and logs.

Handling of resource attributes

otelcol.connector.spanmetrics is an OTLP-native component. As such, it aims to preserve the resource attributes of spans.

  1. For example, let’s assume that there are two incoming resources spans with the same service.name and k8s.pod.name resource attributes.

  2. otelcol.connector.spanmetrics will preserve the incoming service.name and k8s.pod.name resource attributes by attaching them to the output metrics resource. Only one metric resource will be created, because both span resources have identical resource attributes.

  3. Now assume that otelcol.connector.spanmetrics receives two incoming resource spans, each with a different value for the k8s.pod.name recourse attribute.

  4. To preserve the values of all resource attributes, otelcol.connector.spanmetrics will produce two resource metrics. Each resource metric will have a different value for the k8s.pod.name recourse attribute. This way none of the resource attributes will be lost during the generation of metrics.

Component health

otelcol.connector.spanmetrics is only reported as unhealthy if given an invalid configuration.

Debug information

otelcol.connector.spanmetrics does not expose any component-specific debug information.

Examples

Explicit histogram and extra dimensions

In the example below, http.status_code and http.method are additional dimensions on top of:

  • service.name
  • span.name
  • span.kind
  • status.code
alloy
otelcol.receiver.otlp "default" {
  http {}
  grpc {}

  output {
    traces  = [otelcol.connector.spanmetrics.default.input]
  }
}

otelcol.connector.spanmetrics "default" {
  // Since a default is not provided, the http.status_code dimension will be omitted
  // if the span does not contain http.status_code.
  dimension {
    name = "http.status_code"
  }

  // If the span is missing http.method, the connector will insert
  // the http.method dimension with value 'GET'.
  dimension {
    name = "http.method"
    default = "GET"
  }

  dimensions_cache_size = 333

  aggregation_temporality = "DELTA"

  histogram {
    unit = "s"
    explicit {
      buckets = ["333ms", "777s", "999h"]
    }
  }

  // The period on which all metrics (whose dimension keys remain in cache) will be emitted.
  metrics_flush_interval = "33s"

  namespace = "test.namespace"

  output {
    metrics = [otelcol.exporter.otlp.production.input]
  }
}

otelcol.exporter.otlp "production" {
  client {
    endpoint = sys.env("OTLP_SERVER_ENDPOINT")
  }
}

Sending metrics via a Prometheus remote write

The generated metrics can be sent to a Prometheus-compatible database such as Grafana Mimir. However, extra steps are required in order to make sure all metric samples are received. This is because otelcol.connector.spanmetrics aims to preserve resource attributes in the metrics which it outputs.

Unfortunately, the Prometheus data model has no notion of resource attributes. This means that if otelcol.connector.spanmetrics outputs metrics with identical metric attributes, but different resource attributes, otelcol.exporter.prometheus will convert the metrics into the same metric series. This problem can be solved by doing either of the following:

  • Recommended approach: Prior to otelcol.connector.spanmetrics, remove all resource attributes from the incoming spans which are not needed by otelcol.connector.spanmetrics.

  • Or, after otelcol.connector.spanmetrics, copy each of the resource attributes as a metric datapoint attribute. This has the advantage that the resource attributes will be visible as metric labels. However, the cardinality of the metrics may be much higher, which could increase the cost of storing and querying them. The example below uses the merge_maps OTTL function.

If the resource attributes are not treated in either of the ways described above, an error such as this one could be logged by prometheus.remote_write: the sample has been rejected because another sample with the same timestamp, but a different value, has already been ingested (err-mimir-sample-duplicate-timestamp).

Note

In order for a Prometheus target_info metric to be generated, the incoming spans resource scope attributes must contain service.name and service.instance.id attributes.

The target_info metric will be generated for each resource scope, while OpenTelemetry metric names and attributes will be normalized to be compliant with Prometheus naming rules.

Compatible components

otelcol.connector.spanmetrics can accept arguments from the following components:

otelcol.connector.spanmetrics has exports that can be consumed by the following components:

Note

Connecting some components may not be sensible or components may require further configuration to make the connection work correctly. Refer to the linked documentation for more details.