Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.
otelcol.connector.servicegraph
otelcol.connector.servicegraph accepts span data from other otelcol components and outputs metrics representing the relationship between various services in a system.
A metric represents an edge in the service graph.
Those metrics can then be used by a data visualization application (e.g. Grafana) to draw the service graph.
NOTE:
otelcol.connector.servicegraphis a wrapper over the upstream OpenTelemetry Collectorservicegraphconnector. Bug reports or feature requests will be redirected to the upstream repository, if necessary.
Multiple otelcol.connector.servicegraph components can be specified by giving them
different labels.
This component is based on Grafana Tempo’s service graph processor.
Service graphs are useful for a number of use-cases:
- Infer the topology of a distributed system. As distributed systems grow, they become more complex. Service graphs can help you understand the structure of the system.
- Provide a high level overview of the health of your system. Service graphs show error rates, latencies, and other relevant data.
- Provide a historic view of a system’s topology. Distributed systems change very frequently, and service graphs offer a way of seeing how these systems have evolved over time.
Since otelcol.connector.servicegraph has to process both sides of an edge,
it needs to process all spans of a trace to function properly.
If spans of a trace are spread out over multiple Alloy instances, spans cannot be paired reliably.
A solution to this problem is using otelcol.exporter.loadbalancing
in front of Alloy instances running otelcol.connector.servicegraph.
Usage
otelcol.connector.servicegraph "LABEL" {
  output {
    metrics = [...]
  }
}Arguments
otelcol.connector.servicegraph supports the following arguments:
| Name | Type | Description | Default | Required | 
|---|---|---|---|---|
| latency_histogram_buckets | list(duration) | Buckets for latency histogram metrics. | ["2ms", "4ms", "6ms", "8ms", "10ms", "50ms", "100ms", "200ms", "400ms", "800ms", "1s", "1400ms", "2s", "5s", "10s", "15s"] | no | 
| dimensions | list(string) | A list of dimensions to add with the default dimensions. | [] | no | 
| cache_loop | duration | Configures how often to delete series which have not been updated. | "1m" | no | 
| store_expiration_loop | duration | The time to expire old entries from the store periodically. | "2s" | no | 
| metrics_flush_interval | duration | The interval at which metrics are flushed to downstream components. | "0s" | no | 
| database_name_attribute | string | The attribute name used to identify the database name from span attributes. | "db.name" | no | 
Service graphs work by inspecting traces and looking for spans with parent-children relationship that represent a request.
otelcol.connector.servicegraph uses OpenTelemetry semantic conventions to detect a myriad of requests.
The following requests are currently supported:
- A direct request between two services, where the outgoing and the incoming span
must have a Span Kind value of clientandserverrespectively.
- A request across a messaging system, where the outgoing and the incoming span
must have a Span Kind value of producerandconsumerrespectively.
- A database request, where spans have a Span Kind with a value of client, as well as an attribute with a key ofdb.name.
Every span which can be paired up to form a request is kept in an in-memory store:
- If the TTL of the span expires before it can be paired, it is deleted from the store. TTL is configured in the store block.
- If the span is paired prior to its expiration, a metric is recorded and the span is deleted from the store.
The following metrics are emitted by the processor:
| Metric | Type | Labels | Description | 
|---|---|---|---|
| traces_service_graph_request_total | Counter | client, server, connection_type | Total count of requests between two nodes | 
| traces_service_graph_request_failed_total | Counter | client, server, connection_type | Total count of failed requests between two nodes | 
| traces_service_graph_request_server_seconds | Histogram | client, server, connection_type | Time for a request between two nodes as seen from the server | 
| traces_service_graph_request_client_seconds | Histogram | client, server, connection_type | Time for a request between two nodes as seen from the client | 
| traces_service_graph_unpaired_spans_total | Counter | client, server, connection_type | Total count of unpaired spans | 
| traces_service_graph_dropped_spans_total | Counter | client, server, connection_type | Total count of dropped spans | 
Duration is measured both from the client and the server sides.
The latency_histogram_buckets argument controls the buckets for
traces_service_graph_request_server_seconds and traces_service_graph_request_client_seconds.
Each emitted metrics series have a client and a server label corresponding with the service doing the request and the service receiving the request.
The value of the label is derived from the service.name resource attribute of the two spans.
The connection_type label may not be set. If it is set, its value will be either messaging_system or database.
Additional labels can be included using the dimensions configuration option:
- Those labels will have a prefix to mark where they originate (client or server span kinds).
The client_prefix relates to the dimensions coming from spans with a Span Kind ofclient. Theserver_prefix relates to the dimensions coming from spans with a Span Kind ofserver.
- Firstly the resource attributes will be searched. If the attribute is not found, the span attributes will be searched.
When metrics_flush_interval is set to 0s, metrics will be flushed on every received batch of traces.
Blocks
The following blocks are supported inside the definition of
otelcol.connector.servicegraph:
| Hierarchy | Block | Description | Required | 
|---|---|---|---|
| store | store | Configures the in-memory store for spans. | no | 
| output | output | Configures where to send telemetry data. | yes | 
| debug_metrics | debug_metrics | Configures the metrics that this component generates to monitor its state. | no | 
store block
The store block configures the in-memory store for spans.
| Name | Type | Description | Default | Required | 
|---|---|---|---|---|
| max_items | number | Maximum number of items to keep in the store. | 1000 | no | 
| ttl | duration | The time to live for spans in the store. | "2s" | no | 
output block
The output block configures a set of components to forward resulting telemetry data to.
The following arguments are supported:
| Name | Type | Description | Default | Required | 
|---|---|---|---|---|
| metrics | list(otelcol.Consumer) | List of consumers to send metrics to. | [] | no | 
You must specify the output block, but all its arguments are optional.
By default, telemetry data is dropped.
Configure the metrics argument accordingly to send telemetry data to other components.
debug_metrics block
The debug_metrics block configures the metrics that this component generates to monitor its state.
The following arguments are supported:
| Name | Type | Description | Default | Required | 
|---|---|---|---|---|
| disable_high_cardinality_metrics | boolean | Whether to disable certain high cardinality metrics. | true | no | 
| level | string | Controls the level of detail for metrics emitted by the wrapped collector. | "detailed" | no | 
disable_high_cardinality_metrics is the Grafana Alloy equivalent to the telemetry.disableHighCardinalityMetrics feature gate in the OpenTelemetry Collector.
It removes attributes that could cause high cardinality metrics.
For example, attributes with IP addresses and port numbers in metrics about HTTP and gRPC connections are removed.
Note
If configured,
disable_high_cardinality_metricsonly applies tootelcol.exporter.*andotelcol.receiver.*components.
level is the Alloy equivalent to the telemetry.metrics.level feature gate in the OpenTelemetry Collector.
Possible values are "none", "basic", "normal" and "detailed".
Exported fields
The following fields are exported and can be referenced by other components:
| Name | Type | Description | 
|---|---|---|
| input | otelcol.Consumer | A value that other components can use to send telemetry data to. | 
input accepts otelcol.Consumer traces telemetry data. It does not accept metrics and logs.
Component health
otelcol.connector.servicegraph is only reported as unhealthy if given an invalid
configuration.
Debug information
otelcol.connector.servicegraph does not expose any component-specific debug
information.
Example
The example below accepts traces, creates service graph metrics from them, and writes the metrics to Mimir. The traces are written to Tempo.
otelcol.connector.servicegraph also adds a label to each metric with the value of the “http.method” span/resource attribute.
otelcol.receiver.otlp "default" {
  grpc {
    endpoint = "0.0.0.0:4320"
  }
  output {
    traces  = [otelcol.connector.servicegraph.default.input,otelcol.exporter.otlp.grafana_cloud_traces.input]
  }
}
otelcol.connector.servicegraph "default" {
  dimensions = ["http.method"]
  output {
    metrics = [otelcol.exporter.prometheus.default.input]
  }
}
otelcol.exporter.prometheus "default" {
  forward_to = [prometheus.remote_write.mimir.receiver]
}
prometheus.remote_write "mimir" {
  endpoint {
    url = "https://prometheus-xxx.grafana.net/api/prom/push"
    basic_auth {
      username = sys.env("PROMETHEUS_USERNAME")
      password = sys.env("GRAFANA_CLOUD_API_KEY")
    }
  }
}
otelcol.exporter.otlp "grafana_cloud_traces" {
  client {
    endpoint = "https://tempo-xxx.grafana.net/tempo"
    auth     = otelcol.auth.basic.grafana_cloud_traces.handler
  }
}
otelcol.auth.basic "grafana_cloud_traces" {
  username = sys.env("TEMPO_USERNAME")
  password = sys.env("GRAFANA_CLOUD_API_KEY")
}Some of the metrics in Mimir may look like this:
traces_service_graph_request_total{client="shop-backend",failed="false",server="article-service",client_http_method="DELETE",server_http_method="DELETE"}
traces_service_graph_request_failed_total{client="shop-backend",client_http_method="POST",failed="false",server="auth-service",server_http_method="POST"}Compatible components
otelcol.connector.servicegraph can accept arguments from the following components:
- Components that export OpenTelemetry otelcol.Consumer
otelcol.connector.servicegraph has exports that can be consumed by the following components:
- Components that consume OpenTelemetry otelcol.Consumer
Note
Connecting some components may not be sensible or components may require further configuration to make the connection work correctly. Refer to the linked documentation for more details.







