---
title: "otelcol.exporter.loadbalancing | Grafana Alloy documentation"
description: "Learn about otelcol.exporter.loadbalancing"
---

# `otelcol.exporter.loadbalancing`

`otelcol.exporter.loadbalancing` accepts logs and traces from other `otelcol` components and writes them over the network using the OpenTelemetry Protocol (OTLP) protocol.

> Note
> 
> `otelcol.exporter.loadbalancing` is a wrapper over the upstream OpenTelemetry Collector [`loadbalancing`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.147.0/exporter/loadbalancingexporter) exporter. Bug reports or feature requests will be redirected to the upstream repository, if necessary.

You can specify multiple `otelcol.exporter.loadbalancing` components by giving them different labels.

The decision which backend to use depends on the trace ID or the service name. The backend load doesn’t influence the choice. Even though this load-balancer won’t do round-robin balancing of the batches, the load distribution should be very similar among backends, with a standard deviation under 5% at the current configuration.

`otelcol.exporter.loadbalancing` is especially useful for backends configured with tail-based samplers which choose a backend based on the view of the full trace.

When a list of backends is updated, some of the signals will be rerouted to different backends. Around R/N of the “routes” will be rerouted differently, where:

- A “route” is either a trace ID or a service name mapped to a certain backend.
- “R” is the total number of routes.
- “N” is the total number of backends.

This should be stable enough for most cases, and the larger the number of backends, the less disruption it should cause.

## Usage

Alloy ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```alloy
otelcol.exporter.loadbalancing "<LABEL>" {
  resolver {
    ...
  }
  protocol {
    otlp {
      client {}
    }
  }
}
```

## Arguments

You can use the following arguments with `otelcol.exporter.loadbalancing`:

Expand table

| Name          | Type       | Description                                                                        | Default     | Required |
|---------------|------------|------------------------------------------------------------------------------------|-------------|----------|
| `routing_key` | `string`   | Routing strategy for load balancing.                                               | `"traceID"` | no       |
| `timeout`     | `duration` | Time to wait before marking a request to the `otlp > protocol` exporter as failed. | `"0s"`      | no       |

The `routing_key` attribute determines how to route signals across endpoints. Its value can be one of the following:

- `"service"`: spans, logs, and metrics with the same `service.name` will be exported to the same backend.

This is useful when using processors like the span metrics, so all spans for each service are sent to consistent Alloy instances for metric collection. Otherwise, metrics for the same services would be sent to different instances, making aggregations inaccurate.

- `"traceID"`: Spans and logs belonging to the same `traceID` will be exported to the same backend.
- `"resource"`: Metrics belonging to the same resource will be exported to the same backend.
- `"metric"`: Metrics with the same name will be exported to the same backend.
- `"streamID"`: Metrics with the same `streamID` will be exported to the same backend.

The loadbalancer configures the exporter for the signal types supported by the `routing_key`.

The `timeout` argument is similar to the top-level `queue` and `retry` [blocks](#blocks) for `otelcol.exporter.loadbalancing` itself. It helps to re-route data into a new set of healthy backends. This is especially useful for highly elastic environments like Kubernetes, where the list of resolved endpoints changes frequently due to deployments and scaling events.

> **EXPERIMENTAL**: Metrics support in `otelcol.exporter.loadbalancing` is an [experimental](/docs/release-life-cycle/) feature. Experimental features are subject to frequent breaking changes, and may be removed with no equivalent replacement. To enable and use an experimental feature, you must set the `stability.level` [flag](/docs/alloy/latest/reference/cli/run/) to `experimental`.

## Blocks

You can use the following blocks with `otelcol.exporter.loadbalancing`:

No valid configuration blocks found.

There are two types of \[queue]\[] and \[retry]\[] blocks:

- The queue and retry blocks under `protocol > otlp`. This is useful for temporary problems with a specific backend, like transient network issues.
- The top-level queue and retry blocks for `otelcol.exporter.loadbalancing`. Those configuration options provide capability to re-route data into a new set of healthy backends. This is useful for highly elastic environments like Kubernetes, where the list of resolved endpoints changes frequently due to deployments and scaling events.

### `resolver`

Required

The `resolver` block configures how to retrieve the endpoint to which this exporter will send data.

Inside the `resolver` block, either the \[`dns`]\[dns] block or the \[`static`]\[static] block should be specified. If both `dns` and `static` are specified, `dns` takes precedence.

### `aws_cloud_map`

The `aws_cloud_map` block allows users to use `otelcol.exporter.loadbalancing` when using ECS over EKS in an AWS infrastructure.

The following arguments are supported:

Expand table

| Name            | Type       | Description                                                                        | Default     | Required |
|-----------------|------------|------------------------------------------------------------------------------------|-------------|----------|
| `namespace`     | `string`   | The CloudMap namespace where the service is registered.                            |             | yes      |
| `service_name`  | `string`   | The name of the service which was specified when registering the instance.         |             | yes      |
| `health_status` | `string`   | Ports to use with the IP addresses resolved from `service`.                        | `"HEALTHY"` | no       |
| `interval`      | `duration` | Resolver interval.                                                                 | `"30s"`     | no       |
| `port`          | `number`   | Port to be used for exporting the traces to the addresses resolved from `service`. | `null`      | no       |
| `timeout`       | `duration` | Resolver timeout.                                                                  | `"5s"`      | no       |

`health_status` can be set to either of:

- `HEALTHY`: Only return instances that are healthy.
- `UNHEALTHY`: Only return instances that are unhealthy.
- `ALL`: Return all instances, regardless of their health status.
- `HEALTHY_OR_ELSE_ALL`: Returns healthy instances, unless none are reporting a healthy state. In that case, return all instances. This is also called failing open.

If `port` isn’t set, a default port defined in CloudMap will be used.

> Note
> 
> The `aws_cloud_map` resolver returns a maximum of 100 hosts. A [feature request](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/29771) aims to cover pagination for this scenario.

### `dns`

The `dns` block periodically resolves an IP address via the DNS `hostname` attribute. This IP address and the port specified via the `port` attribute will then be used by the gRPC exporter as the endpoint to which to export data to.

The following arguments are supported:

Expand table

| Name       | Type       | Description                                                           | Default  | Required |
|------------|------------|-----------------------------------------------------------------------|----------|----------|
| `hostname` | `string`   | DNS hostname to resolve.                                              |          | yes      |
| `interval` | `duration` | Resolver interval.                                                    | `"5s"`   | no       |
| `port`     | `string`   | Port to be used with the IP addresses resolved from the DNS hostname. | `"4317"` | no       |
| `timeout`  | `duration` | Resolver timeout.                                                     | `"1s"`   | no       |

### `kubernetes`

You can use the `kubernetes` block to load balance across the pods of a Kubernetes service. The Kubernetes API notifies Alloy whenever a new Pod is added or removed from the service. The `kubernetes` resolver has a much faster response time than the `dns` resolver because it doesn’t require polling.

The following arguments are supported:

Expand table

| Name               | Type           | Description                                                 | Default  | Required |
|--------------------|----------------|-------------------------------------------------------------|----------|----------|
| `service`          | `string`       | Kubernetes service to resolve.                              |          | yes      |
| `ports`            | `list(number)` | Ports to use with the IP addresses resolved from `service`. | `[4317]` | no       |
| `return_hostnames` | `bool`         | Return hostnames instead of IPs.                            | `false`  | no       |
| `timeout`          | `duration`     | Resolver timeout.                                           | `"1s"`   | no       |

If no namespace is specified inside `service`, an attempt will be made to infer the namespace for this Alloy. If this fails, the `default` namespace will be used.

Each of the ports listed in `ports` will be used with each of the IPs resolved from `service`.

The “get”, “list”, and “watch” [roles](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#role-example) must be granted in Kubernetes for the resolver to work.

`return_hostnames` is useful in certain situations like using Istio in sidecar mode. To use this feature, the `service` argument must be a headless `Service`, pointing at a `StatefulSet`. Also, the `service` argument must be what’s specified under `.spec.serviceName` in the `StatefulSet`.

### `static`

The `static` block configures a list of endpoints which this exporter will send data to.

The following arguments are supported:

Expand table

| Name        | Type           | Description                     | Default | Required |
|-------------|----------------|---------------------------------|---------|----------|
| `hostnames` | `list(string)` | List of endpoints to export to. |         | yes      |

### `protocol`

The `protocol` block configures protocol-related settings for exporting. At the moment only the OTLP protocol is supported.

### `otlp`

The `otlp` block configures OTLP-related settings for exporting.

### `client`

The `client` block configures the gRPC client used by the component. The endpoints used by the client block are the ones from the `resolver` block

The following arguments are supported:

Expand table

| Name                | Type                       | Description                                                                      | Default       | Required |
|---------------------|----------------------------|----------------------------------------------------------------------------------|---------------|----------|
| `auth`              | `capsule(otelcol.Handler)` | Handler from an `otelcol.auth` component to use for authenticating requests.     |               | no       |
| `authority`         | `string`                   | Overrides the default `:authority` header in gRPC requests from the gRPC client. |               | no       |
| `balancer_name`     | `string`                   | Which gRPC client-side load balancer to use for requests.                        | `round_robin` | no       |
| `compression`       | `string`                   | Compression mechanism to use for requests.                                       | `"gzip"`      | no       |
| `headers`           | `map(string)`              | Additional headers to send with the request.                                     | `{}`          | no       |
| `read_buffer_size`  | `string`                   | Size of the read buffer the gRPC client to use for reading server responses.     |               | no       |
| `wait_for_ready`    | `boolean`                  | Waits for gRPC connection to be in the `READY` state before sending data.        | `false`       | no       |
| `write_buffer_size` | `string`                   | Size of the write buffer the gRPC client to use for writing requests.            | `"512KiB"`    | no       |

By default, requests are compressed with Gzip. The `compression` argument controls which compression mechanism to use. Supported strings are:

- `"gzip"`
- `"zlib"`
- `"deflate"`
- `"snappy"`
- `"zstd"`

If you set `compression` to `"none"` or an empty string `""`, the requests aren’t compressed.

The supported values for `balancer_name` are listed in the gRPC documentation on [Load balancing](https://github.com/grpc/grpc-go/blob/master/examples/features/load_balancing/README.md):

- `pick_first`: Tries to connect to the first address. It uses the address for all RPCs if it connects, or if it fails, it tries the next address and keeps trying until one connection is successful. Because of this, all the RPCs are sent to the same backend.
- `round_robin`: Connects to all the addresses it sees and sends an RPC to each backend one at a time in order. For example, the first RPC is sent to backend-1, the second RPC is sent to backend-2, and the third RPC is sent to backend-1.

The `:authority` header in gRPC specifies the host to which the request is being sent. It’s similar to the `Host` [header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Host) in HTTP requests. By default, the value for `:authority` is derived from the endpoint URL used for the gRPC call. Overriding `:authority` could be useful when routing traffic using a proxy like Envoy, which [makes routing decisions](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/route_matching) based on the value of the `:authority` header.

You can configure an HTTP proxy with the following environment variables:

- `HTTPS_PROXY`
- `NO_PROXY`

The `HTTPS_PROXY` environment variable specifies a URL to use for proxying requests. Connections to the proxy are established via [the `HTTP CONNECT` method](https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/CONNECT).

The `NO_PROXY` environment variable is an optional list of comma-separated hostnames for which the HTTPS proxy should *not* be used. Each hostname can be provided as an IP address (`1.2.3.4`), an IP address in CIDR notation (`1.2.3.4/8`), a domain name (`example.com`), or `*`. A domain name matches that domain and all subdomains. A domain name with a leading “.” (`.example.com`) matches subdomains only. `NO_PROXY` is only read when `HTTPS_PROXY` is set.

Because `otelcol.exporter.loadbalancing` uses gRPC, the configured proxy server must be able to handle and proxy HTTP/2 traffic.

### `keepalive`

The `keepalive` block configures keepalive settings for gRPC client connections.

The following arguments are supported:

Expand table

| Name                    | Type       | Description                                                                               | Default | Required |
|-------------------------|------------|-------------------------------------------------------------------------------------------|---------|----------|
| `ping_wait`             | `duration` | How often to ping the server after no activity.                                           |         | no       |
| `ping_response_timeout` | `duration` | Time to wait before closing inactive connections if the server doesn’t respond to a ping. |         | no       |
| `ping_without_stream`   | `boolean`  | Send pings even if there is no active stream request.                                     |         | no       |

### `tls`

The `tls` block configures TLS settings used for the connection to the gRPC server.

The following arguments are supported:

Expand table

| Name                           | Type           | Description                                                                                  | Default     | Required |
|--------------------------------|----------------|----------------------------------------------------------------------------------------------|-------------|----------|
| `ca_file`                      | `string`       | Path to the CA file.                                                                         |             | no       |
| `ca_pem`                       | `string`       | CA PEM-encoded text to validate the server with.                                             |             | no       |
| `cert_file`                    | `string`       | Path to the TLS certificate.                                                                 |             | no       |
| `cert_pem`                     | `string`       | Certificate PEM-encoded text for client authentication.                                      |             | no       |
| `cipher_suites`                | `list(string)` | A list of TLS cipher suites that the TLS transport can use.                                  | `[]`        | no       |
| `curve_preferences`            | `list(string)` | Set of elliptic curves to use in a handshake.                                                | `[]`        | no       |
| `include_system_ca_certs_pool` | `boolean`      | Whether to load the system certificate authorities pool alongside the certificate authority. | `false`     | no       |
| `insecure_skip_verify`         | `boolean`      | Ignores insecure server TLS certificates.                                                    |             | no       |
| `insecure`                     | `boolean`      | Disables TLS when connecting to the configured server.                                       |             | no       |
| `key_file`                     | `string`       | Path to the TLS certificate key.                                                             |             | no       |
| `key_pem`                      | `secret`       | Key PEM-encoded text for client authentication.                                              |             | no       |
| `max_version`                  | `string`       | Maximum acceptable TLS version for connections.                                              | `"TLS 1.3"` | no       |
| `min_version`                  | `string`       | Minimum acceptable TLS version for connections.                                              | `"TLS 1.2"` | no       |
| `reload_interval`              | `duration`     | The duration after which the certificate is reloaded.                                        | `"0s"`      | no       |
| `server_name`                  | `string`       | Verifies the hostname of server certificates when set.                                       |             | no       |

If the server doesn’t support TLS, you must set the `insecure` argument to `true`.

To disable `tls` for connections to the server, set the `insecure` argument to `true`.

If you set `reload_interval` to `"0s"`, the certificate never reloaded.

The following pairs of arguments are mutually exclusive and can’t both be set simultaneously:

- `ca_pem` and `ca_file`
- `cert_pem` and `cert_file`
- `key_pem` and `key_file`

If `cipher_suites` is left blank, a safe default list is used. Refer to the [Go TLS documentation](https://go.dev/src/crypto/tls/cipher_suites.go) for a list of supported cipher suites.

The `curve_preferences` argument determines the set of [elliptic curves](https://go.dev/src/crypto/tls/common.go#L138) to prefer during a handshake in preference order. If not provided, a default list is used. The set of elliptic curves available are `X25519`, `P521`, `P256`, and `P384`.

### `tpm`

The `tpm` block configures retrieving the TLS `key_file` from a trusted device.

The following arguments are supported:

Expand table

| Name         | Type     | Description                                                        | Default | Required |
|--------------|----------|--------------------------------------------------------------------|---------|----------|
| `auth`       | `string` | The authorization value used to authenticate the TPM device.       | `""`    | no       |
| `enabled`    | `bool`   | Load the `tls.key_file` from TPM.                                  | `false` | no       |
| `owner_auth` | `string` | The owner authorization value used to authenticate the TPM device. | `""`    | no       |
| `path`       | `string` | Path to the TPM device or Unix domain socket.                      | `""`    | no       |

The [trusted platform module](https://trustedcomputinggroup.org/resource/trusted-platform-module-tpm-summary/) (TPM) configuration can be used for loading TLS key from TPM. Currently only TSS2 format is supported.

The `path` attribute is not supported on Windows.

In the following example, the private key `my-tss2-key.key` in TSS2 format is loaded from the TPM device `/dev/tmprm0`:

Alloy ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```alloy
otelcol.example.component "<LABEL>" {
    ...
    tls {
        ...
        key_file = "my-tss2-key.key"
        tpm {
            enabled = true
            path = "/dev/tpmrm0"
        }
    }
}
```

### `queue`

The `queue` block configures an in-memory buffer of batches before data is sent to the gRPC server.

The following arguments are supported:

Expand table

| Name                | Type                       | Description                                                                                | Default      | Required |
|---------------------|----------------------------|--------------------------------------------------------------------------------------------|--------------|----------|
| `block_on_overflow` | `boolean`                  | The behavior when the component’s `TotalSize` limit is reached.                            | `false`      | no       |
| `enabled`           | `boolean`                  | Enables a buffer before sending data to the client.                                        | `true`       | no       |
| `num_consumers`     | `number`                   | Number of readers to send batches written to the queue in parallel.                        | `10`         | no       |
| `queue_size`        | `number`                   | Maximum number of unwritten batches allowed in the queue at the same time.                 | `1000`       | no       |
| `sizer`             | `string`                   | How the queue and batching is measured.                                                    | `"requests"` | no       |
| `wait_for_result`   | `boolean`                  | Determines if incoming requests are blocked until the request is processed or not.         | `false`      | no       |
| `storage`           | `capsule(otelcol.Handler)` | Handler from an `otelcol.storage` component to use to enable a persistent queue mechanism. |              | no       |

The `blocking` argument is deprecated in favor of the `block_on_overflow` argument.

When `block_on_overflow` is `true`, the component will wait for space. Otherwise, operations will immediately return a retryable error.

When `enabled` is `true`, data is first written to an in-memory buffer before sending it to the configured server. Batches sent to the component’s `input` exported field are added to the buffer as long as the number of unsent batches doesn’t exceed the configured `queue_size`.

`queue_size` determines how long an endpoint outage is tolerated. Assuming 100 requests/second, the default queue size `1000` provides about 10 seconds of outage tolerance. To calculate the correct value for `queue_size`, multiply the average number of outgoing requests per second by the time in seconds that outages are tolerated. A very high value can cause Out Of Memory (OOM) kills.

The `sizer` argument could be set to:

- `requests`: number of incoming batches of metrics, logs, traces (the most performant option).
- `items`: number of the smallest parts of each signal (spans, metric data points, log records).
- `bytes`: the size of serialized data in bytes (the least performant option).

The `num_consumers` argument controls how many readers read from the buffer and send data in parallel. Larger values of `num_consumers` allow data to be sent more quickly at the expense of increased network traffic.

If an `otelcol.storage.*` component is configured and provided in the queue’s `storage` argument, the queue uses the provided storage extension to provide a persistent queue and the queue is no longer stored in memory. Any data persisted will be processed on startup if Alloy is killed or restarted. Refer to the [exporterhelper documentation](https://github.com/open-telemetry/opentelemetry-collector/blob/v0.147.0/exporter/exporterhelper/README.md#persistent-queue) in the OpenTelemetry Collector repository for more details.

### `batch`

The `batch` block configures batching requests based on a timeout and a minimum number of items.

Batching is disabled by default. To enable it, explicitly include `batch {}` in your Alloy configuration. You do not need to include a `batch {}` block in your `otelcol.exporter` if you already use a `otelcol.processor.batch` component, although batching in the exporter is the preferred method because it is more flexible.

The following arguments are supported:

Expand table

| Name            | Type       | Description                                                                                                | Default   | Required |
|-----------------|------------|------------------------------------------------------------------------------------------------------------|-----------|----------|
| `flush_timeout` | `duration` | Time after which a batch will be sent regardless of its size. Must be a non-zero value.                    | `"200ms"` | no       |
| `min_size`      | `number`   | The minimum size of a batch.                                                                               | `2000`    | no       |
| `max_size`      | `number`   | The maximum size of a batch, enables batch splitting.                                                      | `3000`    | no       |
| `sizer`         | `string`   | How the queue and batching is measured. Overrides the sizer set at the `sending_queue` level for batching. | `"items"` | no       |

If configured, `max_size` must be greater than or equal to `min_size`.

The `sizer` argument can be set to:

- `items`: The number of the smallest parts of each span, metric data point, or log record.
- `bytes`: the size of serialized data in bytes (the least performant option).

### `retry`

The `retry` block configures how failed requests to the gRPC server are retried.

The following arguments are supported:

Expand table

| Name                   | Type       | Description                                            | Default | Required |
|------------------------|------------|--------------------------------------------------------|---------|----------|
| `enabled`              | `boolean`  | Enables retrying failed requests.                      | `true`  | no       |
| `initial_interval`     | `duration` | Initial time to wait before retrying a failed request. | `"5s"`  | no       |
| `max_elapsed_time`     | `duration` | Maximum time to wait before discarding a failed batch. | `"5m"`  | no       |
| `max_interval`         | `duration` | Maximum time to wait between retries.                  | `"30s"` | no       |
| `multiplier`           | `number`   | Factor to grow wait time before retrying.              | `1.5`   | no       |
| `randomization_factor` | `number`   | Factor to randomize wait time before retrying.         | `0.5`   | no       |

When `enabled` is `true`, failed batches are retried after a given interval. The `initial_interval` argument specifies how long to wait before the first retry attempt. If requests continue to fail, the time to wait before retrying increases by the factor specified by the `multiplier` argument, which must be greater than `1.0`. The `max_interval` argument specifies the upper bound of how long to wait between retries.

The `randomization_factor` argument is useful for adding jitter between retrying Alloy instances. If `randomization_factor` is greater than `0`, the wait time before retries is multiplied by a random factor in the range `[ I - randomization_factor * I, I + randomization_factor * I]`, where `I` is the current interval.

If a batch hasn’t been sent successfully, it’s discarded after the time specified by `max_elapsed_time` elapses. If `max_elapsed_time` is set to `"0s"`, failed requests are retried forever until they succeed.

### `debug_metrics`

The `debug_metrics` block configures the metrics that this component generates to monitor its state.

The following arguments are supported:

Expand table

| Name                               | Type      | Description                                          | Default | Required |
|------------------------------------|-----------|------------------------------------------------------|---------|----------|
| `disable_high_cardinality_metrics` | `boolean` | Whether to disable certain high cardinality metrics. | `true`  | no       |

`disable_high_cardinality_metrics` is the Alloy equivalent to the `telemetry.disableHighCardinalityMetrics` feature gate in the OpenTelemetry Collector. It removes attributes that could cause high cardinality metrics. For example, attributes with IP addresses and port numbers in metrics about HTTP and gRPC connections are removed.

> Note
> 
> If configured, `disable_high_cardinality_metrics` only applies to `otelcol.exporter.*` and `otelcol.receiver.*` components.

## Exported fields

The following fields are exported and can be referenced by other components:

Expand table

| Name    | Type               | Description                                                      |
|---------|--------------------|------------------------------------------------------------------|
| `input` | `otelcol.Consumer` | A value that other components can use to send telemetry data to. |

`input` accepts `otelcol.Consumer` OTLP-formatted data for telemetry signals of these types:

- logs
- traces

## Choose a load balancing strategy

Different Alloy components require different load-balancing strategies. The use of `otelcol.exporter.loadbalancing` is only necessary for [stateful components](../../../../set-up/deploy/#stateful-and-stateless-components).

### `otelcol.processor.tail_sampling`

All spans for a given trace ID must go to the same tail sampling Alloy instance.

- This can be done by configuring `otelcol.exporter.loadbalancing` with `routing_key = "traceID"`.
- If you don’t configure `routing_key = "traceID"`, the sampling decision may be incorrect. The tail sampler must have a full view of the trace when making a sampling decision. For example, a `rate_limiting` tail sampling strategy may incorrectly pass through more spans than expected if the spans for the same trace are spread out to more than one Alloy instance.

### `otelcol.connector.spanmetrics`

All spans for a given `service.name` must go to the same spanmetrics Alloy.

- This can be done by configuring `otelcol.exporter.loadbalancing` with `routing_key = "service"`.
- If you do not configure `routing_key = "service"`, metrics generated from spans might be incorrect. For example, if similar spans for the same `service.name` end up on different Alloy instances, the two Alloy instances will have identical metric series for calculating span latency, errors, and number of requests. When both Alloy instances attempt to write the metrics to a database such as Mimir, the series may clash with each other. At best, this will lead to an error in Alloy and a rejected write to the metrics database. At worst, it could lead to inaccurate data due to overlapping samples for the metric series.

However, there are ways to scale `otelcol.connector.spanmetrics` without the need for a load balancer:

1. Each Alloy could add an attribute such as `collector.id` to make its series unique. Then, for example, you could use a `sum by` PromQL query to aggregate the metrics from different Alloys. Unfortunately, an extra `collector.id` attribute has a downside that the metrics stored in the database will have higher cardinality.
2. Spanmetrics could be generated in the backend database instead of in Alloy. For example, span metrics can be [generated](/docs/tempo/latest/metrics-generator/span_metrics/) in Grafana Cloud by the Tempo traces database.

### `otelcol.connector.servicegraph`

It’s challenging to scale `otelcol.connector.servicegraph` over multiple Alloy instances. For `otelcol.connector.servicegraph` to work correctly, each “client” span must be paired with a “server” span to calculate metrics such as span duration. If a “client” span goes to one Alloy, but a “server” span goes to another Alloy, then no single Alloy will be able to pair the spans and a metric won’t be generated.

`otelcol.exporter.loadbalancing` can solve this problem partially if it is configured with `routing_key = "traceID"`. Each Alloy will then be able to calculate a service graph for each “client”/“server” pair in a trace. It’s possible to have a span with similar “server”/“client” values in a different trace, processed by another Alloy. If two different Alloy instances process similar “server”/“client” spans, they will generate the same service graph metric series. If the series from two Alloy are the same, this will lead to issues when writing them to the backend database. You could differentiate the series by adding an attribute such as `"collector.id"`. The series from different Alloys can be aggregated using PromQL queries on the backed metrics database. If the metrics are stored in Grafana Mimir, cardinality issues due to `"collector.id"` labels can be solved using [Adaptive Metrics](/docs/grafana-cloud/cost-management-and-billing/reduce-costs/metrics-costs/control-metrics-usage-via-adaptive-metrics/).

A simpler, more scalable alternative to generating service graph metrics in Alloy is to generate them entirely in the backend database. For example, service graphs can be [generated](/docs/tempo/latest/metrics-generator/service_graphs/) in Grafana Cloud by the Tempo traces database.

### Mix stateful components

Different Alloy components may require a different `routing_key` for `otelcol.exporter.loadbalancing`. For example, `otelcol.processor.tail_sampling` requires `routing_key = "traceID"` whereas `otelcol.connector.spanmetrics` requires `routing_key = "service"`. To load balance both types of components, two different sets of load balancers have to be set up:

- One set of `otelcol.exporter.loadbalancing` with `routing_key = "traceID"`, sending spans to Alloys doing tail sampling and no span metrics.
- Another set of `otelcol.exporter.loadbalancing` with `routing_key = "service"`, sending spans to Alloys doing span metrics and no service graphs.

Unfortunately, this can also lead to side effects. For example, if `otelcol.connector.spanmetrics` is configured to generate exemplars, the tail sampling Alloys might drop the trace that the exemplar points to. There is no coordination between the tail sampling Alloys and the span metrics Alloys to make sure trace IDs for exemplars are kept.

## Component health

`otelcol.exporter.loadbalancing` is only reported as unhealthy if given an invalid configuration.

## Debug information

`otelcol.exporter.loadbalancing` doesn’t expose any component-specific debug information.

## Examples

### Static resolver

This example accepts OTLP logs and traces over gRPC. It then sends them in a load-balanced way to `"localhost:55690"` or `"localhost:55700"`.

Alloy ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```alloy
otelcol.receiver.otlp "default" {
    grpc {}
    output {
        traces  = [otelcol.exporter.loadbalancing.default.input]
        logs    = [otelcol.exporter.loadbalancing.default.input]
    }
}

otelcol.exporter.loadbalancing "default" {
    resolver {
        static {
            hostnames = ["localhost:55690", "localhost:55700"]
        }
    }
    protocol {
        otlp {
            client {}
        }
    }
}
```

### DNS resolver

When configured with a `dns` resolver, `otelcol.exporter.loadbalancing` will do a DNS lookup on regular intervals. Spans are exported to the addresses the DNS lookup returned.

Alloy ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```alloy
otelcol.exporter.loadbalancing "default" {
    resolver {
        dns {
            hostname = "alloy-traces-sampling.grafana-cloud-monitoring.svc.cluster.local"
            port     = "34621"
            interval = "5s"
            timeout  = "1s"
        }
    }
    protocol {
        otlp {
            client {}
        }
    }
}
```

The following example shows a Kubernetes configuration that configures two groups of Alloy instances:

- A pool of load-balancer Alloy instances:
  
  - Spans are received from instrumented applications via `otelcol.receiver.otlp`
  - Spans are exported via `otelcol.exporter.loadbalancing`.
- A pool of sampling Alloy instances:
  
  - The sampling Alloys run behind a headless service to enable the load-balancer Alloys to discover them.
  - Spans are received from the load-balancer Alloys via `otelcol.receiver.otlp`
  - Traces are sampled via `otelcol.processor.tail_sampling`.
  - The traces are exported via `otelcol.exporter.otlp` to an OTLP-compatible database such as Tempo.

Example Kubernetes configuration

YAML ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```yaml
apiVersion: v1
kind: Namespace
metadata:
  name: grafana-cloud-monitoring
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: k6-trace-generator
  namespace: grafana-cloud-monitoring
spec:
  minReadySeconds: 10
  replicas: 1
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      name: k6-trace-generator
  template:
    metadata:
      labels:
        name: k6-trace-generator
    spec:
      containers:
      - env:
        - name: ENDPOINT
          value: alloy-traces-lb.grafana-cloud-monitoring.svc.cluster.local:9411
        image: ghcr.io/grafana/xk6-client-tracing:v0.0.2
        imagePullPolicy: IfNotPresent
        name: k6-trace-generator
---
apiVersion: v1
kind: Service
metadata:
  name: alloy-traces-lb
  namespace: grafana-cloud-monitoring
spec:
  clusterIP: None
  ports:
  - name: alloy-traces-otlp-grpc
    port: 9411
    protocol: TCP
    targetPort: 9411
  selector:
    name: alloy-traces-lb
  type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: alloy-traces-lb
  namespace: grafana-cloud-monitoring
spec:
  minReadySeconds: 10
  replicas: 1
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      name: alloy-traces-lb
  template:
    metadata:
      labels:
        name: alloy-traces-lb
    spec:
      containers:
      - args:
        - run
        - /etc/alloy/alloy_lb.alloy
        command:
        - /bin/alloy
        image: grafana/alloy:v1.0
        imagePullPolicy: IfNotPresent
        name: alloy-traces
        ports:
        - containerPort: 9411
          name: otlp-grpc
          protocol: TCP
        - containerPort: 34621
          name: alloy-lb
          protocol: TCP
        volumeMounts:
        - mountPath: /etc/alloy
          name: alloy-traces
      volumes:
      - configMap:
          name: alloy-traces
        name: alloy-traces
---
apiVersion: v1
kind: Service
metadata:
  name: alloy-traces-sampling
  namespace: grafana-cloud-monitoring
spec:
  clusterIP: None
  ports:
  - name: alloy-lb
    port: 34621
    protocol: TCP
    targetPort: alloy-lb
  selector:
    name: alloy-traces-sampling
  type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: alloy-traces-sampling
  namespace: grafana-cloud-monitoring
spec:
  minReadySeconds: 10
  replicas: 3
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      name: alloy-traces-sampling
  template:
    metadata:
      labels:
        name: alloy-traces-sampling
    spec:
      containers:
      - args:
        - run
        - /etc/alloy/alloy_sampling.alloy
        command:
        - /bin/alloy
        image: grafana/alloy:v1.0
        imagePullPolicy: IfNotPresent
        name: alloy-traces
        ports:
        - containerPort: 9411
          name: otlp-grpc
          protocol: TCP
        - containerPort: 34621
          name: alloy-lb
          protocol: TCP
        volumeMounts:
        - mountPath: /etc/alloy
          name: alloy-traces
      volumes:
      - configMap:
          name: alloy-traces
        name: alloy-traces
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: alloy-traces
  namespace: grafana-cloud-monitoring
data:
  alloy_lb.alloy: |
    otelcol.receiver.otlp "default" {
      grpc {
        endpoint = "0.0.0.0:9411"
      }
      output {
        traces = [otelcol.exporter.loadbalancing.default.input,otelcol.exporter.debug.default.input]
      }
    }

    otelcol.exporter.debug "default" {
      verbosity = "detailed"
    }

    otelcol.exporter.loadbalancing "default" {
      resolver {
        dns {
          hostname = "alloy-traces-sampling.grafana-cloud-monitoring.svc.cluster.local"
          port = "34621"
        }
      }
      protocol {
        otlp {
          client {
            tls {
              insecure = true
            }
          }
        }
      }
    }

  alloy_sampling.alloy: |
    otelcol.receiver.otlp "default" {
      grpc {
        endpoint = "0.0.0.0:34621"
      }
      output {
        traces = [otelcol.exporter.otlp.default.input,otelcol.exporter.debug.default.input]
      }
    }

    otelcol.exporter.debug "default" {
      verbosity = "detailed"
    }

    otelcol.exporter.otlp "default" {
      client {
        endpoint = "tempo-prod-06-prod-gb-south-0.grafana.net:443"
        auth     = otelcol.auth.basic.creds.handler
      }
    }

    otelcol.auth.basic "creds" {
      username = "111111"
      password = "pass"
    }
```

You must fill in the correct OTLP credentials prior to running the example. You can use [k3d](https://k3d.io/v5.6.0/) to start the example:

Bash ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```bash
k3d cluster create alloy-lb-test
kubectl apply -f kubernetes_config.yaml
```

To delete the cluster, run:

Bash ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```bash
k3d cluster delete alloy-lb-test
```

### Kubernetes resolver

When you configure `otelcol.exporter.loadbalancing` with a `kubernetes` resolver, the Kubernetes API notifies Alloy whenever a new Pod is added or removed from the service. Spans are exported to the addresses from the Kubernetes API, combined with all the possible `ports`.

Alloy ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```alloy
otelcol.exporter.loadbalancing "default" {
    resolver {
        kubernetes {
            service = "alloy-traces-headless"
            ports   = [ 34621 ]
        }
    }
    protocol {
        otlp {
            client {}
        }
    }
}
```

The following example shows a Kubernetes configuration that sets up two groups of Alloy instances:

- A pool of load-balancer Alloys:
  
  - Spans are received from instrumented applications via `otelcol.receiver.otlp`
  - Spans are exported via `otelcol.exporter.loadbalancing`.
  - The load-balancer Alloys will get notified by the Kubernetes API any time a Pod is added or removed from the pool of sampling Alloys.
- A pool of sampling Alloyinstances:
  
  - The sampling Alloy instances don’t need to run behind a headless service.
  - Spans are received from the load-balancer Alloys via `otelcol.receiver.otlp`
  - Traces are sampled via `otelcol.processor.tail_sampling`.
  - The traces are exported via `otelcol.exporter.otlp` to a an OTLP-compatible database such as Tempo.

Example Kubernetes configuration

YAML ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```yaml
apiVersion: v1
kind: Namespace
metadata:
  name: grafana-cloud-monitoring
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: alloy-traces
  namespace: grafana-cloud-monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: alloy-traces-role
  namespace: grafana-cloud-monitoring
rules:
- apiGroups:
  - ""
  resources:
  - endpoints
  verbs:
  - list
  - watch
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: alloy-traces-rolebinding
  namespace: grafana-cloud-monitoring
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: alloy-traces-role
subjects:
- kind: ServiceAccount
  name: alloy-traces
  namespace: grafana-cloud-monitoring
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: k6-trace-generator
  namespace: grafana-cloud-monitoring
spec:
  minReadySeconds: 10
  replicas: 1
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      name: k6-trace-generator
  template:
    metadata:
      labels:
        name: k6-trace-generator
    spec:
      containers:
      - env:
        - name: ENDPOINT
          value: alloy-traces-lb.grafana-cloud-monitoring.svc.cluster.local:9411
        image: ghcr.io/grafana/xk6-client-tracing:v0.0.2
        imagePullPolicy: IfNotPresent
        name: k6-trace-generator
---
apiVersion: v1
kind: Service
metadata:
  name: alloy-traces-lb
  namespace: grafana-cloud-monitoring
spec:
  clusterIP: None
  ports:
  - name: alloy-traces-otlp-grpc
    port: 9411
    protocol: TCP
    targetPort: 9411
  selector:
    name: alloy-traces-lb
  type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: alloy-traces-lb
  namespace: grafana-cloud-monitoring
spec:
  minReadySeconds: 10
  replicas: 1
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      name: alloy-traces-lb
  template:
    metadata:
      labels:
        name: alloy-traces-lb
    spec:
      containers:
      - args:
        - run
        - /etc/alloy/alloy_lb.alloy
        command:
        - /bin/alloy
        image: grafana/alloy:v1.0
        imagePullPolicy: IfNotPresent
        name: alloy-traces
        ports:
        - containerPort: 9411
          name: otlp-grpc
          protocol: TCP
        volumeMounts:
        - mountPath: /etc/alloy
          name: alloy-traces
      serviceAccount: alloy-traces
      volumes:
      - configMap:
          name: alloy-traces
        name: alloy-traces
---
apiVersion: v1
kind: Service
metadata:
  name: alloy-traces-sampling
  namespace: grafana-cloud-monitoring
spec:
  ports:
  - name: alloy-lb
    port: 34621
    protocol: TCP
    targetPort: alloy-lb
  selector:
    name: alloy-traces-sampling
  type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: alloy-traces-sampling
  namespace: grafana-cloud-monitoring
spec:
  minReadySeconds: 10
  replicas: 3
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      name: alloy-traces-sampling
  template:
    metadata:
      labels:
        name: alloy-traces-sampling
    spec:
      containers:
      - args:
        - run
        - /etc/alloy/alloy_sampling.alloy
        command:
        - /bin/alloy
        image: grafana/alloy:v1.0
        imagePullPolicy: IfNotPresent
        name: alloy-traces
        ports:
        - containerPort: 34621
          name: alloy-lb
          protocol: TCP
        volumeMounts:
        - mountPath: /etc/alloy
          name: alloy-traces
      volumes:
      - configMap:
          name: alloy-traces
        name: alloy-traces
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: alloy-traces
  namespace: grafana-cloud-monitoring
data:
  alloy_lb.alloy: |
    otelcol.receiver.otlp "default" {
      grpc {
        endpoint = "0.0.0.0:9411"
      }
      output {
        traces = [otelcol.exporter.loadbalancing.default.input,otelcol.exporter.debug.default.input]
      }
    }

    otelcol.exporter.debug "default" {
      verbosity = "detailed"
    }

    otelcol.exporter.loadbalancing "default" {
      resolver {
        kubernetes {
          service = "alloy-traces-sampling"
          ports = ["34621"]
        }
      }
      protocol {
        otlp {
          client {
            tls {
              insecure = true
            }
          }
        }
      }
    }

  alloy_sampling.alloy: |
    otelcol.receiver.otlp "default" {
      grpc {
        endpoint = "0.0.0.0:34621"
      }
      output {
        traces = [otelcol.exporter.otlp.default.input,otelcol.exporter.debug.default.input]
      }
    }

    otelcol.exporter.debug "default" {
      verbosity = "detailed"
    }

    otelcol.exporter.otlp "default" {
      client {
        endpoint = "tempo-prod-06-prod-gb-south-0.grafana.net:443"
        auth     = otelcol.auth.basic.creds.handler
      }
    }

    otelcol.auth.basic "creds" {
      username = "111111"
      password = "pass"
    }
```

You must fill in the correct OTLP credentials prior to running the example. You can use [k3d](https://k3d.io/v5.6.0/) to start the example:

Bash ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```bash
k3d cluster create alloy-lb-test
kubectl apply -f kubernetes_config.yaml
```

To delete the cluster, run:

Bash ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```bash
k3d cluster delete alloy-lb-test
```

## Compatible components

`otelcol.exporter.loadbalancing` has exports that can be consumed by the following components:

- Components that consume [OpenTelemetry `otelcol.Consumer`](../../../compatibility/#opentelemetry-otelcolconsumer-consumers)

> Note
> 
> Connecting some components may not be sensible or components may require further configuration to make the connection work correctly. Refer to the linked documentation for more details.
