Menu
Open source

Pattern 4 - Load balancing

In the general case, a regular off-the-shelf Layer 4 load balancer works for OpenTelemetry when using the regular HTTP protocol for OTLP payloads, and gRPC load balancers also work when using OTLP gRPC. There are cases where the generic load balacing strategies won’t work, such as when tail-sampling is needed. For those cases, the OpenTelemetry Collector contrib distribution includes a load balancer exporter: it will inspect the payload and extract the trace ID, consistently picking the same backend for the same trace ID. For this use-case, there are two layers of collectors involved: the load balancing layer, and the processing layer.

A layer 4 load balancer can be placed in front of the OpenTelemetry Collector load balancing layer, so that a high-availability setup can be achieved. A processing layer of OpenTelemetry Collectors can be used as backends for the load balancing layer and will be responsible for doing the data processing as well as making the sampling decision, before sending the sampled data to the telemetry backend.

The load balancing exporter supports two sources of backend addresses: a static list provided directly as part of the configuration, or a DNS A record that is queried from time to time. Whenever there’s a change to the list of backends, at most 30% of the trace IDs will get a new backend allocated. This ensures that all backends have similar load.

When using the DNS A record resolver, each load balancing collector might potentially perform the A query at different times, causing the cluster view to be different among the instances. The effect is that spans for the same trace ID might arrive at different collectors for a moment, before all load balancers eventually have the same cluster view. If you have a highly elastic cluster of processing collectors, set the DNS query to a low interval.

Example of a load balancing OpenTelemetry Collector

yaml
receivers:
  otlp:
    protocols:
      grpc:

processors:

exporters:
  loadbalancing:
    protocol:
      otlp:
    resolver:
      dns:
        hostname: my-otelcol.observability.svc.cluster.local

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: []
      exporters: [loadbalancing]