Grafana Beyla 1.0 release: zero-code instrumentation for application telemetry using eBPF

• 14 Nov, 2023 • 7 min

Just two months after introducing the public preview of Grafana Beyla, we are excited to announce the general availability of the open source project with the release of Grafana Beyla 1.0 at ObservabilityCON 2023 today.

We’ve worked hard in the last two months to stabilize, stress test, and refine the features that were part of the public preview of this open source eBPF auto-instrumentation tool. But we’ve also added a couple of noteworthy improvements, which we’d like to showcase today.

Before we go into describing the new major features in Beyla 1.0, let’s start by reviewing the basics about Grafana Beyla.

What is Grafana Beyla?

Grafana Beyla is a vendor agnostic, open source eBPF auto-instrumentation tool for OpenTelemetry and Prometheus applications that lets you easily get started with application observability. We use eBPF to automatically inspect application executables and the OS networking layer, allowing us to capture essential application observability events for HTTP/S and gRPC services. From these captured eBPF events, we produce OpenTelemetry spans and Rate-Errors-Duration (RED) metrics.

As with most eBPF tools, all data capture and instrumentation occurs without any modifications to your application. This means that you can quickly instrument your applications without changing a single line of code. There’s also no need to add an application technology instrumentation agent or an instrumentation SDK. All you have to do is deploy Beyla in your environment, and you’ll instantly get telemetry data for your application.

You will need backend databases to store your telemetry data, and while we hope that you’ll choose our Grafana tools for storage — such as Grafana Tempo for traces and Grafana Mimir for metrics — Beyla is compatible with OpenTelemetry and Prometheus, so you are free to use whatever database backend works best for your stack.

What’s new in Grafana Beyla 1.0?

Multi-process support

The most notable feature of our first general availability release is the multi-process support. When we released the public preview of Grafana Beyla, we required 1:1 mapping between an application service and a Beyla instance. This meant that you needed a separate Beyla instance, deployed as a “sidecar,” for each of the applications you were instrumenting. While this worked in certain scenarios, it could also cause unnecessary resource consumption, especially in microservice environments.

With Beyla 1.0, you can configure a single Beyla instance to monitor many different applications, and in fact, they can be a diverse set of applications. There’s no restriction on the programming language these applications are built with; a single Beyla instance can monitor applications written in Go, C++, Rust, Python, and others.

To support this, we’ve introduced a new powerful way to express which applications you want to monitor in your cluster and provide a rich set of pattern-matching options on executable name and port ranges. The new discovery section of the configuration file lets you define which services you’d like to instrument, for example:

discovery:
 services:
   - namespace: user-management
     name: signup-service
     exe_path_regexp: node
   - namespace: payment-processing
     name: card-service
     open_ports: 8080-8090
   - namespace: user-management
     exe_path_regexp: ((rust-service)|(go-service)|(python))

Since Beyla has multi-process support now, you can easily deploy Beyla as a DaemonSet if you are using Kubernetes. This eliminates much of the resource consumption increase of “sidecar” deployments, and it greatly simplifies Kubernetes deployments. (In other words, you don’t need to restart your services anymore when deploying Beyla.)

Another smaller addition in this space is that Beyla now detects the programming language of the applications it’s instrumenting. The public preview release was able to detect if an application is written in Go, and now we’ve expanded this language detection to applications written in Rust, Node.js, Java, Python, .NET, and Ruby. The detected programming language is then set as part of the trace resource attributes, so that application observability solutions can correctly display this information for the end users.

Heuristic HTTP path decorator

One of the concerns of monitoring solutions is always the cost. How much would it cost to collect/store all of the application metrics? How much would it cost in compute resources to retrieve this information? One particular topic people talk about when discussing cost monitoring is “cardinality explosion." Certain pieces of the collected metrics information can have high cardinality, such as the HTTP URL path, which severely increases the monitoring solution cost, both in storage and retrieval efficiency.

To combat cardinality explosion in metrics storage solutions, OpenTelemetry defines two separate representations of certain attributes — a high cardinality version and a low cardinality version. For example, with the HTTP URL path, the low cardinality version, named http.route, is usually a derivative of the original URL path, so that high cardinality segments of the URL path are removed. For example, an http.route of /users/{id} might be a possible representation of the URL paths /users/123 and /users/456.

Various OpenTelemetry language-specific instrumentations handle the transformation of the URL path to a low cardinality route differently. Some infer this information from the available metadata in the application. Some use an asterisk (*) when they can’t infer it. While others simply use the original URL path.

By default Beyla will use an asterisk (*) for the low cardinality HTTP route, which is a safe choice in regards to keeping costs under control. However, not having a user-friendly HTTP route can make monitoring less useful, which is why in Beyla 1.0 we are introducing the heuristic routes decorator. This new routes decorator can automatically turn high cardinality URL paths into low cardinality routes, without any extensive user configuration. The heuristic routes decorator has a purpose-built classifier, which is able to detect if a URL segment is likely an ID or a meaningful monitoring path component.

For example, these two Google Docs style URL paths below:

document/d/CfMkAGbE_aivhFydEpaRafPuGWbmHfG/edit (no numbers in the ID)
document/d/C2fMkAGb3E_aivhFyd5EpaRafP123uGWbmHfG/edit

will be automatically converted into a low cardinality route:

document/d/*/edit

You can enable this new routes decorator by changing the unmatched mode of the URL route decorator section. For example:

routes:
 patterns:
   - /users/{id}
 ignored_patterns:
   - /metrics
   - /health
 unmatched: heuristic

Built-in support for Grafana Cloud

While Beyla is vendor agnostic, we have made it very easy for you to get started with Grafana Cloud. We’ve added new ease-of-use options to push telemetry data directly to your Grafana Cloud OTLP endpoint.

Given a zone, instance ID, and API token, telemetry data can be sent directly to Grafana Cloud without involving an agent or a collector. These values can be set via the environment variables GRAFANA_CLOUD_ZONE, GRAFANA_CLOUD_INSTANCE_ID, and GRAFANA_CLOUD_API_KEY.

For example:

export GRAFANA_CLOUD_ZONE=prod-us-east-0
export GRAFANA_CLOUD_INSTANCE_ID=123456
export GRAFANA_CLOUD_API_KEY=a-secret-token

For details on how to obtain those values, refer to Send data using OpenTelemetry Protocol.

Beyond Grafana Beyla 1.0: distributed traces

There are many new features we see coming soon after our GA release, ranging from small usability improvements to major undertakings, like adding support for monitoring different protocols other than HTTP and gRPC.

One thing that’s immediately on our radar is distributed traces. Currently, Beyla can only produce single span traces, which are limited in utility compared to distributed traces. Beyla produce spans that can be used with the “span metrics processor” and with the “service graph processor” in Grafana Tempo and the OpenTelemetry Collector.

Beyla works out-of-the-box with Grafana Cloud’s Application Observability solution, which is based on Grafana Tempo’s span metrics. However, these trace spans produced by Beyla are not yet a complete tracing solution. We inherit the trace IDs from incoming server calls, but we don’t propagate trace IDs to outgoing calls yet, so you will only see a partial trace in your tracing product of choice.

We already have a distributed traces prototype working for Go applications, but we want to see what we can accomplish with other programming languages. Automatically propagating trace context from incoming requests to outgoing calls can be challenging in certain language technologies, or downright impossible with certain reactive programming frameworks. However, we’re willing to give this all a try, and if we can solve some of these challenges, it would be nothing short of exhilarating.

To learn more, you can find Grafana Beyla in GitHub and check out our Grafana Beyla documentation, which includes a guide to deploying Beyla in Kubernetes and more.

Feedback

Grafana Beyla 1.0 release: zero-code instrumentation for application telemetry using eBPF

What is Grafana Beyla?