Building a synthetic monitoring solution for Jaeger with Grafana k6

Wilfried Roset

•

2026-01-23•7 min

Wilfried Roset is an engineering manager who leads an SRE team and he is a Grafana Champion. Wilfried currently works at OVHcloud where he focuses on prioritizing sustainability, resilience, and industrialization to guarantee customer satisfaction.

As an SRE Engineering Manager and a Grafana Champion, I believe a resilient and sustainable cloud experience begins with strong observability. That effort starts by ensuring internal services meet the highest operational standards — which is why my engineering team heavily relies on Grafana, Mimir, Pyroscope, OpenSearch, and Jaeger to drive our observability strategy.

A foundational component of this practice is distributed tracing. Tools like Jaeger allow us to map the lifecycle of a request across our complex microservices architecture, instantly pinpointing latency bottlenecks. With an ever-increasing number of microservices, our teams have rapidly adopted tracing, making instrumentation a standard practice. This success, however, created a new challenge: significant pressure on the tracing backend itself.

Our tracing platform, which relies on multiple Jaeger backends supported by OpenSearch clusters managed by our in-house experts, is currently handling over 700,000 spans per second on the main cluster. While the service availability is excellent (>99.99%), our users were reporting a noticeable slowness on the read path. This is a critical issue: the deep visibility that tracing offers is useless if the platform providing it is itself slow. When the 75th percentile read latency makes triaging incidents cumbersome, our ability to maintain a high Quality of Service (QoS) is fundamentally compromised.

The challenge wasn't just fixing the slowness; it was accurately measuring it. Our standard service metrics showed the backend was healthy, but the user experience was clearly poor. The key question was: How do we get an unambiguous, verifiable measure of the true user experience to set reliable Service Level Objectives (SLOs) and prioritize engineering effort?

Synthetic monitoring and Grafana k6

This is precisely where synthetic monitoring enters the picture. It flips the script — instead of passively collecting metrics from live, potentially compromised traffic, synthetic monitoring allows you to simulate a user's journey to gather unambiguous facts about the service's quality. It helps you continuously monitor your production environment from the outside world, using checks that run at frequent intervals, such as every five minutes, with a virtual user.

For this task, we turned to Grafana k6, the open source performance testing tool that helps you verify system performance under extreme load. k6 also works well to enable synthetic monitoring as a distinct, continuous practice.

This approach allows us to be alerted the instant a real-world, single-user issue occurs. By deploying lightweight k6 probes, we can continuously query the Jaeger read path and establish an independent, consistent signal of its true latency, which is essential to set reliable SLOs and prioritize engineering improvements.

Note: While this post explores how to use Grafana k6 to support synthetic monitoring use cases, Grafana Labs also offers Grafana Cloud Synthetic Monitoring, a fully managed, cloud-hosted solution for proactively monitoring critical user journeys. Grafana Cloud Synthetic Monitoring is powered by Grafana k6 OSS.

Key benefits for the SRE team

Using k6 provided three immediate benefits:

Unambiguous data: We moved from anecdotal user reports to fact-based metrics for latency, allowing us to quantify the exact user-facing problem.
SLO validation: The clean, reliable data from the synthetic checks allowed us to set and monitor realistic SLOs for Jaeger's read path.
Independent signal: By running the k6 probes outside of the core observability platform, we gained a fully independent measure of health, ensuring our monitoring wasn't compromised by the very service it was meant to monitor.

Architecture and tech setup

To achieve a clean and reliable signal, our synthetic monitoring setup is straightforward, prioritizing high fidelity to the user experience.

1. k6 agent: We deploy k6 within a Docker container. This container is strategically positioned to interact directly with the Jaeger backend we are investigating.

2. The check: The k6 script is designed to mimic a real-world user interaction: it first executes an HTTP call to search for traces (the most common operation), and then performs subsequent GET requests to retrieve the full trace data. This two-step process on Jaeger's read path accurately assesses performance from an end-user point of view. Depending on your setup you might want to manually trigger the job or use a cronjob style. Another solution is to increase the duration. Here's a simplified look at the k6 script logic:

import http from 'k6/http';
import { URL } from 'https://jslib.k6.io/url/1.0.0/index.js';
// ... other imports
// ... environment variables and constants

export const options = {
  duration: '24h',
};

export default function () {
  let service = randomItem(SERVICES);
  let lookback = randomItem(LOOKBACKS);
  searchTraces(service, lookback);
};

function searchTraces(service, lookback) {
  // ... logic for searching traces and performing HTTP GET
  // ... error checks
 
  // This loop fetches individual traces from the search results
  const responses = http.batch(batchGetTraceByID);
  for (let response=0; response < responses.length; response++){
	check(responses[response], {
  	'get traces by ID was 200': (r) => r.status === 200,
	});
  };
};

Note: In Grafana Cloud Synthetic Monitoring, you can simply paste in a script like the one above and select your probe location to complete your setup.

3. Data Ingestion: k6 is configured to output its resulting metrics, including latency histograms and error rates, via Prometheus remote write. The command to launch k6 should include the output plugin to use and the script is similar to:

export K6_PROMETHEUS_RW_SERVER_URL=http://prometheus:9090/api/v1/write
export K6_PROMETHEUS_RW_TREND_AS_NATIVE_HISTOGRAM=true
k6 run --out=experimental-prometheus-rw synthetic-monitoring.js

4. Storage and visualization: This data is then ingested and stored in Mimir, providing us with a highly available, long-term source of truth for the Jaeger service's quality of experience. Finally, we visualize the results using Grafana.

This diagram also illustrates how the Jaeger servers are instrumented and how they observe the infrastructure or service results (as mentioned above).

Diagram showing an SRE team's workflow: observing Jaeger, querying metrics, generating synthetic workload with k6, and pushing test results.

Key findings and next steps

With the synthetic monitoring data flowing into Mimir, the results were immediate and unambiguous. By using the k6 Prometheus dashboard in Grafana, the problem was quantified: the 75th percentile read latency was validated to be nearly 8 seconds.

This solid, fact-based evidence—derived from our synthetic agent—provided the necessary baseline to prioritize performance improvements over simply chasing anecdotal reports.

Line graph in Grafana showing HTTP latency timings over 48 hours, with fluctuations between 2s and 10s. Peaks visible around 00:00 and 22:00.

This exercise demonstrated that you are just a handful of JavaScript away from deploying a high-fidelity synthetic monitoring agent. Grafana k6 provides a powerful way to flip the script on monitoring, giving you the clean signals necessary to improve your services. Synthetic monitoring is the essential bridge between internal service metrics and the actual user experience.

What's next

One thing is certain: we’ll continue to further adopt and rely on distributed tracing. Armed with these solid facts and monitoring, we will invest time to improve the user experience and lower the read latency.

We already have a couple of solutions underway, such as deploying a tail-sampling processor (using an agent like Alloy) or fine-tuning the underlying OpenSearch cluster to optimize index querying.

To get started with this approach, check out the Grafana k6 documentation and the official k6 Prometheus dashboard for a template to quickly visualize your own synthetic checks.

Want to chat about SRE, observability, or getting started with synthetic monitoring? Feel free to reach out to me on LinkedIn or on Grafana’s Slack!

Building a synthetic monitoring solution for Jaeger with Grafana k6

Synthetic monitoring and Grafana k6

Key benefits for the SRE team

Architecture and tech setup

Key findings and next steps

What's next

Up next

Related content

Related videos

Related docs

Related products

Still have questions?

Get every update

Building a synthetic monitoring solution for Jaeger with Grafana k6

Synthetic monitoring and Grafana k6

Key benefits for the SRE team

Architecture and tech setup

Key findings and next steps

What's next

Related Content

Up next

Related content

Related videos

Related docs

Related products