What you get
| Feature | Description |
|---|---|
| Flame graphs | Visualize where code spends CPU/memory |
| Continuous | Always-on profiling, not just during incidents |
| Pyroscope | Continuous profiling backend |
| Trace correlation | Link profiles to specific requests |
| Diff analysis | Compare profiles across deployments |
Questions answered
| With continuous profiling, you can answer… |
|---|
| Which function is consuming the most CPU time? |
| What code path is causing excessive memory allocations? |
| Did this deployment make the service slower or faster? |
| Where exactly should we optimize to reduce cloud costs? |
| What changed between yesterday’s baseline and today? |
Problems solved
| Problem | Solution |
|---|---|
| “Something is slow” but don’t know where | Flame graphs show exact code paths. |
| High CPU/memory but unclear cause | Continuous profiles reveal hot spots. |
| Profiling only during incidents | Always-on catches regressions early. |
| Can’t compare before/after deployments | Diff profiles show what changed. |
The shift from Level 2
| Level 2 (Traces) | Level 3 (Profiles) |
|---|---|
| “The database call is slow” | “This function processing the results is slow” |
| Service-level timing | Code-level timing |
| Where in the architecture | Where in the code |
Script
Distributed tracing tells you which operation is slow. Continuous profiling tells you which line of code is slow.
You’ve probably used a profiler before, maybe during development when hunting down a performance issue. Continuous profiling is that, but running all the time in production. It captures flame graphs (visualizations of where your code spends CPU and memory) continuously, not just when you remember to turn it on.
The backend for this is Pyroscope, which Grafana Labs acquired and integrated into Grafana Cloud. You can correlate profiles to specific traces, so you can see the flame graph for that exact slow request.
This answers questions that were previously impossible without reproducing the issue. Which function is consuming CPU? What code path is allocating all that memory? Did this deployment make things slower? You can even diff profiles between deployments to see exactly what changed.
The precision here is remarkable. Instead of “the database call is slow,” you’re saying “this function that processes the database results is slow because it’s doing inefficient string concatenation.”
