2025 observability predictions and trends from Grafana Labs
From AI to eBPF, 2024 reshaped the observability landscape. As we peer into 2025, Grafana Labs’ experts predict another year of innovation that will redefine how teams understand and optimize their systems, from profiling to platform engineering.
Their insights align with what the community is saying, according to early responses from our third annual Observability Survey. Do you agree or disagree with the trends our team believes will transform the world of observability next year? Let us know by making your voice heard before the survey closes on Dec. 31!
And when you’re done, come back here to get an early look at the survey results and to find out what our own observability experts see on the horizon for 2025.
Profiles and traces converge
Profiling had a big year in 2024 with the addition of continuous profiling to OpenTelemetry. And despite being a fairly nascent technology, about 13% of our survey respondents are already using profiling tools in production. We expect that number to grow in 2025 as engineers realize the full potential of profiling when they use it in tandem with traces.
Traces have unique benefits, but expect to see increased convergence with profiles as organizations seek deeper insights into application performance, said Ryan Perry, Principal Product Manager. That’s because traces excel at showing end-to-end request flows, while profiles reveal detailed system resource usage.
“By combining these tools, teams gain visibility into their applications that manually added spans never could,” Perry said. “For example, when a trace shows a 400ms span, corresponding profile data can reveal exactly which code executed during that time period, down to the specific functions and their resource consumption. This allows teams to pinpoint performance bottlenecks with surgical precision, leading to more efficient optimization efforts and reduced operational costs.”
As profiling becomes stable in OpenTelemetry, forward-thinking organizations will do more than simply collect traces and profiles. “They’ll be treating them as interconnected, contextual data streams that provide a holistic view of system performance and efficiency,” he added.
AI/ML won’t replace engineers but give them superpowers
Despite the hype, AI/ML has not yet proven to be the silver bullet for observability challenges. While only 18% of surveyed participants consider AI/ML capabilities crucial when evaluating new observability solutions, those embracing AI technologies recognize its promise for things like accelerating root cause analysis and intelligent alerting.
AI/ML continues to be an important emerging tool, but it should no longer be seen as a catch-all solution, said Quynton Johnson, Product Marketing Lead. Instead, expect AI/ML efforts to hone in on specific use cases that deliver tangible value.
“AI/ML excels at pattern recognition, enabling the automation of time-consuming tasks and identifying cost-saving opportunities,” Johnson said. “Rather than replacing engineers, our solutions will augment their capabilities by reducing noise and providing well-reasoned recommendations. This frees up human experts to focus on complex decision-making where their expertise is most valuable.”
Cloud repatriation won’t be the key to cost savings (for most)
Year-over-year, cost continues to be one of the most important criteria when selecting a new observability tool and, according to our data, cost will still be top of mind for organizations of all sizes and across all industries in 2025. One approach to cutting costs that people are talking about — but not executing on — is moving workloads from cloud to on-premises. However, that won’t be feasible (or strategic) for most.
Yes, certain organizations, like large-scale social media networks with predictable workloads, might benefit from hybrid or on-prem solutions. However, the time, money, resources, and overall complexity of full-scale cloud repatriation won’t offset cost for most organizations, said Richard “RichiH” Hartmann, Director of Community & Office of the CTO.
“They should look into implementing a targeted optimization approach,” Hartmann said. “Instead of abandoning their cloud infrastructure, they can optimize it for cost, performance, and scalability.”
“This requires a mix of FinOps, leveraging the right tools, and continuous monitoring of infrastructure economics,” he added, “but teams that lean into this approach will see meaningful cost savings without sacrificing the agility and scalability that drew them to cloud platforms in the first place.”
Platform engineering’s next frontier: eBPF
Platform teams are experiencing significant growth, with nearly 25% of those surveyed working in this role. As the importance of platform teams increases, their responsibilities will expand to encompass emerging tools and technologies — like eBPF.
What started as a trendy technology will become the backbone of modern platform engineering, fundamentally reshaping how organizations handle observability and security,” said Nikola Grcevski, Principal Software Engineer, adding that eBPF is on “the cusp of a major transformation.”
“One significant shift will be the transition of instrumentation responsibility from application teams to platform teams. We’re already seeing OpenTelemetry integrate with eBPF, with updates like the OpenTelemetry eBPF Profiling donation, which is already helping drive adoption of eBPF,” Grcevski said. “Moving forward we’ll see more opportunities for eBPF to create a seamless bridge between system-level data and application telemetry while standardizing how platforms collect and process observability data.“
Open source will continue to be the cornerstone of observability
OpenTelemetry and Prometheus continue to gain traction, with more than 50% of respondents reporting that they’ve increased their usage of both projects over the past year. With similar growth in years past, open source observability shows no signs of slowing down.
Open source isn’t just a cost-saving strategy; it’s becoming the primary vehicle for technological innovation in observability, according to Marylia Gutierrez, Staff Software Engineer. OpenTelemetry, in particular, is transforming how organizations approach instrumentation by providing a vendor-neutral, unified approach to collecting telemetry data across different systems and programming languages.
“As more organizations recognize the strategic value of OTel, we’ll see continued investment, deeper integrations with tools like Prometheus and Grafana, and even wider spread adoption. There are a few promising areas that OTel is poised to impact in the coming months,” Gutierrez said.
“One is streamlined troubleshooting — as OpenTelemetry enables teams to correlate metrics, logs, and traces seamlessly, we’ll see accelerated root cause analysis and improved system reliability,” Gutierrez added. “Another is developer productivity — as standardized instrumentation eliminates the overhead of maintaining custom telemetry solutions, teams will be free to focus on building features. And last is the creation of libraries that provide users with the observability they’ve been seeking, such as databases, mobile applications, profiling, among many others.”
Join us to see how 2025 plays out
What do you think of our predictions? Obviously no one knows what the future holds, but we’re excited to see what 2025 has in store for observability — for us and for you! If you want to join us on this journey, we’ll be covering these topics and more at GrafanaCON 2025, which will run from May 6 to May 8 in Seattle.
And if you can’t make it to the Pacific Northwest, keep an eye out for one of our ObservabilityCON on the Road events potentially coming to a city near you.
Grafana Cloud is the easiest way to get started with metrics, logs, traces, dashboards, and more. We have a generous forever-free tier and plans for every use case. Sign up for free now!