3rd annual

Observability Survey

brought to you by

In this free report, which is based on 1,255 responses collected by Grafana Labs through outreach to our community and at industry events around the world, you’ll get a snapshot of how organizations approach observability today and where they want the industry to go.

TL;DR

Tool overload
is real

101

different observability technologies
cited as being currently in use

Observability runs
on OSS

71%

of organizations say they’re using
Prometheus and OpenTelemetry in
some capacity†

Cost dominates buying
decisions

74%

say it’s a top priority for selecting
tools, ahead of ease of use and
interoperability

Complexity
chaos

39%

say complexity/overhead is their
biggest observability obstacle—the
most frequently cited impediment

Full stack to
the fore

85%

of companies are using unified
infrastructure and application
observability in some capacity†

A view from
the top

49%

say the “CTO/C-level” or “VP-level”
is the highest level at which
observability is considered
critical to the business

More ways to explore the survey

In-depth analysis

Take a closer look at the biggest takeaways and the latest trends impacting the observability space.

Read the full report

See the data in Grafana

See how your organization stacks up and dive into the numbers in our interactive Grafana Play dashboard.

Explore the data

The many tools of the trade

Traces are growing in popularity, with more than half (57%) of organizations including them in their stacks

Organizations still depend on logs and metrics, but traces and profiles are increasingly being included in more stacks. However, adoption of the two newer pillars of telemetry can vary widely across industries. Financial services companies (65%) are the most likely to use traces, while telecommunications companies are the least likely (37%). Similarly, 18% of retail/e-commerce companies use profiles, compared to just 9% of those in the automotive or manufacturing industries.

Telemetry organizations use to observe their systems

Metrics

95%

Logs

87%

Traces

57%

Profiles

16%

Teams are juggling lots of observability tools—and lots of context switching

Companies use an average of eight observability technologies. That’s down slightly from last year (nine), which suggests they’re trying to scale back the number of tools used to observe their systems. Still, the volume of tools they have to choose from is staggering.

101

The astounding number of different observability technologies respondents cite as currently in use, including Prometheus, OpenTelemetry, the ELK Stack, Datadog, and the Grafana LGTM Stack

Number of observability technologies in use

Grafana users configure an average of 16 data sources in the platform

But larger organizations tend to have more data sources. Companies with more than 5,000 employees average 24 data sources, compared to the 6 used by companies with 10 or fewer employees. In addition to company size, a respondent’s role can have a big impact as well, with developers averaging just 10 data sources, while SREs average 18.

Data sources in Grafana, by company size

2-3

4-5

6-7

8-10

11-20

21-50

51-100

101+

N/A

The ways teams tackle observability

Cost is by far the biggest criteria for companies when they select new observability tools

Cost is the top criteria across the board, but other buying priorities diverge depending on the needs of people’s work. For example, 61% of developers cite ease of use, compared to 53% of SREs. Meanwhile, 27% of engineering directors and managers prioritize AI/ML features, compared to just 19% of all roles.

Most important* criteria for selecting new observability tools, by role

Cost

Ease of use/learning curve for new users

Interoperability with other tools used at my organization

Based on open source software/technologies

Ease of switching to another tool in the future

Familiarity/adoption within your organization

AI/ML capabilities

Other

*Respondents could select multiple criteria
†Base of fewer than 50 respondents

Most organizations have centralized observability

Organizations overwhelmingly follow a DevOps model (centralized teams or embedded experts) as their approach to observability.

38%

Centralized observability (support)

A centralized observability team runs the observability platform and provides best practices and support to product teams, but does not directly manage observability for individual services.

23%

Centralized observability (operations)

A centralized observability team runs the observability platform and handles implementation, onboarding, instrumentation, building dashboards, and setting up alerts/SLOs for product teams.

18%

Observability experts

Observability experts (e.g., SREs) are embedded within each product team and are responsible for implementing and managing observability practices specific to that team’s services.

15%

Operations team

An operations team, separate from the product development team, implements observability and is responsible for the uptime and performance of the application in production.

No observability yet

We haven’t yet stood up observability for our production applications.

Other

What the community says about centralized observability

“Centralized observability reduced our mean time to resolution (MTTR) by 40%, saving an average of 15 engineer hours per incident. This has translated to cost savings of approximately $25,000 per quarter.”

— Developer at a large European software and technology company

“Implementing centralized observability allowed us to shift from reactive problem solving to proactive optimization. For instance, by centralizing log analysis and performance metrics across teams, we identified recurring latency patterns in a critical service. Addressing these reduced downtime by 30% and saved over $100,000 annually in SLA breach penalties.”

— Platform team member at a large Asian software and technology company

“We have multiple tenant accounts and some observability tools were deployed per tenant, which had become unmaintainable from the operations perspective. By adopting OTEL and centralizing the tools in a single account (apart from the collector), the maintenance complexity dropped significantly (at least 5x). Costs also tumbled by at least 35%. Once we complete the transition, we expect a total cost reduction of 67%.”

— Platform team member at a small European IoT company

“It has been tremendously helpful across multiple incidents to be able to deduce the cause of system problems which would have either taken longer to find information or even been impossible to track down due to information like traces not being captured (or even logs, in an autoscaled environment). Additionally it has enabled us to proactively solve errors which we may not have been aware of but caused real user impact, or affected the integrity of our systems had they not been caught early.”

— Engineering manager at a small North American applied sciences company

Moving to SaaS is a growing trend

37% of respondents say they “mostly” or “only” use SaaS, up 42% year over year. There’s also been a big drop in organizations splitting equally between SaaS and self-managed, falling from 22% in 2024 to 6% in 2025—a possible sign that those that previously hedged their bets have been won over by SaaS. But the majority of organizations in this survey still manage their own observability setup, with 57% describing their observability setup as “mostly” or “only” self-managed. Most likely to manage their own setup? European organizations (69%) and those in government (77%) and telecommunications (77%).

Which best describes your observability setup?

The role of open source

Organizations are far more likely to use observability tools under open source licensing than commercial

76% of respondents are using open source licensing for observability in some capacity (“open source only,” “mostly open source,” “roughly equal”), but a growing number of organizations are prioritizing commercial tooling—those using “only” or “mostly” commercial licenses more than doubled (10% in 2024, 24% in 2025). In addition, there’s a strong correlation between licensing and observability setups (SaaS vs. self-managed).

Are you using observability tools under an open source or a commercial license?

Relationship between observability setup and licensing

Self-managed only

Mostly self-managed

Roughly equal

Mostly SaaS

SaaS only

For the second straight year, half of all organizations invested more in Prometheus and OpenTelemetry

More than two-thirds of organizations (67%) use Prometheus in production in some capacity (“in production,” “extensively,” “exclusively”), with another 19% investigating or building POCs. In comparison, OpenTelemetry has less production usage (41%), but it appears to have more momentum for future growth, with more than a third (38%) investigating it or building POCs.

71%

of all respondents are using both Prometheus and OpenTelemetry in some capacity†

34%

of all respondents are using both Prometheus and OpenTelemetry in production environments††

How much have you invested in Prometheus?

How much have you invested in OpenTelemetry?

How does your use of Prometheus and OpenTelemetry compare to last year?

Prometheus

▲

53%

30%

The same

▼

Less

11%

Don’t know

OpenTelemetry

▲

50%

27%

The same

▼

Less

17%

Don’t know

More companies are using OpenTelemetry, but they have different priorities for compatible backends

There are some notable differences between those using OpenTelemetry in production in some capacity (“in production,” “extensively,” exclusively") and those starting out with OpenTelemetry (“investigating”, “building a POC”). Those using OpenTelemetry in production put a higher premium on support for a variety of telemetry types (61% vs. 51%), compatibility with existing systems (56% vs. 51%), cost (49% vs. 41%), and scalability (44% vs 36%).

Requirements* for OpenTelemetry backend solutions

57%

Ensure vendor-neutrality and flexibility

51%

Support for a variety of telemetry types, including metrics, logs, traces, and profiles

51%

Effective data exploration, visualization, and analysis capabilities

49%

Compatibility with existing monitoring/observability systems

42%

Cost-effectiveness

35%

Scalability to handle large volumes of telemetry data

32%

Ease of adoption and management

*Respondents could select multiple answers

The challenges that remain

Complexity, noise, and cost are named the top hurdles to observability success

And these concerns can have a direct impact on tool selection. For example, 88% of those who say observability costs too much also say cost is an important criteria for new tools. And nearly two-thirds (62%) of those who cite concerns about getting adoption in their organization also prioritize ease of use and interoperability when selecting new tools.

Biggest* observability concerns

39%

Complexity/overhead associated with setting up and maintaining tooling

38%

More noise; signal-to-noise challenge

37%

Costs too much

29%

Costs are too difficult to predict and budget for

28%

Vendor lock-in

24%

Getting adoption within my company

23%

Convincing management of the value

*Respondents could select multiple answers

Cost concerns vary widely across industries, roles, regions, and organization sizes

Percentage of respondents who cite observability costing too much as one of their biggest concerns

CTO*

50%

Director of Engineering

45%

Developer

37%

Engineering Manager

37%

Platform team

36%

SRE

35%

*Base of fewer than 50 respondents

Observability costs are a fraction of infrastructure costs, but percentages are all over the map

On average, observability spend amounts to 17% of total compute infrastructure spend, though 10% was the most common response‡. However, that ratio can vary widely, with some spending next to nothing and others outpacing their compute spending entirely.

Observability spend as a percentage of total compute infrastructure spend

mean: 17%, median: 10%, mode: 10%

0% of infra spend

1-10% of infra spend

11-20% of infra spend

21-30% of infra spend

31-40% of infra spend

41-50% of infra spend

51+% of infra spend

Alert fatigue is the No. 1 obstacle to faster incident response at almost all levels of an organization

The lone exception is engineering managers, which are slightly more likely to cite “painful incident coordination across teams” (25%) over alert fatigue (24%). They’re also the most likely to cite “limited data across incidents” (18%).

Biggest obstacle to faster incident response, by role

Alert fatigue

Lack of incident response

Painful incident coordination across teams

Lack of culture and process that improves with each incident

Limited data across incidents

Other

*Base of fewer than 50 respondents

The future fixes to today’s problems

SLOs and full stack observability are top priorities going forward

Roughly half of all companies are “investigating” or “building a POC” for unified application and infrastructure observability (51%), service-level objectives (SLOs) (50%), and LLM observability (47%). And more than a third (39%) are doing the same for FinOps. Of this group of emerging technologies, LLM observability is used “in production,” “extensively,” or “exclusively” the least (7% combined), while unified application and infrastructure observability is used the most (34%).

Relevance of emerging technologies and techniques to your company

AI/ML features most in demand: training-based alerts and better root cause analysis tools

However, the order in which these two sought-after features rank diverges based on factors such as company size (smaller orgs tend to favor training-based alerts) and number of technologies and data sources (those with more tend to favor root-cause analysis), and SaaS vs. self-managed (the former favoring training-based alerts, the latter leaning toward faster root-cause analysis).

31%

Training-based alerts that fire when a metric changes from its usual pattern or is an outlier in a group

28%

Faster root cause analysis: automated targeted checks, help interpreting signals, suggestions when a service is failing

16%

Reduction in unused/underutilized resources and telemetry

11%

Guidance for setting up and understanding monitoring, SLOs, alerting, etc.

11%

Ongoing anomaly detection and warnings across services

Other

The ways observability outcomes are translating to business objectives

Organizations prioritize SLOs to proactively improve MTTR

Organizations with mature observability cultures tend to rely on SLOs, so it makes sense that reduced MTTR (33%) and better accountability (25%) are the top outcomes these organizations hope to achieve. Interestingly, cost savings (14%) is the outcome least prioritized, pointing to the possibility that these organizations prioritize getting value from observability over cutting costs.

If you’re actively using SLOs or are interested in doing so, what is the most important outcome you hope to achieve?

The C-suite considers observability a business priority

“CTO/C-level” is most frequently cited (33%) as the highest level at which observability is considered critical to the business. That varies by industry, with financial services (45%) being the most likely to have C-suite engagement, and healthcare (20%) and government (18%) being the least likely. There’s also a correlation to company size, as businesses with 100 or fewer employees are the most likely to have C-suite engagement (40%) and companies with more than 1,000 employees are the least likely (29%).

Highest level at which observability is considered critical to the business within your company

C-suite engagement may be connected to observability maturity

There also appears to be a correlation between those that answered “CTO/C-level” as the highest level at which observability is considered business critical and those with more mature observability practices.

Impact of C-suite support on the adoption of various technologies and techniques

CTO/C-level selected as the highest level at which observability is considered business critical

All other responses

*in production, extensively, or exclusively

How practitioners convince (or struggle to convince) their colleagues to embrace observability

“The best way to convince them is to lead them to it. When my team got the chance to build a new tool, and we instrumented that service with OTel and had visibility into every operation in that service and were able to resolve bugs in a couple of hours, the rest of the teams noticed and wanted that for themselves as well.”

— DevOps engineer at a small European software company

“Start with a minimal setup and quickly show quick wins: alerting on critical parts, dashboards offering visibility.”

— Engineering manager at a mid-sized European financial services company

“Practical results. By providing real-world examples of how observability reduced incident response time and improved overall service reliability, I helped colleagues understand its direct impact on our goals.”

— SRE at a small Asian software and technology company

“To engineers, the speech was: ‘We will know how things work under the hood,’ and it was enough. With the business side we needed to change the approach. It was more like: ‘What kind of signal do you want from your workload?’ And the discovery process has done the rest.”

— Staff engineer at a large South American financial services company

“Some are already convinced based on previous experience at other employers, but many are not. We will need to demonstrate real-world benefits before there is any chance of converting the latter group.”

— Developer at a mid-sized North American company

“The hardest part is demonstrating the value of spending effort on observability as opposed to feature development.”

— Developer at a large European retail/e-commerce company

“The main obstacle is the response: ‘Sounds great, but this isn’t a priority for us at the moment.’ Ultimately, we work on improving observability independently when we have the capacity, without requiring management’s buy-in.”

— Engineering manager at a large European software and technology company

“This is literally what I do daily. It’s always a challenge. But we work to find the early adopters, and use them as our salespeople.”

— SRE at a large North American automotive and manufacturing company

About the survey

Demographics

Survey responses were collected around the world from observability practitioners and leaders. Those individuals come from companies of all sizes and a broad range of industries.

Role

Region

Size of organization

Industry

Methodology

The survey was conducted online and in-person between Sept. 18, 2024 and Jan. 2, 2025. We promoted it through our website, newsletters, and social media channels. We also solicited responses at Grafana Labs events and third-party conferences.

More ways to explore the survey results

In-depth analysis

Take a closer look at the biggest takeaways and the latest trends impacting the observability space.

Read the full report

See the data in Grafana

See how your organization stacks up and dive into the numbers in our interactive Grafana Play dashboard.

Explore the data

† Combines “We are investigating it,” “We are building a POC,” “We are using it in production,” We are using it extensively," and “We are using it exclusively”

†† Combines “We are using it in production,” “We are using it extensively,” and “We are using it exclusively”

‡ This was an optional, open-ended question. Inconsistent or inaccurate responses were removed from the dataset, leaving a base of 294 responses.

Learn more about observability

ObservabilityCON

Our flagship observability event highlighting best practices, cool use cases, and the latest features from Grafana Labs.

GrafanaCON

Join us for our biggest community event of the year to explore the latest updates to Grafana and its extended open source ecosystem.

Observability Journey Maturity Model

Understand where your organization is on its journey and how to improve.

Observability benefits for business

Get the latest research on the impacts of observability.

Previous survey results

2024 Observability Survey

2023 Observability Survey

Observability Survey

TL;DR

Tool overloadis real

Observability runson OSS

Cost dominates buyingdecisions

Complexitychaos

Full stack tothe fore

A view fromthe top

More ways to explore the survey

In-depth analysis

See the data in Grafana

The many tools of the trade

Traces are growing in popularity, with more than half (57%) of organizations including them in their stacks

Teams are juggling lots of observability tools—and lots of context switching

101

Number of observability technologies in use

Grafana users configure an average of 16 data sources in the platform

Data sources in Grafana, by company size

The ways teams tackle observability

Cost is by far the biggest criteria for companies when they select new observability tools

Most important* criteria for selecting new observability tools, by role

Most organizations have centralized observability

What the community says about centralized observability

Moving to SaaS is a growing trend

Which best describes your observability setup?

The role of open source

Organizations are far more likely to use observability tools under open source licensing than commercial

Are you using observability tools under an open source or a commercial license?

Relationship between observability setup and licensing

For the second straight year, half of all organizations invested more in Prometheus and OpenTelemetry

How much have you invested in Prometheus?

How much have you invested in OpenTelemetry?

How does your use of Prometheus and OpenTelemetry compare to last year?

More companies are using OpenTelemetry, but they have different priorities for compatible backends

Requirements* for OpenTelemetry backend solutions

The challenges that remain

Complexity, noise, and cost are named the top hurdles to observability success

Biggest* observability concerns

Cost concerns vary widely across industries, roles, regions, and organization sizes

Percentage of respondents who cite observability costing too much as one of their biggest concerns

Observability costs are a fraction of infrastructure costs, but percentages are all over the map

Observability spend as a percentage of total compute infrastructure spend

Alert fatigue is the No. 1 obstacle to faster incident response at almost all levels of an organization

Biggest obstacle to faster incident response, by role

The future fixes to today’s problems

SLOs and full stack observability are top priorities going forward

Relevance of emerging technologies and techniques to your company

AI/ML features most in demand: training-based alerts and better root cause analysis tools

The ways observability outcomes are translating to business objectives

Organizations prioritize SLOs to proactively improve MTTR

If you’re actively using SLOs or are interested in doing so, what is the most important outcome you hope to achieve?

The C-suite considers observability a business priority

Highest level at which observability is considered critical to the business within your company

C-suite engagement may be connected to observability maturity

Impact of C-suite support on the adoption of various technologies and techniques

How practitioners convince (or struggle to convince) their colleagues to embrace observability

About the survey

Demographics

Role

Region

Size of organization

Industry

Methodology

More ways to explore the survey results

In-depth analysis

See the data in Grafana

Learn more about observability

Previous survey results

Tool overload
is real

Observability runs
on OSS

Cost dominates buying
decisions

Complexity
chaos

Full stack to
the fore

A view from
the top