Why a separate section?
Some capabilities don’t fit neatly into the hierarchy. They span all four levels:
| Practice | How it spans levels |
|---|---|
| Alerting | Alert on infrastructure (L1), services (L2), transactions (L3), or custom metrics (L4) |
| Proactive testing | Test infrastructure uptime (L1), service endpoints (L2), or user journeys (L3) |
| Platform management | Govern access, costs, and scale across all levels |
| Grafana Assistant | Query, troubleshoot, and build dashboards using natural language at any level |
The three operational areas (plus an AI accelerator)
When to focus on operations
| If you’re at… | Operational priority |
|---|---|
| Level 1 | Basic alerting on infrastructure metrics |
| Level 2 | Service-level alerting, SLOs |
| Level 3 | Synthetic tests for critical user journeys |
| Level 4 | Custom metric alerting, cost optimization |
Script
We’ve covered the four levels of observability. But there’s a set of capabilities that don’t fit neatly into any single level. They apply across all of them. These are operational practices: alerting, incident response, proactive testing, and platform management. They’re the tools that turn observability data into action.
Think about alerting. At Level 1, you alert on infrastructure metrics. At Level 2, you alert on service health. At Level 3, you alert on transaction latency. At Level 4, you alert on custom metrics. The concept is the same; the scope changes.
Same with proactive testing. You can test that servers respond, that service endpoints work, or that complete user journeys succeed. And platform management (access control, cost management, scaling) applies across everything.
We’ll cover three operational areas: Alerting and Incident Response Management, which is about detecting problems and responding effectively. Proactive Testing, which is about finding problems before your users do. And Platform Management, which is about governing your observability practice at scale.
And there’s one capability that accelerates all of them: Grafana Assistant. It’s an AI-powered assistant that helps you query data, troubleshoot issues, and build dashboards using natural language. Whether you’re exploring metrics at Level 1 or debugging traces at Level 3, you can ask questions instead of writing complex queries. We’ll cover it after Alerting and IRM.
What you focus on depends on where you are in the hierarchy, but these capabilities are available to you from day one.
