Slide 2 of 7

Operational practices overview

Why a separate section?

Some capabilities don’t fit neatly into the hierarchy. They span all four levels:

PracticeHow it spans levels
AlertingAlert on infrastructure (L1), services (L2), transactions (L3), or custom metrics (L4)
Proactive testingTest infrastructure uptime (L1), service endpoints (L2), or user journeys (L3)
Platform managementGovern access, costs, and scale across all levels
Grafana AssistantQuery, troubleshoot, and build dashboards using natural language at any level

The three operational areas (plus an AI accelerator)

Three cross-cutting operational areas: Alerting and IRM, Proactive Testing, Platform Management

When to focus on operations

If you’re at…Operational priority
Level 1Basic alerting on infrastructure metrics
Level 2Service-level alerting, SLOs
Level 3Synthetic tests for critical user journeys
Level 4Custom metric alerting, cost optimization

Script

We’ve covered the four levels of observability. But there’s a set of capabilities that don’t fit neatly into any single level. They apply across all of them. These are operational practices: alerting, incident response, proactive testing, and platform management. They’re the tools that turn observability data into action.

Think about alerting. At Level 1, you alert on infrastructure metrics. At Level 2, you alert on service health. At Level 3, you alert on transaction latency. At Level 4, you alert on custom metrics. The concept is the same; the scope changes.

Same with proactive testing. You can test that servers respond, that service endpoints work, or that complete user journeys succeed. And platform management (access control, cost management, scaling) applies across everything.

We’ll cover three operational areas: Alerting and Incident Response Management, which is about detecting problems and responding effectively. Proactive Testing, which is about finding problems before your users do. And Platform Management, which is about governing your observability practice at scale.

And there’s one capability that accelerates all of them: Grafana Assistant. It’s an AI-powered assistant that helps you query data, troubleshoot issues, and build dashboards using natural language. Whether you’re exploring metrics at Level 1 or debugging traces at Level 3, you can ask questions instead of writing complex queries. We’ll cover it after Alerting and IRM.

What you focus on depends on where you are in the hierarchy, but these capabilities are available to you from day one.