Grafana Cloud

Monitor selected services

Use the entity catalog to create focused monitoring views for the services your team owns. Filter to specific services, bookmark the view, and check it daily to stay aware of service health without writing queries.

When to use this workflow

Use this workflow when you want to:

  • Monitor a specific set of services your team is responsible for
  • Create a daily “check-in” view for service health
  • Track RED metrics (request rate, error ratio, latency) across your services
  • Quickly identify which services have active insights

This is your primary workflow for proactive monitoring and early detection of issues.

Before you begin

Ensure your services are:

  • Instrumented with OpenTelemetry, Istio, or another supported APM source
  • Sending RED metrics to Grafana Cloud
  • Visible in the entity catalog

Open the entity catalog

From Grafana Cloud, navigate to Observability > Entity catalog.

Filter to your services

Create a focused view by applying filters:

Filter by entity type

  1. Under Type, select Service.

This shows only services, removing infrastructure noise.

Filter by properties

Narrow to your team’s services using the property filter dropdowns located above the entity type filter.

Use any combination of these filters:

  • Namespace - Filter to your team’s namespace (for example, checkout, payments)
  • Region - Show only services in specific regions (for example, us-east-1, eu-west-1)
  • Env - Filter to specific environments (for example, production, staging)

You can select values from multiple dropdowns to narrow your view. For example, select production from the Env dropdown and us-east-1 from the Region dropdown to see only your production services in that region.

Search for specific services

Use the search bar to find services by name:

  • Type partial names (for example, api finds api-server, payment-api)
  • Search is case-insensitive
  • Results update as you type

Review service health

The entity catalog shows key information for each service:

Insight rings

Check the colored rings around each service:

  • Red outer ring - Critical issues on the service itself
  • Yellow outer ring - Warning-level issues
  • Blue outer ring - Informational insights (deployments, configuration changes)
  • Red/yellow inner ring - Issues on child entities (pods running the service)

Focus on services with red rings for immediate investigation.

RED metrics

Each service displays three core metrics:

  • Request rate - Requests per second with trend sparkline
  • Error ratio - Percentage of failed requests
  • Latency (P95) - 95th percentile response time

Look for:

  • Sudden drops in request rate - May indicate traffic routing issues
  • Error ratio spikes - Often correlate with deployments or dependency failures
  • Latency increases - Can signal resource saturation or slow dependencies

Sparklines

View sparklines to see trends over the selected time range. Patterns to watch for:

  • Spikes or drops during deployment windows
  • Regular daily patterns (normal) vs irregular patterns (investigate)
  • Gradual degradation over time

Investigate a service

When you spot an issue, click the service name to open its details:

  1. Service overview tab shows:

    • RED metrics with threshold visualization
    • Active insights with severity and timing
    • Related upstream and downstream services
  2. Check active insights:

    • Read insight descriptions to understand what was detected
    • Note the insight category (Error, Anomaly, Saturation, etc.)
    • Check timing to correlate with deployments or other changes
  3. Review connected entities:

    • See which services call this one (upstream)
    • See which dependencies this service calls (downstream)
    • Click connected entities to check if they share issues

Bookmark your view

After setting up filters, bookmark the entity catalog URL:

  1. Apply all desired filters (entity type, properties, search).
  2. Bookmark the page in your browser.
  3. Name it clearly (for example, “Production Checkout Services”).

Return to this bookmark daily for consistent monitoring.

Set up multiple views

Create different bookmarked views for different contexts:

  • Production services - Filter to production environment
  • Team services - Filter by namespace or team label
  • Critical services - Filter to high-priority services only
  • Recently deployed - Services with recent Amend insights

What to look for

During daily monitoring, focus on:

Critical insights (red rings)

  • Error rate breaches on customer-facing services
  • Saturation warnings approaching resource limits
  • Failure insights like CrashLoopBackOff or service unavailability

Metric anomalies

  • Request rate dropping to zero (service down)
  • Error ratio jumping above 1% (investigate immediately)
  • P95 latency exceeding SLO thresholds

Patterns across services

  • Multiple services with errors at the same time (shared dependency issue)
  • Cascading latency increases (trace to slowest service)
  • All services in a namespace affected (infrastructure or network issue)

Next steps

When you identify an issue during monitoring: