Grafana Cloud

Entity catalog

The entity catalog is your primary entry point for monitoring services and infrastructure. It shows all discovered entities with their health status, insights, and key metrics in a unified view, helping you identify what needs attention.

What is the entity catalog?

The entity catalog automatically displays every service, Pod, node, database, and infrastructure component the knowledge graph discovers from your telemetry data. Each entity appears with:

  • Health indicators - Insight rings showing critical, warning, and informational issues
  • Key metrics - Context-sensitive columns showing RED metrics for services or infrastructure metrics for nodes
  • Sparklines - Trend visualizations for quick pattern recognition
  • Entity properties - Cluster, namespace, version, and other metadata

No manual inventory work required; the catalog automatically stays current as your environment changes.

When to use the entity catalog

Use the entity catalog to:

  • Monitor selected services - Filter to your team’s services and bookmark the view for daily monitoring
  • Identify unhealthy entities - See at a glance which services or infrastructure components have critical insights
  • Explore your environment - Discover what’s running across clusters, namespaces, and environments
  • Start troubleshooting - Click any entity to investigate metrics, logs, and traces

The entity catalog is designed for rapid situational awareness across large environments. Instead of jumping between dashboards or writing queries, you get a comprehensive health view in one place.

Open the entity catalog

From Grafana Cloud, navigate to Observability > Entity catalog.

Entity types and their metrics

The entity catalog shows different information depending on entity type. Each type displays metrics relevant to monitoring that component.

Services

Services show RED metrics (Request rate, Error rate, Duration):

  • Request rate - Requests per second with sparkline trend
  • Error ratio - Percentage of failed requests
  • Latency (P95) - 95th percentile response time

Services are discovered from OpenTelemetry instrumentation, Istio service mesh data, or other APM sources. A service represents a logical application component that handles requests.

Service instances

Service instances represent individual running instances of a service. They show:

  • Request rate - Requests per second for this specific instance
  • Error ratio - Instance-specific error percentage
  • Latency (P95) - Response time for this instance

Service instances help identify when specific instances are behaving differently from others, useful for troubleshooting instance-specific issues like memory leaks or configuration problems.

Pods

Kubernetes Pods show infrastructure metrics:

  • CPU usage - Current CPU consumption
  • Memory usage - Current memory consumption
  • Restart count - How many times the Pod has restarted

Pods are the smallest deployable units in Kubernetes and typically run one or more containers.

Nodes

Kubernetes nodes show cluster-level infrastructure metrics:

  • CPU capacity - Total and available CPU
  • Memory capacity - Total and available memory
  • Pod count - Number of Pods running on the node

Nodes are the worker machines (virtual or physical) that run your containerized workloads.

Node groups

Node groups represent collections of nodes with similar characteristics (for example, same instance type or availability zone):

  • Total nodes - Number of nodes in the group
  • CPU capacity - Aggregate CPU across all nodes
  • Memory capacity - Aggregate memory across all nodes

Node groups are useful for monitoring auto-scaling groups or node pools in managed Kubernetes services.

KubeClusters

Kubernetes clusters show high-level metrics:

  • Node count - Total number of nodes
  • Pod count - Total number of Pods across all nodes
  • Namespace count - Number of namespaces in the cluster

Clusters provide the highest-level view of your Kubernetes environment’s health and capacity.

Namespaces

Kubernetes namespaces show resource usage within a logical grouping:

  • Pod count - Number of Pods in this namespace
  • CPU usage - Aggregate CPU consumption
  • Memory usage - Aggregate memory consumption

Namespaces typically represent teams, environments, or applications and help you understand resource consumption by organizational unit.

Databases

Database entities show database-specific metrics:

  • Connection count - Active database connections
  • Query rate - Queries per second
  • Latency - Query response time

Database entities are discovered from database exporters or instrumentation and help you monitor database health and performance.

Understand insight rings

Each entity displays two colored rings showing its health status:

Outer ring - Insights directly on this entity:

  • Red - Critical issues requiring immediate attention
  • Yellow - Warnings that may need investigation
  • Blue - Informational insights like configuration changes

Inner ring - Insights on child entities (for example, pods within a service):

  • Shows the most severe insight from any child entity
  • Helps identify when problems exist deeper in the stack

The insight ring provides instant visual feedback about entity health without requiring you to click into each one.

The entity catalog provides powerful filtering to narrow down what you’re viewing.

Filter by entity type

Click the entity type dropdown to show only:

  • Service
  • ServiceInstance
  • Pod
  • Node
  • NodeGroup
  • KubeCluster
  • Namespace
  • Database
  • Or any combination

This allows you to focus on a specific layer of your infrastructure or application stack.

Filter by insights

Show only entities with specific insight categories:

  • Saturation - Resource limits being approached
  • Amend - Recent configuration changes
  • Anomaly - Unusual traffic patterns
  • Failure - Critical system failures
  • Error - Request errors or latency breaches

This is especially useful for focusing on entities that need immediate attention.

Filter by properties

Filter by custom properties like:

  • Cluster or region
  • Namespace or environment
  • Version or deployment
  • Any label attached to your entities

Use the search bar to find entities by name. Search supports partial matches and is case-insensitive.

Bookmark filtered views

After applying filters, bookmark the URL to create custom monitoring views for your team. For example, bookmark a view showing only your production services in a specific region.

Each metric column includes a sparkline—a small line chart showing the metric’s trend over the selected time range. Sparklines help you quickly spot:

  • Spikes - Sudden increases in latency or error rate
  • Drops - Traffic disappearing or services going down
  • Patterns - Regular oscillations or daily patterns
  • Stability - Flat lines indicating steady state

Hover over a sparkline to see exact values at specific times.

Used in these workflows

The entity catalog is the foundation for multiple use cases:

Next steps