Entity catalog

Grafana Cloud

Entity catalog

The entity catalog is your primary entry point for monitoring services and infrastructure. It shows all discovered entities with their health status, insights, and key metrics in a unified view, helping you identify what needs attention.

What is the entity catalog?

The entity catalog automatically displays every service, Pod, node, database, and infrastructure component the knowledge graph discovers from your telemetry data. Each entity appears with:

Health indicators - Insight rings showing critical, warning, and informational issues
Key metrics - Context-sensitive columns showing RED metrics for services or infrastructure metrics for nodes
Sparklines - Trend visualizations for quick pattern recognition
Entity properties - Cluster, namespace, version, and other metadata

No manual inventory work required; the catalog automatically stays current as your environment changes.

When to use the entity catalog

Use the entity catalog to:

Monitor selected services - Filter to your team’s services and bookmark the view for daily monitoring
Identify unhealthy entities - See at a glance which services or infrastructure components have critical insights
Explore your environment - Discover what’s running across clusters, namespaces, and environments
Start troubleshooting - Click any entity to investigate metrics, logs, and traces

The entity catalog is designed for rapid situational awareness across large environments. Instead of jumping between dashboards or writing queries, you get a comprehensive health view in one place.

Open the entity catalog

From Grafana Cloud, navigate to Observability > Entity catalog.

Entity types and their metrics

The entity catalog shows different information depending on entity type. Each type displays metrics relevant to monitoring that component.

Services

Services show RED metrics (Request rate, Error rate, Duration):

Request rate - Requests per second with sparkline trend
Error ratio - Percentage of failed requests
Latency (P95) - 95th percentile response time

Services are discovered from OpenTelemetry instrumentation, Istio service mesh data, or other APM sources. A service represents a logical application component that handles requests.

Service instances

Service instances represent individual running instances of a service. They show:

Request rate - Requests per second for this specific instance
Error ratio - Instance-specific error percentage
Latency (P95) - Response time for this instance

Service instances help identify when specific instances are behaving differently from others, useful for troubleshooting instance-specific issues like memory leaks or configuration problems.

Pods

Kubernetes Pods show infrastructure metrics:

CPU usage - Current CPU consumption
Memory usage - Current memory consumption
Restart count - How many times the Pod has restarted

Pods are the smallest deployable units in Kubernetes and typically run one or more containers.

Nodes

Kubernetes nodes show cluster-level infrastructure metrics:

CPU capacity - Total and available CPU
Memory capacity - Total and available memory
Pod count - Number of Pods running on the node

Nodes are the worker machines (virtual or physical) that run your containerized workloads.

Node groups

Node groups represent collections of nodes with similar characteristics (for example, same instance type or availability zone):

Total nodes - Number of nodes in the group
CPU capacity - Aggregate CPU across all nodes
Memory capacity - Aggregate memory across all nodes

Node groups are useful for monitoring auto-scaling groups or node pools in managed Kubernetes services.

KubeClusters

Kubernetes clusters show high-level metrics:

Node count - Total number of nodes
Pod count - Total number of Pods across all nodes
Namespace count - Number of namespaces in the cluster

Clusters provide the highest-level view of your Kubernetes environment’s health and capacity.

Namespaces

Kubernetes namespaces show resource usage within a logical grouping:

Pod count - Number of Pods in this namespace
CPU usage - Aggregate CPU consumption
Memory usage - Aggregate memory consumption

Namespaces typically represent teams, environments, or applications and help you understand resource consumption by organizational unit.

Databases

Database entities show database-specific metrics:

Connection count - Active database connections
Query rate - Queries per second
Latency - Query response time

Database entities are discovered from database exporters or instrumentation and help you monitor database health and performance.

Understand insight rings

Each entity displays two colored rings showing its health status:

Outer ring - Insights directly on this entity:

Red - Critical issues requiring immediate attention
Yellow - Warnings that may need investigation
Blue - Informational insights like configuration changes

Inner ring - Insights propagated from entities in hierarchical relationships:

Shows the most severe insight from child or controlled entities
Helps identify when problems exist in dependent components

The insight ring provides instant visual feedback about entity health without requiring you to click into each one.

Filter and search

The entity catalog provides powerful filtering to narrow down what you’re viewing.

Filter by entity type

Click the entity type dropdown to show only:

Service
ServiceInstance
Pod
Node
NodeGroup
KubeCluster
Namespace
Database
Or any combination

This allows you to focus on a specific layer of your infrastructure or application stack.

Filter by insights

Show only entities with specific insight categories:

Saturation - Resource limits being approached
Amend - Recent configuration changes
Anomaly - Unusual traffic patterns
Failure - Critical system failures
Error - Request errors or latency breaches

This is especially useful for focusing on entities that need immediate attention.

Filter by properties

Filter by custom properties like:

Cluster or region
Namespace or environment
Version or deployment
Any label attached to your entities

Search

Use the search bar to find entities by name. Search supports partial matches and is case-insensitive.

Bookmark filtered views

After applying filters, bookmark the URL to create custom monitoring views for your team. For example, bookmark a view showing only your production services in a specific region.

Sparklines and trends

Each metric column includes a sparkline—a small line chart showing the metric’s trend over the selected time range. Sparklines help you quickly spot:

Spikes - Sudden increases in latency or error rate
Drops - Traffic disappearing or services going down
Patterns - Regular oscillations or daily patterns
Stability - Flat lines indicating steady state

Hover over a sparkline to see exact values at specific times.

Used in these workflows

The entity catalog is the foundation for multiple use cases:

Monitor selected services - Create focused views for your team’s services
Identify unhealthy infrastructure - Find infrastructure components with critical issues
Explore service dependencies - Start dependency analysis from the entity list
Investigate incidents - Add entities to RCA workbench for incident investigation