Menu
Grafana Cloud

Introduction to the knowledge graph

The knowledge graph is a troubleshooting and observability technology that helps you quickly find the root cause of application and infrastructure issues without needing to write queries or jump across dashboards. It connects metrics, logs, and traces into a single workflow so you can resolve problems faster.

Your on-call team no longer becomes overwhelmed by irrelevant alerts that are difficult to manage, noisy, and become quickly outdated.

Why use the knowledge graph?

  • Faster troubleshooting: Automatically surfaces anomalies and failures
  • Beginner-friendly: No PromQL expertise required
  • Context-rich: See the relationships between services, infrastructure, and applications in one place
  • Smarter insights: Go beyond symptoms and get to causes

How the knowledge graph works

The knowledge graph collects information from your telemetry data sources and uses it to create a visual representation of your application and infrastructure components. It then organizes and indexes this representation, so that you can search for specific information to determine how the components fit together in real-time.

The knowledge graph curates knowledge of common runtime failure patterns and potential causes, so your team doesn’t have to research and maintain these rules.

Fundamentals

The following concepts are key to your understanding of how the knowledge graph works:

Entities

Entities represent objects and their properties within your environment, such as services, nodes, or Pods. Refer to Entities and relationships.

Entity catalog

The entity catalog provides a comprehensive list view of the elements that are automatically monitored by the knowledge graph.

Entity graph

The entity graph provides a visual representation of the relationships between infrastructure and application components in your IT environment.

Insights

Insights continuously track resource saturation, amends (for example, deployments and scale events), request, resource, and latency anomalies, systemic failures, and errors on your golden signals and health metrics.

RCA workbench

RCA workbench provides a comprehensive exploration of potential causes for specific issues over time and dependencies, offering access to relevant metrics, logs, and traces for efficient troubleshooting and analysis.

Workbench AI

Workbench AI helps you form hypotheses based on root cause analysis and provides recommended steps for solving performance issues.