How the knowledge graph determines causality between events
The system logs insights according to statistical anomalies (such as seasonality), 5m, 30m, 1hr, 6hr multi-window error rates, system saturation, failures, and updates (such as new deployments, configuration changes, and so on). The knowledge graph uses the entity graph and time and space correlation to determine and display the potential cause-and-effect relationships between insights.
The following chain of events occurs when an alert fires:
- The knowledge graph uses the labels in the alert to reverse lookup to find the target entity in the entity graph.
- The knowledge graph traverses the graph to infer the first, second, third, and fourth degree connections and corresponding insights.
- First-degree connection examples: Service - Service, Service - Pod
- Second-degree connection examples: Service - Pod - Node, Producer - Topic - Consumer
- Third and fourth-degree connection example: Service - (Container) - Node - (Container) - Service
The knowledge graph is familiar with failure patterns for many popular and some less popular components and run times. For more information about insights, refer to Insights categories.
You can use code or the user interface to add rules for new entities, relations, insights, and inferences.