Troubleshoot data issues

This topic provides guidance for troubleshooting data issues in the knowledge graph.

Required metrics and labels

If the knowledge graph isn’t discovering entities or if you’re experiencing empty panels in your dashboards, it may be because Grafana Cloud Adaptive Metrics is dropping or aggregating metrics or labels that the knowledge graph needs. If Adaptive Metrics is affecting the required metrics, you need to remove them from Adaptive Metrics. To learn how to remove metrics from Adaptive Metrics, refer to Recommendation exemptions.

Application Observability required metrics and labels

For an overview of the metrics and labels necessary for the knowledge graph to monitor your environment when using Application Observability, refer to Application Observability required metrics and labels. If the labels are present but issues persist, open a support ticket for further assistance.

For more information on how to send traces_host_info, refer to Host-hours pricing.

Kubernetes metrics

The table below shows the metrics and labels necessary for the knowledge graph to monitor your Kubernetes environment. If the labels are present but issues persist, open a support ticket for further assistance.

Metric name	Required labels
kube_pod_info	cluster, namespace, node, pod
kube_pod_owner	cluster, namespace, node, owner_kind, owner_name
kube_pod_container_resource_requests	cluster, namespace, pod, container, resource
kube_pod_status_phase	cluster, namespace, pod, phase
kube_replicaset_owner	cluster, namespace, replicaset, owner_name, owner_kind
kube_pod_container_info	cluster, namespace, container, image_id
kube_pod_container_resource_limits	cluster, namespace, pod, container, resource
kube_configmap_metadata_resource_version	cluster, namespace, configmap
kube_secret_metadata_resource_version	cluster, namespace, secret
kube_deployment_metadata_generation	cluster, namespace, deployment	statefulset	daemonset
kube_node_info	cluster, node
kubelet_node_name	cluster, node, instance
AWS
kube_node_labels	label_beta_kubernetes_io_instance_type, and label_eks_amazonaws_com_nodegroup or
	label_karpenter_sh_nodepool or
	label_alpha_eksctl_io_cluster_name, label_alpha_eksctl_io_nodegroup_name or
	label_ec2_amazonaws_com_Name, label_ec2_amazonaws_com_aws_autoscaling_groupName or
	label_ec2_amazonaws_com_name, label_ec2_amazonaws_com_aws_autoscaling_group_name or
	label_k8s_io_cloud_provider_aws
GCP
kube_node_labels	label_node_kubernetes_io_instance_type, label_cluster_name, label_cloud_google_com_gke_nodepool
Azure
kube_node_labels	label_agentpool, label_kubernetes_azure_com_cluster
kube_node_status_allocatable	cluster, node, resource

Container resource utilization observability

The following table lists metrics and labels required for Kubernetes container resource utilization observability.

Metric name	Required labels
container_cpu_cfs_throttled_periods_total	cluster, namespace, pod, container, node
container_cpu_cfs_periods_total	cluster, namespace, pod, container, node
container_memory_working_set_bytes	cluster, namespace, pod, container, node
container_memory_usage_bytes	cluster, namespace, pod, container, node
container_memory_cache	cluster, namespace, pod, container, node

RED metrics troubleshooting

For the knowledge graph to associate the RED metrics with the Kubernetes entities it identifies, the entities must have labels that specify their source. For instance, span metrics require labels such as k8s.namespace.name, k8s.cluster.name, and k8s.pod.name.

You can use the Kubernetes Attributes Process to assign these labels. Make sure you follow the Kubernetes monitoring recommendations.

If you still encounter problems, submit a support ticket for further assistance.

Prometheus troubleshooting

In addition to using Grafana Cloud Application Observability or Grafana Cloud Kubernetes Monitoring, you might use Prometheus to scrape some metrics. However, there are some guidelines to consider for the knowledge graph to work correctly.

If you use a single Prometheus job to scrape multiple entities, it can create the following issues:

The knowledge graph might not be able to detect all your entities.
RED metrics might not get associated to entities.
RED metrics might get aggregated across workloads that share the same job.

To avoid issues, we recommend the following:

Make the entities easily identifiable. You can do this by applying one of the following methods:
- Try not to use a single job to scrape multiple services and instead use a job per service.
- Identify your entities by adding a service label to your metrics.
If you are using annotation-based Kubernetes service discovery in your Prometheus configuration, you can use the following relabeling rules:

source_labels: [__meta_kubernetes_pod_name]
regex: ^(.*?)([-][a-zA-Z0-9]{5,10}(-[a-zA-Z0-9]{5})?|-[0-9]+)?$
target_label: service
replacement: $1

AWS troubleshooting

The following table lists the metrics and labels necessary for the knowledge graph to build AWS RDS entities and relationships. These metrics are generated from span metrics sources and help identify relationships to AWS RDS instances by matching *.rds.amazonaws.com hostname patterns in the required labels.

Metric name	Required labels
traces_service_graph_request_client_seconds_count	client_server_address
traces_service_graph_request_client_seconds_count	server
traces_span_metrics_calls_total	server_address
traces_span_metrics_calls_total	net_peer_name
traces_spanmetrics_calls_total	net_peer_name