Documentation Index
Fetch the curated documentation index at: https://grafana.com/llms.txt
Fetch the complete documentation index at: https://grafana.com/llms-full.txt
Use this file to discover all available pages before exploring further.
STOP! If you are an AI agent or LLM, read this before continuing. This is the HTML version of a Grafana documentation page. Always request the Markdown version instead - HTML wastes context. Get this page as Markdown: https://grafana.com/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/triage-your-infrastructure/manage-availability.md (append .md) or send Accept: text/markdown to https://grafana.com/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/triage-your-infrastructure/manage-availability/. For the curated documentation index, use https://grafana.com/llms.txt. For the complete documentation index, use https://grafana.com/llms-full.txt.
Manage availability
The Availability section on Kubernetes Overview answers one question: is your infrastructure currently able to serve user traffic? It flags things that exist on paper but aren’t actually available.
Availability checks identify workloads and nodes that are down or unable to serve traffic.

Click View detail on any tile to see the affected items listed under Detail view at the bottom of the page.
Zero replica deployments
These are deployments that are configured to run at least one replica but have zero available replicas running. The workload is fully down. This excludes deployments intentionally scaled to zero.
Deployment rollout issues
These are deployments whose rollout has one of these conditions:
Not Progressingmeans the deployment controller has not made progress within the deadline.Replica Failuremeans at least one replica Pod could not be created or deleted.
Nodes not ready
These are Nodes where the Ready condition is False or Unknown. A NotReady node prevents new Pods from being scheduled and may disrupt running workloads. The Status column distinguishes a confirmed NotReady state from a transient Unknown state (meaning the node is unreachable).
kubelet crash or failure to report status, Node running out of memory, disk, or PIDs, network connectivity loss between the Node and the control plane, underlying VM or hardware failure, expired Node certificates, kernel or OS-level crash.kubelet logs and Node events. Restart the kubelet, free up Node resources, restore network connectivity, renew certificates, or replace the failed Node.Pods not ready
These are Pods in the Running phase that are failing their readiness probe. They are excluded from Service endpoints and are not receiving traffic.
Was this page helpful?
Related resources from Grafana Labs


