Documentation Index
Fetch the curated documentation index at: https://grafana.com/llms.txt
Fetch the complete documentation index at: https://grafana.com/llms-full.txt
Use this file to discover all available pages before exploring further.
STOP! If you are an AI agent or LLM, read this before continuing. This is the HTML version of a Grafana documentation page. Always request the Markdown version instead - HTML wastes context. Get this page as Markdown: https://grafana.com/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/monitor-non-standard-workloads.md (append .md) or send Accept: text/markdown to https://grafana.com/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/monitor-non-standard-workloads/. For the curated documentation index, use https://grafana.com/llms.txt. For the complete documentation index, use https://grafana.com/llms-full.txt.
Monitor non-standard workloads
Kubernetes ships with a handful of built-in workload controllers: Deployment, StatefulSet, DaemonSet, Job, and CronJob. Anything that schedules Pods through a different mechanism is “non-standard”: either a custom resource definition (CRD) managed by an operator, or Pods created with no owner reference at all.
Non-standard workloads matter for fleet visibility for two reasons:
- Blind spots in rollup views. Most dashboards and tools aggregate by well-known owner kinds. A Pod owned by an
argo.argoproj.io/Rolloutorkafka.strimzi.io/KafkaNodePooldoesn’t appear in theDeploymentorStatefulSetsummary, leaving capacity, restart counts, and failure signals invisible at the fleet level. - Different failure modes. Each controller has its own readiness model and update strategy. A canary
Rolloutfailing its analysis looks nothing like a crashingDeploymentreplica. The signals and remediation paths differ.
Find non-standard workloads in your fleet
You can find non-standard workloads, including:
- Argo Rollouts
- Strimzi Pod sets
- Unmanaged (or static) Pods
- Bare Pods
Navigate to the Workloads main page and filter the Type column.

Jobs and CronJobs have their own page with a separate Type filter. Refer to Monitor jobs.
Argo Rollouts
Argo Rollouts replaces the standard Deployment controller with a Rollout custom resource that adds canary and blue/green update strategies. The owner chain is Rollout → ReplicaSet → Pod, the same shape as a Deployment, so Pods are owned by ReplicaSet objects, not directly by the Rollout.
Why it matters for monitoring:
- A
Rolloutcan hold traffic on the stable revision while the canary revision is still running. Pod counts alone don’t tell you which revision is active. AnalysisRunobjects emit pass or fail signals tied to metric queries. A failed analysis pauses the rollout but leaves two Pod sets running simultaneously, doubling resource consumption with noDeployment-level alert firing.
What to watch:
- The
argo_rollout_phasegauge: values areHealthy,Progressing,Paused,Degraded, andUnknown. ARolloutstuck inProgressingfor an extended period often signals a failed analysis or a paused canary. - Restart counts and OOMKill events scoped to the canary
Rolloutname label. - The Pod-to-revision label
rollouts-pod-template-hashto distinguish stable from canary traffic.
Strimzi PodSets
Strimzi, the Kafka operator for Kubernetes, replaced StatefulSet with its own StrimziPodSet custom resource definition (CRD) as the default in Strimzi 0.35. KafkaNodePool, a separate CRD for managing node pools, became generally available in Strimzi 0.41. Pods are still first-class Kubernetes objects, but their owner chain runs through strimzi.io resources, not apps/v1.
Why it matters for monitoring:
- Kafka brokers and controllers require strict quorum. Losing one broker in a three-Node Cluster is a partial outage even if Kubernetes reports the Pod as
Running. The JVM may be live but the broker may not have rejoined the in-sync replicas (ISR). - Standard workload health checks (available replicas ≥ desired) don’t apply. Kafka health is expressed through Kafka-level metrics: under-replicated partitions, ISR shrink rate, and leader elections.
What to watch:
kafka_server_replicamanager_underreplicatedpartitions: a non-zero value means data risk.- The Strimzi operator condition
Ready=Falseon theKafkaorKafkaNodePoolcustom resource. - Pod owner labels
strimzi.io/clusterandstrimzi.io/namefor grouping in queries.
Static and unmanaged Pods
Static Pods are defined as manifest files on a Node’s filesystem (default: /etc/kubernetes/manifests/) and managed directly by the kubelet, not the API server’s controllers. Control-plane components (kube-apiserver, kube-scheduler, kube-controller-manager, and etcd) are typically static Pods on self-managed Clusters.
Unmanaged Pods are API-created Pods with no owner reference, usually the result of a kubectl run invocation, a misconfigured operator, or a debugging session that was never cleaned up. Modern versions of kubectl run create a bare Pod by default, so it’s easy to leave one behind without realizing it.
Why it matters for monitoring:
- Static Pods aren’t rescheduled if the Node fails; they’re tightly coupled to a single Node. A
NotReadyNode means the static Pod is gone until the Node recovers. - Unmanaged Pods are rescheduling orphans. If they’re evicted or OOMKilled, they disappear permanently. They also frequently represent forgotten resource consumers that escape capacity planning.
What to watch:
- Pods without owners. In Prometheus,
kube_pod_owner{owner_kind=""}surfaces these Pods. - The static Pod label
kubernetes.io/config.source=filedistinguishes them from unmanaged API Pods. - Node-scoped restarts and eviction events for static Pods tied to control-plane health.
Bare Pods
Bare Pods are a subset of unmanaged Pods created intentionally without a controller. They’re common in batch workloads, one-off migrations, and operator-injected sidecar bootstrapping. Unlike accidental unmanaged Pods, bare Pods are a deliberate pattern, but they carry the same observability gap.
Why it matters for monitoring:
- No controller means no automatic restart on failure and no replica health signal. A bare Pod that exits with code
0looks identical to one that crashed, so you need exit code and reason tracking explicitly. - Bare Pods often run privileged or with elevated permissions for maintenance tasks. Tracking their lifecycle (start time, runtime, termination reason) matters for both capacity and security posture.
What to watch:
kube_pod_container_status_last_terminated_reasondistinguishesOOMKilled,Error, andCompleted.kube_pod_start_timecombined with the absence of a matching owner reference detects long-lived bare Pods.- Namespace and label conventions, for example
app.kubernetes.io/managed-by=manual, to separate intentional from accidental bare Pods.
Was this page helpful?
Related resources from Grafana Labs


