Menu
Grafana Cloud

Troubleshoot Kubernetes Monitoring

This section includes common errors encountered while installing and configuring Kubernetes Monitoring components. If you used the easy configuration with Grafana Kubernetes Monitoring Helm chart, refer to Helm chart overview for more information.

Resolve missing Efficiency data

If your Efficiency view shows no data, it could be due to missing Node Exporter metrics. Navigate to Configuration in the main menu, and click the Cluster status tab to determine what is not being reported.

Resolve missing metrics

After configuration, if you are missing metrics even though the Metrics status tab under Configuration is showing the configuration is set up as you intended, check your configuration for an incorrectly configured label for the Node exporter instance.

Make sure the Node exporter instance label is set to the Node name. The labels for kube-state-metrics node and Node exporter instance must contain the same values.

Resolve update error

If you attempted to upgrade Kubernetes Monitoring with the Update button on the Settings tab under Configuration and received an error message, complete the following instructions.

Warning

When you uninstall Grafana Alloy, this deletes its associated alert and recording rule namespace. Alerts added to the default locations are also removed. Save a copy any customized item if you modified the provisioned version.
  1. Click Uninstall.
  2. Click Install to reinstall.
  3. Complete the instructions in Configure with Grafana Kubernetes Monitoring Helm chart.

Resolve duplicate metrics

View the Cardinality page in the app to narrow down where your active series are originating from.

OpenShift support

With OpenShift’s default SecurityContextConstraints (scc) of restricted (refer to the scc documentation for more info), you may run into the following errors while deploying Grafana Alloy using the default generated manifests:

msg="error creating the agent server entrypoint" err="creating HTTP listener: listen tcp 0.0.0.0:80: bind: permission denied"

By default, the Alloy StatefulSet container attempts to bind to port 80, which is only allowed by the root user (0) and other privileged users. With the default restricted SCC on OpenShift, this results in the preceding error.

Events:
  Type     Reason        Age                   From                  Message
  ----     ------        ----                  ----                  -------
  Warning  FailedCreate  3m55s (x19 over 15m)  daemonset-controller  Error creating: pods "grafana-agent-logs-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.containers[0].securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000650000, 1000659999], spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]

By default, the Alloy DaemonSet attempts to run as root user and also attempts to access directories on the host (to tail logs). With the default restricted SCC on OpenShift, this results in the preceding error.

To solve these errors, use the hostmount-anyuid SCC provided by OpenShift, which allows containers to run as root and mount directories on the host.

If this does not meet your security needs, create a new SCC with the required tailored permissions, or investigate running Agent as a non-root container, which goes beyond the scope of this troubleshooting guide.

To use the hostmount-anyuid SCC, add the following stanza to the alloy and alloy-logs ClusterRoles:

yaml
. . .
- apiGroups:
  - security.openshift.io
  resources:
  - securitycontextconstraints
  verbs:
  - use
  resourceNames:
  - hostmount-anyuid
. . .