Menu
Grafana Cloud

Troubleshoot Kubernetes Monitoring

This section includes common errors encountered while installing and configuring Kubernetes Monitoring components.

Resolve missing Efficiency data

If your Efficiency view shows no data, it could be due to missing Node Exporter metrics. Navigate to Configuration in the main menu, and click the Cluster status tab to determine what is not being reported.

Resolve missing metrics

After configuration, if you are missing metrics even though the Metrics status tab under Configuration is showing the configuration is set up as you intended, check your configuration for an incorrectly configured label for the Node exporter instance.

Make sure the Node exporter instance label is set to the Node name. The labels for kube-state-metrics node and Node exporter instance must contain the same values.

Resolve update error

If you attempted to upgrade Kubernetes Monitoring with the Update button on the Settings tab and received an error message, complete the following instructions.

Warning

When you uninstall Agent, this deletes its associated dashboard folder and its alert and recording rule namespace. Custom dashboards or alerts added to the default locations are also removed. Save a copy any customized item if you modified the provisioned version.
  1. Click Uninstall dashboards and alert rules.
  2. Click Install dashboards and alert rules to reinstall.
  3. Complete the instructions in Configure with Grafana Kubernetes Monitoring Helm chart.

OpenShift support

With OpenShift’s default SecurityContextConstraints (scc) of restricted (refer to the scc documentation for more info), you may run into the following errors while deploying Grafana Agent using the default generated manifests:

msg="error creating the agent server entrypoint" err="creating HTTP listener: listen tcp 0.0.0.0:80: bind: permission denied"

By default, the Agent StatefulSet container attempts to bind to port 80, which is only allowed by the root user (0) and other privileged users. With the default restricted SCC on OpenShift, this will result in the above error.

Events:
  Type     Reason        Age                   From                  Message
  ----     ------        ----                  ----                  -------
  Warning  FailedCreate  3m55s (x19 over 15m)  daemonset-controller  Error creating: pods "grafana-agent-logs-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.containers[0].securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000650000, 1000659999], spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]

By default, the Agent DaemonSet attempts to run as root user and also attempts to access directories on the host (to tail logs). With the default restricted SCC on OpenShift, this will result in the above error.

To solve these errors, use the hostmount-anyuid SCC provided by OpenShift, which allows containers to run as root and mount directories on the host.

If this does not meet your security needs, create a new SCC with the required tailored permissions, or investigate running Agent as a non-root container, which goes beyond the scope of this troubleshooting guide.

To use the hostmount-anyuid SCC, add the following stanza to the grafana-agent and grafana-agent-logs ClusterRoles:

yaml
. . .
- apiGroups:
  - security.openshift.io
  resources:
  - securitycontextconstraints
  verbs:
  - use
  resourceNames:
  - hostmount-anyuid
. . .