Otel - kubeletstats
This dashboard shows important information from pods running in a kubernetes cluster. It relies on metrics collected using Otel’s Kubelet Stats receiver and Otel’s Kubernetes cluster receiver. The following snippet shows an OpenTelemetry collector’s configuration that should scrape the metrics required by this dashboard:
receivers:
kubeletstats:
auth_type: serviceAccount
collection_interval: 20s
endpoint: https://${env:OTEL_K8S_NODE_NAME}:10250
extra_metadata_labels:
- k8s.volume.type
insecure_skip_verify: true
metric_groups:
- container
- pod
- volume
- node
k8s_cluster:
allocatable_types_to_report:
- cpu
- memory
- storage
- ephemeral-storage
collection_interval: 15s
node_conditions_to_report:
- Ready
- MemoryPressure
processors:
resource/remove_container_id:
attributes:
- action: delete
key: container.id
- action: delete
key: container_id
exporters:
prometheusremotewrite/local:
endpoint: http://prometheus-server/api/v1/write
resource_to_telemetry_conversion:
enabled: true
service:
extensions:
- health_check
- memory_ballast
pipelines
metrics:
exporters:
- prometheusremotewrite/local
processors:
- resource/remove_container_id
receivers:
- kubeletstats
- k8s_cluster
Two things from that config snippet should be explained further:
- container_id label is removed from metrics. The biggest reason why this is done is because, when containers get restarted, they will create a new time series (because container_id labe is different). This demands more resources from prometheus and would break some panels (like rate(containers_restart))
- prometheusremotewrite exporter enables resource_to_telemetry_conversion is required to export existing attributes as prometheus labels.
Data source config
Collector config:
Upload an updated version of an exported dashboard.json file from Grafana
Revision | Description | Created | |
---|---|---|---|
Download |