0005: Loki mixin configuration improvements
Documentationbreadcrumb arrow Grafana Lokibreadcrumb arrow Communitybreadcrumb arrow Loki Improvement Documents (LIDs)breadcrumb arrow 0005: Loki mixin configuration improvements
Open source

0005: Loki mixin configuration improvements

Author: Alexandre Chouinard (Daazku@gmail.com)

Date: 03/2025

Sponsor(s): N/A

Type: Feature

Status: Draft

Related issues/PRs:

Thread from mailing list: N/A


Background

There is no easy way to set up dashboards and alerts for Loki on a pre-existing Prometheus stack that does not use the Prometheus Operator with a specific configuration.

The metrics selectors are hardcoded, making the dashboard unusable without manual modifications in many cases. It is assumed that job, cluster, namespace, container and/or a combination of other labels are present on metrics and have very specific values.

Problem Statement

This renders the dashboards and alerts unusable for setups that do not conform to the current assumptions about which label(s) should be present in the metrics.

A good example of that would be the “job” label used everywhere: job=~\"$namespace/bloom-planner\"

Usually the job label refer to the task name used to scrape the targets, as per Prometheus documentation, and in k8s, if you are not using prometheus-operator with ServiceMonitor, it’s pretty common to have something like this as a scraping config:

        - job_name: "kubernetes-pods" # Can actually be anything you want.
          kubernetes_sd_configs:
            - role: pod
          relabel_configs:
            # Cluster label is "required" by kubernetes-mixin dashboards
            - target_label: cluster
              replacement: '${cluster_label}'
            ...

which would scrape all pods and yield something like:

up{job="kubernetes-pods", ...}

Right off the bat, that makes the dashboards unusable because it’s incompatible with what is hardcoded in the dashboards and alerts.

Goals

Ideally, selectors should default to the values required internally by Grafana but remain configurable so users can tailor them to their setup.

A good example of this is how kubernetes-monitoring/kubernetes-mixin did it: https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/1fa3b6731c93eac6d5b8c3c3b087afab2baabb42/config.libsonnet#L20-L33 Every possible selector is configurable and thus allow for various setup to properly work.

The structure is already there to support this. It just has not been leveraged properly.

Non-Goals (optional)

It would be desirable to create some automated checks verifying that all metrics used in dashboard and alerts are using the proper selector(s) from the configuration. There are many issues in the repository about new dashboards or dashboard updates not using the proper labels on metrics.

Proposals

Proposal 0: Do nothing

This forces the community to either manually edit the dashboards/alerts or conform to a specific metric collection approach for Loki.

Proposal 1: Allow metrics label selectors to be configurable

This will require a good amount of refactoring.

It allows easier adoption of the “official” dashboards and alerts by the community.

Define once, reuse everywhere. (Currently, updating requires extensive search and replace.)

Other Notes

If this proposal is accepted, I am willing to do the necessary work to move it forward.