This is documentation for the next version of Mimir. For the latest stable release, go to the latest version.

Grafana Mimir operator and user guideMonitor Grafana MimirCollecting metrics and logs from Grafana Mimir

Collecting metrics and logs from Grafana Mimir

You can collect logs and metrics from a Mimir or GEM cluster. To set up dashboards and alerts, see Installing Grafana Mimir dashboards and alerts or Grafana Cloud: Self-hosted Grafana Mimir integration .

It is easier and best to monitor a cluster if it was installed via the Grafana Mimir Helm chart. It is also possible to use this integration if Mimir was deployed another way. For more information, see Collect metrics and logs without the Helm chart.

Collect metrics and logs from the Helm chart

To set up the collection of metrics and logs, follow the steps that are based on the version of the Helm chart that you deployed:

Collect metrics and logs via the Helm chart

Starting from version 3.0.0, the Helm chart sends metrics to a Prometheus-compatible server and sends logs to a Loki cluster. The chart can also scrape additional metrics from kube-state-metrics, kubelet, and cAdvisor. The Helm chart does not collect node_exporter metrics. For more information about node_exporter, see Additional resources metrics.

The Helm chart uses the Grafana Agent operator. Due to how Helm works, before it can use the operator, you need to manually install the Custom Resource Definitions (CRDs) for the Agent operator.

Credentials

If Prometheus and Loki are running without authentication, then you scan skip this section. Metamonitoring supports multiple ways of authentication for metrics and logs. If you are using a secret such as an API key to authenticate with Prometheus or Loki, then you need to create a Kubernetes secret with that secret.

This is an example secret:

apiVersion: v1
kind: Secret
metadata:
  name: metamonitoring-credentials
data:
  prometheus-api-key: FAKEACCESSKEY
  loki-api-key: FAKESECRETKEY

For information about how to create a Kubernetes secret, see Creating a Secret.

Helm chart values

Finally, merge the following YAML configuration into your Helm values file, and replace the values for url, username, passwordSecretName , and passwordSecretKey with the details of the Prometheus and Loki clusters, and the secret that you created. If your Prometheus and Loki servers are running without authentication, then remove the auth blocks from the YAML below.

metaMonitoring:
  serviceMonitor:
    enabled: true
  grafanaAgent:
    enabled: true
    installOperator: true

    logs:
      remote:
        url: "https://example.com/loki/api/v1/push"
        auth:
          username: "12345"
          passwordSecretName: "metamonitoring-credentials"
          passwordSecretKey: "prometheus-api-key"

    metrics:
      remote:
        url: "https://example.com/api/v1/push"
        auth:
          username: "54321"
          passwordSecretName: "metamonitoring-credentials"
          passwordSecretKey: "loki-api-key"

      scrapeK8s:
        enabled: true
        kubeStateMetrics:
          namespace: kube-system
          labelSelectors:
            app.kubernetes.io/name: kube-state-metrics

Collect metrics and logs via Grafana Agent

Older versions of the Helm chart need to be manually instrumented. This means that you need to set up a Grafana Agent that collects logs and metrics from Mimir or GEM. To set up Grafana Agent, see Set up Grafana Agent.

In the following example Grafana Agent configuration file for collecting logs and metrics, replace url, password, and username in the logs and metrics blocks with the details of your Prometheus and Loki clusters.

logs:
  configs:
    - clients:
        - basic_auth:
            password: xxx
            username: xxx
          url: https://example.com/loki/api/v1/push
      name: integrations
      positions:
        filename: /tmp/positions.yaml
      scrape_configs:
        - job_name: integrations/grafana-mimir-logs
          kubernetes_sd_configs:
            - role: pod
          pipeline_stages:
            - cri: {}
          relabel_configs:
            - action: keep
              regex: mimir-distributed-.*
              source_labels:
                - __meta_kubernetes_pod_label_helm_sh_chart
            - source_labels:
                - __meta_kubernetes_pod_node_name
              target_label: __host__
            - action: replace
              replacement: $1
              separator: /
              source_labels:
                - __meta_kubernetes_namespace
                - __meta_kubernetes_pod_container_name
              target_label: job
            - action: replace
              regex: ""
              replacement: k8s-cluster
              separator: ""
              source_labels:
                - cluster
              target_label: cluster
            - action: replace
              source_labels:
                - __meta_kubernetes_namespace
              target_label: namespace
            - action: replace
              source_labels:
                - __meta_kubernetes_pod_name
              target_label: pod
            - action: replace
              source_labels:
                - __meta_kubernetes_pod_container_name
              target_label: name
            - action: replace
              source_labels:
                - __meta_kubernetes_pod_container_name
              target_label: container
            - replacement: /var/log/pods/*$1/*.log
              separator: /
              source_labels:
                - __meta_kubernetes_pod_uid
                - __meta_kubernetes_pod_container_name
              target_label: __path__
      target_config:
        sync_period: 10s
metrics:
  configs:
    - name: integrations
      remote_write:
        - basic_auth:
            password: xxx
            username: xxx
          url: https://example.com/api/prom/push
      scrape_configs:
        - job_name: integrations/grafana-mimir/kube-state-metrics
          kubernetes_sd_configs:
            - role: pod
          metric_relabel_configs:
            - action: keep
              regex: (.*-mimir-)?alertmanager.*|(.*-mimir-)?compactor.*|(.*-mimir-)?distributor.*|(.*-mimir-)?(gateway|cortex-gw|cortex-gw).*|(.*-mimir-)?ingester.*|(.*-mimir-)?querier.*|(.*-mimir-)?query-frontend.*|(.*-mimir-)?query-scheduler.*|(.*-mimir-)?ruler.*|(.*-mimir-)?store-gateway.*
              separator: ""
              source_labels:
                - deployment
                - statefulset
                - pod
          relabel_configs:
            - action: keep
              regex: kube-state-metrics
              source_labels:
                - __meta_kubernetes_pod_label_app_kubernetes_io_name
            - action: replace
              regex: ""
              replacement: k8s-cluster
              separator: ""
              source_labels:
                - cluster
              target_label: cluster
        - bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
          job_name: integrations/grafana-mimir/kubelet
          kubernetes_sd_configs:
            - role: node
          metric_relabel_configs:
            - action: keep
              regex: kubelet_volume_stats.*
              source_labels:
                - __name__
          relabel_configs:
            - replacement: kubernetes.default.svc.cluster.local:443
              target_label: __address__
            - regex: (.+)
              replacement: /api/v1/nodes/${1}/proxy/metrics
              source_labels:
                - __meta_kubernetes_node_name
              target_label: __metrics_path__
            - action: replace
              regex: ""
              replacement: k8s-cluster
              separator: ""
              source_labels:
                - cluster
              target_label: cluster
          scheme: https
          tls_config:
            ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            insecure_skip_verify: false
            server_name: kubernetes
        - bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
          job_name: integrations/grafana-mimir/cadvisor
          kubernetes_sd_configs:
            - role: node
          metric_relabel_configs:
            - action: keep
              regex: (.*-mimir-)?alertmanager.*|(.*-mimir-)?compactor.*|(.*-mimir-)?distributor.*|(.*-mimir-)?(gateway|cortex-gw|cortex-gw).*|(.*-mimir-)?ingester.*|(.*-mimir-)?querier.*|(.*-mimir-)?query-frontend.*|(.*-mimir-)?query-scheduler.*|(.*-mimir-)?ruler.*|(.*-mimir-)?store-gateway.*
              source_labels:
                - pod
          relabel_configs:
            - replacement: kubernetes.default.svc.cluster.local:443
              target_label: __address__
            - regex: (.+)
              replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
              source_labels:
                - __meta_kubernetes_node_name
              target_label: __metrics_path__
            - action: replace
              regex: ""
              replacement: k8s-cluster
              separator: ""
              source_labels:
                - cluster
              target_label: cluster
          scheme: https
          tls_config:
            ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            insecure_skip_verify: false
            server_name: kubernetes
        - job_name: integrations/grafana-mimir/metrics
          kubernetes_sd_configs:
            - role: pod
          relabel_configs:
            - action: keep
              regex: .*metrics
              source_labels:
                - __meta_kubernetes_pod_container_port_name
            - action: keep
              regex: mimir-distributed-.*
              source_labels:
                - __meta_kubernetes_pod_label_helm_sh_chart
            - action: replace
              regex: ""
              replacement: k8s-cluster
              separator: ""
              source_labels:
                - cluster
              target_label: cluster
            - action: replace
              source_labels:
                - __meta_kubernetes_namespace
              target_label: namespace
            - action: replace
              source_labels:
                - __meta_kubernetes_pod_name
              target_label: pod
            - action: replace
              source_labels:
                - __meta_kubernetes_pod_container_name
              target_label: container
            - action: replace
              separator: ""
              source_labels:
                - __meta_kubernetes_pod_label_name
                - __meta_kubernetes_pod_label_app_kubernetes_io_component
              target_label: __tmp_component_name
            - action: replace
              separator: /
              source_labels:
                - __meta_kubernetes_namespace
                - __tmp_component_name
              target_label: job
            - action: replace
              source_labels:
                - __meta_kubernetes_pod_node_name
              target_label: instance
  global:
    scrape_interval: 15s
  wal_directory: /tmp/grafana-agent-wal

Collect metrics and logs without the Helm chart

You can still use the dashboards and rules in the monitoring-mixin, even if Mimir or GEM is not deployed via the Helm chart or if you are using the deprecated enterprise-metrics Helm chart for GEM. As a starting point, use the Agent configuration from Collect metrics and logs via Grafana Agent. You might need to modify it. For more information, see dashboards and alerts requirements.

Service discovery

The Agent configuration relies on Kubernetes service discovery and pod labels to constrain the collected metrics and logs to ones that are strictly related to the Helm chart. If you are deploying Grafana Mimir on something other than Kubernetes, then replace the kubernetes_sd_configs block with a block from the Agent configuration that can discover the Mimir processes.