Grafana Mimir version 2.2 release notes
Grafana Labs is excited to announce version 2.2 of Grafana Mimir, the most scalable, most performant open source time series database in the world.
The highlights that follow include the top features, enhancements, and bugfixes in this release. If you are upgrading from Grafana Mimir 2.1, there is upgrade-related information as well. For the complete list of changes, see the Changelog.
Features and enhancements
Support for ingesting out-of-order samples: Grafana Mimir includes new, experimental support for ingesting out-of-order samples. This support is configurable, and it allows you to set how far out-of-order Mimir accepts samples on a per-tenant basis. This feature still needs additional testing; we do not recommend using it in a production environment. For more information, see Configuring out-of-order samples ingestion
Improved error messages: The error messages that Mimir reports are more human readable, and the messages include error codes that are easily searchable. For error descriptions, see the Grafana Mimir runbooks’ Errors catalog.
Configurable prefix for object storage: Mimir can now store block data, rules, and alerts in one bucket, with each under its own user-defined prefix, rather than requiring one bucket for each. You can configure the storage prefix by using
-<storage>.storage-prefixoption for corresponding storage:
Store-gateway performance optimization The store-gateway can now pre-populate the file system cache when memory-mapping index-header files. This avoids the store-gateway from appearing to be stuck while loading index-headers. This feature is experimental and disabled by default; enable it using the flag
Faster ingester startup: Ingesters now replay their WALs (write ahead logs) about 50% faster, and they also re-join the ring sooner under some conditions.
Helm Chart improvements: The Mimir Helm chart is the best way to install Mimir on Kubernetes. As part of the Mimir 2.2 release, we’re also releasing version 3.0 of the Helm chart. Notable enhancements follow. For the full list of changes, see the Helm chart changelog.
- The Helm chart now supports OpenShift.
- The Helm chart can now easily deploy Grafana Agent in order to scrape metrics and logs from all Mimir pods, and ship them to a remote store, which makes it easier to monitor the health of your Mimir installation. For more information, see Collecting metrics and logs from Grafana Mimir.
- The Helm chart now enables multi-tenancy by default. This makes it easy for you to add tenants as you grow your cluster. You can take advantage of Mimir’s per-tenant quality-of-service features, which improves stability and resilience at high scale. To learn more about how multi-tenancy in Mimir works, see Grafana Mimir authorization and authentication. This change is backwards-compatible. To read about how we implemented this, see #2117.
- We have significantly improved the configuration experience for the Helm chart, and here are a few of the most salient changes:
- We’ve added an
extraEnvFromcapability to all Mimir services to enable you to inject secrets via environment variables.
- We’ve made it possible to globally set environment variables and inject secrets across all pods in the chart using
global.extraEnvFrom. Note that the memcached and minio pods are not included.
- We’ve switched the default storage of the Mimir configuration from a
ConfigMap, which makes it easier to quickly see the differences between your Mimir configurations between upgrades. We especially like the Helm diff plugin for this purpose.
- We’ve added a
structuredConfigoption, which allows you to overwrite specific key-value pairs in the
mimir.configtemplate, which saves you from having to maintain the entire
mimir.configin your own
- We’ve added the ability to create global pod annotations. This unlocks the ability to trigger a restart of all services in response to a single event, such as the update of the secret containing Mimir’s storage credentials.
- We’ve added an
- We’ve set the chart to disable
-distributor.extend-writes, for a smoother upgrade experience. Rolling restarts of ingesters are now less likely to cause spikes in resource usage.
- We’ve improved the documentation for the Helm chart by adding a Getting started with Mimir using the Helm chart.
- We’ve added a smoke test for your Mimir cluster to help catch errors immediately after you install or upgrade Mimir via the Helm chart.
All deprecated API endpoints that are under
/prometheus/rules* have now been removed from the ruler component in favor of identical endpoints that use the prefix
In Grafana Mimir 2.2, we have updated default values and some parameters to give you a better out-of-the-box experience:
Message size limits for gRPC messages that are exchanged between internal Mimir components have increased to 100 MiB from 4 MiB. This helps to avoid internal server errors when pushing or querying large data.
-blocks-storage.bucket-store.ignore-blocks-withinparameter changed from
10h. The default value of
12h. For most-recent data, both changes improve query performance by querying only the ingesters, rather than object storage.
-querier.shuffle-sharding-ingesters-lookback-periodhas been deprecated. If you previously changed this option from its default of
trueand specify the lookback period by setting the
-memberlist.abort-if-join-failsparameter now defaults to
false. When Mimir is using memberlist as the backend store for its hash ring, and it fails to join the memberlist cluster, Mimir no longer aborts startup by default.
If you have used a previous version of the Mimir Helm chart, you must address some of the chart’s breaking changes before upgrading to helm chart version 3.0. For a detailed information about how to do this, see Upgrade the Grafana Mimir Helm chart from version 2.1 to 3.0.
- PR 1883: Fixed a bug that caused the query-frontend and querier to crash when they received a user query with a special regular expression label matcher.
- PR 1933: Fixed a bug in the ingester ring page, which showed incorrect status of entries in the ring.
- PR 2090: Ruler in remote rule evaluation mode now applies the timeout correctly. Previously the ruler could get stuck forever, which halted rule evaluation.
- PR 2036: Fixed panic at startup when Mimir is running in monolithic mode and query sharding is enabled.