Grafana Mimir version 2.15 release notes

Grafana Labs is excited to announce version 2.15 of Grafana Mimir.

The highlights that follow include the top features, enhancements, and bug fixes in this release. For the complete list of changes, refer to the CHANGELOG.

Features and enhancements

S2 compression for gRPC is now supported using the following flags:

-alertmanager.alertmanager-client.grpc-compression=s2
-ingester.client.grpc-compression=s2
-querier.frontend-client.grpc-compression=s2
-querier.scheduler-client.grpc-compression=s2
-query-frontend.grpc-client-config.grpc-compression=s2
-query-scheduler.grpc-client-config.grpc-compression=s2
-ruler.client.grpc-compression=s2
-ruler.query-frontend.grpc-client-config.grpc-compression=s2

Distributors now support lz4 OTLP compression, and you can deploy them in multiple availability zones.

The ruler’s <prometheus-http-prefix>/api/v1/rules endpoint now supports the exclude_alerts, group_limit, and group_next_token parameters.

mimirtool’s analyze ruler and analyze prometheus commands now support bearer tokens.

You can now tune HTTP client settings for GCS and Azure backends via an http block or corresponding CLI flags.

The compactor now refreshes deletion marks concurrently when updating the bucket index.

You can now set the number of Memcached replicas for each type of cache using configuration settings when using jsonnet.

Important changes

In Grafana Mimir 2.15, the following behavior has changed:

The following alertmanager metrics are not exported for a user label when the metric value is zero:

cortex_alertmanager_alerts_received_total
cortex_alertmanager_alerts_invalid_total
cortex_alertmanager_partial_state_merges_total
cortex_alertmanager_partial_state_merges_failed_total
cortex_alertmanager_state_replication_total
cortex_alertmanager_state_replication_failed_total
cortex_alertmanager_alerts
cortex_alertmanager_silences

PromQL compatibility has been upgraded from Prometheus 2.0 to 3.0. For more details, refer to the Prometheus documentation. The following changes are of note:

The . pattern in regular expressions in PromQL now matches newline characters.
Lookback and range selectors are left-open and right-closed. They were previously left-closed and right-closed.
Native histograms now use exponential interpolation.

Backwards compatibility in dashboards and alerts for thanos_memcached_-prefixed metrics has been removed. These metrics were removed in 2.12 in favor of thanos_cache_-prefixed metrics.

Experimental support for Redis as a cache backend has been removed from jsonnet.

The following deprecated configuration options were removed in this release:

-distributor.direct-otlp-translation-enabled, which has been enabled by default since 2.13 and is now considered stable.
-query-scheduler.prioritize-query-components, which is now always enabled.
-api.get-request-for-ingester-shutdown-enabled, a deprecated experimental flag which has been marked for removal in 2.15.

Experimental features

Grafana Mimir 2.15 includes some features that are experimental and disabled by default. Use these features with caution and report any issues that you encounter:

You can now enable Mimir’s experimental PromQL engine with -querier.query-engine=mimir. This new engine provides improved performance and reduced querier resource consumption. However, it only supports a subset of all PromQL features. It falls back to the regular Prometheus engine for queries containing unsupported features.

You can now use the query-frontend to cache non-transient errors using the experimental flags -query-frontend.cache-errors and -query-frontend.results-cache-ttl-for-errors.

The query-frontend and querier both support an experimental PromQL function, double_exponential_smoothing, which you can enable by setting -querier.promql-experimental-functions-enabled=true and -query-frontend.promql-experimental-functions-enabled=true.

The ingester now supports out-of-order native histogram ingestion via the flag -ingester.ooo-native-histograms-ingestion-enabled.

The ingester can now build 24h blocks for out-of-order data which is more than 24 hours old, using the setting -blocks-storage.tsdb.bigger-out-of-order-blocks-for-old-samples.

The ruler now supports caching the contents of rule groups via the setting -ruler-storage.cache.rule-group-enabled.

The distributor now supports promotion of OTel resource attributes to labels via the setting -distributor.promote-otel-resource-attributes.

Bug fixes

Alerts: Fix autoscaling metrics joins in MimirAutoscalerNotActive when series churn.
Alerts: Exclude failed cache “add” operations from alerting since failures are expected in normal operation.
Alerts: Exclude read-only replicas from IngesterInstanceHasNoTenants alert.
Alerts: Use resident set memory for the EtcdAllocatingTooMuchMemory alert so that ephemeral file cache memory doesn’t cause the alert to misfire.
Dashboards: Fix autoscaling metrics joins when series churn.
Distributor: Fix pooling buffer reuse logic when -distributor.max-request-pool-buffer-size is set.
Ingester: Fix issue where active series requests error when encountering a stale posting.
Ingester: Fix race condition in per-tenant TSDB creation.
Ingester: Fix race condition in exemplar adding.
Ingester: Fix race condition in native histogram appending.
Ingester: Fix bug in concurrent fetching where a failure to list topics on startup would cause an invalid topic ID (0x00000000000000000000000000000000).
Ingester: Fix data loss bug in the experimental ingest storage when a Kafka Fetch is split into multiple requests and some of them return an error.
Ingester: Fix bug where chunks could have one unnecessary zero byte at the end.
OTLP: Support integer exemplar value type.
OTLP receiver: Preserve colons and combine multiple consecutive underscores into one when generating metric names in suffix adding mode (-distributor.otel-metric-suffixes-enabled).
Prometheus: Fix issue where negation of native histograms (e.g. -some_native_histogram_series) did nothing.
Prometheus: Always return unknown hint for first sample in non-gauge native histograms chunk to avoid incorrect counter reset hints when merging chunks from different sources.
Prometheus: Ensure native histograms counter reset hints are corrected when merging results from different sources.
PromQL: Fix issue where functions such as rate() over native histograms could return incorrect values if a float stale marker was present in the selected range.
PromQL: round now removes the metric name again.
PromQL: Fix issue where metric might not be a counter, name does not end in _total/_sum/_count/_bucket annotation would be emitted even if rate or increase did not have enough samples to compute a result.
Querier: Fix the behavior of binary operators between native histograms and floats.
Querier: Fix stddev+stdvar aggregations to always ignore native histograms, and to treat infinity consistently.
Query-frontend: Fix issue where sharded queries could return annotations with incorrect or confusing position information.
Query-frontend: Fix issue where downstream consumers may not generate correct cache keys for experimental error caching.
Query-frontend: Support X-Read-Consistency-Offsets on labels queries too.
Ruler: Fix issue when using the experimental -ruler.max-independent-rule-evaluation-concurrency feature, where the ruler could panic as it updates a running ruleset or shutdowns.