Version 2.0 release notes
Grafana Labs is excited to announce the release of Grafana Enterprise Metrics (GEM) 2.0. GEM 2.0 is built on top of Grafana Mimir 2.0, which is the most scalable, most performant open source time series database in the world. To learn more about the Grafana Mimir project, see our announcement blog post as well as the Grafana Mimir documentation.
GEM 2.0 inherits all of the highlights of Grafana Mimir 2.0, including easy deployment, native multi-tenancy, high availability, durable long-term metrics storage, and exceptional query performance. GEM includes additional functionality that enterprise users need to operate a time series database at their organization, including simplified tenant management, fine-grained access controls, and out-of-the box self-monitoring and usage data.
GEM 2.0 is a major version upgrade from the previous version of GEM, 1.7, and represents a major improvement in terms of usability and functionality over its predecessor. However, some of these improvements introduce breaking changes for those upgrading from GEM 1.7. We did our best to weigh the tradeoffs between the one-time pain and long-term benefits, such as a simplified configuration experience, and reduced maintenance burden, when deciding which breaking changes to introduce. For more information about upgrading, see Upgrade Considerations.
Although GEM releases have historically been based on an arbitrary commit hash of Cortex, going forward we will base GEM releases off of a named release of Grafana Mimir. Doing so simplifies the process of understanding what changes from Grafana Mimir are present in GEM.
Features and enhancements
The majority of features and enhancements in GEM 2.0 are inherited from Grafana Mimir 2.0. Therefore, start by reading about the features and enhancements in the Grafana Mimir 2.0 release notes, as well as the Grafana Mimir 2.0 documentation.
The following list highlights the features and enhancements in Grafana Enterprise Metrics 2.0:
The following features were released in Grafana Mimir 2.0, but have some GEM 2.0 specific implications:
- GEM 2.0 inherits Grafana Mimir 2.0’s experimental support for cross-tenant alerting and recording rules. GEM 2.0 seamlessly integrates with GEM’s access control model to ensure that recording and alerting rules can only be created and executed after proper authorization. For more information, see Cross tenant alerting and recording rule federation
- In Grafana Mimir 2.0, we open sourced and promoted to stable the split-and-merge compaction algorithm that was initially released as an experimental closed-source feature in GEM 1.6. Both Grafana Mimir 2.0 and GEM 2.0 utilize the split-and-merge compaction strategy by default. The closed-source time-sharding compaction strategy introduced in GEM 1.4 has been removed because it is redundant. The split-and-merge compactor contains the vertical scaling capability the time-sharding algorithm introduced and adds horizontal scaling as well. For more information on how to transition from the time-sharded compactor to the split-and-merge compactor, see the migration guide. To learn more about the new split-and-merge compactor, see Compaction in the Grafana Mimir 2.0 docs.
- In Grafana Mimir 2.0, we open sourced the experimental query-sharding feature that was introduced in GEM 1.6. This feature has been promoted to stable, and new documentation about how to enable and configure it is in the Query sharding section of the Grafana Mimir 2.0 docs.
We’ve released a version 3 of the admin-api that removes the ability to hard delete tenants, access policies, and tokens via the admin-api. Instead, when users no longer want a tenant, access policy, or token, they are expected to soft delete it by marking it “inactive”. Moving to soft deletes allows us to eliminate several race conditions and cache invalidation problems that caused unexpected behavior with hard deletes. For more information about the v3 API, see the Admin-API documentation.
For those running GEM on Kubernetes, we’ve released a new version of the GEM Helm chart alongside our GEM 2.0 release. The chart itself includes multiple changes including most notably that the chart was renamed to
mimir-distributed. There will now be 1 Helm chart for installing both OSS Grafana Mimir and Grafana Enterprise Metrics, with users using the
enterprise.enabledflag to choose between the two. For more detail, see the Helm chart README.
We’ve made two improvements to GEM’s self-monitoring functionality:
- We added several panels to the Compactor dashboard. They display the time since the last successful compaction run and indicate when too much time has elapsed. This can help flag a malfunctioning or under-resourced compactor.
- GEM can now emit exemplars for its self-monitoring counter and histogram metrics, making it easier to jump from self-monitoring metrics to representative traces. For information on how to enable, see the exemplars section of the self-monitoring documentation.
The per-tenant limit
compactor_blocks_retention_periodcan now be set via the admin-api. For more information, see Setting-per-tenant resource usage limits. That being said, we encourage those using the admin-api to migrate to managing per-tenant resource usage limits via runtime configuration because we intend to deprecate the ability to manage resource limits via admin-api in the future.
GEM 2.0 is a major version upgrade and therefore introduces several breaking changes relative to GEM 1.7. The majority of the breaking changes are concentrated in GEM’s configuration, where we removed a significant number configuration flags (454 to be exact) that were no longer needed. We renamed and/or updated the defaults for numerous other flags whose legacy names or default values no longer made sense. This work allowed us to reduce and simplify GEM’s configuration surface and therefore make it more approachable to users.
To make the GEM 1.7 to GEM 2.0 upgrade as easy as possible, we’ve added functionality to mimirtool for automating the changes users need to make to their configuration.
We’ve paired this tooling with a detailed migration guide that walks through all required changes needed to go from GEM 1.7 to GEM 2.0 in a step-by-step manner. The guide also includes specific instructions on how to perform the upgrade using the Helm chart.
Once you upgrade your GEM backend to version 2.0, we recommend you upgrade your Grafana Enterprise GEM plugin to >= v3.4.0. This is needed to get the improved Compactor panel dashboards mentioned in the “Features and enhancements” section. It also ensures the plugin works with the newest version (v3) of the admin-api.
- Fixed a bug where authentication using the wildcard (
*) username will fail when there are tenants deleted via the v3 admin API endpoints.
The following lists the most important bugs that are fixed in Grafana Enterprise Metrics 2.0.0:
- Fixed a bug that caused GEM to report an incorrect license expiration timestamp in the metric
grafana_labs_license_expiry_timestampif multiple valid licenses existed in local storage and object storage.
- Enterprise configuration extensions now appear in GEM’s
/configendpoint. Before the fix, only configuration values present in the upstream OSS were returned.
- Transient errors while attempting to authenticate user requests to GEM now return an HTTP 500 status code instead of an HTTP 401 code. The 500 code ensures that clients like Prometheus or the Grafana agent will now retry their remote write request rather than dropping it.
- Transient errors are no longer cached by GEM, ensuring that GEM does not repeatedly return an HTTP 500 status code to the client due to a one-off failure.
- When using a label-based access policy with multiple label selectors, GEM now ensures that label values are filtered properly. Before the bug fix, users with an LBAC policy with multiple label selectors could see label values they did not have access to. Metric names and metrics data were already filtered properly and unaffected by this bug.
For the full list of bug fixes, see the Changelog.
Related Enterprise Metrics resources
Running Prometheus-as-a-service with Grafana Enterprise Metrics
Introducing Grafana Enterprise Metrics (GEM), a simple and scalable Prometheus service that is seamless to use, simple to maintain, and supported by Grafana Labs.
How Robinhood scaled from 100M to 700M time series with Grafana Enterprise Metrics
In this GrafanaCONline session, the Robinhood team tells how GME (GameStop) led to GEM (Grafana Enterprise Metrics).
Benchmarking Grafana Enterprise Metrics for horizontally scaling Prometheus up to 500 million active series
We stress-tested GEM to show how it horizontally scaled. One takeaway: Hardware usage scales linearly up to 500 million active series.