Menu

Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.

Enterprise

Grafana Enterprise Metrics downloads

Releases

v2.4.2 – April 21st 2023

  • Binary (Linux AMD64)

    • Download
    • SHA256: 760bfbdeae9af42f4d976665a16e0a06fb1a357d236c20429c77ad951bd54a86
  • Docker image: run docker pull grafana/enterprise-metrics:v2.4.2 (digest: sha256:589072e2e9802c820ad3a281c09f46266dd42c3b98fed07ab741d4ee454fcf9f)

  • License: Grafana Labs license

Changelog

  • [BUGFIX] Fix CVE-2023-24538 by upgrading to Go 1.19.8.

Upstream Grafana Mimir details

v2.4.1 – February 17th 2023

  • Binary (Linux AMD64)

    • Download
    • SHA256: 4b60541e77ad83a1e2e1f0b7ad50a4865c54e14c27173da90ca1169bc0128cdf
  • Docker image: run docker pull grafana/enterprise-metrics:v2.4.1 (digest: sha256:86bc899a450e1f052e2dd4d7fd55435711068cfefa6d5da541a1fa59feaa69e6)

  • License: Grafana Labs license

Changelog

  • [BUGFIX] Fix issue where authentication caches were not sized correctly resulting in poor performance

Upstream Grafana Mimir details

v2.4.0 – November 14th 2022

  • Binary (Linux AMD64)

    • Download
    • SHA256: 9e23df3eee5a2c359b7af1dbe74a1edb776293e735ceacf1982d4b53f3709d2b
  • Docker image: run docker pull grafana/enterprise-metrics:v2.4.0 (digest: sha256:1f56acfb6c9ddbb5d6e961401ba55963ae51752889a1b5536b840837df8f44be)

  • License: Grafana Labs license

Changelog

  • [CHANGE] CarbonAPI is now being used instead of MetricTank as the default native query engine for the Graphite querier.
  • [CHANGE] Enterprise metrics docker image no longer requests the CAP_NET_BIND_SERVICE capability as the default HTTP port was changed from 80 to 8080.
    • If you set -server.http-listen-port or -server.grpc-listen-port to a value lower than 1024, then you need to modify your configuration
      • When using Docker provide the flag --cap-add net_bind_service.

      • When using the mimir-distributed Helm chart, make sure that all the GEM components have the following additional securityContext setting in their respective values file sections:

        yaml
        securityContext:
          sysctls:
            - name: net.ipv4.ip_unprivileged_port_start
              value: "0"  # might be set to the lowest listen port number as well
  • [FEATURE] Added a new flag -graphite.querier.cache-ttl to the Graphite querier to configure the TTL of cached metric names and aggregation configs.
  • [FEATURE] Added optional rate limiting capabilities to the Graphite querier.
    • This can be configured using the following flags:
      • -graphite.querier.rate-limit-enabled
      • -graphite.querier.rate-limit-qps
      • -graphite.querier.tenant-rate-limit-qps
      • -graphite.querier.heavy-rate-limit-qps
  • [ENHANCEMENT] Ruler: Add <prometheus-http-prefix>/api/v1/status/buildinfo endpoint.
  • [ENHANCEMENT] Update all build images to use Go 1.19.2.
  • [BUGFIX] Fix CVE-2022-44643

Upstream Grafana Mimir details

v2.3.2 – February 17th 2023

  • Binary (Linux AMD64)

    • Download
    • SHA256: 8eaec00a4f470e86beb3909c9f98cbdf892847811173c4477f268f08f448fa5b
  • Docker image: run docker pull grafana/enterprise-metrics:v2.3.2 (digest: sha256:dfba678a8b13647634dc9fa021ff1e8d9f23741d2e8590552c5ad22bedd59c81)

  • License: Grafana Labs license

Changelog

  • [BUGFIX] Fix issue where authentication caches were not sized correctly resulting in poor performance

Upstream Grafana Mimir details

v2.3.1 – November 14th 2022

  • Binary (Linux AMD64)

    • Download
    • SHA256: 99b853bae92acd4227b8142ae00dd7b5a32a29df788da170a198e4231eea443a
  • Docker image: run docker pull grafana/enterprise-metrics:v2.3.1 (digest: sha256:d697519012b4f8307ea3f39774235e99d0e5f9c498c7e93685551597179340b3)

  • License: Grafana Labs license

Changelog

  • [BUGFIX] Fix CVE-2022-44643

Upstream Grafana Mimir details

v2.3.0 – September 28th 2022

  • Binary (Linux AMD64)

    • Download
    • SHA256: d8bded3f288da30868fd6ef3d0c6aeac982a55f6d1f627f48297f90bdeeece02
  • Docker image: run docker pull grafana/enterprise-metrics:v2.3.0 (digest: sha256:0cb46f23551037c8f9df40572d5a09876a04cb59536ff7a06eb558c2e1bf558e)

  • License: Grafana Labs license

Changelog

  • [CHANGE] Gateway: Dial timeout now defaults to 5s instead of 30s.
  • [CHANGE] Gateway: Dialing gRPC proxy backends during startup now blocks until the connection is established.
  • [FEATURE] Gateway: Add support for TSDB block upload routes.
  • [FEATURE] Admin client: common config block introduced in Mimir now configures Admin Client in GEM too.
  • [ENHANCEMENT] Gateway: the CLI flag -gateway.request.limit has been added for configuring request limiter middleware.
  • [ENHANCEMENT] Update all build images to use Go 1.18.6.
  • [ENHANCEMENT] Update all images to use Alpine 3.16.2.
  • [ENHANCEMENT] Gateway: Dial timeout is now configurable via -gateway.proxy.*.dial-timeout.
  • [BUGFIX] Gateway: Expose /distributor/ring endpoint on the distributors.
  • [BUGFIX] LBAC: some query limits would not be applied for requests that use LBAC.

Upstream Grafana Mimir details

v2.2.0 – July 21st 2022

  • Binary (Linux AMD64)

    • Download
    • SHA256: c18de5eec921e44bb1aca3aa7f08054226639a846ebab02da9781fcf584e24b9
  • Docker image: run docker pull grafana/enterprise-metrics:v2.2.0 (digest: sha256:5165f84eeb399c1701757efc5a3f9219422bc43935bf995ea7e3d31417b2d6cb)

  • License: Grafana Labs license

Changelog

  • [CHANGE] Ruler: /api/v1/rules* and /prometheus/rules* configuration endpoints are removed in favour of /prometheus/config/v1/rules*. Requests through the gateway are unaffected.
  • [CHANGE] The remote subquerier for the Graphite query proxy is no longer optional
    • The following CLI flags (and their respective YAML config options) have been removed:
      • -graphite.querier.enable-remote-subquerier
      • -graphite.querier.use-remote-results
  • [CHANGE] The YAML config options for the datadog.api have been broken out into datadog.read_api and datadog.write_api
  • [ENHANCEMENT] Admin-client: added experimental support for refreshing authentication cache entries before they expire. When enabled, a cache entry is refreshed and its time to live is extended if it is retrieved and has less than or equal to -auth.cache.refresh.refresh-ttl time left to live in the cache.
    • The following CLI flags (and their respective YAML config options) have been added:
      • -auth.cache.refresh.buffer
      • -auth.cache.refresh.concurrency
      • -auth.cache.refresh.enabled
      • -auth.cache.refresh.refresh-ttl
      • -auth.cache.refresh.retry-interval
  • [ENHANCEMENT] Gateway: Rewrite requests to deleted ruler configuration endpoints to use supported endpoints.
  • [BUGFIX] Docs: Make config category labels consistent across command-line help and generated documentation.

Upstream Grafana Mimir details

v2.1.0 – June 2nd 2022

  • Binary (Linux AMD64)

    • Download
    • SHA256: 1fc9830d632d77b0cce719c4c7841b74028136d99c14331884212b3e247f0d2c
  • Docker image: run docker pull grafana/enterprise-metrics:v2.1.0 (digest: sha256:d02650b34c77cb5130b23790c958f658de8a5634f4d66f24dcb631ae7ba34b99)

  • License: Grafana Labs license

Changelog

  • [FEATURE] Ruler: Added support for expression remote evaluation.
    • The following CLI flags (and their respective YAML config options) have been added:
      • -ruler.query-frontend.address
      • -ruler.query-frontend.auth-token
      • -ruler.query-frontend.tls-enabled
      • -ruler.query-frontend.tls-ca-path
      • -ruler.query-frontend.tls-cert-path
      • -ruler.query-frontend.tls-key-path
      • -ruler.query-frontend.tls-server-name
      • -ruler.query-frontend.tls-insecure-skip-verify
  • [ENHANCEMENT] Self-monitoring: Emit OOM kill and page fault metrics as part of self-monitoring.
  • [BUGFIX] Ruler API: Ruler Limits are now enforced during rule group creation.
  • [BUGFIX] Authentication: Expose internal errors during authentication only in logs, not to clients.

Upstream Grafana Mimir details

v2.0.1 – April 14th 2022

  • Binary (Linux AMD64)

    • Download
    • SHA256: b1527c0b1405b0cb2bb40fbc828093b619fccbe03e830b90b2a61766c1f3701a
  • Docker image: run docker pull grafana/enterprise-metrics:v2.0.1 (digest: sha256:30c80aa0612aed4e0bab24f9e5c817a112f0bbdfa7b51404a069474d706ceaee)

  • License: Grafana Labs license

Changelog

  • [BUGFIX] Authentication: Only include active tenants when resolving the wildcard tenant (*).

Upstream Grafana Mimir details

No changes since GEM v2.0.0:

v2.0.0 – April 13th 2022

  • Binary (Linux AMD64)

    • Download
    • SHA256: 9244138e9e3a46ae53db31da4e67c263c1c482d41802a3bbf50c600bc2497be0
  • Docker image: run docker pull grafana/enterprise-metrics:v2.0.0 (ID: sha256:43ed80839bd0cb1d799087d5591a8873cfaead182683055bbb8aa207efcf8a5f, Repo digest: sha256:338bbcf64ea051cc3911908b977ae3b7bb8ed65342e7a2f8df3f781aa0f5e61a)

  • License: Grafana Labs license

Changelog

  • [CHANGE] Admin-API: enable leader election by default
  • [CHANGE] Change default value of instrumentation.enabled to true
  • [CHANGE] Graphite Querier: The GRPC server is now registered to enable subquerier requests. This requires using the flag EnableRemoteSubquerier.
  • [CHANGE] Graphite Querier: The remote read query is now the default behavior. Also, the previous implementation has been removed.
  • [CHANGE] Admin-API: Change auth.type default from trust to enterprise
  • [CHANGE] Limits: The max_series_per_query limit has been removed from the Admin API and runtime configuration and is no longer enforced by GEM during queries.
  • [CHANGE] Graphite: The Graphite Querier and Graphite Write Proxy have been removed from single binary mode (the all target). They can still be run using the graphite-querier and graphite-write-proxy targets, respectively.
  • [CHANGE] Query-frontend and Graphite Querier: migrated memcached backend client to the same one used in other components (memcached config and metrics are now consistent across all services).
    • The following CLI flags (and their respective YAML config options) have been added:
      • -graphite.querier.metric-name-cache.backend (set it to memcached)
      • -graphite.querier.aggregation-cache.backend (set it to memcached)
    • The following CLI flags (and their respective YAML config options) have been changed:
      • -graphite.querier.metric-name-cache.memcached.hostname and -graphite.querier.metric-name-cache.memcached.service: use -graphite.querier.metric-name-cache.memcached.addresses instead
      • -graphite.querier.aggregation-cache.memcached.hostname and -graphite.querier.aggregation-cache.memcached.service: use -graphite.querier.aggregation-cache.memcached.addresses instead
    • The following CLI flags (and their respective YAML config options) have been renamed:
      • -graphite.querier.metric-name-cache.background.write-back-concurrency renamed to -graphite.querier.metric-name-cache.memcached.max-async-concurrency
      • -graphite.querier.metric-name-cache.background.write-back-buffer renamed to -graphite.querier.metric-name-cache.memcached.max-async-buffer-size
      • -graphite.querier.metric-name-cache.memcached.batchsize renamed to -graphite.querier.metric-name-cache.memcached.max-get-multi-batch-size
      • -graphite.querier.metric-name-cache.memcached.parallelism renamed to -graphite.querier.metric-name-cache.memcached.max-get-multi-concurrency
      • -graphite.querier.metric-name-cache.memcached.timeout renamed to -graphite.querier.metric-name-cache.memcached.timeout
      • -graphite.querier.metric-name-cache.memcached.max-item-size renamed to -graphite.querier.metric-name-cache.memcached.max-item-size
      • -graphite.querier.metric-name-cache.memcached.max-idle-conns renamed to -graphite.querier.metric-name-cache.memcached.max-idle-connections
      • -graphite.querier.aggregation-cache.background.write-back-concurrency renamed to -graphite.querier.aggregation-cache.memcached.max-async-concurrency
      • -graphite.querier.aggregation-cache.background.write-back-buffer renamed to -graphite.querier.aggregation-cache.memcached.max-async-buffer-size
      • -graphite.querier.aggregation-cache.memcached.batchsize renamed to -graphite.querier.aggregation-cache.memcached.max-get-multi-batch-size
      • -graphite.querier.aggregation-cache.memcached.parallelism renamed to -graphite.querier.aggregation-cache.memcached.max-get-multi-concurrency
      • -graphite.querier.aggregation-cache.memcached.timeout renamed to -graphite.querier.aggregation-cache.memcached.timeout
      • -graphite.querier.aggregation-cache.memcached.max-item-size renamed to -graphite.querier.aggregation-cache.memcached.max-item-size
      • -graphite.querier.aggregation-cache.memcached.max-idle-conns renamed to -graphite.querier.aggregation-cache.memcached.max-idle-connections
    • The following CLI flags (and their respective YAML config options) have been removed:
      • -graphite.querier.aggregation-cache.default-validity: new setting is hardcoded to 7 days
      • -graphite.querier.aggregation-cache.memcached.circuit-breaker-consecutive-failures: feature removed
      • -graphite.querier.aggregation-cache.memcached.circuit-breaker-interval: feature removed
      • -graphite.querier.aggregation-cache.memcached.circuit-breaker-timeout: feature removed
      • -graphite.querier.aggregation-cache.memcached.consistent-hash: new setting is always enabled
      • -graphite.querier.aggregation-cache.memcached.update-interval: new setting is hardcoded to 30s
      • -graphite.querier.metric-name-cache.default-validity and -frontend.memcached.expiration: new setting is hardcoded to 7 days
      • -graphite.querier.metric-name-cache.memcached.circuit-breaker-consecutive-failures: feature removed
      • -graphite.querier.metric-name-cache.memcached.circuit-breaker-interval: feature removed
      • -graphite.querier.metric-name-cache.memcached.circuit-breaker-timeout: feature removed
      • -graphite.querier.metric-name-cache.memcached.consistent-hash: new setting is always enabled
      • -graphite.querier.metric-name-cache.memcached.update-interval: new setting is hardcoded to 30s
    • The following metrics have been changed:
      • cortex_cache_dropped_background_writes_total{name} changed to thanos_memcached_operation_skipped_total{name, operation, reason}
      • cortex_cache_value_size_bytes{name, method} changed to thanos_memcached_operation_data_size_bytes{name}
      • cortex_cache_request_duration_seconds{name, method, status_code} changed to thanos_memcached_operation_duration_seconds{name, operation}
      • cortex_cache_fetched_keys{name} changed to thanos_cache_memcached_requests_total{name}
      • cortex_cache_hits{name} changed to thanos_cache_memcached_hits_total{name}
      • cortex_memcache_request_duration_seconds{name, method, status_code} changed to thanos_memcached_operation_duration_seconds{name, operation}
      • cortex_memcache_client_servers{name} changed to thanos_memcached_dns_provider_results{name, addr}
      • cortex_memcache_client_set_skip_total{name} changed to thanos_memcached_operation_skipped_total{name, operation, reason}
      • cortex_dns_lookups_total changed to thanos_memcached_dns_lookups_total
      • For all metrics the value of the “name” label has changed from frontend.memcached to frontend-cache.
      • Above mentioned metrics are now also available with name=metric-name and name=aggregations for caches used by Graphite Querier.
    • The following metrics have been removed:
      • cortex_cache_background_queue_length{name}
  • [CHANGE] Compactor: -compactor.compaction-strategy option removed. The only compactor that can be now used is “split and merge” compactor.
  • [CHANGE] Graphite: Enabled distributed subqueries by default and renamed remote_write YAML flags.
    • -graphite.querier.use-remote-results and -graphite.querier.enable-remote-subquerier now default to true. This means by default subqueries will be distributed across queriers.
    • remote_write YAML flags have been renamed:
      • keepalive has been renamed to keep_alive
      • maxidleconns has been renamed to max_idle_conns
      • maxconns has been renamed to max_conns
      • skiplabelvalidation has been renamed to skip_label_validation.
  • [FEATURE] Admin-API Deletion Markers:
    • Update status field in tenants
    • Add status field to access policies and tokens
    • Add new Admin API v3 endpoints with soft-deletion of entities
      • /admin/api/v3/accesspolicies
      • /admin/api/v3/clusters
      • /admin/api/v3/features
      • /admin/api/v3/licenses
      • /admin/api/v3/tenants
      • /admin/api/v3/tokens
    • List endpoints only return entities in active status
    • Update HTTP authentication layer to only authorize requests of active entities
    • Update storage cache logic to only store the object’s latest version
    • Add v3 endpoints to gateway routes
  • [FEATURE] Graphite Write Proxy: Added -graphite.remote-write-proxy.enabled, -graphite.remote-write-proxy.write-endpoint and -graphite.write-proxy.skip-label-validation to enhance the internal series write performance of the graphite writer. It’s recommended to enable this flag on every installation as soon as possible because it will become a default configuration in future releases.
  • [FEATURE] Ruler: Added federated rule groups support.
    • Exposed cortex_ruler_sync_unauthorized_groups metric to track the number of skipped rule groups during storage synchronizations.
  • [FEATURE] Divide configuration parameters into categories “basic”, “advanced”, and “experimental”. Only flags in the basic category are shown when invoking -help, whereas -help-all will include flags in all categories (basic, advanced, experimental).
  • [FEATURE] Datadog: Added experimental support for ingesting and querying Datadog metrics by adding a Datadog translation layer on top of GEM.
  • [FEATURE] Gateway: Forward requests to deprecated and removed endpoints in Mimir 2.0 (grafana/mimir#763) to their non-legacy equivalents.
  • [ENHANCEMENT] Update all build images to use Go 1.17.8.
  • [ENHANCEMENT] Admin-API: Allow the max_global_exemplars_per_user limit to be set via the Admin API.
  • [ENHANCEMENT] Admin-API: Enable compactor_blocks_retention_period to be set on a per-tenant basis via the Admin API.
  • [ENHANCEMENT] Querier: Apply Label Based Access Policy (LBAC) rules to exemplar endpoints.
  • [ENHANCEMENT] Federation frontend: Add bearer_token configuration for proxy targets.
  • [ENHANCEMENT] Self-monitoring: Add support for emitting exemplars as part of self-monitoring metrics.
  • [ENHANCEMENT] Federation frontend: Return richer error when downstream data source is failed.
  • [BUGFIX] Graphite: no need to configure Mimir’s queryable when starting only -target=graphite-querier.
  • [BUGFIX] Graphite: When configured with enterprise authentication, requests sent to cortex remote read api now forward authorization headers if present.
  • [BUGFIX] LBAC: Filter label values using LBAC policies correctly.
  • [BUGFIX] Authentication: HTTP 500 errors are now returned for transient errors while attempting to authenticate user requests.
  • [BUGFIX] Authentication: Do not cache transient errors while attempting to authenticate user requests.
  • [BUGFIX] Config: Enterprise configuration extensions now appear in the /config endpoint
  • [BUGFIX] Admin: Validate the access policy name used for token generation.
  • [BUGFIX] Admin: Fixed a cosmetic issue that could report an incorrect license expiration timestamp in the metric grafana_labs_license_expiry_timestamp if multiple valid licenses exist in local storage and object storage.
  • [BUGFIX] Gateway: All Alertmanager endpoints are correctly proxied to the alertmanager backend proxy. Previously, only the /alertmanager endpoint was proxied. Users were able to authenticate but not access the alerts UI page at /alertmanager/#/alerts.

Upstream Grafana Mimir details

v1.7.1 – November 14th 2022

  • Binary (Linux AMD64)

    • Download
    • SHA256: e2f0a224ae485a16422ac69c526a73e8118f4f93ec7e69e9e7e840e9d3fba1b6
  • Docker image: run docker pull grafana/enterprise-metrics:v1.7.1 (digest: sha256:84576bd0bab9beb98f6c93e6b9d91dc4efc3e5434747c43ca2bd84863219c8c6)

  • License: Grafana Labs license

Changelog

  • [BUGFIX] Fix CVE-2022-44643

v1.7.0 – January 6th 2022

  • Binary (Linux AMD64)

    • Download
    • SHA256: 262ca08136dc28eee92a6494981ee8de7a7dd859c7a3ea5afa53ff99a56eb42d
  • Deb (Linux AMD64)

    • Download
    • SHA256: 881d2166929022bd96f6a86efc181845883d76b0b6be319882806eb06a29fb94
  • RPM (Linux AMD64)

    • Download
    • SHA256: 307bb036a5452c62ff13dde175801c6312440c7fbdae5bead1bf907000049584
  • Docker image: run docker pull grafana/metrics-enterprise:v1.7.0 (digest: sha256:286ce03b3dcd50c7924ee6860d58b2bd7986c9548cc6fe6207d23b0212883c33)

  • License: Grafana Labs license

Changelog

  • [FEATURE] Admin-API: Added support for Azure Storage
  • [ENHANCEMENT] Federation Frontend: Propagate requests’ bearer token when it is present.
  • [ENHANCEMENT] Federation Frontend: Support TLS configuration for targets.

v1.6.2 – January 6th 2022

  • Binary (Linux AMD64)

    • Download
    • SHA256: f1327147a3e70991073272b61b580c86717afdef4d663198281746c91ca3937a
  • Deb (Linux AMD64)

    • Download
    • SHA256: b9e18f4f80eae18a5cc7a0a01da896c31f633567af75c611fc9d21b82b49aa14
  • RPM (Linux AMD64)

    • Download
    • SHA256: ad45ade184b408e49834e71118af08180cbfe8e9e5b71fde2adade374fd73f85
  • Docker image: run docker pull grafana/metrics-enterprise:v1.6.2 (digest: sha256:48fef5ef7a339d766274a37448e1c3745fde53ec0e2f4eab1a8a093a786d41d2)

  • License: Grafana Labs license

Changelog

  • [BUGFIX] GEM update from v1.5.0 (or older) to v.1.6+ will not invalidate tenant limits set via API anymore.

v1.6.1 – November 18th 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: 64f307e1f6f72cb8557c6d2ef0c8eb2aa939ddfa358bf53c2705fb50e4c0f253
  • Deb (Linux AMD64)

    • Download
    • SHA256: 4651dddfa0627b675a4f9202372ecc0c5d7bc6aff9237ebbac292f2867d305d2
  • RPM (Linux AMD64)

    • Download
    • SHA256: 15ecfdd0ebcb26c017f1cb63191054096a14b770441a269506af0835f7a8b58a
  • Docker image: run docker pull grafana/metrics-enterprise:v1.6.1 (digest: sha256:66f9eb4cee53df7b95860b1d094cae1dca88e1724de3695fec0449f92fe1db90)

  • License: Grafana Labs license

Changelog

  • [BUGFIX] Admin-API: Make sure that read-path limits inherit defaults from global limits.

v1.6.0 – November 15th 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: 90011b1fe6ec6ca1c3eb45ce4b26e0143d4cc2b7c660e28b6f115f9c852e125b
  • Deb (Linux AMD64)

    • Download
    • SHA256: dd237bb2649ec253877616493f1cd5f900ec0f4ec89d487d710f3fe4a46a8d82
  • RPM (Linux AMD64)

    • Download
    • SHA256: 007f589f0ebfe1498dfeb1d2779b39460c58d45778d4ea0c35c70714d6367973
  • Docker image: run docker pull grafana/metrics-enterprise:v1.6.0 (digest: sha256:1e01fe4d792b53b9a4d37c38a612c2027582d6d7248f567ed31e2ed6102c035d)

  • License: Grafana Labs license

Changelog

  • [CHANGE] Admin-client: Rename the “default” auth method to “trust”.
  • [CHANGE] License: Deprecated flag -bootstrap.license.path has been removed. The new flag to use for specifying a license is -license.path.
  • [CHANGE] Ruler: endpoints for listing rules (/api/v1/rules, /api/v1/rules/{namespace}) now return HTTP status code 200 and an empty map when there are no rules instead of an HTTP 404 and plain text error message.
  • [CHANGE] Query-frontend: added sharded label to cortex_query_seconds_total metric.
  • [CHANGE] Query-frontend: changed the flag name for controlling query sharding total shards from -querier.total-shards to -frontend.query-sharding-total-shards.
  • [CHANGE] Flag -querier.parallelise-shardable-queries has been renamed to -query-frontend.parallelize-shardable-queries
  • [CHANGE] Querier/ruler: Option -querier.ingester-streaming has been removed. Querier/ruler now always use streaming method to query ingesters.
  • [CHANGE] Limits: Option -ingester.max-samples-per-query is now deprecated. YAML field max_samples_per_query is no longer supported. It required -querier.ingester-streaming option to be set to false, but since -querier.ingester-streaming is removed (always defaulting to true), the limit using it was removed as well.
  • [CHANGE] Limits: Set the default max number of inflight ingester push requests (-ingester.instance-limits.max-inflight-push-requests) to 30000 in order to prevent clusters from being overwhelmed by request volume or temporary slow-downs.
  • [CHANGE] Update Go version to 1.16.9.
  • [CHANGE] Admin-API: Require that tenant updates include the status field.
  • [FEATURE] Querier: Added label names cardinality endpoint <prefix>/api/v1/cardinality/label_names that is disabled by default. Can be enabled/disabled via the CLI flag -querier.cardinality-analysis-enabled or its respective YAML config option. Configurable on a per-tenant basis.
  • [FEATURE] Querier: Added label values cardinality endpoint <prefix>/api/v1/cardinality/label_values that is disabled by default. Can be enabled/disabled via the CLI flag -querier.cardinality-analysis-enabled or its respective YAML config option. Configurable on a per-tenant basis.
  • [FEATURE] Compactor: added support for a new compaction strategy -compactor.compaction-strategy=split-and-merge. When the split-and-merge compactor is used, source blocks for a given tenant are grouped into -compactor.split-groups number of groups. Each group of blocks is then compacted separately, and is split into -compactor.split-and-merge-shards shards (configurable on a per-tenant basis). Compaction of each tenant shards can be horizontally scaled. Number of compactors that work on jobs for single tenant can be limited by using -compactor.compactor-tenant-shard-size parameter, or per-tenant compactor_tenant_shard_size override.
  • [FEATURE] Query Frontend: Updated experimental querysharding for the blocks storage. You can now enabled querysharding for blocks storage (-store.engine=blocks) by setting -query-frontend.parallelize-shardable-queries to true. The following additional config and exported metrics have been added.
    • New config options:
      • -frontend.query-sharding-total-shards: The amount of shards to use when doing parallelisation via query sharding.
      • -frontend.query-sharding-max-sharded-queries: The max number of sharded queries that can be run for a given received query. 0 to disable limit.
      • -blocks-storage.bucket-store.series-hash-cache-max-size-bytes: Max size - in bytes - of the in-memory series hash cache in the store-gateway.
      • -blocks-storage.tsdb.series-hash-cache-max-size-bytes: Max size - in bytes - of the in-memory series hash cache in the ingester.
    • New exported metrics:
      • cortex_bucket_store_series_hash_cache_requests_total
      • cortex_bucket_store_series_hash_cache_hits_total
      • cortex_frontend_query_sharding_rewrites_succeeded_total
      • cortex_frontend_sharded_queries_per_query
    • Renamed metrics:
      • cortex_frontend_mapped_asts_total to cortex_frontend_query_sharding_rewrites_attempted_total
    • Modified metrics:
      • added sharded label to cortex_query_seconds_total
    • When query sharding is enabled, the following querier config must be set on query-frontend too:
      • -querier.max-concurrent
      • -querier.timeout
      • -querier.max-samples
      • -querier.at-modifier-enabled
      • -querier.default-evaluation-interval
      • -querier.active-query-tracker-dir
      • -querier.lookback-delta
    • Sharding can be dynamically controlled per request using the Sharding-Control: 64 header. (0 to disable)
    • Sharding can be dynamically controlled per tenant using the limit query_sharding_total_shards. (0 to disable)
    • Added sharded_queries count to the “query stats” log.
    • Number of shards is adjusted to be compatible with number of compactor shards used by split-and-merge compactor. Querier can use this to avoid querying blocks that cannot have series in given query shard. This only works when using split-and-merge compactor.
  • [FEATURE] Graphite: Added -graphite.querier.remote-read-enabled and -graphite.querier.query-address to enhance the internal query performance of the graphite querier. It’s recommended to enable this flag on every installation as soon as possible because it will become a default configuration in future releases.
  • [FEATURE] Ingester: Enable snapshotting of in-memory TSDB on disk during shutdown via -blocks-storage.tsdb.memory-snapshot-on-shutdown.
  • [FEATURE] Query-Frontend: Added -query-frontend.cache-unaligned-requests option to cache responses for requests that do not have step-aligned start and end times. This can improve speed of repeated queries, but can also pollute cache with results that are never reused.
  • [ENHANCEMENT] Admin-client: Make the cluster_name configuration optional.
  • [ENHANCEMENT] Admin-API: Add new Admin API v2 endpoints that replace the term ‘instance’ used in version v1 with the term ’tenant’
    • /admin/api/v2/accesspolicies
    • /admin/api/v2/clusters
    • /admin/api/v2/features
    • /admin/api/v2/licenses
    • /admin/api/v2/tenants
    • /admin/api/v2/tokens
  • [ENHANCEMENT] LBAC: Optimize filtering when using single selector in LBAC policy by passing matchers to downstream querier.
  • [ENHANCEMENT] Distributor: reduce latency when HA-Tracking by doing KVStore updates in the background.
  • [ENHANCEMENT] Compactor: when sharding is enabled, skip already planned compaction jobs if the tenant doesn’t belong to the compactor instance anymore.
  • [ENHANCEMENT] Compactor: Blocks cleaner will ignore users that it no longer “owns” when sharding is enabled, and user ownership has changed since last scan.
  • [ENHANCEMENT] Query federation: improve performance in MergeQueryable by memoizing labels.
  • [ENHANCEMENT] Querier / store-gateway: optimized regex matchers.
  • [ENHANCEMENT] Query-frontend: added cortex_query_frontend_non_step_aligned_queries_total to track the total number of range queries with start/end not aligned to step.
  • [ENHANCEMENT] Compactor: added -compactor.compaction-jobs-order support to configure which compaction jobs should run first for a given tenant (in case there are multiple ones). Supported values are: smallest-range-oldest-blocks-first (default), newest-blocks-first (not supported by default compaction strategy).
  • [ENHANCEMENT] Add option (-querier.label-values-max-cardinality-label-names-per-request) to configure the maximum number of label names allowed to be queried in a single <prefix>/api/v1/cardinality/label_values API call.
  • [ENHANCEMENT] Make distributor inflight push requests count include background calls to ingester.
  • [ENHANCEMENT] Store-gateway: added an in-memory LRU cache for chunks attributes. Can be enabled setting -blocks-storage.bucket-store.chunks-cache.attributes-in-memory-max-items=X where X is the max number of items to keep in the in-memory cache. The following new metrics are exposed:
    • cortex_cache_memory_requests_total
    • cortex_cache_memory_hits_total
    • cortex_cache_memory_items_count
  • [ENHANCEMENT] Store-gateway: log index cache requests to tracing spans.
  • [ENHANCEMENT] Ingester: reduce CPU and memory utilization if remote write requests contains a large amount of “out of bounds” samples.
  • [ENHANCEMENT] Ingester: reduce CPU and memory utilization when querying chunks from ingesters.
  • [ENHANCEMENT] Querier: when fetching data for specific query-shard, we can ignore some blocks based on compactor-shard ID, since sharding of series by query sharding and compactor is the same. Added metrics:
    • cortex_querier_blocks_found_total
    • cortex_querier_blocks_queried_total
    • cortex_querier_blocks_with_compactor_shard_but_incompatible_query_shard_total
  • [ENHANCEMENT] Querier&Ruler: reduce cpu usage, latency and peak memory consumption.
  • [ENHANCEMENT] Overrides Exporter: Add max_fetched_chunks_per_query limit to the default and per-tenant limits exported as metrics.
  • [BUGFIX] License: Fixed initialization of AWS subscription manager so it creates a cluster object if not present when running GEM as AWS Marketplace product.
  • [BUGFIX] Admin-API: Change the way per-instance limits are stored to avoid breaking changes between versions.
  • [BUGFIX] Self-monitoring: Ensure system rules adhere to the sharding configuration of the rulers.
  • [BUGFIX] Graphite: fixed invalid label error when querying metrics with dashes in the tags.
  • [BUGFIX] Authentication: Fix caching behavior to ensure tokens are eventually removed from the cache.
  • [BUGFIX] Authentication: Enforce that instances must exist even when using wildcard access policies.
  • [BUGFIX] Admin-API: Expose metrics cortex_admin_api_clients and cortex_admin_client_is_leader for leader election correctly.
  • [BUGFIX] Limits: Fix the way cortex_limits_admin_store_last_update_timestamp_seconds is set to emit a correct UNIX timestamp.
  • [BUGFIX] Alertmanager: don’t replace user configurations with blank fallback configurations (when enabled), particularly during scaling up/down instances when sharding is enabled.
  • [BUGFIX] Query-frontend: Ensure query_range requests handled by the query-frontend return JSON formatted errors.
  • [BUGFIX] Query-frontend: don’t reuse cached results for queries that are not step-aligned.
  • [BUGFIX] Querier: fixed UserStats endpoint. When zone-aware replication is enabled, MaxUnavailableZones param is used instead of MaxErrors, so setting MaxErrors = 0 doesn’t make the Querier wait for all Ingesters responses.

v1.5.1 – September 21st 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: b167c2e5d1152f4229f2a35aa02dba21ec2151037c60ec69d994475b9e0b27f0
  • Deb (Linux AMD64)

    • Download
    • SHA256: 07b98379a54b62995233507f08fb6b11c90f2b8f3faa507ba84ab21fa128b434
  • RPM (Linux AMD64)

    • Download
    • SHA256: 7af2f92a5897a2a3b1b7d7eb0c4ed8d8537cee9f884b65e50067b84de43fb02f
  • Docker image: run docker pull grafana/metrics-enterprise:v1.5.1 (digest: sha256:079ed9d61a7ab0953afbfa76de8ab2d38d44ac17e630446bab4084b4aba0c2e4)

  • License: Grafana Labs license

Changelog

  • [ENHANCEMENT] Add ADFS compatibility to our OIDC auth.
  • [BUGFIX] Ruler: Use predictable names for Ruler WALs ensuring they are used after crashes and cleaned up.

v1.5.0 – August 24th 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: e1dbb7640ad49509f22182c4a732b3b9b28c57f860ed2860718d33670fbd4fbe
  • Deb (Linux AMD64)

    • Download
    • SHA256: 549207728b7e023109a375f41b403cc73749344e08e285e238081d6ddafd3bc5
  • RPM (Linux AMD64)

    • Download
    • SHA256: f2116c99cf835f10562a67ae052411c82bd1b3f61470d9f4926f6e243fc35227
  • Docker image: run docker pull grafana/metrics-enterprise:v1.5.0 (digest: sha256:b0d98ffe49df461a524743a49dca26952a59c9c007231035e52f0a06e5003fff)

  • License: Grafana Labs license

Changelog

  • [CHANGE] Alertmanager: allowed to configure the experimental receivers firewall on a per-tenant basis. The following CLI flags (and their respective YAML config options) have been changed and moved to the limits config section:
    • -alertmanager.receivers-firewall.block.cidr-networks renamed to -alertmanager.receivers-firewall-block-cidr-networks
    • -alertmanager.receivers-firewall.block.private-addresses renamed to -alertmanager.receivers-firewall-block-private-addresses
  • [CHANGE] Memberlist: Expose default configuration values to the command line options. Note that setting these explicitly to zero will no longer cause the default to be used. If the default is desired, then do set the option. The following are affected:
    • -memberlist.stream-timeout
    • -memberlist.retransmit-factor
    • -memberlist.pull-push-interval
    • -memberlist.gossip-interval
    • -memberlist.gossip-nodes
    • -memberlist.gossip-to-dead-nodes-time
    • -memberlist.dead-node-reclaim-time
  • [CHANGE] Authentication: Access Policy names passed via a JWT token in the OIDC auth flow will be downcased before being matched against Access Policies in GEM. This improves interoperability between GEM and other systems since GEM only allows lowercase characters in Access Policy names
  • [CHANGE] Change default value of -server.grpc.keepalive.min-time-between-pings from 5m to 10s and -server.grpc.keepalive.ping-without-stream-allowed to true.
  • [CHANGE] Changed -alertmanager.storage.type default value from configdb to local.
  • [CHANGE] Changed -ruler.storage.type default value from configdb to local.
  • [CHANGE] Cortex chunks storage has been deprecated and it’s now in maintenance mode: all Cortex users are encouraged to migrate to the blocks storage. No new features will be added to the chunks storage. The default Cortex configuration still runs the chunks engine; please check out the blocks storage doc on how to configure Cortex to run with the blocks storage.
  • [CHANGE] Dependency: update go-redis from v8.2.3 to v8.9.0.
  • [CHANGE] Deprecated the bootstrap target in favor of the tokengen target.
  • [CHANGE] Enable strict JSON unmarshal for pkg/util/validation.Limits struct. The custom UnmarshalJSON() will now fail if the input has unknown fields.
  • [CHANGE] Graphite: proxy no longer generates generic metrics metadata. This helps to reduce ingestion rate as counted by Cortex and used for limits.
  • [CHANGE] Ingester: Change default value of -ingester.active-series-metrics-enabled to true. This incurs a small increase in memory usage, between 1.2% and 1.6% as measured on ingesters with 1.3M active series.
  • [CHANGE] License: Flag -bootstrap.license.path has been deprecated in favor of -license.path.
  • [CHANGE] Memberlist: the memberlist_kv_store_value_bytes has been removed due to values no longer being stored in-memory as encoded bytes.
  • [CHANGE] Querier / ruler: Change -querier.max-fetched-chunks-per-query configuration to limit to maximum number of chunks that can be fetched in a single query. The number of chunks fetched by ingesters AND long-term storare combined should not exceed the value configured on -querier.max-fetched-chunks-per-query.
  • [CHANGE] Querier / ruler: deprecated -store.query-chunk-limit CLI flag (and its respective YAML config option max_chunks_per_query) in favour of -querier.max-fetched-chunks-per-query (and its respective YAML config option max_fetched_chunks_per_query). The new limit specifies the maximum number of chunks that can be fetched in a single query from ingesters and long-term storage: the total number of actual fetched chunks could be 2x the limit, being independently applied when querying ingesters and long-term storage.
  • [CHANGE] Query-frontend: Enable query stats by default, they can still be disabled with -frontend.query-stats-enabled=false.
  • [CHANGE] Removed configdb support from Ruler and Alertmanager backend storages.
  • [CHANGE] Removed log_messages_total metric.
  • [CHANGE] Removed query sharding for the chunks storage. Query sharding is now only supported for blocks storage.
  • [CHANGE] Renamed metric deprecated_flags_inuse_total as deprecated_flags_used_total.
  • [CHANGE] Renamed metric experimental_features_in_use_total as experimental_features_used_total.
  • [CHANGE] Some files and directories on local disk now have stricter permissions, and are only readable by owner, but not group or others.
  • [CHANGE] The example Kubernetes manifests (stored at k8s/) have been removed due to a lack of proper support and maintenance.
  • [CHANGE] Update Go version to 1.16.6.
  • [FEATURE] Added flag -debug.block-profile-rate to enable goroutine blocking events profiling.
  • [FEATURE] Alertmanager: Added -alertmanager.max-config-size-bytes limit to control size of configuration files that Cortex users can upload to Alertmanager via API. This limit is configurable per-tenant.
  • [FEATURE] Alertmanager: Added -alertmanager.max-templates-count and -alertmanager.max-template-size-bytes options to control number and size of templates uploaded to Alertmanager via API. These limits are configurable per-tenant.
  • [FEATURE] Alertmanager: Added rate-limits to notifiers. Rate limits used by all integrations can be configured using -alertmanager.notification-rate-limit, while per-integration rate limits can be specified via -alertmanager.notification-rate-limit-per-integration parameter. Both shared and per-integration limits can be overwritten using overrides mechanism. These limits are applied on individual (per-tenant) alertmanagers. Rate-limited notifications are failed notifications. It is possible to monitor rate-limited notifications via new cortex_alertmanager_notification_rate_limited_total metric.
  • [FEATURE] Alertmanager: support negative matchers, time-based muting - upstream release notes.
  • [FEATURE] Allow for reporting CPU time usage to AWS Marketplace metering service in case GEM is running as AWS Marketplace container product.
  • [FEATURE] Collect and store CPU time usage reports in Admin store, which can later be used to submit to metering services, such as the AWS Marketplace API
  • [FEATURE] Querier/Ruler: Added new -querier.max-fetched-chunk-bytes-per-query flag. When Cortex is running with blocks storage, the max chunk bytes limit is enforced in the querier and ruler and limits the size of all aggregated chunks returned from ingesters and storage as bytes for a query.
  • [FEATURE] Querier: Added new -querier.max-fetched-series-per-query flag. When Cortex is running with blocks storage, the max series per query limit is enforced in the querier and applies to unique series received from ingesters and store-gateway (long-term storage).
  • [FEATURE] Query Frontend: Add cortex_query_fetched_chunks_total per-user counter to expose the number of chunks fetched as part of queries. This metric can be enabled with the -frontend.query-stats-enabled flag (or its respective YAML config option query_stats_enabled).
  • [FEATURE] Query Frontend: Add cortex_query_fetched_series_total and cortex_query_fetched_chunks_bytes_total per-user counters to expose the number of series and bytes fetched as part of queries. These metrics can be enabled with the -frontend.query-stats-enabled flag (or its respective YAML config option query_stats_enabled).
  • [FEATURE] Query Frontend: Add experimental querysharding for the block storage. You can now enabled querysharding for block storage (-store.engine) by setting -querier.parallelise-shardable-queries to true.
  • [FEATURE] Ruler Storage: S3 header extensions were added to the new ruler storage S3 config block.
  • [FEATURE] Ruler: Add new -ruler.query-stats-enabled which when enabled will report the cortex_ruler_query_seconds_total as a per-user metric that tracks the sum of the wall time of executing queries in the ruler in seconds.
  • [FEATURE] When running GEM as AWS Marketplace container product then the Go runtime variable GOMAXPROCS is automatically set to match the container CPU quota, in case Kubernetes CPU resource limits are set.
  • [FEATURE] Alertmanager: The experimental sharding feature is now considered complete. Detailed information about the configuration options can be found here for alertmanager and here for the alertmanager storage. To use the feature:
    • Ensure that a remote storage backend is configured for Alertmanager to store state using -alertmanager-storage.backend, and flags related to the backend. Note that the local and configdb storage backends are not supported.
    • Ensure that a ring store is configured using -alertmanager.sharding-ring.store, and set the flags relevant to the chosen store type.
    • Enable the feature using -alertmanager.sharding-enabled.
    • Note the prior addition of a new configuration option -alertmanager.persist-interval. This sets the interval between persisting the current alertmanager state (notification log and silences) to object storage. See the configuration file reference for more information.
  • [ENHANCEMENT] Add Cassandra support.
  • [ENHANCEMENT] Add timeout for waiting on compactor to become ACTIVE in the ring.
  • [ENHANCEMENT] Added tenant_ids tag to tracing spans
  • [ENHANCEMENT] Added option -distributor.excluded-zones to exclude ingesters running in specific zones both on write and read path.
  • [ENHANCEMENT] Added zone-awareness support to alertmanager for use when sharding is enabled. When zone-awareness is enabled, alerts will be replicated across availability zones.
  • [ENHANCEMENT] Admin-API: Add a new endpoint for returning product and feature information at /admin/api/v1/features
  • [ENHANCEMENT] Admin-API: Allow admin-api to operate for read-only request when no license is present.
  • [ENHANCEMENT] Alertmanager: Added -alertmanager.max-alerts-count and -alertmanager.max-alerts-size-bytes to control max number of alerts and total size of alerts that a single user can have in Alertmanager’s memory. Adding more alerts will fail with a log message and incrementing cortex_alertmanager_alerts_insert_limited_total metric (per-user). These limits can be overrided by using per-tenant overrides. Current values are tracked in cortex_alertmanager_alerts_limiter_current_alerts and cortex_alertmanager_alerts_limiter_current_alerts_size_bytes metrics.
  • [ENHANCEMENT] Alertmanager: Added -alertmanager.max-dispatcher-aggregation-groups option to control max number of active dispatcher groups in Alertmanager (per tenant, also overrideable). When the limit is reached, Dispatcher produces log message and increases cortex_alertmanager_dispatcher_aggregation_group_limit_reached_total metric.
  • [ENHANCEMENT] Alertmanager: Cleanup persisted state objects from remote storage when a tenant configuration is deleted.
  • [ENHANCEMENT] Authentiation: OIDC integration now supports a JWT with multiple roles. When present, these roles will be rolled up into a “virtual” access policy that provides metrics read access to the union of instances contained in those roles.
  • [ENHANCEMENT] Blocks storage: support ingesting exemplars and querying of exemplars. Enabled by setting new CLI flag -blocks-storage.tsdb.max-exemplars=<n> or config option blocks_storage.tsdb.max_exemplars to positive value.
  • [ENHANCEMENT] Distributor: Added distributors ring status section in the admin page.
  • [ENHANCEMENT] Etcd: Added username and password to etcd config.
  • [ENHANCEMENT] Expose CPU quota information (number of cores, cgroup quota) as Prometheus metrics.
  • [ENHANCEMENT] Expose error counters and timestamps of CPU usage reporting as Prometheus metrics when AWS Marketplace meterting is enabled.
  • [ENHANCEMENT] Expose value of GOMAXPROCS as Prometheus metrics.
  • [ENHANCEMENT] Facilitate running GEM Docker image as a non-root user. Usage is documented in the Kubernetes deployment documentation.
  • [ENHANCEMENT] Ingester: Added option -ingester.ignore-series-limit-for-metric-names with comma-separated list of metric names that will be ignored in max series per metric limit.
  • [ENHANCEMENT] Ingester: added option -ingester.readiness-check-ring-health to disable the ring health check in the readiness endpoint.
  • [ENHANCEMENT] License: Added flag -license.type that is used to specify that the APP is running through AWS Marketplace.
  • [ENHANCEMENT] License: Implemented /licenses endpoint that responds with static list of licenses that replaces default implementation if the APP is running through AWS Marketplace.
  • [ENHANCEMENT] License: Implemented logic to check if AWS Marketplace subscription is active instead of checking license file if the APP is running through AWS Marketplace.
  • [ENHANCEMENT] Memberlist: expose configuration of memberlist packet compression via -memberlist.compression=enabled.
  • [ENHANCEMENT] Memberlist: optimized receive path for processing ring state updates, to help reduce CPU utilization in large clusters.
  • [ENHANCEMENT] Node-API: Added TSDB block metadata to the exportable debug archive.
  • [ENHANCEMENT] Node-API: Register a new endpoint for fetching a compressed debug file containing config and version information at /node/api/v1/debug-export.
  • [ENHANCEMENT] Node-API: Register a new endpoint for fetching version information about the nodes at /node/api/v1/version.
  • [ENHANCEMENT] Querier now can use the LabelNames call with matchers, if matchers are provided in the /labels API call, instead of using the more expensive MetricsForLabelMatchers call as before. This can be enabled by enabling the -querier.query-label-names-with-matchers-enabled flag once the ingesters are updated to this version. In the future this is expected to become the default behavior.
  • [ENHANCEMENT] Reduce memory used by streaming queries, particularly in ruler.
  • [ENHANCEMENT] Ring, query-frontend: Avoid using automatic private IPs (APIPA) when discovering IP address from the interface during the registration of the instance in the ring, or by query-frontend when used with query-scheduler. APIPA still used as last resort with logging indicating usage.
  • [ENHANCEMENT] Ruler: added rule_group label to metrics cortex_prometheus_rule_group_iterations_total and cortex_prometheus_rule_group_iterations_missed_total.
  • [ENHANCEMENT] Scanner: add support for DynamoDB (v9 schema only).
  • [ENHANCEMENT] Scanner: retry failed uploads.
  • [ENHANCEMENT] Storage: Added the ability to disable Open Census within GCS client (e.g -gcs.enable-opencensus=false).
  • [ENHANCEMENT] Store-gateway: added -store-gateway.sharding-ring.wait-stability-min-duration and -store-gateway.sharding-ring.wait-stability-max-duration support to store-gateway, to wait for ring stability at startup.
  • [ENHANCEMENT] Wildcard Datasource: Wildcard “*” datasources are now supported in datasource urls for GEM. This allows an action to have access to all instances in all access policies associated with the provided token. If that set of instances includes a wildcard “*”, then access is expanded to all instances in the cluster.
  • [ENHANCEMENT] Added instrumentation to Redis client, with the following metrics:
    • cortex_rediscache_request_duration_seconds
  • [ENHANCEMENT] Include additional limits in the per-tenant override exporter. The following limits have been added to the cortex_overrides metric:
    • max_fetched_series_per_query
    • max_fetched_chunk_bytes_per_query
    • ruler_max_rules_per_rule_group
    • ruler_max_rule_groups_per_tenant
  • [ENHANCEMENT] License Manager: Added functionality to regularly check the local license file and sync it to the license storage backend.
    • Added metrics grafana_labs_license_syncs_total and grafana_labs_license_sync_failures_total.
  • [ENHANCEMENT] Ring: allow experimental configuration of disabling of heartbeat timeouts by setting the relevant configuration value to zero. Applies to the following:
    • -distributor.ring.heartbeat-timeout
    • -ring.heartbeat-timeout
    • -ruler.ring.heartbeat-timeout
    • -alertmanager.sharding-ring.heartbeat-timeout
    • -compactor.ring.heartbeat-timeout
    • -store-gateway.sharding-ring.heartbeat-timeout
  • [ENHANCEMENT] Ring: allow heartbeats to be explicitly disabled by setting the interval to zero. This is considered experimental. This applies to the following configuration options:
    • -distributor.ring.heartbeat-period
    • -ingester.heartbeat-period
    • -ruler.ring.heartbeat-period
    • -alertmanager.sharding-ring.heartbeat-period
    • -compactor.ring.heartbeat-period
    • -store-gateway.sharding-ring.heartbeat-period
  • [ENHANCEMENT] Alertmanager: introduced new metrics to monitor operation when using -alertmanager.sharding-enabled:
    • cortex_alertmanager_state_fetch_replica_state_total
    • cortex_alertmanager_state_fetch_replica_state_failed_total
    • cortex_alertmanager_state_initial_sync_total
    • cortex_alertmanager_state_initial_sync_completed_total
    • cortex_alertmanager_state_initial_sync_duration_seconds
    • cortex_alertmanager_state_persist_total
    • cortex_alertmanager_state_persist_failed_total
  • [ENHANCEMENT] Memberlist: introduced new metrics to aid troubleshooting tombstone convergence:
    • memberlist_client_kv_store_value_tombstones
    • memberlist_client_kv_store_value_tombstones_removed_total
    • memberlist_client_messages_to_broadcast_dropped_total
  • [ENHANCEMENT] Ruler: added new metrics for tracking total number of queries and push requests sent to ingester, as well as failed queries and push requests. Failures are only counted for internal errors, but not user-errors like limits or invalid query. This is in contrast to existing cortex_prometheus_rule_evaluation_failures_total, which is incremented also when query or samples appending fails due to user-errors.
    • cortex_ruler_write_requests_total
    • cortex_ruler_write_requests_failed_total
    • cortex_ruler_queries_total
    • cortex_ruler_queries_failed_total
  • [BUGFIX] Graphite: Fix handling of consolidateBy and make aggregation method part of aggregation cache key.
  • [BUGFIX] Alertmanager: fix Alertmanager status page if clustering via gossip is disabled or sharding is enabled.
  • [BUGFIX] Authentication: fix handling of missing instances, or when instance has no matching access policy, by properly returning a 401 instead of crashing.
  • [BUGFIX] Compactor: fixed panic while collecting Prometheus metrics.
  • [BUGFIX] Graphite: Apply the max-points-per-req-hard limit correctly.
  • [BUGFIX] Graphite: Fix race in index.json API endpoint which lead to incomplete results.
  • [BUGFIX] HA Tracker: when cleaning up obsolete elected replicas from KV store, tracker didn’t update number of cluster per user correctly.
  • [BUGFIX] Ingester: fix issue where runtime limits erroneously override default limits.
  • [BUGFIX] Ingester: fixed infrequent panic caused by a race condition between TSDB mmap-ed head chunks truncation and queries.
  • [BUGFIX] Ingester: fixed ingester stuck on start up (LEAVING ring state) when -ingester.heartbeat-period=0 and -ingester.unregister-on-shutdown=false.
  • [BUGFIX] Invalidate cached authentication tokens when they are deleted from object storage.
  • [BUGFIX] Make multiple Get requests instead of MGet on Redis Cluster.
  • [BUGFIX] Memberlist: fix to setting the default configuration value for -memberlist.retransmit-factor when not provided. This should improve propagation delay of the ring state (including, but not limited to, tombstones). Note that if the configuration is already explicitly given, this fix has no effect.
  • [BUGFIX] Purger: fix Invalid null value in condition for column range caused by nil value in range for WriteBatch query.
  • [BUGFIX] Querier: Fix issue where samples in a chunk might get skipped by batch iterator.
  • [BUGFIX] Querier: fix queries failing with “at least 1 healthy replica required, could only find 0” error right after scaling up store-gateways until they’re ACTIVE in the ring.
  • [BUGFIX] Query-frontend: Fix 401s during query_range requests when enterprise authentication is used. The workaround involving disabling enterprise authentication on the querier can now be removed.
  • [BUGFIX] Ruler: Fix bug in rule forwarding with remote write which could cause filling up the disk because it was not truncated.
    • New flags called -ruler.remote-write.wal-truncate-frequency, -ruler.remote-write.min-wal-time and -ruler.remote-write.max-wal-time have been added.
  • [BUGFIX] Ruler: Honor the evaluation delay for the ALERTS and ALERTS_FOR_STATE series.
  • [BUGFIX] Ruler: fix /ruler/rule_groups endpoint doesn’t work when used with object store.
  • [BUGFIX] Ruler: fix startup in single-binary mode when the new ruler_storage is used.
  • [BUGFIX] Ruler: fixed counting of PromQL evaluation errors as user-errors when updating cortex_ruler_queries_failed_total.
  • [BUGFIX] Store-gateway: when blocks sharding is enabled, do not load all blocks in each store-gateway in case of a cold startup, but load only blocks owned by the store-gateway replica.
  • [BUGFIX] Upgrade Prometheus. TSDB now waits for pending readers before truncating Head block, fixing the chunk not found error and preventing wrong query results.

v1.4.2 – July 21st 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: 69682495d5995e04616894294b5af8661a03155a01a99beff93e0ea9b36a5007
  • Deb (Linux AMD64)

    • Download
    • SHA256: f6f09f334d0b577245309af2ec3429febc11ac3d196a5f0b5f2cd391a4147cd6
  • RPM (Linux AMD64)

    • Download
    • SHA256: 04e9062bafd0298d3402d9051bafe54cb6871ab28de1df7101c505e7d631a4af
  • Docker image: run docker pull grafana/metrics-enterprise:v1.4.2 (digest: sha256:385b563669a5ba4a459f833a2c356884b757de719e43369ead0c5dc59cb11d94)

  • License: Grafana Labs license

Changelog

  • [SECURITY] Prevent path traversal attack from users able to control the HTTP header X-Scope-OrgID. (CVE-2021-36157)
    • Users only have control of the HTTP header when GEM is configured with flags -auth.type=default and -tenant-federation.enabled=false
  • [SECURITY] Update build image to use Go 1.16.6. (CVE-2021-34558) #1874
  • [BUGFIX] Ruler: Register remote write metrics correctly. #1814

Upstream Cortex details

  • Cortex Hash: 2210ebb7052a9efb99d0e4dc53043a3f5d806d00

v1.4.1 – June 29th 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: e1dd56442d1d2fd8cdf224938207fda4845eeb3f610e5e11a920e23de43adb5a
  • Deb (Linux AMD64)

    • Download
    • SHA256: 664140413d7d47e4a37d9aa435b4ebe9607ed47cdfae4e0631d02cd209f63076
  • RPM (Linux AMD64)

    • Download
    • SHA256: 4bb9c8e17819a63a7e7bc38a4366e4047f7315cfd680524524529a91fb45d9c2
  • Docker image: run docker pull grafana/metrics-enterprise:v1.4.1 (digest: sha256:d1d17bfe2ec984b093b9da1ab8cdea1f764f24f16b38557d719254c4e64c9f9a)

  • License: Grafana Labs license

Changelog

  • [BUGFIX] Update the GEM build image to use Alpine 3.14, python 3.9 and gsutil 4.52.

Upstream Cortex details

  • Cortex Hash: 98dd0c4d69576fdfaf2b9bfd7aa475e835e11429

v1.4.0 – June 28th 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: 9237645e6e2c046d46035c64c74cd5b146312b19cfc30d684b058a67d89c9f13
  • Deb (Linux AMD64)

    • Download
    • SHA256: de8e197e0ca8420cfe296fee2ba37891e72e7396afcf54a26f91cccafc146b9b
  • RPM (Linux AMD64)

    • Download
    • SHA256: a11f2eb10d5ba375a2480e85f94d1c82d63f142858349562d11b99321a40a8c6
  • Docker image: run docker pull grafana/metrics-enterprise:v1.4.0 (digest: sha256:ff38e0544d805bfd1450a1f033ed79585252a4444d247e0e4c649625619215ab)

  • License: Grafana Labs license

Changelog

  • [CHANGE] Breaking: Verify token issuer when using OIDC authentication. Includes a breaking change for users of OIDC authentication. #1571
    • Before this change the configuration of OIDC authentication required the OIDC provider’s jwks_uri to be set in the configuration flag auth.admin.oidc.url. This flag has been deprecated.
    • A new flag named auth.admin.oidc.issuer-url has been added, and it must be set to the URL of the OIDC provider. For example: -auth.admin.oidc.issuer-url=https://accounts.google.com Note: This is not simply a rename of the old flag; you also need to update the value. The defined issuer is required to provide the OIDC discovery endpoint (/.well-known/openid-configuration)
  • [CHANGE] Breaking: The GEM/GEL Ruler can now be accessed by access policies with rules read/write permissions, which are no longer metrics/logs specific #1366 & #1403
    • Before this change, there were metric rule specific permissions metrics:rules:read and metrics:rules:write.
    • The data representation for this change in object storage is backwards compatible, so no change is needed for existing access policies using the new rules.
    • The JSON representation for these rules is not backwards compatible, and so any JSON interactions with the API that specified the strings metrics:rules:read or metrics:rules:write must be updated to the strings rules:read and rules:write respectively.
    • This breaking change applies to the GEM Plugin as well, so please update to version v3.0.X.
  • [CHANGE] Remove enterprise_features config block entirely. #1453
  • [CHANGE] Alertmanager: deprecated -alertmanager.storage.* CLI flags (and their respective YAML config options) in favour of -alertmanager-storage.*. This change doesn’t apply to alertmanager.storage.path and alertmanager.storage.retention.
  • [CHANGE] Blocks storage: removed the config option -blocks-storage.bucket-store.index-cache.postings-compression-enabled, which was deprecated. Postings compression is always enabled.
  • [CHANGE] GEM now fails fast on startup if it is unable to connect to the ring backend.
  • [CHANGE] Querier / ruler: deprecated -store.query-chunk-limit CLI flag (and its respective YAML config option max_chunks_per_query) in favor of -querier.max-fetched-chunks-per-query (and its respective YAML configuration option max_fetched_chunks_per_query). The new limit specifies the maximum number of chunks that can be fetched in a single query from ingesters and long-term storage: the total number of chunks that are actually fetched, in the worst case, can be twice the limit because the limit is applied to ingesters as well as long-term storage.
  • [CHANGE] Query frontend: removed the configuration option -querier.compress-http-responses, which was deprecated. Instead, use-api.response-compression-enabled.
  • [CHANGE] Runtime-config / overrides: removed the config options -limits.per-user-override-config (use -runtime-config.file) and -limits.per-user-override-period (use -runtime-config.reload-period), both deprecated.
  • [FEATURE] Add embedded recording rules to the Enterprise Ruler to support building dashboards and alerts from internal metrics written directly to GEM itself via a distributor. #1459
    • To enable or disable the feature, use the -instrumentation.enabled flag or associated enabled setting on the instrumentation configuration block. The feature is disabled by default.
  • [FEATURE] Add the ability to write internal metrics directly to GEM itself via a distributor. #1281
    • To configure, or enabled or disabled the feature, user the -instrumentation.enabled flag and associated other flags or the instrumentation configuration block:
      yaml
      instrumentation:
        enabled: false
        flush_period: 15s
        write_timeout: 10s
        distributor_client:
          address: dns:///:9095
          connect_timeout: 5s
          tls_enabled: false
          tls_cert_path:
          tls_key_path:
          tls_ca_path:
          tls_server_name:
          tls_insecure_skip_verify:
      The feature is disabled by default.
  • [FEATURE] Self-monitoring: expose filesystem usage metrics to source the disk utilization panel in the self-monitoring resource dashboards #1618
  • [FEATURE] Add an experimental GEM component federation-frontend, which can be used to federate queries between multiple GEM clusters. #1274
  • [FEATURE] Querier: Added new -querier.max-fetched-series-per-query flag. When GEM is running with blocks storage, the max series per query limit is enforced in the querier and applies to unique series received from ingesters and store-gateway (long-term storage).
  • [FEATURE] Querier/Ruler: Added new -querier.max-fetched-chunk-bytes-per-query flag. When GEM is running with blocks storage, the max chunk bytes limit is enforced in the querier and ruler and limits the size of all aggregated chunks returned from ingesters and storage as bytes for a query.
  • [ENHANCEMENT] Introduce configuration parameter to limit how many points we process per query. #1292
  • [ENHANCEMENT] Adding API endpoints via which a user can post / get their storage schemas / aggregations. #1389
  • [ENHANCEMENT] Admin-API: Listing mutable resources now includes a comma separated list of versions for those resources in the ETag header #1419
  • [ENHANCEMENT] Admin-API: Updating a mutable resources now allows a wildcard value ("*") to be passed as the If-Match header, which allows the updating of any current version #1449
  • [ENHANCEMENT] The /config HTTP endpoint now also returns GEM specific options alongside regular Cortex configuration. #1380
  • [BUGFIX] Fix LBAC regular expression matchers #1305
  • [BUGFIX] Validate all fields of JWT tokens used for auth, except the issuer. #1500
  • [BUGFIX] Ruler: ensure the S3 rule storage flags properly maps to the upstream flags. #1460
  • [BUGFIX] Admin-API: rejecting update requests when access policies have empty scopes or realms. #1447
  • [BUGFIX] Updated licenses are now persisted to object storage, fixing the responses from the license API which would show old license information. #1568
  • [BUGFIX] Validate all fields of JWT tokens used for auth, except the issuer. #1500
  • [BUGFIX] OAuth: Don’t use default access policy when an invalid JWT claim is provided. #1635
  • [BUGFIX] Authentiation: Invalidate cached authentication tokens when they are deleted from object storage. #1703

Upstream Cortex details

v1.3.1 – Jul 21st 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: 6592ffe2258a44b008c03abe4e52645c7a612bfb7f3d1f5dead44dbc7929904a
  • Deb (Linux AMD64)

    • Download
    • SHA256: d292cb0de1a4ef05b7ffd5c7faa3d9647c91a189cab5daee6362e8f931338be7
  • RPM (Linux AMD64)

    • Download
    • SHA256: 074e9cda3c4c3f74ecf5b45ddcd82c3fc2adc83f93afaaa1f9735eba1854373a
  • Docker image: run docker pull grafana/metrics-enterprise:v1.3.1 (digest: sha256:e03a7ae061d5f617490812a6f45c6362fdc9ef79010555a207ebee2174ef9b23)

  • License: Grafana Labs license

Changelog

  • [SECURITY] Prevent path traversal attack from users able to control the HTTP header X-Scope-OrgID. (CVE-2021-36157)
    • Users only have control of the HTTP header when GEM is configured with flags -auth.type=default and -tenant-federation.enabled=false
  • [SECURITY] Update build image to use Go 1.16.6. (CVE-2021-34558) #1874
  • [BUGFIX] Update the GEM build image to use Alpine 3.14, python 3.9 and gsutil 4.52. #1781
  • [BUGFIX] Ruler: Register remote write metrics correctly. #1814

Upstream Cortex details

  • Cortex Hash: 64592254fe91c86e903882947a58d572a316884d

v1.3.0 – April 26th 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: 478528db0a22918eeafb1b6f93387d28d0ae6163dd592771a9e9d9f302c3a40d
  • Deb (Linux AMD64)

    • Download
    • SHA256: cfe0ebe4928a4cff007c1f8c86eebc8c73484bfa763c9679c2d3dad7a4c51388
  • RPM (Linux AMD64)

    • Download
    • SHA256: 7278036942905d3341c3e3a9aaadad30dda9fc18e5e3ad86ee092d4e03e77d72
  • Docker image: run docker pull grafana/metrics-enterprise:v1.3.0

  • License: Grafana Labs license

Changelog

  • [SECURITY] Alertmanager: Fix a local file disclosure vulnerability when -experimental.alertmanager.enable-api is used (CVE-2021-31231):
    • The HTTP Basic auth password_file can be used as an attack vector to send any file content via a webhook.
    • The Alertmanager templates can be used as an attack vector to send any file content because the Alertmanager can load any text file specified in the templates list.
  • [CHANGE] Admin API: Concurrent requests to the same resource are no longer allowed. If two requests are issued to create, update, or delete the same resource, then the first one to achieve a lock executes and the second one returns a conflict error. This is handled per process. To enforce this behavior on multiple processes, use leader election. #1186
  • [CHANGE] Admin API: all errors encountered during the processing of HTTP requests are converted to GRPC errors in order to determine the correct HTTP status to return. This enforces consistency for leader election, because some requests are handled internally, and others are forwarded to other instances. #1217
  • [CHANGE] Admin API: all mutation operations (PUT/DELETE) now require an If-Match header to be set (an integer between "" such as "27") to verify that the correct version of the resource is being modified and prevent against race conditions. You can find the current version of a resource in the ETag header that is returned when that resource is read (via GET) or updated (via PUT).
  • [FEATURE] Admin API: you can set per-instance resource limits via the Admin API. This is enabled by default. #1173
    • You can enable or disable this feature by using the -admin-api.limits.enabled or -admin-api.limits.refresh-period flags. Also, you can configure this feature by using the admin_api configuration block:
      yaml
      admin_api:
        limits:
          enabled: true
          refresh_period: 1m
  • [ENHANCEMENT] Upgrade build image to use Go 1.16.3. #1294
  • [ENHANCEMENT] Admin client: Add cortex_admin_client_is_leader gauge metric to determine when the client considers itself the leader. #1175
  • [ENHANCEMENT] Admin API: update an access policy via the Admin API using a PUT request. #1139
  • [ENHANCEMENT] Admin API: Update an instance via the Admin API using a PUT request. #1180
  • [ENHANCEMENT] Gateway: Forward /multitenant_alertmanager/ring and /ruler/ring routes to the alertmanager and ruler proxy backends. #1144
  • [BUGFIX] Graphite: Fix aggregation cache to generate cache keys using correct input data. #963
  • [BUGFIX] Authentication: Fix issue where all requests would trigger a panic if authentication is enabled but no admin client is configured. A error is now printed instead. #1106

Upstream Cortex details

v1.2.1 – April 27 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: c00f80ceb5994542ec0527e9d1a6a481dbb472c8fdb0318b12142a59b6b32ec4
  • Deb (Linux AMD64)

    • Download
    • SHA256: 741477bbf0d1d4191e413b4f0db96098920df37e27f9a5598b994a6791b0aef3
  • RPM (Linux AMD64)

    • Download
    • SHA256: 28ce6fe43f93bd158d415e03b2ce8bbdf01e0fde1e699f3486b359167d8efb5f
  • Docker image: run docker pull grafana/metrics-enterprise:v1.2.1

  • License: Grafana Labs license

Changelog

  • [SECURITY] Alertmanager: Fix a local file disclosure vulnerability when -experimental.alertmanager.enable-api is used (CVE-2021-31231):
    • The HTTP Basic auth password_file can be used as an attack vector to send any file content via a webhook.
    • The Alertmanager templates can be used as an attack vector to send any file content because the Alertmanager can load any text file specified in the templates list.

v1.2.0 – March 10 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: 702208cb7b440b44a30a7ba9bbe34e7a1bbd19a632435a92cdd608cb232593c8
  • Deb (Linux AMD64)

    • Download
    • SHA256: a3e5140bf38f6479693608bbaf9bdcb7824795c1a662a70007905310cf35f862
  • RPM (Linux AMD64)

    • Download
    • SHA256: 2de19fd38ed129bc66ee4f76804bd2633c996afaa43900a306cb536d672e8909
  • Docker image: run docker pull grafana/metrics-enterprise:v1.2.0

  • License: Grafana Labs license

Changelog

  • [CHANGE] Gateway: Remove purger proxy configuration, which is not a supported target for blocks clusters.
  • [CHANGE] Auth: Override authentication flags have been renamed:
    • The auth.override-admin-token flag has been changed to auth.override.token.
    • The auth.override-admin-token-file flag has been changed to auth.override.token-file.
  • [FEATURE] Gateway: Improve the gateway target to support unique TLS configurations and write timeouts for each backend.
    • New fields have been added to allow for configuration:
      yaml
      gateway:
        proxy:
          default:
            tls:
              tls_cert_path: <string>
              tls_key_path: <string>
              tls_ca_path: <string>
              tls_insecure_skip_verify: <bool>
          distributor:
            read_timeout: <duration>
            write_timeout: <duration>
            tls:
            ...
  • [FEATURE] Compactor: Introduced time-sharding compaction strategy.
  • [ENHANCEMENT] Distributor: Wrap remote writes in distributor to sample and log them as business intelligence events.
  • [ENHANCEMENT] Metrics emitted for TLS certificate expiration now reflect certificates being reloaded.
  • [ENHANCEMENT] Remove the Graphite Auto Complete Index and use Cortex index instead.
  • [ENHANCEMENT] Add Graphite API endpoint /metrics/index.json.
  • [ENHANCEMENT] Distributor: Wrap remote writes in distributor to sample and log them as business intelligence events.
  • [ENHANCEMENT] Call Cortex Distributor over gRPC from Graphite Write Proxy (formerly Graphite Distributor)
  • [ENHANCEMENT] Admin API: Add feature to elect and admin-api leader instance to handle all mutation requests. Requests to non-leader instances are forwarded to the leader instance.
    • New fields have been added to allow for configuration:
    yaml
    admin_api:
      leader_election:
        enabled: <bool>
        ring:
          kvstore: <kv.Config>
          heartbeat_period: <duration>
          heartbeat_timeout: <duration>
          tokens_observe_period: <duration>
          instance_interface_name: <[]string>
        client_config: <grpcclient.Config>
  • [BUGFIX] LBAC: Fix issue where debug logs would not print the selector and instead print selector="unsupported value type".
  • [BUGFIX] Admin-Client: Warning logs are no longer created on resource creation.
  • [BUGFIX] Ruler: Fix issue where invalid remote-write URLs cause a panic.
  • [BUGFIX] Querier: Apply label access filters on multi tenant access policies.

Upstream Cortex details

v1.1.3 – April 27 2021

  • Binary (Linux AMD64)

    • Download
    • SHA256: 0c2a549552ac2cf406837df4d6823a88bb5089f84d175a5b16d2710dd0ce7f3a
  • Deb (Linux AMD64)

    • Download
    • SHA256: 1814df03f6573deaefbc87de75777873d9d6f724efce74a59e0cae06734c69fb
  • RPM (Linux AMD64)

    • Download
    • SHA256: b2a0fb67aed10a46d1e4e5f3e5db6e77b4e0cb9b167bdb15a6997ae2878d085c
  • Docker image: run docker pull grafana/metrics-enterprise:v1.1.3

  • License: Grafana Labs license

Changelog

  • [SECURITY] Alertmanager: Fix a local file disclosure vulnerability when -experimental.alertmanager.enable-api is used (CVE-2021-31231):
    • The HTTP Basic auth password_file can be used as an attack vector to send any file content via a webhook.
    • The Alertmanager templates can be used as an attack vector to send any file content because the Alertmanager can load any text file specified in the templates list.

v1.1.2 – January 20 2021

Changelog

  • [BUGFIX] Querier: fix default value incorrectly overriding -querier.frontend-address in single-binary mode.

v1.1.1 – January 14 2021

Changelog

  • [BUGFIX] Ruler: Minimize gaps on rule evaluations with stale input and enabled ruler evaluation delay.

v1.1.0 – January 12 2021

Changelog

  • [CHANGE] Admin-API: Resources must not be both prefixed and suffixed with the __ characters. If any of your existing resources exist with this naming pattern, they must be deleted and recreated with a new name before upgrading.

  • [CHANGE] Graphite: Allow storage schema and storage aggregation configs to be defined per tenant.

  • [CHANGE] Admin-Client: Instance management client calls no longer use object storage Iter calls when retrieving the latest version of a resource.

  • [CHANGE] Graphite: Add API endpoints to explore the available Graphite functions.

  • [CHANGE] Admin: The selectors for label policies are now provided as PromQL label strings instead of typed objects.

    • Deprecated:

      json
      "label_policies": [
        {
          "selector": [
            {
              "name": "env",
              "value": "dev",
              "type": "EQ"
            }
          ]
        }
      ]
    • New:

      json
      "label_policies": [
        {
          "selector": "{env=\"dev\"}"
        }
      ]
  • [CHANGE] Admin: Operations with an ADMIN scope are no longer restricted to operating on clusters they have as a configured realm.

  • [CHANGE] Deprecate enterprise_features config section in favor of the Cortex config extension.

    • Deprecated:

      yaml
      enterprise_features:
        ruler_s3_request_headers:
          file: <string>
          poll_interval: <duration>
        ruler_remote_write:
          enabled: <bool>
          wal_dir: <string>
    • New:

      yaml
      ruler:
        storage:
          s3:
            header_map_file_path: <string>
            header_map_poll_interval: <duration>
        remote_write:
          enabled: <bool>
          wal_dir: <string>
  • [FEATURE] Ruler: Alerts can now be correctly forwarded to the Alertmanager with enterprise authentication enabled by setting the basic authentication username to __alertmanager__ and the password to a API token with access to every instance.

  • [FEATURE] Queries: LBAC enforcement has been added for queries and label value requests.

    • When GEM is run using the default authentication mode, LBAC policies are specified using the X-Prom-Label-Policy HTTP header in the format: X-Prom-Label-Policy: <tenant-id>:urlEscaped(<prometheus label selector>). For example, a policy that only allows metrics with the label env equal to dev for tenant test-instance could specified with the following header: X-Prom-Label-Policy: test-instance:%7Benv=%22dev%22%7D. To specify multiple policies either set the header multiple times or set the header with a single string of multiple policies separated by an unescaped comma.
  • [FEATURE] Admin API: add label_policies field, which contains an array of label matchers to the access policy realm JSON.

    json
    {
      "realms": [
        {
          "instance": "<string>",
          "cluster": "<string>",
          "label_policies": [
            {
              "selector": [
                {
                  "type": "<enum: EQ | NEQ | RE | NRE>",
                  "name": "<string>",
                  "value": "<string>"
                }
              ]
            }
          ]
        }
      ]
    }
  • [FEATURE] Admin: Add target tokengen to generate tokens for the default or a custom access policy.

  • [FEATURE] Admin: Added a default __admin__ access policy that has an ADMIN scope. This policy can be disabled adding the following to the GEM configuration file.

    yaml
    admin_client:
      disable_default_admin_policy: true
  • [FEATURE] Querier: Queries can be federated across multiple tenants. The tenants IDs involved need to be specified separated by a | character in the X-Scope-OrgID request header.

  • [FEATURE] Add gateway target that can be configured to proxy requests to microservices and can be used to load balance remote_write requests to the distributors.

  • [ENHANCEMENT] AdminAPI: Add scope for read only admin access, admin:read.

  • [ENHANCEMENT] AdminAPI: Add separate set of scopes for alerts and rules.

    • alerts:read
    • alerts:write
    • logs:rules:read
    • logs:rules:write
    • metrics:rules:read
    • metrics:rules:write
  • [ENHANCEMENT] Reduce allocations in Graphite Ingester, when ingesting untagged Graphite metrics.

  • [ENHANCEMENT] Serve Graphite /metrics/find requests by keeping track of all recent metrics in an in-memory index on the Ingesters to reduce latency.

  • [ENHANCEMENT] Add auxiliary Graphite API endpoints to explore tags and obtain auto-complete suggestions for the Grafana query editor.

  • [ENHANCEMENT] Admin API: add ClusterKind support for Logs & Traces.

  • [ENHANCEMENT] Admin API: add scopes for Logs.

  • [ENHANCEMENT] Admin: The bootstrap target no longer needs to be run before being able to start GEM with enterprise features. Every target will now try to perform bootstrapping on startup if it has not already been done. Failure to bootstrap will not prevent GEM running, but enterprise features will not be available.

  • [ENHANCEMENT] Add grafana_labs_license_expiry_timestamp metric to expose GEM license expiration as a UNIX timestamp, in seconds.

  • [BUGFIX] Graphite: Fixing a bug in the request parsing of GET requests on the auto-complete endpoints.

  • [BUGFIX] Graphite: When ingesting datapoints resulting in out-of-order/out-of-bounds/duplicate-sample we need to return status 200 to prevent an indefinite loop.

  • [BUGFIX] Ruler: Fix issue where remote-write rule groups are created then immediately deleted when a rule group name contains the / delimiter character.

Upstream Cortex changes

v1.0.2 – October 16 2020

Changelog

  • [CHANGE] Update vendored Cortex from v1.4.0 to [v1.4.0-21bad5][21bad5]
  • [BUGFIX] Fix potential panic due to writing into a closed chan in the graphite query executor.
  • [ENHANCEMENT] Admin: Access policy create operations now enforce valid instance/cluster names for the realms configured on the access policy.
  • [ENHANCEMENT] Add -version flag to GEM.
  • [FEATURE] Add config options to rate limit the LIST methods of buckets.
  • [FEATURE] Adds the Graphite /render API endpoint, which can be used to query metrics with the Graphite query language.
  • [FEATURE] Add config options to specify and poll files to inject arbitrary HTTP headers in requests to S3 for the admin and blocks client.
    yaml
    blocks_storage:
      s3:
        header_map_file_path: <path to header file>
        header_map_poll_interval: <duration string>
    admin_client:
      storage:
        s3:
          header_map_file_path: <path to header file>
          header_map_poll_interval: <duration string>
  • [FEATURE] Adds the Graphite /metrics/find API endpoint, which can be used to obtain lists of metrics matching a given pattern (Grafana query editor auto-complete, dashboard variable population, etc).
  • [FEATURE] Add a default access policy option for OpenID Connect tokens.

Upstream Cortex details

v1.0.1 – October 06 2020

Upstream Cortex details

  • Cortex Hash: 23554ce028c090a4a3413ac0e35e5e1dc9fa929f
  • Cortex Version: 1.4.0

Changelog

  • [CHANGE] Update vendored Cortex to v1.4.0.

v1.0.0 – September 17 2020

Upstream Cortex details

  • Cortex Hash: bb5fcc929832f7bd2a6c2df348b387abcb8b961e
  • Cortex Version: 1.4.0-rc.0

Changelog

  • [BUGFIX] Make config field names consistent.
  • [CHANGE] Use Go 1.14.9 to build the project and cut build-image@v0.1.3.

v1.0.0-rc.2 – September 15 2020

Upstream Cortex details

  • Cortex Hash: c3a344784a0c8ce70ef2521f543033dee3dce6c6
  • Cortex Version: 1.3.1

Changelog

  • [BUGFIX] Admin API: Fix panic on start up for admin-api target.

v1.0.0-rc.1 – September 04 2020

Upstream Cortex details

  • Cortex Hash: 4f6e1e5c48ccad2c1988cf1d36ca522ae0c805ed
  • Cortex Version: 1.3.1

Changelog

  • [CHANGE] Admin-Client: The storage backend for the admin client no longer defaults to s3. Instead no default is set and the admin client will not start up unless a default is set.
  • [CHANGE] The following features will no longer be active unless GEM is started with access to a valid license.
    • Admin API
    • Ruler S3 auth headers
    • Ruler API to configure remote write rule groups

v0.6.3 – August 20 2020

Upstream Cortex details

  • Cortex Hash: 2bda7b94
  • Cortex Version: 1.2.1

Changelog

  • [CHANGE] Auth: removed auth.enable flag and add auth.type flag with default and enterprise options.
  • [FEATURE] Admin API: Add list endpoint for stored licenses.

v0.6.2 – August 04 2020

Upstream Cortex details

  • Cortex Hash: 6db67a4efbbf62b1133fa037a95382a21f752bbf
  • Cortex Version: 1.2.1

Changelog

  • [CHANGE] Ruler: S3 Headers are no longer protected by a license.