<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Operating GEM on Grafana Labs</title><link>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/</link><description>Recent content in Operating GEM on Grafana Labs</description><generator>Hugo -- gohugo.io</generator><language>en</language><atom:link href="/docs/enterprise-metrics/v2.6.x/operations/index.xml" rel="self" type="application/rss+xml"/><item><title>Compactor</title><link>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/compactor/</link><pubDate>Wed, 11 Mar 2026 13:04:24 +0000</pubDate><guid>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/compactor/</guid><content><![CDATA[&lt;h2 id=&#34;monitoring-compactor-health&#34;&gt;Monitoring compactor health&lt;/h2&gt;
&lt;p&gt;Grafana Enterprise Metrics emits several metrics related to compactor health.
The following queries are useful to get a high-level view of compactor
activity. For users with self-monitoring enabled, please see the &lt;code&gt;GEM system monitoring / compactor&lt;/code&gt; dashboard, which includes panels built from these queries.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Successful compactor jobs run per hour&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;sum(increase(cortex_compactor_runs_completed_total[1h]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This value should be relatively stable when viewed over a long enough period of
time, for example hours or days.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Failed compactor jobs run per hour&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;sum(increase(cortex_compactor_runs_failed_total[1h]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Restarting the compactor process will interrupt in process
compaction jobs. This will increase the value of
&lt;code&gt;cortex_compactor_runs_failed_total&lt;/code&gt;, but it is not cause for concern as long
as these restarts are expected. In the event of a compactor crash, this metric
will not be incremented. Compactor process crash events should be monitored
separately.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Number of blocks per tenant&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;sum by (user) (cortex_bucket_blocks_count - cortex_bucket_blocks_marked_for_deletion_count)&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This value should be relatively stable over a long enough period of time, for
example several days. If the compactor is lagging behind, it will increase over
time.&lt;/p&gt;
&lt;h2 id=&#34;monitoring-bucket-index-health&#34;&gt;Monitoring bucket index health&lt;/h2&gt;
&lt;p&gt;Before enabling the &lt;a href=&#34;/docs/mimir/latest/operators-guide/architecture/bucket-index/&#34;&gt;bucket index&lt;/a&gt;, the index health can
be verified by monitoring the
&lt;code&gt;cortex_bucket_index_last_successful_update_timestamp_seconds&lt;/code&gt; metric. This
metric tracks the last successful bucket index update per tenant. The following
query can be used to determine the index age for each tenant:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;time() - cortex_bucket_index_last_successful_update_timestamp_seconds&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The maximum index age should generally line up with the value of the
&lt;code&gt;-compactor.cleanup-interval&lt;/code&gt; flag.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Some jitter is added to the cleanup interval to prevent all
compactor replicas from running at the same moment every time the interval
elapses. Additionally, the cleanup takes some time to perform. Because of
this, you may see the index age slightly older than the cleanup interval. This
is not cause for concern. We recommend configuring an alerting threshold when
the index age exceeds (2 * &lt;code&gt;compactor.cleanup-interval&lt;/code&gt;) &#43; 5 minutes.&lt;/p&gt;&lt;/blockquote&gt;
]]></content><description>&lt;h2 id="monitoring-compactor-health">Monitoring compactor health&lt;/h2>
&lt;p>Grafana Enterprise Metrics emits several metrics related to compactor health.
The following queries are useful to get a high-level view of compactor
activity. For users with self-monitoring enabled, please see the &lt;code>GEM system monitoring / compactor&lt;/code> dashboard, which includes panels built from these queries.&lt;/p></description></item><item><title>Gateway</title><link>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/gateway/</link><pubDate>Wed, 11 Mar 2026 13:04:24 +0000</pubDate><guid>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/gateway/</guid><content><![CDATA[&lt;h1 id=&#34;gateway&#34;&gt;Gateway&lt;/h1&gt;
&lt;p&gt;The Grafana Enterprise Metrics gateway is a service target. It can proxy requests to other Grafana Enterprise Metrics microservices. You can also use it for client-side load balancing of requests proxied to the distributors.&lt;/p&gt;
&lt;h2 id=&#34;configuration&#34;&gt;Configuration&lt;/h2&gt;
&lt;p&gt;The gateway has its own configuration block in the Grafana Enterprise Metrics configuration files.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;gateway:
  proxy:
    default: &amp;lt;backend_proxy_config&amp;gt;
    [ admin_api: &amp;lt;backend_proxy_config&amp;gt; ]
    [ alertmanager: &amp;lt;backend_proxy_config&amp;gt; ]
    [ compactor: &amp;lt;backend_proxy_config&amp;gt; ]
    [ distributor: &amp;lt;backend_proxy_config&amp;gt; ]
    [ graphite: &amp;lt;backend_proxy_config&amp;gt; ]
    [ ingester: &amp;lt;backend_proxy_config&amp;gt; ]
    [ query_frontend: &amp;lt;backend_proxy_config&amp;gt; ]
    [ ruler: &amp;lt;backend_proxy_config&amp;gt; ]
    [ store_gateway: &amp;lt;backend_proxy_config&amp;gt; ]
}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;You can also use flags to configure the gateway. Each flag is the path to the equivalent configuration field joined by the period (&lt;code&gt;.&lt;/code&gt;) character and with underscores (&lt;code&gt;_&lt;/code&gt;) replaced with dashes (&lt;code&gt;-&lt;/code&gt;).
For example, use the flag &lt;code&gt;--gateway.proxy.store-gateway.url=&amp;lt;store-gateway url&amp;gt;&lt;/code&gt; to configure the store-gateway backend proxy URL.&lt;/p&gt;
&lt;h3 id=&#34;backend_proxy_config&#34;&gt;&amp;lt;backend_proxy_config&amp;gt;&lt;/h3&gt;
&lt;p&gt;A &lt;code&gt;backend_proxy&lt;/code&gt; section specifies the URL of the backend to be proxied.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;url: &amp;lt;url&amp;gt; | default = &amp;lt;gateway.proxy.default.url&amp;gt;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h2 id=&#34;client-side-load-balancing&#34;&gt;Client-side load balancing&lt;/h2&gt;
&lt;p&gt;If you use a backend proxy URL beginning with &lt;code&gt;dns:///&lt;/code&gt;, it creates a gRPC proxy with client-side round-robin load balancing instead of the default HTTP reverse proxy.
To configure client-side load balancing for requests to the distributors, set the &lt;code&gt;gateway.proxy.distributor.url&lt;/code&gt; to &lt;code&gt;dns:///&amp;lt;distributor service&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; There are three &lt;code&gt;/&lt;/code&gt; characters in the preceding DNS URL meaning that you are using the default DNS authority. For details about DNS URLs, refer to &lt;a href=&#34;https://tools.ietf.org/html/rfc4501&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;RFC 4501&lt;/a&gt;.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Client-side load balancing is useful in ensuring that distributors are evenly loaded with requests.
Prometheus remote-write clients use HTTP persistent connections, also known as HTTP keep-alive, to re-use a single TCP connection for multiple requests and responses resulting in reduced latency for subsequent requests.&lt;/p&gt;
&lt;p&gt;Kubernetes Services are not load balancers; initial TCP connections are made using a random endpoint but once the connection is established, the same remote-write client will talk to the same distributor server for its lifetime. This can mean an uneven load for your distributors and worse cluster performance overall.&lt;/p&gt;
&lt;p&gt;The Grafana Enterprise Metrics gateway solves this problem by exposing an HTTP server for receiving the client requests but using gRPC to talk to the distributors.
The gRPC proxy maintains a list of endpoints returned from the DNS lookup and keeps persistent connections to each one. The proxies are also configured to perform per request client-side load balancing across the endpoints resulting in the best of persistent connections without the issues presented in the preceding paragraph.&lt;/p&gt;
]]></content><description>&lt;h1 id="gateway">Gateway&lt;/h1>
&lt;p>The Grafana Enterprise Metrics gateway is a service target. It can proxy requests to other Grafana Enterprise Metrics microservices. You can also use it for client-side load balancing of requests proxied to the distributors.&lt;/p></description></item><item><title>OAuth integration</title><link>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/oauth/</link><pubDate>Wed, 11 Mar 2026 13:04:24 +0000</pubDate><guid>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/oauth/</guid><content><![CDATA[&lt;h1 id=&#34;oauth-integration&#34;&gt;OAuth integration&lt;/h1&gt;
&lt;p&gt;Grafana Enterprise Metrics supports the &lt;a href=&#34;https://openid.net/specs/openid-connect-core-1_0.html&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;&lt;strong&gt;OpenID Connect (OIDC)&lt;/strong&gt;&lt;/a&gt; core standard to validate tokens. This allows you to integrate GEM with an existing OAuth token provider at your organization.&lt;/p&gt;
&lt;p&gt;To support OIDC, provide the URL of the OIDC provider (issuer) in the &lt;code&gt;auth.admin.oidc.issuer-url&lt;/code&gt; setting. The provider is required to have the OIDC Discovery endpoint (also known as &amp;ldquo;well known endpoint&amp;rdquo;) at &lt;code&gt;&amp;lt;issuer-url&amp;gt;/.well-known/openid-configuration&lt;/code&gt;, as described &lt;a href=&#34;https://openid.net/specs/openid-connect-discovery-1_0.html#ProviderConfig&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;in the openid standard&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A JWT is included as the password in HTTP basic authentication or as part of a bearer token in bearer authentication. The bearer token should have two parts separated by a &lt;code&gt;:&lt;/code&gt;. The first part is the tenant ID. The second part is the JWT.&lt;/p&gt;
&lt;p&gt;The JWT is validated against the OIDC provider specified above. If it is valid then an access policy name is extracted. The regular expression in &lt;code&gt;auth.admin.oidc.access_policy_regex&lt;/code&gt; is run against each value in the the JWT claim field specified in &lt;code&gt;auth.admin.oidc.access_policy_claim&lt;/code&gt;, which can either be a single string or a list of strings.&lt;/p&gt;
&lt;p&gt;A sub-match has to be present to extract the access policy. If the value in the JWT claim field is a string, then only the first sub-match is used. If it is a list of strings, then the first submatch for each entry is used. You can use the regular expression &lt;code&gt;(.*)&lt;/code&gt; for the whole claim field.&lt;/p&gt;
&lt;p&gt;The regular expression syntax is &lt;a href=&#34;https://github.com/google/re2/wiki/Syntax&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;RE2&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;configuration&#34;&gt;Configuration&lt;/h2&gt;
&lt;p&gt;To use OIDC, specify the &lt;code&gt;auth.type&lt;/code&gt; as &lt;code&gt;enterprise&lt;/code&gt;. Here is an example authentication section:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;auth:
  type: enterprise
  admin:
    oidc:
      issuer_url: https://accounts.authprovider.com/realms/example
      access_policy_claim: &amp;#34;sub&amp;#34;
      access_policy_regex: &amp;#34;pref-([0-9]&amp;#43;)-.*&amp;#34;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Here is an example payload section of a valid JWT:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;JSON&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-json&#34;&gt;{
  &amp;#34;sub&amp;#34;: &amp;#34;pref-1234567890-abc&amp;#34;,
  &amp;#34;name&amp;#34;: &amp;#34;John Doe&amp;#34;,
  &amp;#34;admin&amp;#34;: true
}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The extracted access policy is &lt;code&gt;1234567890&lt;/code&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; OpenID Connect (OIDC) converts the encoded access policies to lowercase (&lt;code&gt;downcase&lt;/code&gt;). For example, if your OpenID system has an access policy called &lt;code&gt;Team1&lt;/code&gt;, then you need to create an access policy in GEM called &lt;code&gt;team1&lt;/code&gt; so the integration works.&lt;/p&gt;&lt;/blockquote&gt;
&lt;h2 id=&#34;multiple-access-policies&#34;&gt;Multiple access policies&lt;/h2&gt;
&lt;p&gt;It is possible to provide an array of strings in the JWT claim field. If this array only includes one item, then the behavior is the same as when providing a string in this field. In the case where multiple access policies are provided as a list in the JWT claim field, they will be aggregated into a &amp;ldquo;virtual&amp;rdquo; access policy. This &amp;ldquo;virtual&amp;rdquo; access policy will provide metric read access to the union of all tenants contained in the original access policies. For example, given the following JWT and config above:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;JSON&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-json&#34;&gt;{
  &amp;#34;sub&amp;#34;: [&amp;#34;pref-1234567890-abc&amp;#34;, &amp;#34;pref-9876543210-xyz&amp;#34;],
  &amp;#34;name&amp;#34;: &amp;#34;John Doe&amp;#34;,
  &amp;#34;admin&amp;#34;: true
}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The resulting access policy would be an aggregate of &lt;code&gt;1234567890&lt;/code&gt; and &lt;code&gt;9876543210&lt;/code&gt;. If &lt;code&gt;1234567890&lt;/code&gt; provided read and write access to &lt;code&gt;tenant-1&lt;/code&gt;, and &lt;code&gt;9876543210&lt;/code&gt; provided read and write access to &lt;code&gt;tenant-2&lt;/code&gt; and &lt;code&gt;tenant-3&lt;/code&gt;, the resulting virtual access policy would provide read-only access to &lt;code&gt;tenant-1&lt;/code&gt;, &lt;code&gt;tenant-2&lt;/code&gt;, and &lt;code&gt;tenant-3&lt;/code&gt;. This generated access policy is cached for the period specified in &lt;code&gt;auth.cache.ttl.duration&lt;/code&gt;, which defaults to &lt;code&gt;10m&lt;/code&gt;.&lt;/p&gt;
]]></content><description>&lt;h1 id="oauth-integration">OAuth integration&lt;/h1>
&lt;p>Grafana Enterprise Metrics supports the &lt;a href="https://openid.net/specs/openid-connect-core-1_0.html" target="_blank" rel="noopener noreferrer">&lt;strong>OpenID Connect (OIDC)&lt;/strong>&lt;/a> core standard to validate tokens. This allows you to integrate GEM with an existing OAuth token provider at your organization.&lt;/p></description></item><item><title>Remote-write rule forwarding</title><link>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/remote-write-rule-forwarding/</link><pubDate>Wed, 11 Mar 2026 13:04:24 +0000</pubDate><guid>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/remote-write-rule-forwarding/</guid><content><![CDATA[&lt;h1 id=&#34;remote-write-rule-forwarding&#34;&gt;Remote-write rule forwarding&lt;/h1&gt;
&lt;p&gt;Grafana Enterprise Metrics (GEM) allows for forwarding metrics evaluated from the
&lt;a href=&#34;/docs/mimir/latest/operators-guide/architecture/components/ruler/&#34;&gt;Ruler&lt;/a&gt; to any Prometheus
remote-write compatible backend.&lt;/p&gt;
&lt;p&gt;This works by loading rule groups into the Ruler with an extra config field
as shown in the example below:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;console&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-console&#34;&gt;# A regular Grafana Mimir rule group
groups:
  - name: group_one
    interval: 5m
    rules:
      - expr: &amp;#39;rate(prometheus_remote_storage_samples_in_total[5m])&amp;#39;
        record: &amp;#39;prometheus_remote_storage_samples_in_total:rate5m&amp;#39;
  - name: group_two
    interval: 1m
    rules:
      - expr: &amp;#39;rate(prometheus_remote_storage_samples_in_total[1m])&amp;#39;
        record: &amp;#39;prometheus_remote_storage_samples_in_total:rate1m&amp;#39;
    remote_write:
      - url: &amp;#39;http://user:pass@example.com/api/v1/push&amp;#39;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;In the above example, when &lt;code&gt;group_2&lt;/code&gt; is loaded into Grafana Enterprise Metrics, the Ruler Module
will evaluate the expression &lt;code&gt;rate(prometheus_remote_storage_samples_in_total[1m])&lt;/code&gt; every &lt;code&gt;1m&lt;/code&gt;
and forward the generated metric with name &lt;code&gt;prometheus_remote_storage_samples_in_total:rate1m&lt;/code&gt;
to &lt;code&gt;example.com&lt;/code&gt;. Meanwhile, &lt;code&gt;group_1&lt;/code&gt; will continue to work as expected, the evaluated
metric &lt;code&gt;prometheus_remote_storage_samples_in_total:rate5m&lt;/code&gt; will be stored within the same GEM
tenant that is running the Ruler.&lt;/p&gt;
&lt;h3 id=&#34;configuration&#34;&gt;Configuration&lt;/h3&gt;
&lt;h4 id=&#34;rule-storage&#34;&gt;Rule Storage&lt;/h4&gt;
&lt;p&gt;Remote write rules are compatible with the following backends:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Azure Blob Storage&lt;/li&gt;
&lt;li&gt;GCS&lt;/li&gt;
&lt;li&gt;S3&lt;/li&gt;
&lt;li&gt;Swift&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The following backends are not supported:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;local filesystem&lt;/li&gt;
&lt;li&gt;ConfigDB&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&#34;write-ahead-log-wal&#34;&gt;Write-ahead log (WAL)&lt;/h4&gt;
&lt;p&gt;When a rule group is configured with a remote-write config, GEM buffers the generated metrics in a write-ahead log (WAL) before forwarding them to the remote-write endpoint. This is done to increase reliability in case either GEM or the remote endpoint crashes. If GEM crashes, it reads from the WAL and continues to forward metrics to the configured backend from the last sent timestamp. If the remote endpoint crashes, GEM continues to retry requests until it is available again. If multiple rule groups have been configured to send to the same remote-write endpoint, GEM will use a common WAL for the metrics generated by those rule groups. The WAL is truncated at the time specified by the &lt;code&gt;ruler.remote-write.wal-truncate-frequency&lt;/code&gt; setting. WAL entries older than time specified in the &lt;code&gt;ruler.remote-write.max-wal-time&lt;/code&gt; setting are removed. WAL entries younger than &lt;code&gt;ruler.remote-write.min-wal-time&lt;/code&gt; are not removed.&lt;/p&gt;
&lt;p&gt;By default, the WAL is stored in the &lt;code&gt;wal&lt;/code&gt; folder in the GEM binary working directory.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;console&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-console&#34;&gt;$ ls
enterprise-metrics-binary   wal/&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The directory can be configured as shown:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;ruler:
  remote_write:
    enabled: true
    wal_dir: /tmp/wal
    min_wal_time: 1h
    max_wal_time: 5h
    wal_truncate_frequency: 1h&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h4 id=&#34;example&#34;&gt;Example&lt;/h4&gt;
&lt;p&gt;The following is a complete example of the above mentioned config options using a ruler with sharding enabled and S3 as its rule storage backend:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;ruler:
  external_url: localhost:9090
  rule_path: &amp;#34;/tmp/rules&amp;#34;
  storage:
    type: s3
    s3:
      endpoint: minio:9000
      access_key_id: cortex
      secret_access_key: supersecret
      bucketnames: &amp;#34;gem-ruler&amp;#34;
      insecure: true
      s3forcepathstyle: true
  poll_interval: 10s
  enable_api: true
  enable_sharding: true
  ring:
    kvstore:
      store: memberlist
  remote_write:
    enabled: true
    wal_dir: /tmp/wal
    min_wal_time: 1h
    max_wal_time: 5h
    wal_truncate_frequency: 1h&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;loading-remote-write-groups&#34;&gt;Loading remote-write groups&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;mimirtool&lt;/code&gt; tool is compatible with Prometheus rule files that contain the remote-write rule group syntax. You can download and use the latest version of the &lt;code&gt;mimirtool&lt;/code&gt; &lt;a href=&#34;https://github.com/grafana/mimir/releases&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;in the releases of Grafana Mimir&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can also use the docker image of the &lt;code&gt;mimirtool&lt;/code&gt;: &lt;code&gt;docker pull grafana/mimirtool:latest&lt;/code&gt;&lt;/p&gt;
&lt;h4 id=&#34;example-usage&#34;&gt;Example usage&lt;/h4&gt;
&lt;p&gt;Once you have GEM running with remote-write rule groups enabled you can load remote-write rule groups using the following procedure.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Save the following file to your workspace:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;rules.yaml:&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;groups:
  - name: remote_write_group
    interval: 5m
    rules:
      - expr: &amp;#34;sum(up)&amp;#34;
        record: &amp;#34;sum_up&amp;#34;
    remote_write:
      - url: &amp;#34;http://user:pass@example.com/api/v1/push&amp;#34;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;ol start=&#34;2&#34;&gt;
&lt;li&gt;Run the following command with &lt;code&gt;mimirtool&lt;/code&gt;:&lt;/li&gt;
&lt;/ol&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;Bash&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-bash&#34;&gt;mimirtool rules sync \
--rule-files=rules.yaml \
--id=&amp;lt;tenant-name&amp;gt; \
--address=&amp;lt;gem-url&amp;gt; \
--key=&amp;lt;valid-gem-write-token&amp;gt;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
]]></content><description>&lt;h1 id="remote-write-rule-forwarding">Remote-write rule forwarding&lt;/h1>
&lt;p>Grafana Enterprise Metrics (GEM) allows for forwarding metrics evaluated from the
&lt;a href="/docs/mimir/latest/operators-guide/architecture/components/ruler/">Ruler&lt;/a> to any Prometheus
remote-write compatible backend.&lt;/p>
&lt;p>This works by loading rule groups into the Ruler with an extra config field
as shown in the example below:&lt;/p></description></item><item><title>Cluster query federation</title><link>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/cluster-query-federation/</link><pubDate>Wed, 11 Mar 2026 13:04:24 +0000</pubDate><guid>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/cluster-query-federation/</guid><content><![CDATA[&lt;h1 id=&#34;cluster-query-federation&#34;&gt;Cluster query federation&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; Cluster query federation is an experimental feature. As such, the
configuration settings, command line flags, or specifics of the implementation
are subject to change.&lt;/p&gt;&lt;/blockquote&gt;
&lt;h2 id=&#34;overview&#34;&gt;Overview&lt;/h2&gt;
&lt;p&gt;Since version 1.4, Grafana Enterprise Metrics (GEM) includes the optional
component &lt;code&gt;federation-frontend&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The goal of this component is to provide an ability to aggregate data from
multiple GEM clusters in a single PromQL query. The underlying target clusters
are queried using the Prometheus &lt;a href=&#34;https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_read&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;&lt;code&gt;remote_read&lt;/code&gt;&lt;/a&gt; API and &lt;a href=&#34;https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Labels
API&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The component itself does not require any other components of Grafana Mimir.
Therefore, you can run it on its own. A quite common use case is aggregating
the data from two GEM clusters that are running in different regions:&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;cluster-fed-architecture.png&#34;
  alt=&#34;Cluster federation architecture&#34;/&gt;&lt;/p&gt;
&lt;h2 id=&#34;configuration&#34;&gt;Configuration&lt;/h2&gt;
&lt;p&gt;A minimal configuration of the &lt;code&gt;federation-frontend&lt;/code&gt; has to disable
authentication, because the federation frontend forwards the Basic
authentication and Bearer token that is supplied by its clients to the underlying target
clusters. Also, to start the federation frontend, configure the target to be
&lt;code&gt;federation-frontend&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You need to configure a list of target clusters within the
&lt;code&gt;federation.proxy_targets&lt;/code&gt; block; currently, there are no equivalent CLI flags
available. Each entry requires a &lt;code&gt;name&lt;/code&gt; that contains an identifier that will
be exposed using the &lt;code&gt;__cluster__&lt;/code&gt; label in the query results and a &lt;code&gt;url&lt;/code&gt; that
points to a Prometheus compatible API. For GEM, use the URL
&lt;code&gt;http://&amp;lt;gem-host&amp;gt;/prometheus&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Optionally, you can configure each &lt;code&gt;proxy_target&lt;/code&gt; to have Basic auth
credentials, which override the user-supplied ones.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Warning:&lt;/strong&gt; When you configure Basic auth via the &lt;code&gt;proxy_target&lt;/code&gt;
configuration, its credentials there take precedence over the client-supplied
ones. Without other preventive action, any client that can reach the federation
frontend can perform queries by using those credentials.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;In the following example, two clusters in two different regions are queried via
the federation frontend:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;multitenancy_enabled: false # The federation frontend does not do any authentication itself
target: federation-frontend # Run the federation frontend only

federation:
  proxy_targets:
    - name: us-west
      url: http://gem-us-west/prometheus
    - name: us-east
      url: http://gem-us-east/prometheus&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;aggregate-metrics-from-a-local-gem-cluster-and-grafana-cloud-metric-stack&#34;&gt;Aggregate metrics from a local GEM cluster and Grafana Cloud Metric stack&lt;/h3&gt;
&lt;p&gt;The federation frontend allows you to get an aggregated view of metrics stored
in a local GEM cluster and a hosted Grafana Cloud Metrics stack. With the
following configuration, you can query both of the clusters as though they were
one:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;federation:
  proxy_targets:
    - name: own-data-center
      url: http://gem/prometheus
    - name: grafana-cloud
      url: https://prometheus-us-central1.grafana.net/api/prom
      basic_auth:
        username: &amp;lt;tenant-id&amp;gt;
        password: &amp;lt;token&amp;gt;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Warning:&lt;/strong&gt; This gives any client that can reach the federation frontend
access to your metrics data in Grafana Cloud Metrics without further
authentication.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;By using the authentication credentials of the local GEM cluster, you can
execute a query against both clusters. To do so, set the access policy&amp;rsquo;s token
as a variable for subsequent commands:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;$ export API_TOKEN=&amp;#34;the long token string you copied&amp;#34;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;$ curl -s -u &amp;#34;&amp;lt;tenant-id&amp;gt;:$API_TOKEN&amp;#34; -G --data-urlencode &amp;#34;query=count(up) by (__cluster__)&amp;#34; http://federation-frontend/prometheus/api/v1/query | jq&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;JSON&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-json&#34;&gt;{
  &amp;#34;status&amp;#34;: &amp;#34;success&amp;#34;,
  &amp;#34;data&amp;#34;: {
    &amp;#34;resultType&amp;#34;: &amp;#34;vector&amp;#34;,
    &amp;#34;result&amp;#34;: [
      {
        &amp;#34;metric&amp;#34;: {
          &amp;#34;__cluster__&amp;#34;: &amp;#34;own-data-center&amp;#34;
        },
        &amp;#34;value&amp;#34;: [1623344524.382, &amp;#34;10&amp;#34;]
      },
      {
        &amp;#34;metric&amp;#34;: {
          &amp;#34;__cluster__&amp;#34;: &amp;#34;grafana-cloud&amp;#34;
        },
        &amp;#34;value&amp;#34;: [1623344524.382, &amp;#34;4&amp;#34;]
      }
    ]
  }
}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h2 id=&#34;limitations-of-cluster-query-federation&#34;&gt;Limitations of cluster query federation&lt;/h2&gt;
&lt;p&gt;This &lt;em&gt;experimental feature&lt;/em&gt; comes with some limitations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No result caching in the federation frontend&lt;/li&gt;
&lt;li&gt;No support for alerting/ruler on a federation level&lt;/li&gt;
&lt;li&gt;No support for metric metadata endpoint&lt;/li&gt;
&lt;li&gt;No support for exemplars&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If your use case is blocked by one of those limitations, then feel free to reach out
through our support channels with a feature request.&lt;/p&gt;
]]></content><description>&lt;h1 id="cluster-query-federation">Cluster query federation&lt;/h1>
&lt;blockquote>
&lt;p>&lt;strong>NOTE:&lt;/strong> Cluster query federation is an experimental feature. As such, the
configuration settings, command line flags, or specifics of the implementation
are subject to change.&lt;/p></description></item><item><title>Self-monitoring</title><link>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/self-monitoring/</link><pubDate>Wed, 11 Mar 2026 13:04:24 +0000</pubDate><guid>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/self-monitoring/</guid><content><![CDATA[&lt;h1 id=&#34;self-monitoring&#34;&gt;Self monitoring&lt;/h1&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt; Self-monitoring is an experimental feature. As such, the configuration settings, command line flags, or
specifics of the implementation are subject to change.&lt;/p&gt;
&lt;h2 id=&#34;overview&#34;&gt;Overview&lt;/h2&gt;
&lt;p&gt;Since version 1.4, Grafana Enterprise Metrics (GEM) includes the ability to directly record self-monitoring metrics to
allow you to easily monitor the health and stability of GEM itself. The metrics GEM collects about itself are written to
a built-in &lt;code&gt;__system__&lt;/code&gt; tenant. The metrics written can be queried as usual using tokens created under the
built-in &lt;code&gt;__system__&lt;/code&gt; access policy. Since version 1.8, GEM directly records &lt;a href=&#34;../exemplars/&#34;&gt;exemplars&lt;/a&gt;
as part of self-monitoring metrics.&lt;/p&gt;
&lt;p&gt;The way self-monitoring works ensures that any metrics available from GEM via &lt;code&gt;/metrics&lt;/code&gt; endpoints will be available
directly in GEM without needing to be scraped by an external process. While these metrics would ordinarily need to be
scraped using &lt;a href=&#34;https://prometheus.io/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Prometheus&lt;/a&gt; or the
&lt;a href=&#34;/docs/grafana-cloud/agent/&#34;&gt;Grafana Agent&lt;/a&gt;, with self-monitoring they will be available after
following the quick setup described below.&lt;/p&gt;
&lt;p&gt;This feature provides a simple, out-of-the-box way to monitor GEM itself with a minimum amount of
configuration or extra dependencies. To get the maximum value of this feature, we recommend you install
&lt;a href=&#34;/grafana/plugins/grafana-metrics-enterprise-app/&#34;&gt;GEM&amp;rsquo;s Grafana plug-in&lt;/a&gt;, which automatically
provisions a set of dashboards that use the self-monitoring metrics. The dashboards are in line with Grafana
Labs&amp;rsquo; best practices for understanding GEM system health. Self-monitoring is compatible with plugin versions &amp;gt;= 3.0.4
(which require Grafana 8). Grafana 7.5 users should use version 2.1.1.&lt;/p&gt;
&lt;h2 id=&#34;configuration&#34;&gt;Configuration&lt;/h2&gt;
&lt;p&gt;The sections below describe the steps needed to set up self monitoring.&lt;/p&gt;
&lt;h3 id=&#34;single-binary-mode&#34;&gt;Single binary mode&lt;/h3&gt;
&lt;p&gt;Self-monitoring is enabled by default - no action is necessary in single binary mode!&lt;/p&gt;
&lt;h3 id=&#34;microservices-mode&#34;&gt;Microservices mode&lt;/h3&gt;
&lt;p&gt;In order to use self-monitoring in microservices mode, you&amp;rsquo;ll need a hostname that you can use to
address the gPRC port (9095 by default) of each of the GEM distributors. This could be a load balancer that balances
between each distributor, a DNS &lt;code&gt;A&lt;/code&gt; record that includes IPs for each distributor, or a Kubernetes service that balances
between each gRPC port of the distributor pods. For the purposes of this example, we&amp;rsquo;ll assume that you are using a
Kubernetes service and GEM is running in a namespace called &lt;code&gt;enterprise-metrics&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Add the following section to your GEM configuration file used by each GEM pod or process.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;instrumentation:
  distributor_client:
    address: dns:///distributor.enterprise-metrics.svc.cluster.local:9095&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Or&lt;/strong&gt;, you can alternatively add the command line flag to the arguments passed to each GEM pod or process.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;-instrumentation.distributor-client.address=&#39;dns:///distributor.enterprise-metrics.svc.cluster.local:9095&#39;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What is described above will give you system health metrics about the entire GEM cluster. To better understand GEM
behavior, you also want to understand resource usage at a per-tenant level. In order to get the self-monitoring metrics
you need to understand this behavior (and populate the &amp;ldquo;Per Tenant Usage&amp;rdquo; dashboards provisioned by the GEM plugin),
you must also deploy the &lt;a href=&#34;overrides-exporter/&#34;&gt;overrides-exporter&lt;/a&gt; component.&lt;/p&gt;
&lt;h3 id=&#34;exemplars&#34;&gt;Exemplars&lt;/h3&gt;
&lt;p&gt;Since GEM 1.8, self-monitoring has the ability to directly record &lt;a href=&#34;../exemplars/&#34;&gt;exemplars&lt;/a&gt;.
However, recording of the exemplars under the &lt;code&gt;__system__&lt;/code&gt; tenant is still controlled by the same
&lt;a href=&#34;../../tenant-management/limits/&#34;&gt;limits&lt;/a&gt; applied to all other tenants. This means that recording of
exemplars for the &lt;code&gt;__system__&lt;/code&gt; tenant is disabled by default (as it is for all tenants) and must be enabled using the
runtime configuration file or enabled globally.&lt;/p&gt;
&lt;p&gt;Since the &lt;code&gt;__system__&lt;/code&gt; tenant is built into GEM itself and immutable, limits for it (such as enabling exemplars)
cannot be set using the Admin API. Instead, if you wish to emit exemplars for the &lt;code&gt;__system__&lt;/code&gt; tenant you must override
the &lt;code&gt;max_global_exemplars_per_user&lt;/code&gt; setting for the &lt;code&gt;__system__&lt;/code&gt; tenant using
the &lt;a href=&#34;/docs/mimir/latest/operators-guide/configuring/about-runtime-configuration/&#34;&gt;runtime configuration file&lt;/a&gt; or
enable exemplars globally.&lt;/p&gt;
&lt;p&gt;Here is an example of using the runtime configuration file:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;overrides:
  __system__:
    max_global_exemplars_per_user: 300000&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;verification&#34;&gt;Verification&lt;/h3&gt;
&lt;p&gt;After you&amp;rsquo;ve deployed the configuration changes above, you&amp;rsquo;ll need to verify that self-monitoring is working correctly.
We&amp;rsquo;ll learn how to query the self-monitoring metrics later, but to verify they&amp;rsquo;re working we can check a simple counter
incremented when self-monitoring metrics are emitted.&lt;/p&gt;
&lt;p&gt;Pick a single pod or process that is part of your GEM cluster. For this example, we&amp;rsquo;ll assume that you have picked an
ingester.&lt;/p&gt;
&lt;p&gt;Make a &lt;code&gt;curl&lt;/code&gt; request to the &lt;code&gt;/metrics&lt;/code&gt; endpoint of the ingester.&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;$ curl -s &amp;#39;http://ingester-01.example.com/metrics&amp;#39; | grep &amp;#39;cortex_self_monitoring_pushes_total&amp;#39;
# HELP cortex_self_monitoring_pushes_total Number of successes pushing self-monitoring metrics
# TYPE cortex_self_monitoring_pushes_total counter
cortex_self_monitoring_pushes_total 15&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt; If you are running GEM in a Kubernetes cluster, individual pods might not be directly accessible from outside
the Kubernetes cluster. In this case you can make the request from another pod running in the Kubernetes cluster, or you
can make use of the
&lt;a href=&#34;https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;kubectl port-forward&lt;/a&gt;
command.&lt;/p&gt;
&lt;p&gt;If the metric above is &lt;code&gt;0&lt;/code&gt; or doesn&amp;rsquo;t exist, check the logs for each GEM component looking for errors or warnings
related to pushing metrics to a distributor.&lt;/p&gt;
&lt;h2 id=&#34;querying&#34;&gt;Querying&lt;/h2&gt;
&lt;p&gt;In order to query self-monitoring metrics directly, you&amp;rsquo;ll need to
&lt;a href=&#34;../../admin-api/&#34;&gt;create a token&lt;/a&gt; associated with the &lt;code&gt;__system__&lt;/code&gt; access policy. The steps
below assume you have already done this and copied down the token. The following examples further assume that your GEM
cluster is available at the host &lt;code&gt;gem.example.com&lt;/code&gt; over HTTPS.&lt;/p&gt;
&lt;p&gt;First, set the token as a variable to use for the subsequent commands.&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;$ export API_TOKEN=&amp;#34;the long token string you copied&amp;#34;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Next, we&amp;rsquo;ll make a request to the Prometheus query endpoint of GEM looking for a particular metric. In this case,
&lt;code&gt;grafana_metrics_enterprise_build_info&lt;/code&gt;&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;$ curl -s -u &amp;#34;__system__:$API_TOKEN&amp;#34; &amp;#34;https://gem.example.com/prometheus/api/v1/query?query=grafana_metrics_enterprise_build_info&amp;#34; | jq
{
  &amp;#34;status&amp;#34;: &amp;#34;success&amp;#34;,
  &amp;#34;data&amp;#34;: {
    &amp;#34;resultType&amp;#34;: &amp;#34;vector&amp;#34;,
    &amp;#34;result&amp;#34;: [
      {
        &amp;#34;metric&amp;#34;: {
          &amp;#34;__name__&amp;#34;: &amp;#34;grafana_metrics_enterprise_build_info&amp;#34;,
          &amp;#34;branch&amp;#34;: &amp;#34;gem-release-1.4&amp;#34;,
          &amp;#34;goversion&amp;#34;: &amp;#34;go1.16.3&amp;#34;,
          &amp;#34;instance&amp;#34;: &amp;#34;ingester-01:80&amp;#34;,
          &amp;#34;revision&amp;#34;: &amp;#34;ccd12b7a&amp;#34;,
          &amp;#34;target&amp;#34;: &amp;#34;ingester&amp;#34;,
          &amp;#34;version&amp;#34;: &amp;#34;v1.4.1&amp;#34;
        },
        &amp;#34;value&amp;#34;: [
          1622833381.751,
          &amp;#34;1&amp;#34;
        ]
      },
      {
        &amp;#34;metric&amp;#34;: {
          &amp;#34;__name__&amp;#34;: &amp;#34;grafana_metrics_enterprise_build_info&amp;#34;,
          &amp;#34;branch&amp;#34;: &amp;#34;gem-release-1.4&amp;#34;,
          &amp;#34;goversion&amp;#34;: &amp;#34;go1.16.3&amp;#34;,
          &amp;#34;instance&amp;#34;: &amp;#34;distributor-01:80&amp;#34;,
          &amp;#34;revision&amp;#34;: &amp;#34;ccd12b7a&amp;#34;,
          &amp;#34;target&amp;#34;: &amp;#34;distributor&amp;#34;,
          &amp;#34;version&amp;#34;: &amp;#34;v1.4.1&amp;#34;
        },
        &amp;#34;value&amp;#34;: [
          1622833381.751,
          &amp;#34;1&amp;#34;
        ]
      },

      &amp;lt;...snip...&amp;gt;
    ]
  }
}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;As you can see, querying self-monitoring metrics with GEM is the same process as querying any other type of metrics.&lt;/p&gt;
&lt;h2 id=&#34;implementation&#34;&gt;Implementation&lt;/h2&gt;
&lt;p&gt;Though you don&amp;rsquo;t &lt;em&gt;need&lt;/em&gt; to be familiar with how self-monitoring works at a technical level, it&amp;rsquo;s detailed below in the
hopes that it&amp;rsquo;s useful.&lt;/p&gt;
&lt;h3 id=&#34;gathering&#34;&gt;Gathering&lt;/h3&gt;
&lt;p&gt;Self-monitoring metrics are gathered internally the same way metrics exposed via the &lt;code&gt;/metrics&lt;/code&gt;
endpoint are: they are registered with a
Prometheus &lt;a href=&#34;https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#Registerer&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Registerer&lt;/a&gt;
on application start up. The metrics are updated during the normal course of running the application and periodically (
every 15 seconds by default) flushed directly to a distributor. Any metric available from the &lt;code&gt;/metrics&lt;/code&gt; endpoint of a
GEM component will also be available in the self-monitoring system.&lt;/p&gt;
&lt;p&gt;The metrics are written to the distributor over its gRPC interface. This allows the self-monitoring system control over
the exact tenant the metrics are stored under. This enables it to cleanly separate system metrics (under
the &lt;code&gt;__system__&lt;/code&gt; tenant) from user data.&lt;/p&gt;
&lt;h3 id=&#34;injected-labels&#34;&gt;Injected labels&lt;/h3&gt;
&lt;p&gt;Normally, when metrics are scraped by Prometheus, labels
are &lt;a href=&#34;https://prometheus.io/docs/concepts/jobs_instances/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;automatically added by Prometheus&lt;/a&gt;
that identify where the metrics came from. Since self-monitoring metrics are not scraped by any external system, labels
are automatically added internally to help identify which component the metrics came from.&lt;/p&gt;
&lt;p&gt;The following labels are added to metrics emitted by the self-monitoring system.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;instance&lt;/code&gt;: this label is made up of the node or host name a component is running on in combination with the HTTP port
used. For example a value for this label in a GEM cluster running on Kubernetes might be &lt;code&gt;ingester-1:80&lt;/code&gt;
or &lt;code&gt;querier-5bf6ddccd7-hzbtn:80&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;target&lt;/code&gt;: this label is made of a comma separated list of the targets a GEM process is running as
(&lt;code&gt;ingester&lt;/code&gt;, &lt;code&gt;querier&lt;/code&gt;, etc.) or &lt;code&gt;all&lt;/code&gt; in single binary mode.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;system-tenant-and-access-policy&#34;&gt;System tenant and access policy&lt;/h3&gt;
&lt;p&gt;In order to cleanly separate self-monitoring data from user data, GEM comes with a built-in &lt;code&gt;__system__&lt;/code&gt; tenant
and &lt;code&gt;__system__&lt;/code&gt; access policy. All self-monitoring data is written to the &lt;code&gt;__system__&lt;/code&gt; tenant. The self-monitoring
data may be queried using tokens associated with the&lt;code&gt;__system__&lt;/code&gt; access policy. Because these are built into GEM itself,
they cannot be removed. However, writing self-monitoring metrics to the system tenant can be turned off using the
flag &lt;code&gt;-instrumentation.enabled=false&lt;/code&gt; or the associated configuration setting.&lt;/p&gt;
&lt;h3 id=&#34;recording-rules&#34;&gt;Recording rules&lt;/h3&gt;
&lt;p&gt;In order to use self-monitoring metrics to power associated self-monitoring dashboards, the GEM ruler also includes
built-in recording rules. These recording rules perform aggregations of self-monitoring metrics they same way the ruler
aggregates other metrics. Because these recording rules are built-in to GEM itself, they cannot be removed. However,
they can be turned off using the same flag that enables or disables self-monitoring &lt;code&gt;-instrumentation.enabled=false&lt;/code&gt; or
the associated configuration setting.&lt;/p&gt;
&lt;h3 id=&#34;overhead&#34;&gt;Overhead&lt;/h3&gt;
&lt;p&gt;Self-monitoring metrics are stored in GEM itself. Like any other metrics, they consume space in object storage. When
enabled in microservices mode, each GEM component (ingester, querier, etc) will emit &lt;strong&gt;approximately 2000 series per
component&lt;/strong&gt;. These series are emitted for each component and GEM duplicates them based on the replication factor
in the ingesters.&lt;/p&gt;
&lt;p&gt;To understand how many series will be written under the &lt;code&gt;__system__&lt;/code&gt; tenant as part of self-monitoring, you can use
the following formula:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;2000 * $NUMBER_OF_GEM_PROCESSES * $REPLICATION_FACTOR&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Since these series are written to GEM in a similar way to other series, they&amp;rsquo;ll be deduplicated by the compactor in
object storage to reduce space required. To understand how many series will end up in object storage via
the &lt;code&gt;__system__&lt;/code&gt; tenant, you can use the following formula:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;2000 * $NUMBER_OF_GEM_PROCESSES&lt;/code&gt;&lt;/p&gt;
]]></content><description>&lt;h1 id="self-monitoring">Self monitoring&lt;/h1>
&lt;p>&lt;strong>NOTE&lt;/strong> Self-monitoring is an experimental feature. As such, the configuration settings, command line flags, or
specifics of the implementation are subject to change.&lt;/p></description></item><item><title>Exemplars</title><link>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/exemplars/</link><pubDate>Wed, 11 Mar 2026 13:04:24 +0000</pubDate><guid>https://grafana.com/docs/enterprise-metrics/v2.6.x/operations/exemplars/</guid><content><![CDATA[&lt;h1 id=&#34;about-exemplars-in-gem&#34;&gt;About exemplars in GEM&lt;/h1&gt;
&lt;p&gt;An exemplar is a specific trace representative of a repeated pattern of data in a given time interval. It helps you identify higher cardinality metadata from specific events within time series data. To learn more about exemplars and how they can help you isolate and troubleshoot problems with your systems, see &lt;a href=&#34;/docs/grafana/latest/basics/exemplars/&#34;&gt;Introduction to exemplars&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Grafana Enterprise Metrics includes the ability to store exemplars in-memory. Exemplar storage in GEM is implemented similarly to how it is in Prometheus. Exemplars are stored as a fixed size circular buffer that stores exemplars in memory for all series.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&#34;../../config/reference/#limits_config&#34;&gt;limits_config&lt;/a&gt; property can be used to control the size of the circular buffer by the number of exemplars. For reference, an exemplar with just a &lt;code&gt;traceID=&amp;lt;jaeger-trace-id&amp;gt;&lt;/code&gt; uses roughly 100 bytes of memory via the in-memory exemplar storage. If the exemplar storage is enabled, GEM will also append the exemplars to WAL for local persistence (for WAL duration).&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;prereq_exemplars/&#34;&gt;Before you begin&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;enable_exemplars/&#34;&gt;Enable exemplars in GEM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;view_exemplars/&#34;&gt;View exemplar data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
]]></content><description>&lt;h1 id="about-exemplars-in-gem">About exemplars in GEM&lt;/h1>
&lt;p>An exemplar is a specific trace representative of a repeated pattern of data in a given time interval. It helps you identify higher cardinality metadata from specific events within time series data. To learn more about exemplars and how they can help you isolate and troubleshoot problems with your systems, see &lt;a href="/docs/grafana/latest/basics/exemplars/">Introduction to exemplars&lt;/a>.&lt;/p></description></item></channel></rss>