Cross-cluster query federation

Enterprise

Federate queries across clusters

NOTE: Cross-cluster query federation is an experimental feature. As such, the configuration settings, command line flags, and specifics of the implementation are subject to change.

Overview

Starting with version 1.4, Grafana Enterprise Metrics (GEM) includes the optional federation-frontend component. The goal of this component is to provide the ability to combine data from multiple GEM clusters into a single PromQL query. The federation-frontend component queries underlying target clusters using the following methods:

Mimir query sharding adapted for the target clusters. This is also known as cluster sharding.
The Prometheus remote_read API
The Metadata APIs

You can run this component on its own, as it doesn’t require any other components of GEM. A common use case is combining the data from two GEM clusters that are running in different regions, as shown in the following diagram:

Configure the federation-frontend

To configure the federation-frontend component, you must disable authentication by disabling multitenancy. The federation-frontend forwards the Basic authentication and Bearer token supplied by its clients to the underlying target clusters.

Additionally, to start the federation-frontend, you must configure the target to be federation-frontend.

You must configure a list of target clusters within the federation.proxy_targets block in the YAML configuration file. There are no equivalent CLI flags available. Each entry requires a name that contains an identifier that’s exposed using the __cluster__ label in the query results and a url that points to a GEM instance. For GEM clusters using the default Prometheus HTTP prefix, use the following URL: http://<gem-host>/prometheus.

Optionally, you can configure each proxy_target to have Basic auth credentials, which override the user-supplied ones.

Warning
When you configure Basic auth via the proxy_target configuration, its credentials there take precedence over the client-supplied ones. Without other preventive action, any client that can reach the federation frontend can perform queries by using those credentials.

In the following example, two clusters in two different regions are queried via the federation-frontend:

multitenancy_enabled: false # The federation-frontend does not do any authentication itself
target: federation-frontend # Run the federation-frontend only

federation:
  proxy_targets:
    - name: us-west
      url: http://gem-us-west/prometheus
    - name: us-east
      url: http://gem-us-east/prometheus

Note
Using cross-cluster query federation in GEM version 2.16 or higher only supports target clusters running GEM version 2.16 or higher. If the federation-frontend is configured to use a target cluster running an older version of GEM, queries will fail with the following error: the federation-frontend only works with GEM proxy targets; the proxy target "xxx" is not a GEM cluster.

Configure cross-cluster sharding

Cross-cluster sharding shards queries into different clusters and then runs the queries on those individual clusters before combining them on the federation-frontend. Compared to using remote read, this approach improves the performance of cross-cluster queries through providing more distributed computation and less network transfer compared to using remote read for all queries. It also takes advantage of all the query acceleration techniques available in the target GEM cluster, including query result caching, splitting, and sharding.

The federation-frontend component enables cross-cluster sharding by default. You can configure cross-cluster sharding using the cluster_sharding_enabled setting and the -federation.cluster-sharding-enabled CLI flag. For example:

federation:
  cluster_sharding_enabled: true

Not all queries can be evaluated with cross-cluster sharding. If the federation-frontend receives a query that cannot be evaluated with cross-cluster sharding, it automatically falls back to using remote read for that query.

Combine metrics from a local GEM cluster and Grafana Cloud Metric stack

The federation-frontend allows you to get a combined view of metrics stored in a local GEM cluster and a hosted Grafana Cloud Metrics stack. With the following configuration, you can query both of the clusters as though they were one:

federation:
  proxy_targets:
    - name: own-data-center
      url: http://gem/prometheus
    - name: grafana-cloud
      url: https://prometheus-us-central1.grafana.net/api/prom
      basic_auth:
        username: <tenant-id>
        password: <token>

Warning
This gives any client that can reach the federation-frontend access to your metrics data in Grafana Cloud Metrics without further authentication.

By using the authentication credentials of the local GEM cluster, you can execute a query against both clusters. To do so, set the access policy’s token as a variable for subsequent commands:

$ export API_TOKEN="the long token string you copied"

$ curl -s -u "<tenant-id>:$API_TOKEN" -G --data-urlencode "query=count(up) by (__cluster__)" http://federation-frontend/prometheus/api/v1/query | jq

{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__cluster__": "own-data-center"
        },
        "value": [1623344524.382, "10"]
      },
      {
        "metric": {
          "__cluster__": "grafana-cloud"
        },
        "value": [1623344524.382, "4"]
      }
    ]
  }
}

Configure partial results

By default, the federation-frontend returns an error for any query when one or more of the target clusters returns an error or does not respond.

Alternatively, the federation-frontend can compute partial results based on the successful responses from target clusters and ignore the responses from target clusters that return an error or do not respond.

You can enable partial results with the following CLI flags and configuration options:

For range and instant queries: -federation.partial-queries-enabled CLI flag or federation.partial_queries_enabled YAML configuration file option
For metadata queries, which are label names, label values and series queries: -federation.partial-metadata-enabled CLI flag or federation.partial_metadata_enabled YAML configuration file option

It is also possible to enable or disable partial results by setting a HTTP header on the query request:

For range and instant queries: X-Partial-Queries-Enabled: true to enable partial results, X-Partial-Queries-Enabled: false to disable partial results
For metadata queries: X-Partial-Metadata-Enabled: true to enable partial results, X-Partial-Metadata-Enabled: false to disable partial results

For range and instant queries, a warning is included in the response from the federation-frontend and displayed in Grafana if only a subset of target clusters’ responses are used.

Enabling partial queries has the same effect regardless of whether the query is evaluated with remote read or cross-cluster sharding. For queries where cross-cluster sharding sends multiple requests to each target cluster for a single query, either all responses from a target cluster are used, or none are used, including when only some fail. For example, if the query is sum(foo) + sum(bar), and a cluster returns a successful response for sum(foo) but not for sum(bar), then the successful response for sum(foo) from that cluster is discarded as well.

For metadata queries, no warning is included in the response from the federation-frontend if only a subset of target clusters’ responses are used.

Limitations of cross-cluster query federation

This experimental feature has the following limitations:

No result caching in the federation-frontend
No support for alerting/ruler on a federation level
No support for endpoints other than range queries, instant queries, and metadata queries. Metadata queries consist of label names queries, labels values queries, and series queries.
No support for exemplars

If your use case is blocked by one of those limitations, then feel free to reach out through our support channels with a feature request.

Was this page helpful?

Email docs@grafana.com

Help and support

Community

Federate queries across clusters

Overview

Configure the federation-frontend

Configure cross-cluster sharding

Combine metrics from a local GEM cluster and Grafana Cloud Metric stack

Configure partial results

Limitations of cross-cluster query federation

Was this page helpful?

Related resources from Grafana Labs