Menu

Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.

Enterprise

Cluster query federation

NOTE: Cluster query federation is an experimental feature. As such, the configuration settings, command line flags, or specifics of the implementation are subject to change.

Overview

Since version 1.4, Grafana Enterprise Metrics (GEM) includes the optional component federation-frontend.

The goal of this component is to provide an ability to aggregate data from multiple GEM clusters in a single PromQL query. The underlying target clusters are queried using the Prometheus remote_read API and Labels API.

The component itself does not require any other components of Grafana Mimir. Therefore, you can run it on its own. A quite common use case is aggregating the data from two GEM clusters that are running in different regions:

Cluster federation architecture

Configuration

A minimal configuration of the federation-frontend has to disable authentication, because the federation frontend forwards the Basic authentication and Bearer token that is supplied by its clients to the underlying target clusters. Also, to start the federation frontend, configure the target to be federation-frontend.

You need to configure a list of target clusters within the federation.proxy_targets block; currently, there are no equivalent CLI flags available. Each entry requires a name that contains an identifier that will be exposed using the __cluster__ label in the query results and a url that points to a Prometheus compatible API. For GEM, use the URL http://<gem-host>/prometheus.

Optionally, you can configure each proxy_target to have Basic auth credentials, which override the user-supplied ones.

Warning: When you configure Basic auth via the proxy_target configuration, its credentials there take precedence over the client-supplied ones. Without other preventive action, any client that can reach the federation frontend can perform queries by using those credentials.

In the following example, two clusters in two different regions are queried via the federation frontend:

yaml
multitenancy_enabled: false # The federation frontend does not do any authentication itself
target: federation-frontend # Run the federation frontend only

federation:
  proxy_targets:
    - name: us-west
      url: http://gem-us-west/prometheus
    - name: us-east
      url: http://gem-us-east/prometheus

Aggregate metrics from a local GEM cluster and Grafana Cloud Metric stack

The federation frontend allows you to get an aggregated view of metrics stored in a local GEM cluster and a hosted Grafana Cloud Metrics stack. With the following configuration, you can query both of the clusters as though they were one:

yaml
federation:
  proxy_targets:
    - name: own-data-center
      url: http://gem/prometheus
    - name: grafana-cloud
      url: https://prometheus-us-central1.grafana.net/api/prom
      basic_auth:
        username: <tenant-id>
        password: <token>

Warning: This gives any client that can reach the federation frontend access to your metrics data in Grafana Cloud Metrics without further authentication.

By using the authentication credentials of the local GEM cluster, you can execute a query against both clusters. To do so, set the access policy’s token as a variable for subsequent commands:

$ export API_TOKEN="the long token string you copied"
$ curl -s -u "<tenant-id>:$API_TOKEN" -G --data-urlencode "query=count(up) by (__cluster__)" http://federation-frontend/prometheus/api/v1/query | jq
json
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__cluster__": "own-data-center"
        },
        "value": [1623344524.382, "10"]
      },
      {
        "metric": {
          "__cluster__": "grafana-cloud"
        },
        "value": [1623344524.382, "4"]
      }
    ]
  }
}

Limitations of cluster query federation

This experimental feature comes with some limitations:

  • No result caching in the federation frontend
  • No support for alerting/ruler on a federation level
  • No support for metric metadata endpoint
  • No support for exemplars

If your use case is blocked by one of those limitations, then feel free to reach out through our support channels with a feature request.