Menu

Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.

Enterprise

Grafana Enterprise Metrics (GEM) allows for forwarding metrics evaluated from the Ruler to any Prometheus remote-write compatible backend.

This works by loading rule groups into the Ruler with an extra config field as shown in the example below:

console
# A regular Cortex rule group
groups:
  - name: group_one
    interval: 5m
    rules:
      - expr: 'rate(prometheus_remote_storage_samples_in_total[5m])'
        record: 'prometheus_remote_storage_samples_in_total:rate5m'
  - name: group_two
    interval: 1m
    rules:
      - expr: 'rate(prometheus_remote_storage_samples_in_total[1m])'
        record: 'prometheus_remote_storage_samples_in_total:rate1m'
    remote_write:
      - url: 'http://user:pass@example.com/api/v1/push'

In the above example, when group_2 is loaded into Grafana Enterprise Metrics, the Ruler Module will evaluate the expression rate(prometheus_remote_storage_samples_in_total[1m]) every 1m and forward the generated metric with name prometheus_remote_storage_samples_in_total:rate1m to example.com. Meanwhile, group_1 will continue to work as expected, the evaluated metric prometheus_remote_storage_samples_in_total:rate5m will be stored within the same cortex instance that is running the Ruler.

Configuration

Rule Storage

Remote write rules are compatible with the following backends:

  • Azure Blob Storage
  • GCS
  • S3
  • Swift

The following backends are not supported:

  • local filesystem
  • ConfigDB

Write-ahead log (WAL)

When a rule group is configured with a remote-write config, GEM buffers the generated metrics in a write-ahead log (WAL) before forwarding them to the remote-write endpoint. This is done to increase reliability in case either the GEM instance or the remote endpoint crashes. If the GEM instance crashes, it reads from the WAL and continues to forward metrics to the configured backend from the last sent timestamp. If the remote endpoint crashes, GEM continues to retry requests until it is available again. If multiple rule groups have been configured to send to the same remote-write endpoint, the GEM instance will use a common WAL for the metrics generated by those rule groups. The WAL is truncated at the time specified by the ruler.remote-write.wal-truncate-frequency setting. WAL entries older than time specified in the ruler.remote-write.max-wal-time setting are removed. WAL entries younger than ruler.remote-write.min-wal-time are not removed.

By default, the WAL is stored in the wal folder in the GEM binary working directory.

console
$ ls
metrics-enterprise-binary   wal/

The directory can be configured as shown:

yaml
ruler:
  remote_write:
    enabled: true
    wal_dir: /tmp/wal
    min_wal_time: 1h
    max_wal_time: 5h
    wal_truncate_frequency: 1h

Example

The following is a complete example of the above mentioned config options using a ruler with sharding enabled and S3 as its rule storage backend:

yaml
ruler:
  external_url: localhost:9090
  rule_path: "/tmp/rules"
  storage:
    type: s3
    s3:
      endpoint: minio:9000
      access_key_id: cortex
      secret_access_key: supersecret
      bucketnames: "gem-ruler"
      insecure: true
      s3forcepathstyle: true
  poll_interval: 10s
  enable_api: true
  enable_sharding: true
  ring:
    kvstore:
      store: memberlist
  remote_write:
    enabled: true
    wal_dir: /tmp/wal
    min_wal_time: 1h
    max_wal_time: 5h
    wal_truncate_frequency: 1h

Loading remote-write groups

The cortextool project, as of version v0.3.1, is compatible with Prometheus rule files that contain the remote-write rule group syntax. You can download and use the latest version of the cortextool here.

You can also use the docker image of the cortextool: docker pull grafana/cortex-tools:latest

Example usage

Once you have GEM running with remote-write rule groups enabled you can load remote-write rule groups using the following procedure.

  1. Save the following file to your workspace:

rules.yaml:

console
groups:
  - name: remote_write_group
    interval: 5m
    rules:
      - expr: 'sum(up)'
        record: 'sum_up'
    remote_write:
      - url: 'http://user:pass@example.com/api/v1/push'
  1. Run the following command with cortextool:
console
$ cortextool rules sync \
--rule-files=rules.yaml \
--id=<instance-name> \
--address=<gem-url> \
--key=<valid-gem-write-token>