Control metrics costs via Adaptive Metrics
Adaptive Metrics consists of a recommendations service that generates recommended rules for aggregation, and an aggregations service that implements those rules. You can interact with both of these services via an HTTP API, a CLI tool, or both.
You can also use the Adaptive Metrics application plugin, which is available from the Apps menu.
Note: The API is in an early stage of development and subject to change.
Caution: The CLI is deprecated and will be removed in the near future.
Supported metrics formats
While Grafana Cloud accepts metrics data in a variety of formats, Adaptive Metrics is only compatible with a subset of these formats:
Metrics format | Supported? | Notes |
---|---|---|
Prometheus | Yes | Fully supported. However, if you do not send metric metadata, few recommendations will be generated. Metric metadata is sent by default in newer versions of Prometheus and the Grafana Agent, but will not be sent if intentionally disabled or if running an older version where the default is to not send. |
OpenTelemetry | Yes | Recommendations are limited because metadata is not sent. |
Influx Line protocol | Yes | Recommendations are limited because metadata is not sent. |
Datadog | No | |
Graphite | No |
Check if you are sending metadata for your metrics
To check whether you are sending metrics metadata, send a request to the HTTP API metadata
endpoint:
curl -u "$METRICS_INSTANCE_ID:$API_KEY" "https://<cluster>.grafana.net/prometheus/api/v1/metadata"
Note: Adaptive Metrics uses Prometheus metrics metadata stored in your Grafana Hosted Metrics instance to ensure recommendations are safe to apply mathematically. For example, for a counter-type metric, recommendations by Adaptive Metrics ensure that counter resets are considered during aggregation. If metrics metadata is not available for a metric, and Adaptive Metrics is unable to infer a metric’s type from its name or usage patterns, no recommendation will be produced for that metric. If you are using a metrics format other than Prometheus, metrics metadata is not preserved. As a result, there are fewer recommendations for those metrics.
CLI workflow
Understand the high-level workflow with the CLI:
- Download recommendations of what metrics to aggregate.
- Use those recommendations to create your own set of aggregation rules.
- Upload that set of aggregation rules.
The CLI also enables you to view, edit, and delete existing aggregation rules that have already been applied.
Use the Adaptive Metrics CLI
Adaptive Metrics provides a CLI tool.
Before you begin
To use the CLI tool, gather the following information:
URL
: In the formhttps://<your-grafana-cloud-prom-url>.grafana.net/
. To find yourURL
value, go to your grafana.com account and check the Details page of your hosted Prometheus endpoint.TENANT
: The numeric instance ID where Adaptive Metrics is set up. To find yourTENANT
value, go to your grafana.com account and check the Details page of your hosted Prometheus endpoint for Username / Instance ID.TOKEN
: A token from a Grafana Cloud Access Policy, make sure the access policy hasmetrics:read
andmetrics:write
scopes for the stack ID where you have enabled Adaptive Metrics.
Download the Adaptive Metrics CLI:
Go to the URL that is based on the build that corresponds to your platform:
https://dl.grafana.com/files/adaptive-cli/adaptive-cli.linux.amd64
SHA256 Sum:
4c2618c98e23126c3fa0a9fb8e12ee732502d6ca48cd1b39095bad70b952baf7
https://dl.grafana.com/files/adaptive-cli/adaptive-cli.linux.arm64
SHA256 Sum:
c1dbe7d21d8d0b17ee4d6ef9c33da64c7b397d9d4087498aa94217f2c8ded220
https://dl.grafana.com/files/adaptive-cli/adaptive-cli.darwin.amd64
SHA256 Sum:
68a4048bfb28714b145de565a61ab627e053a64f784aa6c146f0552ccd6d36fa
https://dl.grafana.com/files/adaptive-cli/adaptive-cli.darwin.arm64
SHA256 Sum:
ce05641a40dd61bba2600e067aed3265eef5ffb5f9398929ec106ed71f2c31b4
Depending on your operating system, you may have to run
chmod +x ./adaptive-cli.<your-distribution>
to change the file permissions on the CLI and make it executable.Launch the CLI using the following command:
./adaptive-cli.<your-distribution> --user $TENANT --url $URL --password $TOKEN
In the previous command, substitute the values of
$TENANT
,$URL
, and$TOKEN
. For more information, see Before you begin.Use the
show recommendations
command to pull down the most recently generated recommendations from the recommendations service.
For built-in help documentation about the CLI tool, launch the tool in interactive mode (adding the --repl
flag) and then type --help
.
Example aggregation rule
Each aggregation rule looks similar to this:
{
"metric": "agent_request_duration_seconds_sum",
"drop_labels": [
"container",
"instance",
"method",
"namespace",
"pod",
"provider",
"status_code",
"ws"
],
"aggregations": [
"sum:counter"
]
}
In the preceding example:
metric
is the name of the metric to be aggregated.drop_labels
is an array of the labels that will be removed by the aggregations service.aggregations
is an array of the aggregation types to calculate for this metric.
You can use an aggregation rule file to define multiple rules simultaneously.
The following example rule file is an array of one or more aggregation rules:
[
{
"metric": "agent_request_duration_seconds_sum",
"drop_labels": ["namespace", "pod"],
"aggregations": ["sum:counter"]
},
{
"metric": "prometheus_request_duration_seconds_sum",
"drop_labels": ["container", "instance", "ws"],
"aggregations": ["sum:counter"]
}
]
Apply aggregation rules
After you add (create aggregations
), modify (edit aggregations
), or delete (delete aggregations
) an aggregation rule, the CLI’s show aggregations
command reflects the change. Use this command to get the most current picture of which aggregation rules are active in your environment.
There is a delay between uploading new aggregation rules and those metrics aggregations taking effect in your environment. In most cases, the delay is approximately 5-10 minutes, but we currently have no mechanism to let you know precisely when new aggregations take effect.
You can query whatever metric you have added, or changed the aggregation rule for, and look at the value of the __dropped_labels__
label. After this value reflects the changes you’ve made, you’ll know your updated aggregation rules are live in your environment.
We currently limit how often new aggregation rules can be applied. Although you can upload as many new versions of your aggregation rules as you like, those updates are only applied once every 10 minutes. If you make multiple updates in quick succession, the system applies your first received (oldest) update. Then, 10 minutes later, the most recently received update is applied. The intermediate updates never get applied.
Adaptive Metrics API
The Adaptive Metrics CLI is a wrapper around an API. You can use the underlying API directly if you choose. This API is under active development and is subject to change.
List recommendations
Download our recommendations for metrics to aggregate using command below. TOKEN
and TENANT
are variables defined within the requirements section
curl -u "$TENANT:$TOKEN" "$URL/aggregations/recommendations"
TOKEN
must belong to an access policy with the metrics:read
scope.
You can use an optional verbose flag to retrieve more information about each recommendation:
curl -u "$TENANT:$TOKEN" "$URL/aggregations/recommendations?verbose=true"
List current recommendations configuration
Download the current configuration of the recommendations service using the following command:
curl -u "$TENANT:$TOKEN" "$URL/aggregations/recommendations/config"
TOKEN
must belong to an access policy with the metrics:read
scope.
The only tunable parameter exposed by the recommendations service is the keep_labels
parameter. This parameter allows the user to define a comma-separated list of labels that they never want recommended for aggregation. This can be useful at organizations where certain labels are always expected on metrics, regardless of whether or not those labels have been recently queried.
An example response from the /recommendations/config
endpoint would look as follows:
{
"keep_labels": ["instance", "pod"]
}
The preceding response indicates that the recommendations service has been configured to never recommend aggregating the instance
or pod
labels.
Update recommendations configuration
Upload new recommendations configuration using the following command:
curl -u "$TENANT:$TOKEN" --request POST --data @config.json "$URL/aggregations/recommendations/config"
TOKEN
must belong to an access policy with the metrics:write
scope.
This command uses the same endpoint described in List current recommendations configuration and expects the same JSON format.
List currently applied aggregation rules
Download your existing aggregation rules:
curl -u "$TENANT:$TOKEN" "$URL/aggregations/rules"
TOKEN
must belong to an access policy with the metrics:read
scope.
Upload new aggregation rules
Uploading new aggregation rules is a multi-step process:
- Fetch the currently applied rules.
- Modify rules locally.
- Upload rules back.
Fetch the currently applied rules
Use this command:
curl -u "$TENANT:$TOKEN" -D headers.txt "$URL/aggregations/rules" > rules.json
TOKEN
must belong to an access policy with the metrics:read
scope.
The preceding command uses the same endpoint described in List recommendations, but adds an additional -D headers.txt
argument.
The -D headers.txt
argument stores the headers in a file called headers.txt.
This step is required if you want to then upload a new rule file, for example if you want to update the existing aggregation rules you have in place. The information in these headers ensures there are no update collisions. An update collision is the scenario where multiple users try to edit the rules file at the same time and overwrite one another’s changes.
Modify the rules locally
Use your editor of choice to modify the rules.json
file downloaded in the prior step.
Upload rules back
The API supports uploading an entire rules file.
Warning: THIS ACTION WILL OVERWRITE YOUR EXISTING RULE FILE. If you prefer to append to your existing rules, you must use the CLI instead.
To upload your modified rules.json
file from the previous step, use the following shell script:
TMPFILE=$(mktemp)
trap 'rm "$TMPFILE"' EXIT
cat headers.txt | grep -i '^etag:' | sed 's/^ETag:/If-Match:/i' > "$TMPFILE"
curl --request POST --header @"$TMPFILE" --data-binary @$1 -u "$TENANT:$TOKEN" "$URL/aggregations/rules"
TOKEN
must belong to an access policy with the metrics:write
scope.
The cat headers.txt
command modifies the headers.txt
file created in the previous curl call that pulled down the existing aggregation rules.
The curl --request POST
command uploads your new rules file, as well as the updated headers.
Save the shell script as rules_upload.sh.
To run that script, use the following command:
./rules_upload.sh <your_new_rules_file.json>
Replace <your_new_rules_file.json>
with the name of the rules file you wish to upload.
Note: If, upon trying toPOST
the new rules file, you see the errorthe Etag supplied in the 'If-Match' header does not match the Etag of the rules you are trying to replace
, the headers you provided are either missing or stale. To fix, re-fetch the rules file and headers, being careful to look for any changes that may have been introduced since your last edits. For more information on Etag headers, see Etag.
Note: After you configure aggregation rules, the active series count might increase temporarily. Aggregated and unaggregated series will be considered active at the same time. After a short period of time, the unaggregated series will no longer be considered active, and you will see a net reduction in active series.
Aggregation service: requirements on sample age
We can only aggregate raw samples that are relatively recent. Grafana Cloud will reject samples for metrics being aggregated that arrive more than 90s delayed. If the difference between the wall clock time at which a sample arrives at Grafana Cloud and the timestamp on that sample (which indicates when it was collected) is greater than 90 seconds, Grafana Cloud will reject that sample.
If Grafana Cloud rejects samples for this reason, you will see an increase in sample-too-old-for-aggregation
or aggregator-sample-too-old
errors on the Discarded Metrics Samples panel of your billing dashboard.
This sample age requirement only applies to samples that belong to metrics that are being aggregated.
Why this happens
To compute an aggregation, you must wait for all raw samples associated with that metric to arrive. We don’t know how many samples will arrive, nor can we wait indefinitely on those samples, because the longer we wait, the longer the delay in the data being queryable and/or visible in dashboards.
If a sample arrives after our configured waiting time, it does not get taken into account during the computation of the aggregated value. Because our metrics database is immutable once the aggregation has been computed, we cannot update the aggregated value to reflect this late arriving data point.
Troubleshooting
If you encounter issues querying a metric that has been aggregated, see Troubleshoot your aggregated metrics query. For any other questions or feedback, contact your Customer Success Manager or file a support request.
Security warning when running the CLI on macOS
If you try to run the CLI on macOS and get a security warning that it can’t be opened because Apple cannot check it, perform the following steps:
- Open System Settings.
- Navigate to Privacy & Security.
- Scroll down to Security.
- Locate the option to run the CLI.
Related resources from Grafana Labs


