Menu
Grafana Cloud

Segment rules in Adaptive Metrics

Note

Rule segmentation for Adaptive Metrics is currently in private preview. Grafana Labs offers support on a best-effort basis, and breaking changes might occur prior to the feature being made generally available.

You can use rule segmentation to extend the capabilities of Adaptive Metrics. With rule segmentation, the service provides recommendation on a per-team, rather than a per-metric, basis. This allows each team to take ownership of and optimize their metrics separately.

How rule segmentation works

The following overview describes rule segmentation in Adaptive Metrics.

Use of Kubernetes namespaces

Grafana Cloud deploys all workloads on Kubernetes, and there is a one-to-many relationship between teams and Kubernetes namespaces. This means that each namespace belongs to only one team, whereas each team can own multiple namespaces.

When scraping metrics on a workload, the scraper assigns the namespace label to each scraped metric. Each metric belongs to a specific team based on its namespace label.

In rule segmentation, you define one segment for each team. This allows each team to define the aggregation rules for their metrics in a single rule set, without affecting other teams. Each segment definition has a regular expression that matches either a list of namespaces or a namespace naming pattern that the team uses.

The following example segment is from Mimir, which hosts Grafana Cloud metrics.

json
{
  "id": "<unique ID>",
  "name": "mimir",
  "selector": "{namespace=~\"(adaptive-metrics|mimir-cluster-1|mimir-cluster-2)\"}"
}

Use of Git-ops

To enable history retrieval and auditing, Adaptive Metrics stores its configurations in Git. To configure aggregation rules outside of Git, Adaptive Metrics uses the grafana-adaptive-metrics Terraform provider. Refer to the Terraform Provider repository for more information.

The Terraform provider comes with a resource type to define Adaptive Metrics segments called grafana-adaptive-metrics_segment. Because each team has a corresponding segment, segments are split into separate files with the following file structure.

  • aggregation_rules.tf
  • rules-mimir.json
  • rules-loki.json
  • rules.json
  • segments.json

The segments.json file contains the segment definitions. For example:

json
[
  {
    "fallback_to_default": false,
    "name": "mimir",
    "selector": "{namespace=~\"(adaptive-metrics|mimir-cluster-1|mimir-cluster-2)\"}"
  },
  {
    "fallback_to_default": false,
    "name": "loki",
    "selector": "{namespace=~\"(loki-cluster-1|loki-cluster-2)\"}"
  }
]

The rules.json file contains the default fallback rules for Adaptive Metrics if a series doesn’t match any defined segments. This file uses the same format as the per-team rules files. For example:

json
[
  {
    "metric": "node_network_protocol_type",
    "drop_labels": ["agent_hostname", "instance"],
    "aggregations": ["sum:counter"]
  }
]

The rules-\*.json files contain one rule set for each team, which is only applied to that team’s metrics. For example:

json
[
  {
    "metric": "activity_tracker_free_slots",
    "drop_labels": ["instance", "pod"],
    "aggregations": ["sum:counter"]
  }
]

The Terraform configuration in the aggregation_rules.tf file imports these JSON files and builds a segmented rule set based on their content. For example:

hcl
locals {
 rules_json = jsondecode(file("${path.module}/rules.json"))
 all_rules_files = fileset("${path.module}", "rules-*.json")
 segments = jsondecode(file("${path.module}/segments.json"))


 segmented_rules_json = {
   for file in local.all_rules_files :
   regex("rules-(.+).json", file)[0] => jsondecode(file("${path.module}/${file}"))
 }
}


resource "grafana-adaptive-metrics_ruleset" "all_rules" {
 rules = local.rules_json
}


resource "grafana-adaptive-metrics_segment" "all_segments" {
 for_each = { for segment in module.common.segments : segment.name => segment }


 name                = each.value.name
 selector            = each.value.selector
 fallback_to_default = each.value.fallback_to_default
}


resource "grafana-adaptive-metrics_ruleset" "segmented_rules" {
 for_each = local.segmented_rules_json


 segment = grafana-adaptive-metrics_segment.all_segments[each.key].id
 rules   = each.value
}

Get started with rule segmentation

If you have already defined a set of aggregation rules and recommendation exemptions that apply to all teams in your organization, follow these steps.

  1. Identify which exemptions correspond to each team so that you can later assign these exemptions to the corresponding segment. To simplify this process, consider copying the set of exemption rules defined in the default segment to all teams and allowing each team to independently remove or edit any rules that don’t apply.
  2. After you identify the label values for each team, use the Segment API to define each segment. To apply the rules from the default segment if no matching rule applies to an incoming series, set the fallback_to_default setting to true.
  3. After creating each segment, define the corresponding set of exemptions for the recommendations service.
  4. After the recommendations service runs, apply the generated recommendations one-by-one, and then set the fallback_to_default setting to false to ignore the default segment.

As a best practice, migrate segments with a smaller total volume of series first, and then gradually proceed with the teams that have a larger volume of series. Additionally, to avoid errors, consider automating this process as much as possible.

About the Segment API

The segment entity has the following fields:

  • id: The unique segment identifier generated during creation. Use this identifier in API calls to reference a segment.
  • name: The name of the segment. Use this field to identify each segment without needing to refer to the id or selector fields. The default segment has a value of default.
  • selector: The selector value used to match a time series to a segment. To minimize the impact of having to evaluate this selector for each of the incoming series in the write path, there are limited possible values for this field. Consider the following restrictions when defining segments:
    • Only one label matcher is allowed. For example, {team="billing"}).
    • All segments must reference the same label name. Creating a segment with a selector that has a different label name results in an error.
    • Only equality matchers, for example, ({team="billing"}), or multi-literal regular expression matchers, for example, ({team=~(alerting|alerting-dev)}), are allowed.
  • fallback_to_default: Defines whether the default segment rules are applied if no matching rule is defined in this segment for incoming time series. This setting also applies to exemptions. Use this field in a migration scenario to move rules from the default segment to a custom segment.

Segment rules using the API

Use the Segment API to manage rule segments.

List rule segments

  • Method: GET
  • Endpoint: /aggregations/rules/segments
  • Scope: adaptive-metrics-segments:read
  • Description: Retrieves a list of all rule segments configured by the user. Returns an array of segments.

Example response:

json
[
  {
    "id": "01J35VCQXJHNF68C3JV3C91T0G",
    "name": "Development",
    "selector": "{env=\"dev\"}",
    "fallback_to_default": true
  }
]

Create rule segments

  • Method: POST
  • Endpoint: /aggregations/rules/segments
  • Scope: adaptive-metrics-segments:write
  • Description: Creates a rule segment. The segment value must correspond to a label selector, where the number of labels is not greater than one, and only equality matchers are accepted. Additionally, a 400 error is returned if the new segment label name doesn’t match the ones already existing in the store.

Example payload:

json
{
  "name": "Staging",
  "selector": "{env=\"staging\"}",
  "fallback_to_default": false
}

Note

After you create a segment, all of the endpoints available for listing and editing rules, including recommendations and exemptions, allow the optional sending of a URL query parameter named segment. This parameter specifies the particular rule segment to perform an operation on. If this parameter is empty, the API uses the default segment.

Update rule segments

  • Method: PUT
  • Endpoint: /aggregations/rules/segments?segment=<segment ID>
  • Scope: adaptive-metrics-segments:write
  • Description: Updates a rule segment.

Example payload:

json
{
  "name": "Staging",
  "selector": "{env=\"staging\"}",
  "fallback_to_default": false
}

Delete rule segments

  • Method: DELETE
  • Endpoint: /aggregations/rules/segments?segment=<segment ID>
  • Scope: adaptive-metrics-segments:delete
  • Description: Deletes a rule segment. Specify the segment to delete in the request body.

Note

To delete a segment, you must remove all aggregation rules and exemptions associated with that segment. Otherwise, an error occurs.

List segmented aggregation rules

  • Method: GET
  • Endpoint: /aggregations/segmented_rules
  • Description: Lists aggregation rules distributed by segment.

Example response:

json
[
  {
	"segment": {..},
	"rules": [...]
  },
  {
	"segment": {...},
	"rules": [...]
  },
  {
	"segment": {...},
	"rules": [...]
  },
  {
	"segment": {"name": "default"},
	"rules": [...]
  }
]

List segmented recommendations

  • Method: GET
  • Endpoint: /aggregations/segmented_recommendations
  • Description: Lists recommendations distributed by segment.

Example response:

json
  {
	"segment": {...},
	"recommendations": [...]
  },
  {
	"segment": {"name": "default" },
	"recommendations": [...]
  }

List segmented exemptions

  • Method: GET
  • Endpoint: /v1/recommendations/segmented_exemptions
  • Description: Lists exemption rules distributed by segment.

Example response:

json
[
  {
	"segment": {...},
	"exemptions": [...]
  },
  {
	"segment": {"name": "default" },
	"exemptions": [...]
  }
]

Best practices for segmenting rules

Follow these best practices for segmenting rules in Adaptive Metrics.

Define one segment per team

While it’s possible to have more than one segment for the same team, this approach is not recommended. Those segments would need to replicate the same number of aggregation rules and exemptions, adding an unnecessary layer of complexity by requiring these sets of rules to remain in sync.

If a single team has more than one label value assigned, define a single selector using a multi-literal regular expression. For example, {namespace=~"(team_dev|team_staging|team)"}.

Migrate one segment at a time

Incorrectly defining segments during migration can affect the ratio of aggregated series. As a result, it’s recommended to migrate one segment at a time and to check during the process that the percentage of total series versus aggregated series doesn’t change.

Begin with the segments that have the least potential for negative impact. Then, refine the process before proceeding with the segments that have a higher volume of time series.

It’s also recommended use an automation tool to minimize risks during the migration.

Regularly update segment definitions

If a new label value is assigned to a team, update its associated segment as soon as possible. Otherwise, the default segment is applied to time series with that label value. It’s also recommended to use an automation tool for this process.

Example

The following example defines a segment for the machine learning team. In this example, migration isn’t required.

You can identify the machine learning team’s time series using the label namespace. There are three possible values associated with the team, ml-dev, ml-staging, and ml.

The following steps show how to define both the segment and its associated rules using the API.

  1. Create the machine-learning segment.

    bash
    curl -u $TENANT:$KEY -X POST -H "Content-Type: application/json" -d '{
    "name": "ml",
    "selector": "{namespace=~\"(ml-dev|ml-staging|ml)\"}"
    }' $URL/aggregations/rules/segments

    The output of this command shows the value of the new segment, including the unique identifier in the id field.

  2. Apply segment recommendations. After you create the segment, the recommendations service starts generating per-segment recommendations that you can apply. For example, if the segment id for the machine-learning team is 01J35VC3XPZD6GBB0QXJ86KS1E, you can download the recommendations list for that segment. For example:

    bash
    curl -u $TENANT:$KEY -X GET $URL/aggregations/recommendations?segment=01J35VC3XPZD6GBB0QXJ86KS1E -o ml.recommendations.json

    After you download the recommended rules, you can be apply them by referencing the corresponding segment when you update the rules. For example:

    bash
    curl -u $TENANT:$KEY -X POST $URL/aggregations/rules?segment=01J35VC3XPZD6GBB0QXJ86KS1E -d @ml.recommendations.json -H "Content-Type: application/json"
  3. Create per-segment exemptions. You can define a set of exemptions on a per-segment basis. For example:

    bash
    curl -u $TENANT:$KEY -X POST $URL/recommendations/exemptions?segment=01J35VC3XPZD6GBB0QXJ86KS1E -d @ml.exemptions.json -H "Content-Type: application/json"
  4. Update the segment, as required. For example:

    bash
    curl -u $TENANT:$KEY -X PUT -H "Content-Type:   application/json" -d '{
     "name": "machine-learning",
     "selector": "{namespace=~\"(ml-dev|ml-staging|ml| machine-learning)\"}"
    }' $URL/aggregations/rules/segments