Grafana Cloud

Create policies

Create, view, and edit policies to define a set of criteria that evaluates traces and decides which to keep (sample) or not keep (not sampled, or drop).

Policies work in an or relationship by default. This means that if a trace matches the criteria of any of the defined policies, it is sampled.

All policies you create are evaluated in no particular order. If any top-level policies are evaluated as sample, the trace is kept. In this sense, it is important to note that as soon as you submit a policy, only those traces that are evaluated sample by the policies are kept; all other traces are dropped.

When you started with Adaptive Traces, you already created a probabilistic policy that helps reduce the overall volume of data.

In this section, we walk through how to build on that with some more examples.

  1. Create a status code policy. Status code policies target error detection and debugging.

  2. Create an and policy, including string and latency sub-policies. And policies help you create very specific rules combining multiple criteria that capture only the most relevant traces.

Note

If you are creating a string_attribute policy type and using regular expressions (regex) in string attributes, you must explicitly enable regex matching in the configuration.

Create a status code policy

Status code policies sample traces based on their status code (Ok ,Error, or Unset). They are useful for capturing traces that contain errors.

In this example, you create a status code policy to sample traces containing errors.

  1. Navigate to Adaptive Telemetry > Adaptive Traces.

  2. Click Create New Policy.

  3. Enter a policy name.

  4. Choose a policy type; in this case status code.

    In the Body, make sure the status code is set to error.

  5. Toggle Auto-expire to set an expiry time for the policy.

  6. Click Submit.

    You are immediately able to see the impact of Adaptive Traces on the volume of stored traces.

Create an and policy

And policies allow you to get more specific by combining multiple policies using a logical AND operation.

For example, you could apply a different probabilistic sampling rate to traces containing spans with a particular service name.

  1. Navigate to Adaptive Traces.

  2. Click Create New Policy.

  3. Enter a policy name.

  4. Choose a policy type; in this case and.

    In the Body, update name, key, and values for the string attribute policy and name and sampling percentage for the probabilistic policy.

   {
  "name": "test-policy-1",
  "type": "and",
  "and": {
    "and_sub_policy": [
      {
        "name": "important-service-policy",
        "type": "string_attribute",
        "string_attribute": {
          "key": "service.name",
          "values": [
            "service-name"
          ]
        }
      },
      {
        "name": "twenty-percent",
        "type": "probabilistic",
        "probabilistic": {
          "sampling_percentage": 20
        }
      }
    ]
  }
}
  1. Toggle Auto-expire to set an expiry time for the policy.

  2. Click Submit.

    You are immediately able to see the impact of Adaptive Traces on the volume of stored traces.

    For more information, refer to the section on Monitor Adaptive Traces.

Set policy expiration

When you create or edit a policy, you can set it to automatically expire after a specified date and time. This is useful for temporary policies, such as capturing extra traces during a deployment or incident investigation, without leaving the policy active indefinitely.

To set a policy to auto-expire, toggle Auto-expire when creating or editing a policy, then choose an expiration date and time.

When a policy expires:

  • The policy is automatically removed and stops evaluating traces.
  • Traces that were previously sampled by the expired policy are no longer affected going forward.
  • The expiration is recorded in the policy history as “Policy expired”.
  • Expiration times are stored in UTC.

Note

If you need the policy again after it expires, you must create a new one. Expired policies cannot be reactivated.

View policy history

You can view an audit trail of changes made to any policy. The policy history tracks when a policy was created, edited, expired, or deleted, and who made each change.

To view the history of a policy, complete the following steps.

  1. Navigate to Adaptive Traces > Policies.
  2. Click a policy name to open the policy drawer.
  3. Select the Policy history tab.

The history displays a chronological list of events, including:

  • Policy created: the policy was first created manually.
  • Applied recommendation by [user]: a user applied a recommendation that created or updated the policy.
  • Auto-applied recommendation: the system automatically applied a recommendation (for example, anomaly detection).
  • Edited by [user]: a user modified the policy.
  • Policy expired: the policy reached its auto-expire date.
  • Deleted by [user]: a user manually deleted the policy.

Example scenario

The following scenario highlights how Adaptive Traces can be used to ingest only the traces that matter.

Imagine you run a popular online gaming platform, where users can purchase subscriptions. These transactions are high-value and critical to your business. You need to ensure a smooth and reliable experience for paying users, while also maintaining overall platform stability.

Challenges

You have a large and active user base, generating a huge volume of trace data.

Payment transactions are crucial as any errors can lead to revenue loss and customer dissatisfaction. Your platform comprises various microservices, for example, payment processing. Identifying performance issues across different services is crucial to ensure a smooth gaming experience.

How Adaptive Traces can help

  1. Create a probabilistic policy to capture a representative sample of traces from your services while maintaining a manageable data volume.

    JSON
    {
      "name": "probabilistic-game-servers",
      "type": "probabilistic",
      "probabilistic": {
        "sampling_percentage": 1
      }
    }
  2. Configure a status code policy to capture all traces with error status codes from the payment services. This ensures immediate visibility into any transaction failures.

    JSON
    {
        "and_sub_policy": [
          {
            "name": "payment-service",
            "type": "string_attribute",
            "string_attribute": {
              "key": "service.name",
              "values": [
             "payment-service"
             ]
            }
          },
          {
            "name": "error",
            "type": "status_code",
            "status_code": {
              "status_codes": [
                "ERROR"
              ]
            }
          }
        ]
      }
    }
  3. Create an and policy to configure a latency policy with a string attribute policy to capture traces with high latency, for example, traces exceeding 500ms, from your payment service.

    JSON
    {
        "and_sub_policy": [
          {
            "name": "high-latency-payment",
            "type": "latency",
            "latency": {
              "threshold_ms": 500
            }
          },
          {
            "name": "payment-service-string-attribute",
            "type": "string_attribute",
            "string_attribute": {
              "key": "service.name",
              "values": [
                "payment-service"
              ]
            }

For another practical example, refer to the OpenTelemetry documentation.