Menu
Grafana Cloud

Create policies

Create, view, and edit policies to define a set of criteria that evaluates traces and decides which to keep (sample) or not keep (not sampled, or drop).

Policies work in an or relationship by default. This means that if a trace matches the criteria of any of the defined policies, it is sampled.

All policies you create are evaluated in no particular order. If any top-level policies are evaluated as sample, the trace is kept. In this sense, it is important to note that as soon as you submit a policy, only those traces that are evaluated sample by the policies are kept; all other traces are dropped.

When you started with Adaptive Traces, you already created a probabilistic policy that helps reduce the overall volume of data.

In this section, we walk through how to build on that with some more examples.

  1. Create a status code policy. Status code policies target error detection and debugging.

  2. Create an and policy, including string and latency sub-policies. And policies help you create very specific rules combining multiple criteria that capture only the most relevant traces.

Create a status code policy

Status code policies sample traces based on their status code (Ok ,Error, or Unset). They are useful for capturing traces that contain errors.

In this example, you create a status code policy to sample traces containing errors.

  1. Navigate to Administration > Cost Management > Traces Cost Management > Adaptive Traces.

  2. Click Create New Policy.

  3. Enter a policy name.

  4. Choose a policy type; in this case status code.

    In the Body, make sure the status code is set to error.

  5. Toggle Auto-expire to set an expiry time for the policy.

  6. Click Submit.

    You are immediately able to see the impact of Adaptive Traces on the volume of stored traces.

Create an and policy

And policies allow you to get more specific by combining multiple policies using a logical AND operation.

For example, you could apply a different probabilistic sampling rate to traces containing spans with a particular service name.

  1. Navigate to Administration > Cost Management > Traces Cost Management > Adaptive Traces.

  2. Click Create New Policy.

  3. Enter a policy name.

  4. Choose a policy type; in this case and.

    In the Body, update name, key, and values for the string attribute policy and name and sampling percentage for the probabilistic policy.

   {
  "name": "test-policy-1",
  "type": "and",
  "and": {
    "and_sub_policy": [
      {
        "name": "important-service-policy",
        "type": "string_attribute",
        "string_attribute": {
          "key": "service.name",
          "values": [
            "service-name"
          ]
        }
      },
      {
        "name": "twenty-percent",
        "type": "probabilistic",
        "probabilistic": {
          "sampling_percentage": 20
        }
      }
    ]
  }
}
  1. Toggle Auto-expire to set an expiry time for the policy.

  2. Click Submit.

    You are immediately able to see the impact of Adaptive Traces on the volume of stored traces.

    For more information, refer to the section on Monitor Adaptive Traces.

Example scenario

The following scenario highlights how Adaptive Traces can be used to ingest only the traces that matter.

Imagine you run a popular online gaming platform, where users can purchase subscriptions. These transactions are high-value and critical to your business. You need to ensure a smooth and reliable experience for paying users, while also maintaining overall platform stability.

Challenges

You have a large and active user base, generating a huge volume of trace data.

Payment transactions are crucial as any errors can lead to revenue loss and customer dissatisfaction. Your platform comprises various microservices, for example, payment processing. Identifying performance issues across different services is crucial to ensure a smooth gaming experience.

How Adaptive Traces can help

  1. Create a probabilistic policy to capture a representative sample of traces from your services while maintaining a manageable data volume.

    JSON
    {
      "name": "probabilistic-game-servers",
      "type": "probabilistic",
      "probabilistic": {
        "sampling_percentage": 1
      }
    }
  2. Configure a status code policy to capture all traces with error status codes from the payment services. This ensures immediate visibility into any transaction failures.

    JSON
    {
        "and_sub_policy": [
          {
            "name": "payment-service",
            "type": "string_attribute",
            "string_attribute": {
              "key": "service.name",
              "values": [
             "payment-service"
             ]
            }
          },
          {
            "name": "error",
            "type": "status_code",
            "status_code": {
              "status_codes": [
                "ERROR"
              ]
            }
          }
        ]
      }
    }
  3. Create an and policy to configure a latency policy with a string attribute policy to capture traces with high latency, for example, traces exceeding 500ms, from your payment service.

    JSON
    {
        "and_sub_policy": [
          {
            "name": "high-latency-payment",
            "type": "latency",
            "latency": {
              "threshold_ms": 500
            }
          },
          {
            "name": "payment-service-string-attribute",
            "type": "string_attribute",
            "string_attribute": {
              "key": "service.name",
              "values": [
                "payment-service"
              ]
            }

For another practical example, refer to the OpenTelemetry documentation.