Grafana Cloud

Evaluation API

The evaluation control plane API manages evaluators, rules, and judge provider discovery.

Note

AI Observability is internally referred to as “Sigil” in some configuration and naming. For example, environment variables use the SIGIL_EVAL_ prefix, while the API routes documented here use the generic /api/v1/ prefix.

Evaluator endpoints

MethodPathDescription
GET/api/v1/evaluatorsList all evaluators.
POST/api/v1/evaluatorsCreate an evaluator.
GET/api/v1/evaluators/{id}Get an evaluator by ID.
PUT/api/v1/evaluators/{id}Update an evaluator.
DELETE/api/v1/evaluators/{id}Delete an evaluator.

Rule endpoints

MethodPathDescription
GET/api/v1/eval-rulesList all evaluation rules.
POST/api/v1/eval-rulesCreate a rule.
GET/api/v1/eval-rules/{id}Get a rule by ID.
PUT/api/v1/eval-rules/{id}Update a rule.
DELETE/api/v1/eval-rules/{id}Delete a rule.

Judge provider endpoints

MethodPathDescription
GET/api/v1/judge-providersList discovered judge providers.

Score ingest

TransportEndpoint
HTTPPOST /api/v1/scores:export

The score ingest endpoint accepts externally computed evaluation scores. Scores are idempotent — re-submitting the same score ID is a no-op.

Evaluator types

LLM judge

JSON
{
  "kind": "llm_judge",
  "config": {
    "provider": "openai",
    "model": "gpt-4o-mini",
    "system_prompt": "You are a quality evaluator.",
    "user_prompt": "Rate this response:\n{{assistant_response}}",
    "max_tokens": 100,
    "temperature": 0.0,
    "timeout_ms": 30000
  }
}

JSON schema

JSON
{
  "kind": "json_schema",
  "config": {
    "schema": {
      "type": "object",
      "required": ["answer"],
      "properties": {
        "answer": { "type": "string" }
      }
    },
    "target": "response"
  }
}

The optional target field sets the text to evaluate: response (default), input, or system_prompt.

Regex

JSON
{
  "kind": "regex",
  "config": {
    "pattern": "^\\d+$",
    "reject": false,
    "target": "response"
  }
}

Heuristic

JSON
{
  "kind": "heuristic",
  "config": {
    "version": "v2",
    "target": "response",
    "rules": {
      "and": [
        { "not_empty": "assistant_response" },
        { "min_length": { "field": "assistant_response", "value": 10 } }
      ]
    }
  }
}

Rule selectors

SelectorDescription
user_visible_turnAssistant text responses without tool calls.
all_assistant_generationsAny assistant output.
tool_call_stepsGenerations containing tool calls.
errored_generationsGenerations with call_error.

Rule alert integration

Rules can include an alert_rule_uids field that stores the UIDs of Grafana alert rules created from the evaluation rule. You can create alerts directly from the rule editor in the plugin UI by setting a pass-rate threshold and contact point. Alerts are created as non-provisioned, so you can edit them in the Grafana Alerting UI afterward.