---
title: "Data handling and privacy | Grafana Cloud documentation"
description: "Understand what data AI Observability collects, how it's stored, and how to control data retention."
---

> For a curated documentation index, see [llms.txt](/llms.txt). For the complete documentation index, see [llms-full.txt](/llms-full.txt).

# Data handling and privacy

Grafana AI Observability captures generation data that your SDKs export. This article explains what data is collected, how it’s stored, and how to control retention.

## Understand collected data

AI Observability stores the generation data your SDK sends, including:

- Conversation IDs and generation IDs.
- Model provider and name.
- System prompts, input messages, and output messages.
- Tool definitions, tool calls, and tool results.
- Token usage and timing data.
- Agent names and computed version hashes.
- Metadata and tags you attach.
- Evaluation scores and feedback.

AI Observability also receives OpenTelemetry metrics and traces from your agents via the collector.

## Control your data

You control what data AI Observability receives by configuring your SDK:

- **Embedding capture** is off by default. Enable it only for debugging because it may include sensitive input data.
- **Raw artifacts** (full provider request/response) are off by default.
- **Metadata and tags** are application-defined — include only what’s useful for observability.
- **System prompts and messages** are captured as-is. If your prompts contain sensitive data, consider filtering before export.

## Instrument coding agents privately

AI Observability SDKs can instrument coding agents, and ready-made plugins integrate with the most popular ones, including Claude Code, Codex, Cursor, GitHub Copilot, OpenCode, and Pi.

By default, plugins capture only metadata — model, token usage, tool names, and timing. Your conversations, prompts, tool calls, and tool results stay local. Sending that content to AI Observability is an explicit opt-in.

To enable conversation capture for a plugin, set the content capture mode in `~/.config/sigil/config.env`:

Bash ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```bash
SIGIL_CONTENT_CAPTURE_MODE=full
```

The supported modes are `metadata_only` (default), `no_tool_content`, and `full`. Refer to [Instrument coding agents](/docs/grafana-cloud/machine-learning/ai-observability/guides/instrument-coding-agents) for the per-plugin setup steps.

## Understand storage

Generation data is stored in two tiers:

- **Hot storage (MySQL)**: recent generation metadata and payloads for fast queries.
- **Cold storage (object storage)**: compacted, compressed payloads for long-term retention.

For self-hosted deployments, you control the storage infrastructure and retention policies. For Grafana Cloud, data handling follows Grafana Cloud’s standard data processing agreements.

## Configure retention

Configure the compactor retention period to control how long hot data is kept before compaction. After compaction, data in object storage follows your storage lifecycle policies.

## Understand online evaluation privacy

When using LLM judge evaluators, generation content (messages, prompts, tool calls) is sent to the configured judge provider for scoring. The judge provider processes this data according to its own terms of service. Choose judge providers that meet your organization’s data handling requirements.

## Next steps

- [Security and access controls](/docs/grafana-cloud/machine-learning/ai-observability/privacy-and-security/security)
