Grafana Cloud

Configure the AI Observability SDK

All AI Observability SDKs share the same configuration model. This article covers the available options for generation export, authentication, batching, and telemetry.

Generation export

ParameterDefaultDescription
protocolgrpcTransport protocol. Options: http, grpc, none (instrumentation-only).
endpointvaries by protocolAI Observability API address. HTTP default: http://localhost:8080/api/v1/generations:export. gRPC default: localhost:4317.

Authentication

ModeRequired fieldsDescription
noneNo authentication. Suitable for local development.
tenanttenantIdInjects X-Scope-OrgID header. Use for self-hosted multi-tenant deployments.
bearerbearerTokenInjects Authorization: Bearer <token> header. Use with proxy patterns.
basictenantId, basicPasswordInjects Authorization: Basic header. Recommended for Grafana Cloud.

Batching and retry

ParameterDefaultDescription
batchSize100Maximum generations per export batch.
flushInterval1sHow often the SDK flushes queued generations.
queueSize2000Maximum number of queued generations before the SDK drops new ones.
maxRetries5Number of retry attempts for transient failures.
initialBackoff100msInitial retry delay.
maxBackoff5sMaximum retry delay.
payloadMaxBytes16 MBMaximum payload size per export request.

OpenTelemetry metrics

The SDK emits these OpenTelemetry metrics:

MetricTypeDescription
gen_ai.client.operation.durationHistogramLLM call duration.
gen_ai.client.token.usageHistogramToken consumption per call.
gen_ai.client.time_to_first_tokenHistogramStreaming time to first token.
gen_ai.client.tool_calls_per_operationHistogramTool calls per generation.

Embedding capture

Embedding capture is off by default. Enable it for debugging only because it may expose sensitive data.

ParameterDefaultDescription
captureInputfalseCapture embedding input content.
maxInputItems20Maximum embedding inputs to capture.
maxTextLength1024Maximum text length per input.

Raw artifacts

Raw artifacts capture the unprocessed provider request and response. Off by default.

Enable per-language:

  • Go: WithRawArtifacts() option
  • Python: raw_artifacts=True
  • TypeScript: rawArtifacts: true
  • Java: .setRawArtifacts(true)
  • .NET: .WithRawArtifacts()

Next steps