Grafana Cloud

Configure the AI Observability SDK

All AI Observability SDKs share the same configuration model. This article covers the available options for generation export, authentication, batching, and telemetry.

Generation export

ParameterDefaultDescription
protocolgrpcTransport protocol. Options: http, grpc, none (instrumentation-only).
endpointvaries by protocolAI Observability API address. HTTP default: http://localhost:8080/api/v1/generations:export. gRPC default: localhost:4317.

Authentication

ModeRequired fieldsDescription
noneNo authentication. Suitable for local development.
tenanttenantIdInjects X-Scope-OrgID header. Use for self-hosted multi-tenant deployments.
bearerbearerTokenInjects Authorization: Bearer <token> header. Use with proxy patterns.
basictenantId, basicPasswordInjects Authorization: Basic header. Recommended for Grafana Cloud.

For basic mode, tenantId is your Grafana Cloud instance ID and basicPassword is a Grafana Cloud Access Policy Token with the sigil:write scope. Refer to Create an API key for setup instructions.

Batching and retry

ParameterDefaultDescription
batchSize100Maximum generations per export batch.
flushInterval1sHow often the SDK flushes queued generations.
queueSize2000Maximum number of queued generations before the SDK drops new ones.
maxRetries5Number of retry attempts for transient failures.
initialBackoff100msInitial retry delay.
maxBackoff5sMaximum retry delay.
payloadMaxBytes16 MBMaximum payload size per export request.

OpenTelemetry setup

The SDK emits OpenTelemetry spans and metrics internally, but does not create OTel providers. Your application must configure a TracerProvider and MeterProvider before creating the Sigil client. Without this setup, all traces and metrics are silently lost.

Set the OTLP endpoint and optional auth headers through environment variables. The OTel SDK exporters read them automatically:

Bash
# Option A — Direct to Grafana Cloud (no collector needed):
export OTEL_EXPORTER_OTLP_ENDPOINT="https://<your-otlp-gateway-url>"   # from Grafana Cloud portal
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic <base64(instance_id:cloud_api_token)>"

# Option B — Via local Alloy / OTel Collector:
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"

Refer to Send data using the OTLP endpoint to find your stack-specific OTLP gateway URL.

Python
from opentelemetry import trace, metrics
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.sdk.resources import Resource
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.http.metric_exporter import OTLPMetricExporter

resource = Resource.create({"service.name": "my-agent"})

tp = TracerProvider(resource=resource)
tp.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(tp)

mp = MeterProvider(resource=resource, metric_readers=[
    PeriodicExportingMetricReader(OTLPMetricExporter())
])
metrics.set_meter_provider(mp)

# ... create Sigil client and use it ...

sigil.shutdown()
tp.shutdown()
mp.shutdown()

This example requires opentelemetry-sdk and opentelemetry-exporter-otlp-proto-http.

Go
import (
    "go.opentelemetry.io/otel"
    sdkmetric "go.opentelemetry.io/otel/sdk/metric"
    sdktrace "go.opentelemetry.io/otel/sdk/trace"
    "go.opentelemetry.io/contrib/exporters/autoexport"
)

traceExp, _ := autoexport.NewSpanExporter(ctx)
tp := sdktrace.NewTracerProvider(sdktrace.WithBatcher(traceExp), sdktrace.WithResource(res))
otel.SetTracerProvider(tp)
defer tp.Shutdown(ctx)

metricExp, _ := autoexport.NewMetricReader(ctx)
mp := sdkmetric.NewMeterProvider(sdkmetric.WithReader(metricExp), sdkmetric.WithResource(res))
otel.SetMeterProvider(mp)
defer mp.Shutdown(ctx)

// ... create Sigil client and use it ...
typescript
import { metrics } from "@opentelemetry/api";
import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node";
import { BatchSpanProcessor } from "@opentelemetry/sdk-trace-base";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
import {
  MeterProvider,
  PeriodicExportingMetricReader,
} from "@opentelemetry/sdk-metrics";
import { OTLPMetricExporter } from "@opentelemetry/exporter-metrics-otlp-http";

const tp = new NodeTracerProvider({ resource });
tp.addSpanProcessor(new BatchSpanProcessor(new OTLPTraceExporter()));
tp.register();

const mp = new MeterProvider({
  resource,
  readers: [
    new PeriodicExportingMetricReader({ exporter: new OTLPMetricExporter() }),
  ],
});
metrics.setGlobalMeterProvider(mp);

// ... create Sigil client and use it ...

await sigil.shutdown();
await tp.shutdown();
await mp.shutdown();
java
// Use OpenTelemetry autoconfigure:
import io.opentelemetry.sdk.autoconfigure.AutoConfiguredOpenTelemetrySdk;

AutoConfiguredOpenTelemetrySdk.initialize();

// ... create Sigil client and use it ...
csharp
using OpenTelemetry.Metrics;
using OpenTelemetry.Trace;

using var tracerProvider = Sdk.CreateTracerProviderBuilder()
    .AddSource("github.com/grafana/sigil/sdks/dotnet")
    .AddOtlpExporter()
    .Build();

using var meterProvider = Sdk.CreateMeterProviderBuilder()
    .AddMeter("github.com/grafana/sigil/sdks/dotnet")
    .AddOtlpExporter()
    .Build();

// ... create Sigil client and use it ...

OpenTelemetry metrics

The SDK emits these OpenTelemetry metrics:

MetricTypeDescription
gen_ai.client.operation.durationHistogramLLM call duration.
gen_ai.client.token.usageHistogramToken consumption per call.
gen_ai.client.time_to_first_tokenHistogramStreaming time to first token.
gen_ai.client.tool_calls_per_operationHistogramTool calls per generation.

Embedding capture

Embedding capture is off by default. Enable it for debugging only because it may expose sensitive data.

ParameterDefaultDescription
captureInputfalseCapture embedding input content.
maxInputItems20Maximum embedding inputs to capture.
maxTextLength1024Maximum text length per input.

Raw artifacts

Raw artifacts capture the unprocessed provider request and response. Off by default.

Enable per-language:

  • Go: WithRawArtifacts() option
  • Python: raw_artifacts=True
  • TypeScript: rawArtifacts: true
  • Java: .setRawArtifacts(true)
  • .NET: .WithRawArtifacts()

Next steps