---
title: "Grafana Cloud AI Observability | Grafana Cloud documentation"
description: "OpenTelemetry-native observability with distributed tracing for your entire AI stack including LLMs, vector databases, GPU infrastructure, Model Context Protocol, and AI evaluations with Grafana and Grafana Cloud"
---

# AI Observability

OpenTelemetry-native AI observability with distributed tracing across your complete AI stack. Monitor and visualize real-time performance of LLMs, vector databases, GPUs and MCP Servers (Model Context Protocol)

* * *

## Overview

Grafana AI Observability is a complete solution designed to monitor and optimize your entire AI stack. It provides end-to-end observability across all components of your AI stack.

### GenAI observability

- **Performance tracking:** Monitor LLM response times, throughput, and availability across providers
- **Cost management:** Real-time spend tracking, cost optimization, and budget management for LLM usage
- **Token analytics:** Track consumption patterns, efficiency metrics, and usage optimization opportunities
- **User interactions:** Gain insights into user interactions, prompts, and completions for performance understanding

### GenAI evaluations

- **Quality assessment:** Automated hallucination detection, factual accuracy verification, and content quality scoring
- **Safety monitoring:** Continuous toxicity detection, bias assessment, and compliance tracking for responsible AI
- **Evaluation scoring:** Confidence levels, quality gates, and automated quality assurance workflows
- **Problem identification:** Detailed analysis and categorization of AI model issues and failure patterns

### GenAI Agent Observability

- **Invocation tracking:** Monitor total agent invocations, usage distribution by source, and percentage breakdown across your agentic AI systems
- **Cost management:** Real-time tracking of total agent costs in USD, per-agent cost breakdown, and cost attribution for budget optimization
- **Performance monitoring:** Track 95th percentile operation duration, average latency by agent and provider, and operation throughput rates
- **Logs and debugging:** Integrated agent logs with OpenTelemetry trace and span ID correlation for distributed tracing and root cause analysis

### VectorDB observability

- **Query performance:** Monitor similarity search response times, throughput, and query optimization
- **Database operations:** Track insert, update, and delete operations across different vector database providers
- **Resource utilization:** Monitor memory usage, storage efficiency, and infrastructure scaling needs
- **Index management:** Track index building, optimization, and maintenance for optimal search performance

### MCP observability

- **Protocol health:** Track session management, connection stability, and protocol compliance metrics
- **Tool analytics:** Monitor tool usage patterns, performance, and availability across your AI ecosystem
- **Transport monitoring:** Analyze communication performance across HTTP, WebSocket, and other transport layers
- **Integration insights:** Track tool invocation patterns, payload analysis, and system reliability

### GPU observability

- **Performance monitoring:** Track GPU utilization, compute efficiency, and processing throughput
- **Thermal management:** Monitor temperatures, cooling systems, and prevent thermal throttling
- **Resource optimization:** Analyze memory usage, power consumption, and multi-GPU coordination
- **Infrastructure health:** Monitor hardware status, driver stability, and predictive maintenance metrics

## Explore

[Introduction  
\
Learn about how Grafana Cloud AI Observability can help you improve performance of your AI stack.](./introduction/)

[Setup Guide  
\
Install the AI Observability integration and configure OpenTelemetry for your AI applications.](./setup/)

[GenAI Monitoring  
\
Monitor and evaluate your generative AI applications with comprehensive observability and quality assessment capabilities.](./genai/)

[VectorDB Observability  
\
Track vector database performance, query response times, and operational metrics across services and environments.](./vectordb-observability/)

[MCP Observability  
\
Monitor Model Context Protocol usage, tool analytics, and transport performance for robust protocol monitoring.](./mcp-observability/)

[GPU Observability  
\
Track GPU utilization, temperature, memory usage, and hardware performance metrics across your infrastructure.](./gpu-observability/)
