---
title: "AI/LLM observability overview | Grafana Labs"
description: "Monitor LLMs, vector databases, and GPU infrastructure"
---

> For a curated documentation index, see [llms.txt](/llms.txt). For the complete documentation index, see [llms-full.txt](/llms-full.txt).

## What you get

| Component                        | What it monitors                                | Problems solved                      |
|----------------------------------|-------------------------------------------------|--------------------------------------|
| **GenAI / LLMs**                 | Response times, token usage, costs, error rates | Track LLM performance and spending.  |
| **GenAI evaluations**            | Hallucination detection, toxicity, bias         | Ensure AI output quality and safety. |
| **Vector databases**             | Query performance, operations, resource usage   | Optimize RAG pipelines.              |
| **MCP (Model Context Protocol)** | Tool analytics, session health                  | Monitor AI agent integrations.       |
| **GPU infrastructure**           | Utilization, temperature, memory, power         | Prevent GPU bottlenecks.             |

## Questions answered

| With AI Observability, you can answer…                        |
|---------------------------------------------------------------|
| How much are we spending on LLM tokens this month?            |
| Is our AI model hallucinating or producing toxic content?     |
| Why is our RAG pipeline returning slow or irrelevant results? |
| Are our GPUs being underutilized or thermal throttling?       |
| Which AI agents are using tools most frequently?              |

## Problems solved

| Problem                               | Solution                                     |
|---------------------------------------|----------------------------------------------|
| LLM costs unpredictable and untracked | Real-time cost tracking per model/provider   |
| No visibility into AI model quality   | Automated hallucination and safety detection |
| RAG pipeline lacks transparency       | Vector DB query metrics                      |
| GPU resources wasted                  | Utilization and temperature monitoring       |

## End-to-end AI stack visibility

[AI stack pipeline showing User Request through AI Agent, LLM, Vector DB, to GPU with metrics at each stage](grafana-cloud-ai-stack-1.svg "AI stack pipeline showing User Request through AI Agent, LLM, Vector DB, to GPU with metrics at each stage")
