Menu
Grafana Cloud

GenAI observability

GenAI Observability provides a complete monitoring for Large Language Model (LLM) applications including performance metrics, token usage tracking, cost analysis, and user interaction patterns.

Overview

The GenAI Observability dashboard is the main monitoring dashboard for LLM applications, offering insights into:

  • Request monitoring - Volume, success rates, and response times
  • Cost analysis - Real-time spend tracking and optimization insights
  • Token usage - Consumption patterns and efficiency metrics
  • Performance analytics - Model comparisons and trend analysis
  • Error tracking - Detailed failure analysis and categorization

Key features

Performance monitoring

  • Request volume and frequency tracking
  • Success and failure rate analysis
  • Response time distribution and latency monitoring
  • Throughput analysis and concurrent request handling

Cost optimization

  • Real-time cost calculation and tracking
  • Cost per request analysis and optimization
  • Model cost comparison and recommendations
  • Budget tracking and cost alert capabilities

Token analytics

  • Input and output token consumption tracking
  • Token efficiency analysis and optimization
  • Usage pattern identification and trends
  • Model-specific token cost analysis

Getting started