Kong AI Gateway Dashboard

This dashboard visualizes the metrics available from the Kong AI Gateway. The following metrics are visualized:

Request per Second (RPS): The number of requests processed per second.
Token per Second (TPS): The number of tokens processed per second.
Latency: The delay in processing requests, broken down by component:
- LLM Provider: Latency from the Large Language Model provider.
- Cache Fetch: Latency to retrieve data from the cache.
- Cache Embeddings: Latency related to creating embeddings for the cache.
- RAG Fetch: Latency to retrieve data for Retrieval-Augmented Generation (RAG).
- RAG Embeddings: Latency related to creating embeddings for RAG.
Cost: The monetary cost associated with API usage.
Token: The number of tokens consumed.

Revisions

Revision	Description	Created
			Download

Get this dashboard

Import the dashboard template

Datasource

Dependencies

Resources