Plugins 〉LLM


Sign up to receive occasional product news and updates:



  • Overview
  • Installation
  • Change log
  • Related content

Grafana LLM app (public preview)

This Grafana application plugin centralizes access to LLMs across Grafana.

It is responsible for:

  • storing API keys for LLM providers
  • proxying requests to LLMs with auth, so that other Grafana components need not store API keys
  • providing Grafana Live streams of streaming responses from LLM providers (namely OpenAI)
  • providing LLM based extensions to Grafana's extension points (e.g. 'explain this panel')

Future functionality will include:

  • support for more LLM providers, including the ability to choose your own at runtime
  • rate limiting of requests to LLMs, for cost control
  • token and cost estimation
  • RBAC to only allow certain users to use LLM functionality

Note: The Grafana LLM App plugin is currently in Public preview. Grafana Labs offers support on a best-effort basis, and there might be breaking changes before the feature is generally available.

For users

Grafana Cloud: the LLM app plugin is installed for everyone, but LLM features are disabled by default. To enable LLM features, select "Enable OpenAI access via Grafana" in plugin configuration.

OSS or Enterprise: install and configure this plugin with your OpenAI-compatible API key to enable LLM-powered features across Grafana.

This includes new functionality inside Grafana itself, such as explaining panels, or in plugins, such as automated incident summaries, AI assistants for flame graphs and Sift error logs, and more.

All LLM requests will be routed via this plugin, which ensures the correct API key is being used and requests are routed appropriately.

For plugin developers

This plugin is not designed to be directly interacted with; instead, use the convenience functions in the @grafana/llm package which will communicate with this plugin, if installed.

Working examples can be found in the '@grafana/llm README' and in the DevSandbox.tsx class.

First, add the latest version of @grafana/llm to your dependencies in package.json:

  "dependencies": {
    "@grafana/llm": "0.8.0"

Then in your components you can use the llm object from @grafana/llm like so:

import React, { useState } from 'react';
import { useAsync } from 'react-use';
import { scan } from 'rxjs/operators';

import { llms } from ‘@grafana/llm’; import { PluginPage } from ‘@grafana/runtime’;

import { Button, Input, Spinner } from ‘@grafana/ui’;

const MyComponent = (): JSX.Element => { const [input, setInput] = React.useState(’’); const [message, setMessage] = React.useState(’’); const [reply, setReply] = useState(’’);

const { loading, error } = useAsync(async () => { const enabled = await llms.openai.enabled(); if (!enabled) { return false; } if (message === ‘’) { return; } // Stream the completions. Each element is the next stream chunk. const stream = llms.openai .streamChatCompletions({ model: ‘gpt-3.5-turbo’, messages: [ { role: ‘system’, content: ‘You are a cynical assistant.’ }, { role: ‘user’, content: message }, ], }) .pipe( // Accumulate the stream chunks into a single string. scan((acc, delta) => acc + delta, ‘’) ); // Subscribe to the stream and update the state for each returned value. return stream.subscribe(setReply); }, [message]);

if (error) { // TODO: handle errors. return null; }

return ( <div> <Input value={input} onChange={(e) => setInput(e.currentTarget.value)} placeholder=“Enter a message” /> <br /> <Button type=“submit” onClick={() => setMessage(input)}> Submit </Button> <br /> <div>{loading ? <Spinner /> : reply}</div> </div> ); };

Installing LLM on Grafana Cloud:

For more information, visit the docs on plugin installation.




  • Settings: differentiate between disabled and not configured (#350)


  • Breaking: use base and large model names instead of small/medium/large (#334)
  • Breaking: remove function calling arguments from @grafana/llm package (#343)
  • Allow customisation of mapping between abstract model and provider model, and default model (#337, #338, #340)
  • Make the model field optional for chat completions & chat completion stream endpoints (#341)
  • Don't preload the plugin to avoid slowing down Grafana load times (#339)


  • Fix handling of streaming requests made via resource endpoints (#326)


  • Initial backend support for abstracted models (#315)


  • Fix panic with stream EOF (#308)


  • Added a displayVectorStoreOptions flag to optionally display the vector store configs


  • Add mitigation for side channel attacks


  • Refactors repo into monorepo together with frontend dependencies
  • Creates developer sandbox for developing frontend dependencies
  • Switches CI/CD to github actions


  • Fix bug where resource calls to OpenAI would fail for Grafana managed LLMs


  • Fix additional UI bugs
  • Fix issue where health check returned true even if LLM was disabled


  • Fix UI issues around OpenAI provider introduced in 0.6.1


  • Store Grafana-managed OpenAI opt-in in ML cloud backend DB (Grafana Cloud only)
  • Updated Grafana-managed OpenAI opt-in messaging (Grafana Cloud only)
  • UI update for LLM provider selection


  • Add Grafana-managed OpenAI as a provider option (Grafana Cloud only)


  • Allow Qdrant API key to be configured in config UI, not just when provisioning


  • Fix issue where temporary errors were cached, causing /health to fail permanently.


  • Add basic auth to VectorAPI


  • Add 'Enabled' switch for vector services to configuration UI
  • Added instructions for developing with example app
  • Improve health check to return more granular details
  • Add support for filtered vector search
  • Improve vector service health check


  • Add Go package providing an OpenAI client to use the LLM app from backend Go code
  • Add support for Azure OpenAI. The plugin must be configured to use OpenAI and provide a link between OpenAI model names and Azure deployment names
  • Return streaming errors as part of the stream, with objects like {"error": "<error message>"}


  • Improve health check endpoint to include status of various features
  • Change path handling for chat completions streams to put separate requests into separate streams. Requests can pass a UUID as the suffix of the path now, but is backwards compatible with an older version of the frontend code.


  • Expose vector search API to perform semantic search against a vector database using a configurable embeddings source


  • Support proxying LLM requests from Grafana to OpenAI