Help build the future of open source observability software Open positions

Check out the open source projects we support Downloads

The actually useful free plan

Grafana Cloud Free Tier
check

10k series Prometheus metrics

check

50GB logs, 50GB traces, 50GB profiles

check

500VUk k6 testing

check

20+ Enterprise data source plugins

check

100+ pre-built solutions

Featured webinar

Getting started with grafana LGTM stack

Getting started with managing your metrics, logs, and traces using Grafana

Learn how to unify, correlate, and visualize data with dashboards using Grafana.

How to monitor AI agent applications on Amazon Bedrock AgentCore with Grafana Cloud

How to monitor AI agent applications on Amazon Bedrock AgentCore with Grafana Cloud

2025-11-26 7 min

Today’s AI agents have grown increasingly sophisticated, moving into production environments and becoming integral parts of engineering workflows. But these agents can also be black boxes for engineers, which makes observability more critical than ever. 

Without proper monitoring, you’re often left feeling like you’re flying blind as you try to debug agent failures, understand performance bottlenecks, and track costs. We want to put our users back in control, so in this tutorial you’ll learn how to deploy an AI agent on Amazon Bedrock AgentCore with full observability powered by OpenTelemetry and Grafana Cloud.

More specifically, you’ll learn how to:

1. Deploy AI agents on AWS Bedrock AgentCore for managed, scalable production runtime

2. Instrument agents with OpenTelemetry using OpenLit for automatic, zero-code observability

3. Monitor agent performance in Grafana Cloud with AI Observability dashboards

4. Debug production issues using distributed tracing

5. Optimize costs by tracking token usage and model performance

Note: This post focuses on AI application observability. Stay tuned for the second part of this guide, which will focus on the AI observability at the infrastructure layer.

What is Amazon Bedrock AgentCore?

Amazon Bedrock AgentCore is a managed service that simplifies deploying and running AI agents in production. Think of it as a serverless runtime for your AI agents. You provide the agent code, and AWS handles the infrastructure, scaling, and execution environment.

Key benefits include:

  • Managed infrastructure: No need to provision servers or manage Kubernetes clusters
  • Amazon Bedrock integration: Native access to foundation models like Llama 3, Claude, and others
  • Container-based deployment: Package your agent with all dependencies using Docker
  • Enterprise-ready: Built-in security, IAM integration, and compliance features

AgentCore is particularly powerful for orchestration frameworks like CrewAI, LangGraph, or Strands, where coordinating multiple agents or complex workflows is necessary.

Why use OpenTelemetry for AI agents?

AI agents can be notoriously difficult to debug. A single user query might trigger:

  • Multiple LLM API calls
  • Tool invocations and external API requests
  • Multi-step reasoning chains
  • Retry logic and error handling

When something goes wrong (or worse), when performance silently degrades, you need visibility into every step. To address this, we recommend using OpenTelemetry (OTel), the industry-standard observability framework, which provides unified instrumentation for distributed applications and infrastructure.

For AI agents specifically, OpenTelemetry helps you answer critical questions:

  • Which LLM calls are slowest?
  • How many tokens am I consuming per request?
  • Where are errors occurring in my agent workflow?
  • What’s the end-to-end latency for user requests?

And while OpenTelemetry is powerful, manually instrumenting every LLM call and agent step is tedious and error-prone. This is where OpenLit shines.

OpenLit provides automatic instrumentation for AI frameworks:

  • Zero code changes required; wrap your Python command with openlit-instrument
  • Automatically capture LLM calls (OpenAI, Anthropic, Bedrock, etc.)
  • Support for agent frameworks (CrewAI, LangChain, LlamaIndex)
  • Export OpenTelemetry-compatible data to any OTLP backend

Tutorial: deploy and monitor a CrewAI agent

To illustrate how this works, let’s build a complete example: a research assistant agent powered by CrewAI and Meta’s Llama 3, deployed on AWS Bedrock AgentCore, with full observability in Grafana Cloud.

Prerequisites

Before starting, ensure you have:

  1. Python 3.12+** installed
  2. AWS CLI configured with credentials:
Bash
  aws configure

You’ll need permissions for:

   - Bedrock AgentCore

   - Amazon ECR (Elastic Container Registry)

   - Bedrock model access (specifically meta.llama3-8b-instruct-v1:0)

4. Grafana Cloud account (If you don’t have one, you can sign up for our forever-free tier now.)

5. AgentCore CLI installed:

Bash
  python -m venv .venv && source .venv/bin/activate

   pip install bedrock-agentcore-starter-toolkit

Step 1: Create a CrewAI Agent

Let’s create an example AI Agent using CrewAI: 

Python
import os
from bedrock_agentcore import BedrockAgentCoreApp
from crewai import Agent, Task, Crew, Process

# Initialize AgentCore runtime
app = BedrockAgentCoreApp()

# Define a simple research assistant agent
researcher = Agent(
    role="Research Assistant",
    goal="Provide helpful, accurate answers, with concise summaries.",
    backstory=("You are a knowledgeable research assistant who answers clearly "
               "and cites facts when relevant."),
    # Use Llama 3 8B via AWS Bedrock
    llm="bedrock/meta.llama3-8b-instruct-v1:0",
    verbose=False,
    max_iter=2
)

@app.entrypoint
def invoke(payload: dict):
    """AgentCore entrypoint. Expects {'prompt': ''}"""

    user_message = payload.get("prompt", "Hello!")
    task = Task(
        description=user_message,
        agent=researcher,
        expected_output="A helpful, well-structured response."
    )

    crew = Crew(
        agents=[researcher],
        tasks=[task],
        process=Process.sequential,
        verbose=False,
    )

    result = crew.kickoff()
    return {"result": result.raw}

if __name__ == "__main__":
    app.run()

Key components:

  • BedrockAgentCoreApp: Integrates CrewAI with the Amazon Bedrock AgentCore runtime
  • Agent definition: Single agent with a research assistant role using Llama 3
  • @app.entrypoint: Decorator that marks the function as the agent’s entry point
  • Crew orchestration: CrewAI manages task execution and agent coordination

The agent accepts JSON input like {"prompt": "your question"} and returns a JSON response.

Step 2: Configure dependencies

Create a requirements.txt that includes:

crewai>=1.0.0
openlit>=1.35
litellm
bedrock-agentcore

Step 3: Configure AgentCore deployment

Run the AgentCore configuration command:

Bash
agentcore configure \
  --deployment-type container \
  --entrypoint crewai_agent.py \
  --name crewai_agent \
  --non-interactive

This generates a .bedrock_agentcore/crewai_agent/ directory with:

- Dockerfile: Container build configuration

- agent_config.json: Metadata for AgentCore

Step 4: Add OpenTelemetry configuration

Now comes the observability magic. Edit the generated Dockerfile at .bedrock_agentcore/crewai_agent/Dockerfile and add these environment variables:

dockerfile
# Disable AWS ADOT observability to use OpenLIT exclusively
ENV DISABLE_ADOT_OBSERVABILITY="true"

# OpenTelemetry configuration for Grafana Cloud
ENV OTEL_SERVICE_NAME="my_service"
ENV OTEL_DEPLOYMENT_ENVIRONMENT="my_environment"
ENV OTEL_EXPORTER_OTLP_ENDPOINT="your_grafana_cloud_otlp_endpoint"
ENV OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic%20"

Important: Replace the OTLP endpoint and headers with your Grafana Cloud credentials:

  1. Sign in to the Grafana Cloud portal and select your Grafana Cloud stack.
  2. Click Configure in the OpenTelemetry section.
  3. In the Password / API Token section, click Generate to create a new API token
  4. Give the API token a name
  5. Click on Create token
  6. Click on Close without copying the token
  7. Copy and replace the values for OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_HEADERS in the Dockerfile ENVs

For more information, refer to our guide on manually setting up OpenTelemetry for Grafana Cloud.

Next, ensure the CMD line in the Dockerfile uses OpenLit’s instrumentation wrapper:

dockerfile
# Use OpenLit to automatically instrument the agent
CMD ["openlit-instrument", "python", "-m", "crewai_agent"]

What’s happening here?

  • openlit-instrument wraps your Python command
  • At runtime, OpenLit automatically monitors the CrewAI agent operations
  • Every LLM request and agent task is traced and exported to Grafana via OTLP

Step 5: Build and deploy

Build the Docker image and deploy to AgentCore:

Bash
agentcore launch --local-build

This command will:

1. Build the Docker image locally with all dependencies

2. Push the image to Amazon ECR

3. Deploy the agent to Bedrock AgentCore

4. Set up IAM execution roles

5. Configure the runtime environment

The deployment process takes two to five minutes. You’ll see output like:

✓ Pushing to ECR...
✓ Deploying to AgentCore...
✓ Agent deployed successfully!
Agent ID: agt_abc123xyz

Step 6: Invoke the agent

Test your deployed agent:

Bash
# Simple test
agentcore invoke '{"prompt": "hi"}'

# Research query
agentcore invoke '{"prompt": "Explain AI Observability"}'

# Complex request
agentcore invoke '{"prompt": "Compare supervised and unsupervised learning with examples"}'

Response:

JSON
{
  "result": "Quantum computing is a revolutionary approach to computation that..."
}

Response:

JSON
{
  "result": "Quantum computing is a revolutionary approach to computation that..."
}

Step 7: Explore Grafana Cloud AI Observability

Once you have telemetry flowing from CrewAI Agent on AgentCore to Grafana Cloud, you can use the pre-built dashboards from Grafana Cloud AI Observability.

Navigate to Connections → search for AI Observability and click on it → go to GenAI Observability, scroll down, and install the dashboards.

Here’s a breakdown of what you can see in the dashboards:

  • End-to-end latency: Total time from request to response
  • LLM call details: Which model, how many tokens, latency, cost
The Grafana AI Observability dashboard with data on request rates, token usage, usage cost, and more
  • Agent workflow: Task creation, execution, and response formatting
A Grafana Cloud dashboard shows token consumption vs. average usage cost, as well as traces
  • Error traces: If something fails, you’ll see the exact step and error message
A list of error traces in Grafana Cloud

Next steps

The combination of AWS Bedrock AgentCore, OpenTelemetry, and Grafana Cloud provides a production-ready stack for AI agents with enterprise-grade observability. Explore our Grafana Cloud AI Observability documentation to learn more.

Grafana Cloud is the easiest way to get started with metrics, logs, traces, dashboards, and more. We have a generous forever-free tier and plans for every use case. Sign up for free now!