Open source

Requirements and expectations

Before you put Grafana Alloy into production, it helps to have a clear picture of where it runs well, how it’s usually deployed, and where people most often get surprised.

Before a first deployment, people usually want answers to a few basic questions:

  • Will Alloy run in my environment?
  • How should I deploy it the first time?
  • What kinds of constraints or trade-offs should I expect?

The guidance here focuses on the common, supported paths that work well for most users, without diving into every possible edge case.

Design expectations

Grafana Alloy makes telemetry collection explicit and predictable, even when that means exposing trade-offs that other tools try to hide.

A few design choices are worth keeping in mind:

  • Alloy favors explicit configuration over implicit behavior. You define pipelines, routing, and scaling decisions in configuration rather than relying on automatic inference.
  • Alloy exposes deployment and scaling choices instead of masking them. Changes in topology—such as switching from a DaemonSet to a centralized deployment—can affect behavior, and those effects are intentional and visible.
  • Alloy consolidates multiple collectors, but it doesn’t replicate every default or assumption from these other collectors. Similar concepts may behave differently when the underlying goals differ.
  • Alloy prioritizes predictability over “magic” defaults. Understanding how components connect and how work distributes is part of operating Alloy successfully.

Keeping these expectations in mind makes it easier to reason about configuration changes, scaling decisions, and observed behavior in production.

Supported platforms

Alloy runs on the following platforms:

  • Linux
  • Windows
  • macOS
  • FreeBSD

For supported architectures and version requirements, refer to Supported platforms.

For setup instructions, refer to Set up Alloy.

Network requirements

Alloy requires network access for its HTTP server and for sending data to backends.

HTTP server

Alloy runs an HTTP server for its UI, API, and metrics endpoints. By default, it listens on 127.0.0.1:12345.

For more information, refer to HTTP endpoints.

Outbound connectivity

Alloy needs outbound network access to send telemetry to your backends. Ensure firewall rules and egress rules allow connections to:

  • Remote write or OTLP endpoints for metrics, such as Mimir, Prometheus, or Thanos
  • Log ingestion endpoints, such as Loki, Elasticsearch, or OTLP-compatible backends
  • Trace ingestion endpoints, such as Tempo, Jaeger, or OTLP-compatible backends
  • Profile ingestion endpoints, such as Pyroscope

Cluster communication

When you enable clustering, Alloy nodes communicate over HTTP/2 using the same HTTP server port. Each node must be reachable by other cluster members on the configured listen address.

Permissions and access

Some Alloy components interact closely with the host, container runtime, or Kubernetes APIs. When that happens, Alloy needs enough access to complete the work.

This requirement most often comes up when collecting:

  • Host-level metrics, logs, traces, or profiles
  • Container or runtime information
  • Data that lives outside the application sandbox

Not every component can run in a fully locked-down environment. When Alloy runs with restricted permissions, certain components might fail or behave unexpectedly.

For information about running as a non-root user, refer to Run as a non-root user.

When you enable a component, check its documented requirements first. Refer to the component reference for component-specific constraints and limitations.

Security

Alloy supports TLS for secure communication. Configure TLS in component tls blocks for backend connections, or use the --cluster.enable-tls flag for clustered mode. Authentication methods such as basic auth, OAuth2, and bearer tokens are configured per component.

Secrets management

Store sensitive values like API keys and passwords outside your configuration files. Alloy supports environment variable references and integrations such as HashiCorp Vault, Kubernetes Secrets, AWS S3, and local files.

Refer to the component documentation for specific options.

Deployment patterns

Alloy supports edge, gateway, and hybrid deployment patterns. Refer to How Alloy works for guidance on choosing the right pattern for your architecture.

For detailed setup instructions, refer to Deploy Alloy.

Clustering and scaling behavior

Some Alloy behavior depends on how you deploy it, not just on configuration.

Alloy supports clustering to distribute work across multiple instances. Clustering uses a gossip protocol and consistent hashing to distribute scrape targets automatically.

Note

Target auto-distribution requires enabling clustering at both the instance level and the component level. Refer to Clustering for configuration details.

A few things that often surprise users:

  • More Alloy instances means more meta-monitoring metrics.
  • A switch between DaemonSet and centralized deployments can change observed series counts.
  • Scaling clustered collectors changes how targets distribute, even when the target list stays the same.

For resource planning guidance, refer to Estimate resource usage.

Data durability

Alloy uses a Write-Ahead Log (WAL) for metrics to handle temporary backend outages. The WAL buffers data locally and retries sending when the backend becomes available.

For the WAL to persist across restarts, configure persistent storage using the --storage.path flag.

Note

Without persistent storage, Alloy loses buffered data on restart. By default, Alloy stores data in a temporary directory.

Push-based pipelines for logs, traces, and profiles have different durability characteristics. Refer to component documentation for more information.

Monitor Alloy

Alloy exposes metrics about its own health and performance at the /metrics endpoint.

Key monitoring capabilities:

  • Internal metrics: Controller and component metrics in Prometheus format
  • Health endpoints: /-/ready and /-/healthy for load balancer checks
  • Debugging UI: Visual component graph and live debugging at /

Refer to Set up meta-monitoring for configuration examples.

Component capabilities

Each Alloy component has its own capabilities and limits. Before you rely on a component in production, check:

  • Which signal types it accepts and emits: metrics, logs, traces, and profiles
  • Whether the component is stable or still evolving
  • Whether it’s a native Alloy component or wraps upstream OpenTelemetry Collector functionality

Refer to the component reference for this information.

Troubleshoot issues

If something doesn’t behave as expected after deployment:

  1. Review Troubleshooting and debugging.
  2. Check the component documentation.
  3. Revisit deployment patterns and clustering assumptions.

Next steps