Menu
Grafana Cloud

Note

Fleet Management is currently in public preview. Grafana Labs offers limited support, and breaking changes might occur prior to the feature being made generally available. For bug reports or questions, fill out our feedback form.

Introduction to Fleet Management

Grafana Fleet Management is a Grafana Cloud feature that enables you to manage collector deployments at scale.

Streamlined configuration management. With a central control hub, you can easily manage configurations across all collectors. Create, activate, deactivate, or adjust pipelines remotely without complex scripts or time-consuming change management processes, all from the Grafana Cloud interface.

Centralized health monitoring. You have complete visibility into the health of your collector fleet. View out-of-the-box dashboards that track metrics for component health, system resources, and deployment infrastructure. Troubleshoot more efficiently and identify issues with individual collectors more quickly.

Cost and collection control. Tailor data collection pipelines to match specific use cases and ensure observability data stays focused on what’s most important. Turn data streams on and off as needed to optimize your costs and reduce data overload.

How it works

Fleet Management is built on the concepts of collectors and configuration pipelines.

Collectors

Collectors are individual Grafana Alloy deployments registered in the Fleet Management service using a remotecfg code block in their local configuration files. Once a collector is registered and running, it appears in the Fleet Management interface, where you can assign it a configuration pipeline and monitor its health. Here is an example of a remotecfg block:

alloy
remotecfg {
    url = <SERVICE_URL>
    basic_auth {
        username      = <USERNAME>
        password_file = <PASSWORD_FILE>
    }

    id             = constants.hostname
    attributes     = {"cluster" = "dev", "namespace" = "otlp-dev"}
    poll_frequency = "5m"
}

Configuration pipelines

Configuration pipelines are standalone configuration blocks that are created and assigned remotely to collectors in Fleet Management. Configuration pipelines are composed of a unique name, the components to be loaded and run, and a list of attributes that match collectors with the pipeline. Refer to Create and assign configuration pipelines for more details.

The remote configuration assigned to a collector, which is made up of one or more configuration pipelines, is determined at runtime. A registered collector polls the Collector API on the poll_frequency set in its remotecfg block, looking for configuration pipelines with matching attributes. Collectors run their local configuration and remote configuration in parallel. Components loaded by local and remote configurations are isolated from one another to avoid conflicts.

On the collector side, the last loaded configuration is cached in a directory in the collector’s data path. By default, this path is data-alloy/remotecfg/. The cached configuration is used as a fallback to ensure the collector continues working with the last valid configuration in case the connection to the API fails.

Collector health status

In addition to remote configuration capabilities, Fleet Management helps you manage your fleet by reporting the health of your registered collectors. The health status is based on three factors:

  • Has the collector made a GetConfig API request recently?
  • Is the collector reporting an up metric with the collector_id label?
    • The label’s value must match the id from the collector’s remotecfg block.
  • Does the collector have any active alerts?
    • If there are critical alerts, the collector is marked as unhealthy with a red warning icon.
    • If there are alerts, but they are non-critical, the collector is marked with a yellow warning icon.

The Fleet Management service fetches active alerts from the Grafana Prometheus instance. Alerts must exist in your stack’s Prometheus Alertmanager to be discoverable by the health status check. Refer to the Alertmanager documentation for tips on using Mimirtool to configure Alertmanager.

When setting up alerts, make sure the following labels are set so that the alerts are properly linked in Fleet Management:

  • collector_id: The value of this label must match the id argument in the collector’s remotecfg block.
  • severity: The Fleet Management service uses the value of this label to assign a yellow or red health status indicator.
  • source:"fleet-management": The value of the source label must be set to "fleet-management".

Key terms and concepts

Familiarize yourself with the terminology and concepts of Fleet Management.

Collector
Software that pulls or receives telemetry from your applications or infrastructure, processes the data, and sends it to a backend for analysis. Collectors are made up of components defined in a configuration file. Fleet Management must be used with Alloy collectors.
Fleet
Aggregate term for multiple collector instances.
Inventory
The list view of your collector fleet.
remotecfg
A block of configuration added to a collector’s local configuration file that registers the collector with the Fleet Management service. Several arguments make up the remotecfg block, including one for attributes. For the complete list of supported arguments, refer to the documentation.
Configuration pipeline
A standalone piece of configuration created in Fleet Management and remotely assigned to collectors based on attribute matching. Configuration pipelines are made up of a unique name, a content body, and matching attributes.
Remote configuration
The set of one or more configuration pipelines matched to a collector’s attributes. Remote configurations are run in parallel with a collector’s local configuration.
Custom attribute
Also known as an attribute override or a user-defined attribute. A key-value pair created by users and assigned to collectors through the Fleet Management user interface. Custom attributes are not unique identifiers. The same attribute can be assigned to multiple collectors. Custom attributes cannot begin with collector., which is a reserved namespace. Custom attributes take precedence over system attributes.
System attribute
A key-value pair set in the remotecfg block of a collector’s local configuration. System attributes include those set by the user in the remotecfg and those automatically created by Fleet Management. The automatically created attributes are collector.os and collector.version. Users cannot create system attributes beginning with collector. because it is a reserved namespace. System attributes are not unique identifiers. The same attribute can be assigned to multiple collectors. Custom attributes set in the Fleet Management interface take precedence over system attributes.
Matching attribute
Also known as a matcher. Inspired by Alertmanager, this conditional determines which configuration pipelines are applied at collector runtime. A matching attribute has three parts:
  • An attribute name formatted as an unquoted literal or a double-quoted string.
  • An operator. There are 4 possibilities:
    • =: is equal to
    • !=: is not equal to
    • =~: matches the regular expression
    • !~: does not match the regular expression
  • An attribute value or regular expression formatted as an unquoted literal or a double-quoted string.

Note

Unquoted literals can contain all UTF-8 characters other than whitespace and the following reserved characters: {, }, !, =, ~, ", ', \.

For example, the matching attribute collector.os!=windows applies a configuration pipeline to all collectors where the operating system is not Windows. Here are some additional examples of matching attributes:

  • namespace=~"dev|staging"
  • team!~"team-.*"
  • profiling_enabled=true
  • arch!=amd64
  • collector.version=~"v1.2.0|v1.3.0"
  • collector.os=~".+"

The last matching attribute, collector.os=~".+", is a useful regular expression that matches all values.

Matching attributes use AND logic, so all attributes must match for a collector to use a configuration pipeline. For example, if a configuration pipeline has matching attributes ["os=linux", "env=prod", "cluster=prod-us-east-0"], collectors must have os=linux AND env=prod AND cluster=prod-us-east-0 attributes to receive this pipeline. You can assign attributes to a collector without affecting the selection process. The collector-pipeline pairing is determined by the matching attributes assigned in a configuration pipeline.

Key features

Fleet Management offers these key features to help teams manage large numbers of collector instances.

Inventory

From the Inventory tab, you can see a list of your registered collectors. The collector list shows the operational status, Alloy version, operating system, and attributes for each collector. You can change which columns are displayed in the list and how they’re ordered. Filter the list by attribute or run a search for any of the other list parameters. Select multiple collectors at once to bulk edit custom attributes.

Click on any collector row to see a details view. From here, you can see vital information about your collector, including health metrics, internal logs, alerts, and its assigned attributes and remote configuration.

Remote configuration

The Remote configuration tab is where you can create, edit, activate, and delete configuration pipelines. Create a new configuration pipeline using the Fleet Management template. Build your pipeline, test the pipeline syntax, and then add matching attributes to assign the configuration pipeline to collectors. To learn more, refer to Create and assign configuration pipelines.

You can also view, search, and edit your existing configuration pipelines and their matching attributes. Activate or deactivate a pipeline with a click of the switch.