Streamline your workflows
Follow these tips and strategies for using Grafana Fleet Management to streamline your remote configuration workflows.
Increase efficiency with reusable configuration pipelines
Save time and effort by reusing configuration pipelines across your fleet.
Set environment variables for the varying properties and create a pipeline that reads those variables using the Alloy configuration function, sys.env
.
Use collector attributes to customize how pipelines are applied.
Configuration properties with different values
Configuring a fleet of collectors can be cumbersome, especially when configuration contexts, such as host credentials or scrape targets, differ. To avoid creating multiple versions of the same pipeline, Fleet Management can help you efficiently reuse standard configuration pipelines across deployments, with the help of environment variables.
Example: Scrape different targets
For example, if you want to collect metrics on all deployments but you need to scrape different targets, set a target environment variable on each host, refer to the variable in the pipeline, and then roll out the pipeline to your collectors:
prometheus.scrape "example" {
targets = [
{"__address__" = sys.env("TARGET_LOCATION")},
]
forward_to = [prometheus.remote_write.staging.receiver]
}
Telemetry labels based on collector attributes
When you assign attributes to a collector, you are categorizing them based on meaningful characteristics, such as team ownership or environment. You might also want to persist those attributes in the labels you apply to your collected data. Rather than creating a new pipeline for each set of labels you want to apply, you can reuse a single pipeline that reads environment variables.
Example: Add telemetry labels for owner and environment
For example, if you’ve assigned attributes to collectors for owner
and env
and want to export metrics with corresponding labels, set environment variables, such as TEAM_OWNER
and ENV
, on your hosts, refer to the variables in the pipeline, and then roll it out to your collectors.
discovery.relabel "example" {
targets = prometheus.exporter.self.example.targets
rule {
target_label = "owner"
replacement = sys.env("TEAM_OWNER")
}
rule {
target_label = "env"
replacement = sys.env("ENV")
}
}
Customized pipelines by application workload or technology
Match configuration pipelines to your collectors based on the applications or technologies that are running on the host. You can add attributes to the collector that match to each workload or technology. If you have standard telemetry that needs to be collected from all hosts, such as server or database monitoring, create a single default configuration pipeline that applies to all collectors on all machines with a universal matching attribute. If you remove an application from a host, remove the attribute from that host’s collector to disable the pipeline for that instance.
Example: Apply the Node.js integration to hosts running your application
Collect metrics from all hosts where your Node.js application is running.
- Add the attribute
application=MYAPP
to each collector where the application is hosted. You can add a local attribute directly to theremotecfg
block of the local configuration or add a remote attribute in the Fleet Management application or with an API request. - Follow the instructions to create a new configuration pipeline using an integration template and select the Node.js integration.
- Add the matching attribute
application=MYAPP
to the pipeline. - Activate the pipeline to begin collecting telemetry from your Node.js applications.
Example: Apply the MySQL integration to every collector
If all of your machines are running MySQL, use a default configuration pipeline to collect metrics and logs from them.
- Add the attribute
default=MYSQL
to every collector by selecting all collectors in the Fleet Management application and clicking the bulk edit tool or by making aBulkUpdateCollectorsRequest
to the Collector API. - Follow the instructions to create a new configuration pipeline using an integration template and select the MySQL integration.
- Review the configuration and set up any parameters required.
For MySQL, the default configuration expects the connection string to be available in a file at
/var/lib/alloy/mysql-secret
. - Add the matching attribute
default=MYSQL
to the pipeline. - Activate the pipeline to begin collecting telemetry from your MySQL databases.
Minimize risk by deploying configuration pipelines in stages
Configuration pipelines are assigned to collectors using matching attributes. Ensure stability in your production environments with staged releases that leverage these attributes.
In the Fleet Management interface, begin by assigning attributes to your collectors based on their deployment characteristics, such as env=PROD
, test=GROUP-A
, or deploy=BLUE
.
You can also automate the assignment of remote attributes with calls to the Collector API.
Once your collectors are categorized, create a configuration pipeline and assign matching attributes, either using the Fleet Management application or the Pipeline API.
Matching attributes are combined with an AND
operator, so you can customize their application to your setup.
Example: Release a new configuration pipeline
For example, you can use a gradual rollout to test a new configuration pipeline:
- Add an
env
remote attribute with valuesdev
,staging
,prod-eu
, orprod-us
to each collector that should receive the pipeline. - While creating the new pipeline, add a matching attribute using a regular expression that matches your
dev
collectors:env=~dev
. - When you’re satisfied with the pipeline’s performance, modify the matching attribute to include the staging environment:
env=~dev|staging
. - If the pipeline is ready for production, add the
prod-eu
to the matching attribute:env=~dev|staging|prod-eu
. - Confirm there are no issues and then add the
prod-us
to the matching attribute so the pipeline is now deployed across all your environments and production instances:env=~dev|staging|prod-eu|prod-us
.
If the pipeline causes a problem at any point during the rollout, deactivate it with a click of the switch in the Remote configuration tab in the Fleet Management application.
Example: Create a new version of a configuration pipeline
With the Grafana Terraform provider or the Pipeline API, it’s possible to integrate version controlled configuration pipelines with Fleet Management. If GitOps is not part of your current observability setup, you can still maintain a version history when testing and rolling out new versions of existing configuration pipelines.
- Create a copy of the current pipeline and name the copy with its version number (for example,
integration_linux_node_metrics_v1_3
). - Keep the original pipeline running everywhere while you deploy the new version to
dev
orstaging
environments using matching attributes. - If you’re satisfied with the new version, add the matching attributes to deploy it to the rest of your environments.
- Deactivate the original pipeline by clicking the switch in the Fleet Management application or setting
enabled
tofalse
in your API request.
If at any point the new version of the pipeline causes problems, you can deactivate it and reactivate the original pipeline.
Other configuration deployment patterns
Matching attributes can be used for other types of configuration pipeline deployments:
- A/B testing. To conduct A/B testing of different configuration pipelines, assign remote attributes to collectors based on which version of the pipeline they should receive (for example,
test=GROUP-A
andtest=GROUP-B
) and then add the matching attribute to the corresponding pipeline. Label the collected telemetry by group as well and then evaluate the performance of each pipeline. - Canary deployments. Implement a canary deployment by assigning a meaningful attribute to each group of collectors and then matching the new version of the pipeline to each group in succession. Once you’re satisfied with the pipeline’s performance, add a regular expression and the matching attribute for the next group of collectors, and so on. Refer to the staged rollout example for a sample regular expression.
- Blue-green deployments. You can also assign attributes based on blue-green environments, for example
deploy=BLUE
anddeploy=GREEN
, and then add matching attributes to control which configuration pipelines are assigned to each environment.
Reduce costs with data on demand
High verbosity telemetry, such as info
and debug
logs or continuous profiles, can become expensive if you’re collecting them all the time.
But when there’s an incident, you might need these signals.
With Fleet Management, you can create configuration pipelines to collect this telemetry but keep the pipelines disabled until you need them.
If you’re sampling traces from your collectors, you can also increase the sampling percentage remotely when it’s time to debug an issue.
Example: Collect on-demand profiles to diagnose excessive resource usage
For example, you can use on-demand continuous profiles to find the cause of excessive resource usage:
- Create a profiling configuration pipeline using the Fleet Management application or the Pipeline API, but leave the pipeline disabled by turning off the UI switch or setting the
enabled
key tofalse
in the API call. - When you notice an issue with high resource consumption, enable the profiling pipeline and make sure to add matching attributes so it’s matching to the correct collectors.
- After finding the offending code, disable the profiling pipeline, remove the matching attributes, and leave the pipeline ready for the next time you need to troubleshoot an incident.
Example: Automate data collection during an incident response
Consider automating higher-resolution data collection as part of your incident response:
- Create deactivated configuration pipelines that collect the “must gather” data specified in your runbooks. Leave them disabled.
- When an incident is declared, the process triggers an API call that enables the necessary pipelines and sets matching attributes.
- Closing the incident triggers another API call to disable the pipelines and remove matching attributes.
Example: Debug an issue with more tracing data
Increase your sampling percentage to get more tracing data when you need to debug an issue.
Create a configuration pipeline that collects traces and samples 10% of them using probabilistic sampling.
otelcol.processor.probabilistic_sampler "default" { // Keep 10% of traces sampling_percentage = 10 output { traces = [otelcol.processor.batch.default.input] } }
Match the pipeline to your collectors using attributes and activate it to begin receiving sampled traces.
If an issue occurs, return to the Fleet Management interface and edit the configuration pipeline to increase the sampling percentage to 100% so you can see all tracing data.
otelcol.processor.probabilistic_sampler "default" { // Keep 100% of traces sampling_percentage = 100 output { traces = [otelcol.processor.batch.default.input] } }
When the issue is resolved, edit the pipeline again to return the sampling percentage to 10%.
Maximize security with scalable Alloy components
You can enforce the principle of least privilege and minimize attack radius by rotating credentials for your hosts with remote configuration in Fleet Management.
Add the Alloy remote.vault
component to your configuration pipelines to retrieve and rotate secrets using the Key/Value v2 secrets engine.