Blog  /  Engineering

How relabeling in Prometheus works

21 Mar 2022 10 min read

Relabeling is a powerful tool that allows you to classify and filter Prometheus targets and metrics by rewriting their label set.

The purpose of this post is to explain the value of Prometheus’ relabel_config block, the different places where it can be found, and its usefulness in taming Prometheus metrics. Much of the content here also applies to Grafana Agent users.

For reference, here’s our guide to Reducing Prometheus metrics usage with relabeling.

So without further ado, let’s get into it!

Prometheus labels

Labels are sets of key-value pairs that allow us to characterize and organize what’s actually being measured in a Prometheus metric.

For example, when measuring HTTP latency, we might use labels to record the HTTP method and status returned, which endpoint was called, and which server was responsible for the request.

Each unique combination of key-value label pairs is stored as a new time series in Prometheus, so labels are crucial for understanding the data’s cardinality and unbounded sets of values should be avoided as labels.

Internal labels

But what about metrics with no labels? Prometheus also provides some internal labels for us. These begin with two underscores and are removed after all relabeling steps are applied; that means they will not be available unless we explicitly configure them to.

Some of these special labels available to us are

Label name Description
__name__ The scraped metric’s name
__address__ host:port of the scrape target
__scheme__ URI scheme of the scrape target
__metrics_path__ Metrics endpoint of the scrape target
__param_<name> is the value of the first URL parameter passed to the target
__scrape_interval__ The target’s scrape interval (experimental)
__scrape_timeout__ The target’s timeout (experimental)
__meta_ Special labels set set by the Service Discovery mechanism
__tmp Special prefix used to temporarily store label values before discarding them

So now that we understand what the input is for the various relabel_config rules, how do we create one? And what can they actually be used for?

Stages of application

One source of confusion around relabeling rules is that they can be found in multiple parts of a Prometheus config file.

# A list of scrape configurations.
scrape_configs:
    - job_name: "some scrape job"
    ...
 
    # List of target relabel configurations.
    relabel_configs:
    [ - <relabel_config> ... ]

    # List of metric relabel configurations.
    metric_relabel_configs:
    [ - <relabel_config> ... ]

# Settings related to the remote write.
remote_write:
    url: https://remote-write-endpoint.com/api/v1/push 
    ...

    # List of remote write relabel configurations.
    write_relabel_configs:
    [ - <relabel_config> ... ]

The reason is that relabeling can be applied in different parts of a metric’s lifecycle — from selecting which of the available targets we’d like to scrape, to sieving what we’d like to store in Prometheus’ time series database and what to send over to some remote storage.

First off, the relabel_configs key can be found as part of a scrape job definition. These relabeling steps are applied before the scrape occurs and only have access to labels added by Prometheus’ Service Discovery. They allow us to filter the targets returned by our SD mechanism, as well as manipulate the labels it sets.

Once the targets have been defined, the metric_relabel_configs steps are applied after the scrape and allow us to select which series we would like to ingest into Prometheus’ storage.

Finally, the write_relabel_configs block applies relabeling rules to the data just before it’s sent to a remote endpoint. This can be used to filter metrics with high cardinality or route metrics to specific remote_write targets.

The base <relabel_config> block

A <relabel_config> consists of seven fields. These are:

  • source_labels
  • separator (default = ;)
  • target_label
  • regex (default = (.*))
  • modulus
  • replacement (default = $1)
  • action (default = replace)

A Prometheus configuration may contain an array of relabeling steps; they are applied to the label set in the order they’re defined in. Omitted fields take on their default value, so these steps will usually be shorter.

source_labels and separator

Let’s start off with source_labels. It expects an array of one or more label names, which are used to select the respective label values. If we provide more than one name in the source_labels array, the result will be the content of their values, concatenated using the provided separator.

As an example, consider the following two metrics

my_custom_counter_total{server="webserver01",subsystem="kata"} 192  1644075044000
my_custom_counter_total{server="sqldatabase",subsystem="kata"} 147  1644075044000

The following relabel_config

source_labels: [subsystem, server]
separator: "@"

would extract these values.

kata@webserver01
kata@sqldatabase

regex

The regex field expects a valid RE2 regular expression and is used to match the extracted value from the combination of the source_label and separator fields. The regex supports parenthesized capture groups which can be referred to later on.

This block would match the two values we previously extracted

source_labels: [subsystem, server]
separator: "@"
regex: "kata@(.*)"

However, this block would not match the previous labels and would abort the execution of this specific relabel step

source_labels: [subsystem, server]
separator: "@"
regex: "(.*)@redis"

The default regex value is (.*), so if not specified, it will match the entire input.

replacement

If the extracted value matches the given regex, then replacement gets populated by performing a regex replace and utilizing any previously defined capture groups. Going back to our extracted values, and a block like this

source_labels: [subsystem, server]
separator: "@"
regex: "(.*)@(.*)"
replacement: "${2}/${1}"

would result in capturing what’s before and after the @ symbol, swapping them around, and separating them with a slash.

webserver01/kata
sqldatabase/kata

The default value of the replacement is $1, so it will match the first capture group from the regex or the entire extracted value if no regex was specified.

target_label

If the relabel action results in a value being written to some label, target_label defines to which label the replacement should be written. For example, the following block would set a label like {env="production"}

replacement: "production"
target_label: "env"
action: "replace"

While, continuing with the previous example, this relabeling step would set the replacement value to “my_new_label”

- source_labels: [subsystem, server]
  separator: "@"
  regex: "(.*)@(.*)"
  replacement: "${2}/${1}"
  target_label: "my_new_label"
  action: "replace"

resulting in

{my_new_label="webserver01/kata"}
{my_new_label="sqldatabase/kata"}

modulus

Finally, the modulus field expects a positive integer. The relabel_config step will use this number to populate the target_label with the result of the MD5(extracted value) % modulus expression.

Available actions

We’ve come a long way, but we’re finally getting somewhere. Now what can we do with those building blocks? How can they help us in our day-to-day work?

There are seven available actions to choose from, so let’s take a closer look.

keep/drop

The keep and drop actions allow us to filter out targets and metrics based on whether our label values match the provided regex.

Let’s go back to our previous example

my_custom_counter_total{server="webserver01",subsystem="kata"} 192  1644075074000
my_custom_counter_total{server="sqldatabase",subsystem="kata"} 14700  1644075074000

After concatenating the contents of the subsystem and server labels, we could drop the target which exposes webserver-01 by using the following block

- source_labels: [subsystem, server]
  separator: "@"
  regex: "kata@webserver"
  action: "drop"

Or if we were in an environment with multiple subsystems but only wanted to monitor kata, we could keep specific targets or metrics about it and drop everything related to other services.

- source_labels: [subsystem, server]
  separator: "@"
  regex: "kata@(.*)"
  action: keep

In many cases, here’s where internal labels come into play.

You can, for example, only keep specific metric names.

- source_labels: [__name__]
  regex: “my_custom_counter_total|my_custom_counter_sum|my_custom_gauge”
  action: keep

Or if you’re using Prometheus' Kubernetes service discovery you might want to drop all targets from your testing or staging namespaces.

- source_labels: [__meta_kubernetes_namespace]
  regex: “testing|staging”
  action: drop

labelkeep/labeldrop

The labelkeep and labeldrop actions allow for filtering the label set itself.

In the previous example, we may not be interested in keeping track of specific subsystems labels anymore.

The following relabeling would remove all {subsystem="<name>"} labels but keep other labels intact.

- regex: "subsystem"
  action: labeldrop

Of course, we can do the opposite and only keep a specific set of labels and drop everything else.

- regex: "subsystem|server|shard"
  action: labelkeep

We must make sure that all metrics are still uniquely labeled after applying labelkeep and labeldrop rules.

replace

Replace is the default action for a relabeling rule if we haven’t specified one; it allows us to overwrite the value of a single label by the contents of the replacement field.

As we saw before, the following block will set the env label to the replacement provided, so {env="production"} will be added to the labelset.

- action: replace
  replacement: production
  target_label: env

The replace action is most useful when you combine it with other fields.

Here’s another example:

- action: replace
  source_labels: [__meta_kubernetes_pod_name,__meta_kubernetes_pod_container_port_number]
  separator: ":"
  target_label: address

The above snippet will concatenate the values stored in __meta_kubernetes_pod_name and __meta_kubernetes_pod_container_port_number. The extracted string would then be set written out to the target_label and might result in {address="podname:8080}.

hashmod

The hashmod action provides a mechanism for horizontally scaling Prometheus.

The relabeling step calculates the MD5 hash of the concatenated label values modulo a positive integer N, resulting in a number in the range [0, N-1].

An example might make this clearer. Consider the following metric and relabeling step

my_custom_metric{name="node",val="42"} 100

- action: hashmod
  source_labels: [name, val]
  separator: "-"
  modulus: 8
  target_label: __tmp_hashmod

The result of the concatenation is the string “node-42” and the MD5 of the string modulus 8 is 5.

$ python3
>>> import hashlib
>>> m = hashlib.md5(b"node-42")
>>> int(m.hexdigest(), 16) % 8
5

So ultimately {__tmp=5} would be appended to the metric’s label set.

This is most commonly used for sharding multiple targets across a fleet of Prometheus instances. The following rule could be used to distribute the load between 8 Prometheus instances, each responsible for scraping the subset of targets that end up producing a certain value in the [0, 7] range, and ignoring all others.

- action: keep
  source_labels: [__tmp_hashmod]
  regex: 5

labelmap

The labelmap action is used to map one or more label pairs to different label names.

Any label pairs whose names match the provided regex will be copied with the new label name given in the replacement field, by utilizing group references (${1}, ${2}, etc).

The replacement field defaults to just $1, the first captured regex, so it’s sometimes omitted.

Here’s an example. If we’re using Prometheus’ Kubernetes SD, our targets would temporarily expose some labels such as:

    __meta_kubernetes_node_name: The name of the node object.
    __meta_kubernetes_node_provider_id: The cloud provider's name for the node object.
    __meta_kubernetes_node_address_<address_type>: The first address for each node address type, if it exists.
…
    __meta_kubernetes_namespace: The namespace of the service object.
    __meta_kubernetes_service_external_name: The DNS name of the service. (Applies to services of type ExternalName)
    __meta_kubernetes_service_name: The name of the service object.
    __meta_kubernetes_service_port_name: Name of the service port for the target.
…
    __meta_kubernetes_pod_name: The name of the pod object.
    __meta_kubernetes_pod_ip: The pod IP of the pod object.
    __meta_kubernetes_pod_container_init: true if the container is an InitContainer
    __meta_kubernetes_pod_container_name: Name of the container the target address points to.
…

Labels starting with double underscores will be removed by Prometheus after relabeling steps are applied, so we can use labelmap to preserve them by mapping them to a different name.

- action: labelmap
  regex: "__meta_kubernetes_(.*)"
  replacement: "k8s_${1}"

Common use cases for relabeling in Prometheus

Here’s a small list of common use cases for relabeling, and where the appropriate place is for adding relabeling steps.

  • When you want to ignore a subset of applications; use relabel_config
  • When splitting targets between multiple Prometheus servers; use relabel_config + hashmod
  • When you want to ignore a subset of high cardinality metrics; use metric_relabel_config
  • When sending different metrics to different endpoints; use write_relabel_config

Learn more

That’s all for today! Hope you learned a thing or two about relabeling rules and that you’re more comfortable with using them. For more information, check out our documentation and read more in the Prometheus documentation.

Grafana Cloud is the easiest way to get started with metrics, logs, traces, and dashboards. We have a generous free forever tier and plans for every use case. Sign up for free now!