This is documentation for the next version of Grafana Tempo documentation. For the latest stable release, go to the latest version.

Metrics from traces

Metrics-generator

Cardinality

Open source

Cardinality

Cardinality refers to the total combination of key/value pairs, such as labels and label values for a given metric series or log stream, and how many unique combinations they generate. For more information on cardinality, refer to the What are cardinality spikes and why do they matter? blog post.

Because writes to a time-series database (TSDB) database are in series, high cardinality doesn’t make a big difference to performance at ingest. However, cardinality can have a major impact on querying where, the higher the cardinality, the more items are required to be iterated over.

Traces collection and metrics

Tempo’s server-side metrics generation adds functionality to the collection of traces by creating Prometheus-based metrics that track a variety of metrics such as:

Total span call counts
Span latency histograms
Total span size count

The metrics-generator creates metrics which define the relationship between services via edges and nodes. Each of these metrics are queryable using a set of Prometheus labels (key/value pairs).

Each new value for a label increases the number of active series associated with a metric. (To learn more about active series, refer to the Trace active series documentation.)

This is also known as an increase in cardinality, and the number of active series generated for a metric is directly proportional to the number of labels that exist for that metrics alongside the number of values each label has added.

In a non-modified instance of the metrics generator, a small number of labels are added automatically. Because labels like span_kind and status_code only have a few valid values, the largest variable for the number of active series produced for each metric depends on the number of service names and span names associated with trace spans.

The metrics-generator can also be configured to also add extra labels on metrics, using span attribute key/value pairs which are mapped directly to these labels. Refer to the custom span attribute documentation for more information.

Be careful when configuring custom attributes: the greater the number of values seen in a specific attribute, the greater the number of active series are produced. For more information about active series, refer to the active series documentation

Let’s say that you are adding a custom attribute that includes unique customer IDs as a metrics label. If you have 100 customers, this could potentially multiply the number of active series generated by up to 100 (for example, going from 25,000 active series to 2.5M). Always consider which attributes are useful as labels for querying metrics, as well as the cardinality that they can increase metrics by.

Dry-running the metrics-generator

A good practice, before fully enabling metrics-generator, is to run the metrics-generator in dry-run mode. Using the dry-run mode generates metrics but doesn’t collect them, and thus doesn’t write them to a metrics storage database. The override metrics_generator.disable_collection is defined for this use-case.

To get an estimate, set the override to true, and run the metrics-generator as you would normally. Then, check tempo_metrics_generator_registry_active_series for the calculated estimation of active series for your set-up.

If your active series limit is already reached and tempo_metrics_generator_registry_active_series no longer reflects true demand, use the tempo_metrics_generator_registry_active_series_demand_estimate metric instead. This metric uses HyperLogLog estimation to approximate the actual cardinality even when limits are in effect.