Menu
Grafana Cloud

Multidimensional SLI example

This guide explains what multidimensional SLIs are, how they work, and provides a basic example.

Before you begin, read the SLI availability examples to understand how SLIs are defined in Grafana SLO:

Note

The SLI query result must return a ratio between 0 and 1, where 1 means 100% of events were successful. This is required to evaluate whether the SLI meets the SLO target.

In the availability and latency SLI examples, the SLIs were formed using one of the following formulas:

# ratio of successful event rates formula
Success rate = rate of successful events over a period
               /  
               rate of total events over a period

# ratio of successful events formula
Success rate = number of successful events over a period
               /  
               total number of events over a period

However, these formulas are not fully accurate. Prometheus queries can return multiple series (also called dimensions). Therefore, the final SLIs use sum(...) to aggregate results from all series.

The formulas then look more like:

Success rate = sum(rate of successful events over a period)
               /  
               sum(rate of total events over a period)


Success rate = sum(number of successful events over a period)
               /  
               sum(total number of events over a period)

Here, sum(...) aggregates all potential dimensions (all distinct label values) in the numerator and denominator before the final ratio calculation.

This type of SLI is referred to as a roll-up SLI (or aggregated SLI). The following is an example using the Ratio query builder:

Screenshot of the graph result of an SLI ratio

Multidimensional probe example

Multidimensional SLIs (SLIs evaluated across multiple label dimensions) use sum by (<labels>) in both the numerator and denominator, producing multiple ratio series. For example:

Success rate = sum by (probe) (rate of successful probe executions)
               /  
               sum by (probe) (rate of total probe executions)
DimensionSuccess rate per dimension
{probe=“NorthVirginia”}0.9
{probe=“Spain”}1
{probe=“Tokyo”}0.95

The final SLI result is 0.9, equal to its equivalent roll-up SLI.

In the final SLI calculation, all dimensions are aggregated, making it act as a roll-up SLI for SLO compliance.

Continuing with the previous example, use the Grouping function to define dimensions per probe:

Screenshot of the SLO wizard graph result of a multidimensional SLI

Note that you can create multidimensional SLIs using either option in the Grafana SLO wizard: Ratio or Advanced.

How multidimensional SLIs work

For SLO compliance evaluation, the SLI calculation for multidimensional and roll-up SLIs is exactly the same.

However, Grafana SLO provides additional functionality to handle multidimensional SLIs:

  1. Fast and slow burn alerts per dimension. When enabled, Grafana SLO triggers fast-burn or slow-burn alerts whenever an individual dimension consumes the error budget quickly or slowly, respectively.

    The SLO dashboard displays the list of multidimensional alerts
    The SLO dashboard displays the list of multidimensional alerts

    Note

    Multidimensional alerts are not triggered for overall SLO consumption, but only for the consumption of a particular dimension.

    To be alerted when the overall error budget is consumed, create a roll-up SLO that does not include the sum/group by dimensions.

  2. Per-dimension SLO dashboard filtering. The SLO dashboard allows filtering results and visualizing SLI consumption for each dimension.

    A screenshot of an SLO dashboard that displays SLI consumption per probe