SLI example for latency
This guide provides examples of how to define latency SLIs using different Prometheus metric types. The basic SLO example for demonstration purposes is as follows:
SLI category | SLI description | Time window | Target |
---|---|---|---|
Latency | Requests respond within 2 seconds | 28d | 99% |
The SLI in this example includes all requests, and the SLO defines the target percentage.
When possible, avoid using percentiles in SLIs, such as 95th percentile latency with a 99% target, to maintain simplicity and consistency across SLO types. Refer to Building good SLOs—CRE life lessons from Google Cloud for more on this topic.
Before you begin, read the SLI availability examples to understand how SLIs are defined in Grafana SLO:
Note
The SLI query result must return a ratio between 0 and 1, where 1 means 100% of events were successful. This is required to evaluate whether the SLI meets the SLO target.

Probe latency (using Prometheus Gauge)
This example uses the probe_duration_seconds
metric from Synthetic Monitoring probes to verify public latency. For details on how Synthetic Monitoring probes work, see the SLI availability examples using probes.
Metric | Type | Description |
---|---|---|
probe_duration_seconds | Gauge | How long the probe took to complete in seconds |
In the Grafana SLO wizard, you can create SLIs using two options:
- Ratio query builder: Enter counter metrics for successful and total events.
- Advanced: Enter the ratio SLI query directly.
Because probe_duration_seconds
is not a counter metric, choose the Advanced option to create the SLI query.
SLIs are defined as ratio-like queries, either as the ratio of successful events or the ratio of successful event rates:
# ratio of successful event rates formula
Success rate = rate of successful events (over a period)
/
rate of total events (over a period)
# ratio of successful events formula
Success rate = number of successful events (over a period)
/
total number of events (over a period)
With gauge metrics, you can implement the ratio of successful events formula as follows:
# number of successful probe requests over the rate interval
sum(
count_over_time(
(probe_duration_seconds{job="<JOB_NAME>"} < 2)[$__rate_interval:]
)
)
/
# number of total probe requests over the rate interval
sum(
count_over_time(
probe_duration_seconds{job="<JOB_NAME>"}[$__rate_interval:]
)
)
Here’s the breakdown of the numerator query:
# number of successful probe requests over the rate interval
sum(
count_over_time(
(probe_duration_seconds{job="<JOB_NAME>"} < 2)[$__rate_interval:]
)
)
probe_duration_seconds{job="<JOB_NAME>"} < 2
Returns probe latency samples. The
< 2
comparison filters samples where latency is within the SLI threshold (less than two seconds).The result is a binary series:
1
for success and no sample for failure.[$__rate_interval:]
Runs the previous expression over the past
$__rate_interval
.Because
count_over_time
works only on range vectors, it uses a subquery[:]
to produce a range vector containing all samples from that period.count_over_time(...)
Counts the number of samples in the previous query, the number of successful probe requests in the range vector.Finally,
sum(...)
aggregates across all series (dimensions).
The numerator is then divided by the total number of probe requests over the same interval using a similar query:
/
# number of total probe requests over the rate interval
sum(
count_over_time(
probe_duration_seconds{job="<JOB_NAME>"}[$__rate_interval:]
)
)
Alternatively, the numerator can use bool
and sum_over_time
:
# number of successful probe requests over the rate interval
# `bool` returns a binary 0/1 series and `sum_over_time` sums 1s for successes
sum(
sum_over_time(
(probe_duration_seconds{job="<JOB_NAME>"} < bool 2)[$__rate_interval:]
)
)
/
# number of total probe requests over the rate interval
sum(
count_over_time(
probe_duration_seconds{job="<JOB_NAME>"}[$__rate_interval:]
)
)
Probe latency (using Histogram)
The SLI example uses the probe_all_duration_seconds
histogram metric, whose SLI query is different.
Metric | Type | Description |
---|---|---|
probe_all_duration_seconds | Histogram | How long the probe took to complete in seconds |
Prometheus histogram metrics store samples based on their value (latency in this case) and expose additional series:
*_count
: Returns all samples for all latencies.*_bucket
: Returns samples per configured buckets. The buckets for this metric are0
,0.005
,0.1
,0.025
,0.05
,0.1
,0.25
,0.5
,1
,2.5
,5
,10
, and+Inf
.
You can use a histogram metric to return the number of successful samples if the metric includes a bucket for the specific SLI threshold.
However, probe_all_duration_seconds
does not include a bucket for 2s
, and cannot be used to filter histogram samples at that threshold. For alternatives, refer to handle a threshold not available as a bucket.
This example uses a different threshold (2.5s
) for demonstration purpose. Use the Ratio option to build the SLI query as follows:
Ratio query builder | Value | Description |
---|---|---|
Success metric | probe_all_duration_seconds_bucket{job="<JOB_NAME>", le="2.5"} | Number of probes requests under 2.5s |
Total metric | probe_all_duration_seconds_count{job="<JOB_NAME>"} | Total number of probe requests |
Grouping | (leave empty) | Creates a single SLI dimension See the multidimensional SLI example |
Click Run queries to generate the final SLI ratio query:

The auto-generated SLI implements the ratio of successful event rates formula:
Success rate = rate of successful events (over a period)
/
rate of total events (over a period)
The SLI query returns a ratio between 0 and 1, where 1 means 100% of events were successful.
To learn why the auto-generated SLI is formed this way and how it works, refer to the breakdown of the ratio SLI query of the HTTP availability example.
Handle a threshold not available as a bucket
It is common for your SLI threshold to not match an existing histogram bucket, as in this example:
- The SLI searches for responses under
2
seconds. - But the available buckets are configured for
1
and2.5
, not2
.
In this case, probe_all_duration_seconds_bucket{job="<JOB_NAME>", le="2"}
does not work, and you should consider other approaches:
- Add a bucket for your threshold: If you control the instrumentation, update the histogram metric to include a bucket for the exact SLI threshold.
- Use a fallback metric: Check if a latency gauge metric is available like in the previous Gauge example.
- Approximate using the nearest bucket: Use the nearest higher or lower bucket. Document this clearly and adapt your SLO settings, as the SLO no longer match the intended SLI threshold.