Query acceleration with bloom filters
Note
Query acceleration using Bloom filters is enabled as a public preview for select large-scale Grafana Cloud customers that are ingesting more that 75TB of logs a month. Limited support and no SLA are provided.
Loki leverages bloom filters to speed up queries by reducing the amount of data Loki needs to load from the store and iterate through. Loki is often used to run “needle in a haystack” queries; these are queries where a large number of log lines are searched, but only a few log lines match the query. Some common use cases are searching all logs tied to a specific trace ID or customer ID.
An example of such queries would be looking for a trace ID on a whole cluster for the past 24 hours:
{cluster="prod"} | traceID="3c0e3dcd33e7"
Without accelerated filtering, Loki downloads all the chunks for all the streams matching {cluster="prod"}
for the last 24 hours and iterates through each log line in the chunks, checking if the [structured metadata][] key traceID
with value 3c0e3dcd33e7
is present.
With accelerated filtering, Loki is able to skip most of the chunks and only process the ones where we have a statistical confidence that the structured metadata pair might be present.
To learn how to write queries to use bloom filters, refer to Query acceleration.
For more information about the underlying components, refer to the Bloom filters topic in the Loki documentation.