New Feature in Loki v1.3: the Query Frontend
Recently, Loki v1.3.0 was released. It included many changes, but I’d like to talk about one in particular: the query frontend. This new component in the Loki architecture is a drop-in addition. What does that mean? Loki can run with or without it.
In fact, the query frontend both produces and consumes the Loki API, meaning to a consumer there’s no difference. If we were to model this in code, we’d say the frontend takes a Loki API as an argument and returns another implementation of the same API.
Why have another component to fulfill behavior that’s already implemented? To answer this, we’ll have to take a look at a few problems the query frontend alleviates.
Parallelization
Loki fulfills a few main types of queries, ranging from filter queries (such as fetch the logs from application X and filter for log lines matching Y) to metric queries (like calculate the rate of 400 http codes in my service during the last hour).
Ultimately, both types of queries require sifting through potentially large volumes of log data and performing filters, etc. Luckily, these operations can be parallelized by time fairly easily. This is one of the frontend’s jobs: to split incoming requests into smaller time ranges, run them in parallel, and recombine them.
Splitting queries enables speedups proportional to the split factor. If we query for the count of occurrences of X in a 24-hour period, the frontend can translate that into the sum of occurrences in 24 one-hour periods. Each of these smaller queries are sent off to the downstream Loki API and recombined by the frontend. When you’re operating on high throughput log streams, this is an invaluable tool.
Scheduling & Denial of Service
In addition to performance, the new query frontend also provides protection from denial of service attacks using per-tenant query scheduling. Loki is written from first principles to be multi-tenant. Tenants can be companies, teams within a company, or any other arbitrary division your use case dictates. The frontend is aware of this and uses a set of per-tenant queues internally to organize incoming queries. It picks queries from these queues fairly and distributes them to the downstream Loki API – in our case, the querier components.
Consider a problematic tenant sending too many queries: Other tenants’ queries may continue unhindered as the problematic tenant simply puts pressure on its own queue, but not others’. This has served us well, ensuring that under load, tenant requests are allocated fairly to downstream queriers.
What’s Next
For the sake of brevity, I won’t go into detail about a few smaller things the frontend implements, such as automatic retries and step alignment for metric queries, and the fact that it uses a pull (instead of push) model with downstream queriers.
This brings us to what’s next for the Loki frontend:
- Result caching: Loki already uses multiple tiers of caches, and result caching is the next addition. This is heavily inspired by the Cortex feature of the same name, but will allow us to calculate and store entire query results for reuse without recalculation. This sounds simple, but the devil is in the details. The big challenge here is being able to combine partial results and then distribute only the difference as a smaller unit of work to downstream queriers. Cyril Tovena and I have been discussing this, but its implementation is nascent. 
- Query sharding: This is a technique for mapping an incoming query into an equivalent but more parallelizable form. A simple example is average. Instead of calculating an average in one location and needing to iterate through all of the data there, we can turn this query into a sum/count, which is the same as average but can be heavily parallelized. 
Getting Started
Be sure to check out the readme we’ve put together on how to get started, try out the query frontend for yourself, and let us know what you think.








