Grafana Loki 2.7 has arrived!
With it comes an experimental feature we are rather excited about: a redesigned index based off of the Prometheus TSDB index. While we are still in the early stages, this enhancement in Grafana Loki, which we previewed at ObservabilityCON 2022, creates a smaller storage footprint, better query performance, and much more that we will dive into below!
What are the problems we’re solving?
First, a short history of how we got here. When Grafana Loki first debuted around four years ago, multiple storage systems were required: log “chunks” used object storage, but chunk indices utilized key-value stores, such as Cassandra. Enter the boltdb shipper, which removed the need for a separate index store and allowed Loki to run on a single object store.
This approach held for a while. But we are always striving to make Loki simpler to use, more cost-effective to operate, and more capable to serve larger logging footprints. We noticed that as we operated Loki at scale at Grafana Labs (a typical day at the time of this blog is about 15TB of logs and 500,000 unique streams), there were opportunities to improve in these areas. There are three themes we noticed:
- Handling volume. Simply put, we don’t want to put any limits on our users’ ability to send, store, or retrieve logs. We want to allow Loki users to operate confidently at scale, without worrying about curtailing their logging. We strived to increase the number of streams Grafana Loki could handle and minimize the amount of CPU required for querying.
- Query performance. We noticed that certain types of queries, such as range scans and metadata queries, were not as fast as they could be to give users a high-quality query experience.
- Cost considerations. When you use Grafana Loki, you’re paying infrastructure costs to store and query your logs. While object storage has come a long way and has gotten fairly inexpensive, the cost still adds up at volume.
With an eye toward tackling these problems, our TSDB index was born!
What benefits does Loki’s TSDB index provide?
Loki’s TSDB index is directly inspired by the Prometheus TSDB storage format. Because of how Loki fundamentally operates — storing labeled data as efficiently as possible while providing fast lookups to log streams — it is a perfect technological match when we think about the value proposition of Loki to our users.
The Loki team has worked tirelessly to implement this feature and run it internally for some time now, and here are some of the benefits we’ve noticed:
For the operator: Improved resource utilization. One benefit that jumps off the page is the reduction in object storage footprint, which directly translates to money saved for the user. When operating Loki with TSDB index internally, we’ve seen a 75% reduction in the size of our indices, with a much lower, and healthier, compression ratio.
The TSDB index also allows us to better provision resources at query time. While this has fantastic benefits for the person performing the query, this smarter query planning also allows us to better provision compute resources.
For the end user: More performant querying. At peak performance, we’ve seen that Loki can now scan log lines at up to 400GB/second, roughly 4x faster than before. The improved query planning, combined with a few other enhancements, makes queries much more responsive, and users can get to their vital log data faster.
How to use Grafana Loki’s TSDB index
It is very easy to get up and running with the Grafana Loki’s TSDB index. Simply update the
schema_config section of your configuration YAML file like so:
tsdb_shipper: active_index_directory: /data/tsdb-index cache_location: /data/tsdb-cache index_gateway_client: server_address: dns:///index-gateway.<namespace>.svc.cluster.local:9095 query_ready_num_days: 7 shared_store: gcs schema_config: configs: - from: 2022-10-24 store: boltdb-shipper <- - - object_store: gcs schema: v11 index: prefix: index_ period: 24h - from: 2022-11-30 store: tsdb < - - - object_store: gcs schema: v12 < - - - index: prefix: tsdb_index_ < - - - period: 24h < - - -
In this example, we are telling Loki to leverage the TSDB index beginning on November 30, 2022, and leverage the boltdb shipper before that. Loki is smart enough to work that out behind the scenes. In the case of querying, for example, the user action stays the same, while the experience is much improved!
Please note that this feature is experimental. As we noted above, we are still in the early stages with this feature, but are very excited to share it. As the community continues to try out the TSDB index, and we ourselves at Grafana Labs extend our usage of the TSDB index, we will continue to share optimizations, enhanced documentation, and other additional resources to help users fully realize the power that TSDB index unleashes.
Other notable enhancements to Loki 2.7
A number of other enhancements are included in this Loki 2.7 release, such as:
- Promtail support for max stream limit
- Better support for Azure blob storage
And if you’re interested in learning more about the full Grafana LGTM stack (Loki for logs, Grafana for visualization, Tempo for traces, Mimir for metrics), you can watch our ObservabilityCON 2022 session “LGTM: Scale observability with Mimir, Loki, and Tempo” on demand.
Thank you to all of the Grafana Loki users and contributors who help grow the Loki project!