Grafana Loki 2.2 released: Multi-line logs, crash resiliency, and performance improvements
I imagine everyone is long since tired and bored with their Loki 2.1 end of year/holiday gift, so I’m here today to bring some really exciting news. Loki 2.2 is released!!!
Lots of new features are in this release, but worth celebrating in particular is that the single most requested feature for Loki has been added!
Another major milestone was reached for Loki in 2.2:
Amazing work by Owen to finally bring a write ahead log to Loki’s ingester! This means sudden crashes of a Loki process should not result in losing any logs. We have been running Loki with the WAL enabled on all our clusters for several months now. Our use and abuse of Loki flushed out a few interesting bugs:
We always run our clusters with replication, but it still feels awesome to have a WAL further protecting everyone’s logs.
Performance improvements and fixes
2.2 also contains an impressive list of performance improvements and fixes. We’ve been operating clusters with the new 2.0 features for four or more months now and have found a number of ways to make things faster and more efficient.
Optimizations around JSON parsing, label hints, and other query path optimizations saved us about 30 cores on our queries! Not everyone may see such a significant improvement, but I’m quite sure everyone will see some!
Cyril Tovena also set his magical pprof skills loose on Loki’s write path, and many improvements were found, such as decreasing our distributor memory consumption by 8x!
There were more than 200 PRs merged between 2.1 and 2.2, and all of those were painstakingly sorted and arranged in the Changelog. I encourage everyone to check it out, and more importantly, always read the Upgrade guide to make sure your upgrade process is smooth.
Thanks to Danny for 3280, which expands the JSON parser to be able to select specific JSON elements as well as access elements inside an array. Also Danny’s first PR for Loki 3126, while less impressive than 3280, is my favorite because it was the PR that set events in motion for him to end up on the Loki squad. Congrats Danny!
Kavi set out to solve one of the biggest pain points our Docker driver users have experienced: 2898 fixes the shutdown blocking of the Docker driver when Loki was unavailable to receive logs. More exciting for me, though, is 3083, which adds a Promtail target that can listen on Google Pub/Sub topics, allowing us to set up log syncs for our Google Cloud logs and ingest them into Loki.
Last but not least for my highlight list: Cyril generously spent some of the time he normally uses playing games when he boots into Windows, to instead add Windows event log support into Promtail in PR 3246. Your gaming sacrifice is sincerely appreciated.
We hope the significant performance improvements and new features in 2.2 were worth the wait. We are extremely excited for what’s to come in 2021!
The easiest way to get started with Loki is Grafana Cloud, and we’ve recently added a free plan that includes 50GB of logs and upgraded our paid plans. If you’re not already using Grafana Cloud, sign up today for free and see which plan meets your use case.