Prometheus v2.11 Released

Published: 11 Jul 2019 by Ganesh Vernekar RSS

Since graduating within CNCF last August, Prometheus has adopted a new schedule for releases every six weeks. The latest release, v2.11, arrived on July 9. Prometheus 2.11 includes a new option to compress WAL records using Snappy, query performance improvements, the option to use Alertmanager API v2, and more.

What’s New in Prometheus v2.11

You can download the latest version here.

Change in Metric Names

prometheus_tsdb_wal_reader_corruption_errors is now renamed to prometheus_tsdb_wal_reader_corruption_errors_total.

Additionally, a new metric called prometheus_http_requests_total was added. It’s a counter for HTTP requests made to Prometheus API.

Option to Use Alertmanager API v2

For users who have upgraded to the latest version of Alertmanager, this patch adds a configuration option for Prometheus to send alerts to the new API v2 endpoint instead of the default v1 endpoint.

InitContainers in Kubernetes Service Discovery

The new release added Kubernetes service discovery for initContainers. The original request for this feature posited that it would be useful to have them scraped because they may contain troubleshooting information, would help users detect problems, and could provide useful information about how long initialization takes.

Compression of WAL Records

This feature, built by Chris Marchbanks, provides the option to compress WAL records using Snappy, which reduces the size of WAL significantly.

Query Performance Improvement

I worked on this enhancement for efficient iteration and search in HashForLabels and HashWithoutLabels functions. It avoids the copying of slice data during iteration of labels, and uses a faster algorithm. We use the same query engine inside Cortex and noticed that these functions are taking longer than they should be. Once I dug into and optimized them, we saw some of our heavy queries take 30-40% less time. This is a good example where Cortex developments flow into upstream as enhancements!

Allocations in PromQL Aggregations Reduced

Thomas Jackson, who worked on this enhancement, reports that his benchmarks on aggregation showed that this reduced allocations by ~5%, with a ~10% time improvement.

Remote-Write Allocation Improvements

Chris Marchbanks improved allocations for remote write by removing an unneeded temp variable, only allocating pendingSamples once per shard, allocating the send slice far fewer times, using a mask rather than always copying samples into a temporary slice, and allocating a Snappy buffer per shard rather than creating one per request.

Regexp Matchers for Set Lookups

Goutham Veeramachaneni and I are mentoring a Google Summer of Code student, Alec Wang, who worked on this optimization of queries using regexp for set lookups. This makes the queries by Grafana dashboards faster because of how Grafana dashboards send queries to Prometheus.

Fix Unsafe Snapshots

A collaborative effort by Krasi Georgiev and Bartek Płotka, this fixed a bug in taking snapshots of Prometheus which includes head block.

Release Notes

Check out the release notes for a complete list of new features, enhancements, and bug fixes.

Download

Head to the download page for download links and instructions.