New in Prometheus v2.19.0: Memory-mapping of full chunks of the head block reduces memory usage by as much as 40%

Published: 10 Jun 2020 RSS

The just-released Prometheus v2.19.0 introduces the new feature of memory-mapping full chunks of the head (in-memory) block from disk, which reduces memory usage and also makes restarts faster. I will be talking about this feature in this blog post.

The memory-mapping

The head block is the in-memory part of the time series database (TSDB) embedded in Prometheus. It stores the last 1–3 hours of data in memory. The data is flushed to disk every 2 hours to form persistent blocks. In the head block, we store the samples in the form of compressed chunks; each chunk consists of up to 120 samples, and when we create a new chunk, the old one is said to be “full.”

A full chunk is not modified once a new chunk is cut for the series. We use this information to flush chunks to disk whenever they are full and only store references to the chunks (file index and offset in file) in memory. We memory-map the file on disk, which means the operating system loads the chunks into memory when needed. With this, depending on the scrape interval, the majority of the sample data stays on disk while the memory usage is reduced overall.

Benchmark results

In the benchmarks that we run on PRs on GitHub by deploying actual Prometheus instances, we saw 20–40% reduction in memory usage. The difference in reduction depends on the churn: With high churn, some of the series will never fill a chunk up to 120 samples and hence will take more memory throughout.

The maximum reduction is seen for short scrape intervals and low churn, as chunks get full more quickly. If you have a very long scrape interval, like 1 minute or more, you will see only a very minimal effect on memory consumption, as it takes longer to fill up a chunk.

As a side effect of memory-mapping chunks, during the replay of the Write Ahead Log (WAL) we can discard all samples within the time range that is already covered in the memory-mapped chunks. This showed a reduction of 20–30% replay time in the benchmark.

Here are some graphs:

Decent memory reduction with high churn (10-20%)

Decent memory reduction with high churn (10-20%)

Massive memory reduction with little to no churn (30-40%)

Massive memory reduction with little to no churn (30-40%)

The improvements are not limited to Prometheus. Other projects in the Prometheus ecosystem like Cortex and Thanos, which import the Prometheus TSDB, also benefit from this change. Here is a graph from Cortex with block storage, where not only the memory usage is reduced, but so are the spikes in memory every 2 hours.

Cortex with block storage

Some things to keep in mind

It is natural to assume that you can now reduce the resource allocation for Prometheus as it’s using less memory. But it cannot be ignored that the chunks are loaded into memory when required. So if you run heavy queries that would touch a lot of series simultaneously for the past few hours of data, then Prometheus is going to take up a little more memory than the ideal reduction. Plan the resources keeping this in mind.

Learn more about Prometheus

Here is the full changelog for Prometheus v2.19.0 and binaries along with it. Upgrade now to see memory-mapping of head chunks in action!

For more about what’s new in Prometheus, check out the on-demand recording of Goutham Veeramachaneni’s GrafanaCONline talk, Prometheus: What the future holds.

Related Posts

Today we're releasing Cortex 1.1, the first (minor) release since Cortex went GA in March. We've improved reliability and performance -- read more to find out how.
At GrafanaCONline, Grafana Labs Senior Software Engineer Ed Welch detailed how he gets the most out of his Nissan Leaf battery using Grafana, Cortex and Loki. Also learn about creating a Raspberry Pi-based desktop Kubernetes cluster.
Without isolation, queries in Prometheus could only see a fraction of the samples ingested in a scrape. Here's how Prometheus v2.17 fixed all of that.

Related Case Studies

Hiya migrated to Grafana Cloud to cut costs and gain control over its metrics

To scale Prometheus, says Senior Software Engineer Jake Utley, Grafana Cloud was ‘the most in line with what we wanted to accomplish.’

"We wanted the ability to look at our own information and understand it from top to bottom."
– Dan Sabath, Senior Software Engineer, Hiya

How Gojek is leveraging Cortex to keep up with its ever-growing scale

Gojek’s Lens monitoring system has 40+ tenants, for which Cortex handles about 1.2 million samples per second.

"The goal is to make sure that whenever a new service or team is created, they automatically get onboarded to the monitoring platform."
– Ankit Goel, Product Engineer, Gojek

DigitalOcean gains new insight with Grafana visualizations

The company relies on Grafana to be the consolidated data visualization and dashboard solution for sharing data.

"Grafana produces beautiful graphs we can send to our customers, works with our Chef deployment process, and is all hosted in-house."
– David Byrd, Product Manager, DigitalOcean