Cortex v1.0 released: The highly scalable, fast Prometheus implementation is generally available for production use
We’re happy to announce that Cortex v1.0 has been released! The horizontally scalable, durable, and fast Prometheus implementation is now generally available for production use.
At Grafana Labs, we’ve been using Cortex in production for almost three years, including to power the Prometheus backend for the Grafana Cloud managed logging and metrics platform. We’re confident that for large organizations looking for an “enterprise-ready” Prometheus system that can deploy at high scale and reliability, Cortex is the answer.
In fact, over the past year, adoption has hit the mainstream, with Cortex – a Cloud Native Computing Foundation Sandbox project – helping numerous enterprises adopt Prometheus. And the number of stargazers on GitHub has shot up:
Highlights of v1.0
The v1.0 release of Cortex brings numerous changes to make Cortex easy to use and production ready. These changes include:
- Production documentation detailing the steps necessary to build a production-ready Cortex deployment.
- Turn-key Grafana dashboards and ready-made Prometheus alerts, the very same we use internally at Grafana Labs to run multiple production Cortex clusters at massive scale.
- Stability and backwards compatibility guarantees to give you the confidence you need to rely on Cortex.
- An easy to use, single process “airplane” mode for getting started and experimenting with Cortex.
The Cortex v1.0 release can be found on our GitHub page.
Grafana Labs and Cortex
Since the project was started in 2016, the focus has been on delivering an infinitely scalable solution. Grafana Labs got involved in 2018 though the acquisition of Kausal. Some recent highlights of our engineering work include:
- Deduplicating metrics from HA Prometheus pairs to enable high availability on the client side.
- Blazing fast PromQL queries, allowing you to run real-time queries against tens of billions of data points.
- How we (ab)used Consul to improve efficiency and scale.
- Introducing the Grafana Agent to reduce resource usage and enable different deployment models when sending metrics to Cortex.
- We’ve used gossip to reduce dependencies and make Cortex easier to use.
Over the past three years, Cortex has been used in production both at Grafana Labs and at companies like Gojek. A decacorn startup, Gojek serves millions of users across Southeast Asia with its mobile wallet, GoPay, and more than 20 services on its super app. To support multiple markets, the systems team at Gojek focused on building an infrastructure for speed, reliability, and scale. By 2019, the team realized it needed a new monitoring system that could keep up with Gojek’s ever-growing technology organization, which led them to Cortex. A year later, Gojek’s Lens monitoring system has 40+ tenants, for which Cortex handles about 1.2 million samples per second. For more on Gojek’s use of Cortex, read the full case study.
I want to give a big thank you to the 97 contributors to Cortex, but in particular a shout out to Bryan Boreham, Ken Haines, Thor Hansen, Jacob Lisi, Chris Marchbanks, Marco Pracucci, Peter Štibraný, Sandeep Sukhani, Goutham Veeramachaneni, Ganesh Vernekar, and Julius Volz for their work over the past few years. We wouldn’t have been able to do this without you.