Company: Builder.ai
Industry: Software & Technology
Builder.ai is an AI-powered, composable software platform designed to unlock human potential by empowering a wide range of users—from entrepreneurs and students to enterprises and SMBs—to bring their digital ideas to life without the need to navigate technical complexities.
Challenge
Builder.ai faced a challenge in monitoring its Builder Developer Surface, a virtual desktop solution for developers. The platform operates over 20 Kubernetes clusters across six geographic regions and two cloud providers, generating a vast amount of metrics and logs. While Builder.ai initially relied on an in-house stack using Grafana, Prometheus, and Elastic, this setup proved difficult to manage due to high maintenance demands, persistent memory issues in Prometheus, and limited team capacity, which strained the resources needed to keep the stack functional.
Solution
Adopting Grafana Cloud as a centralized observability platform enabled Builder.ai to seamlessly replace its in-house solution. The transition was swift, thanks to the team’s existing knowledge of Grafana and guidance from the Grafana Labs professional services team. Within two weeks, the team achieved centralized visibility across clusters, while eliminating the burden of having to manage and scale their underlying infrastructure.
Impact
- Centralized visibility: Consolidated more than 20 self-hosted dashboards into a single, centralized view.
- Ease of deployment: Completed migration within two weeks, leveraging Grafana’s pre-configured tools, professional services, and Helm charts.
- Cost control: Leveraged recommendations from Grafana Cloud Adaptive Metrics to reduce metrics costs.
- Custom dashboards: Developed targeted views for specific use cases, such as monitoring virtual desktop health checks across clusters and alerting the team when a cluster reaches capacity.
- Efficient log management: Utilized Grafana Cloud Logs for cost-effective log storage, enabling insights into cluster performance and debugging without high storage costs.
- Automated health checks: Built a health-check API to automate issue detection and resolution, minimizing manual interventions.
- Long-term trend analysis: Shifted focus from immediate data to longer-term insights, enhancing Builder.ai’s ability to address persistent issues and optimize cluster health.
“With Grafana Cloud, you can actually see what’s going on and you can see your costs. I can’t recommend Adaptive Metrics enough from a cost control point of view, because it does affect, ultimately, how many metrics you end up paying for. And again… it’s literally the press of a button.”
James Dobson, Technical Lead in Developer Surface Platform
Your guides

