Webinar

Builder.ai - Observability Journey

You are registered for this webinar Thanks for registering
You'll receive an email confirmation, and a reminder on the day of the event. You'll receive an email when the on-demand video is available.
Builder.ai - Observability Journey

Company: Builder.ai
Industry: Software & Technology

Builder.ai is an AI-powered, composable software platform designed to unlock human potential by empowering a wide range of users—from entrepreneurs and students to enterprises and SMBs—to bring their digital ideas to life without the need to navigate technical complexities.

Challenge

Builder.ai faced a challenge in monitoring its Builder Developer Surface, a virtual desktop solution for developers. The platform operates over 20 Kubernetes clusters across six geographic regions and two cloud providers, generating a vast amount of metrics and logs. While Builder.ai initially relied on an in-house stack using Grafana, Prometheus, and Elastic, this setup proved difficult to manage due to high maintenance demands, persistent memory issues in Prometheus, and limited team capacity, which strained the resources needed to keep the stack functional.

Solution

Adopting Grafana Cloud as a centralized observability platform enabled Builder.ai to seamlessly replace its in-house solution. The transition was swift, thanks to the team’s existing knowledge of Grafana and guidance from the Grafana Labs professional services team. Within two weeks, the team achieved centralized visibility across clusters, while eliminating the burden of having to manage and scale their underlying infrastructure.

Impact

  • Centralized visibility: Consolidated more than 20 self-hosted dashboards into a single, centralized view.
  • Ease of deployment: Completed migration within two weeks, leveraging Grafana’s pre-configured tools, professional services, and Helm charts.
  • Cost control: Leveraged recommendations from Grafana Cloud Adaptive Metrics to reduce metrics costs.
  • Custom dashboards: Developed targeted views for specific use cases, such as monitoring virtual desktop health checks across clusters and alerting the team when a cluster reaches capacity.
  • Efficient log management: Utilized Grafana Cloud Logs for cost-effective log storage, enabling insights into cluster performance and debugging without high storage costs.
  • Automated health checks: Built a health-check API to automate issue detection and resolution, minimizing manual interventions.
  • Long-term trend analysis: Shifted focus from immediate data to longer-term insights, enhancing Builder.ai’s ability to address persistent issues and optimize cluster health.

With Grafana Cloud, you can actually see what’s going on and you can see your costs. I can’t recommend Adaptive Metrics enough from a cost control point of view, because it does affect, ultimately, how many metrics you end up paying for. And again… it’s literally the press of a button.

James Dobson, Technical Lead in Developer Surface Platform


Your guides

Utsav Preet
Utsav Preet
Associate Director of Engineering
Builder.ai
James Dobson
James Dobson
Technical Lead
Builder.ai
Resources

More great videos and webinars