With roughly 100 million global active users, Roblox understands that its success relies on healthy and scalable infrastructure. And that means having a healthy and scalable observability solution.

Over the past two years, the Roblox team has made significant changes to its observability platform – including telemetry metrics, logging, and tracing – resulting in memorable migration stories, some hard-earned lessons, and a great vision for the future. Join Director of Engineering Xiaofeng Han and Principal Engineer Ying Dai as they cover the journey of evolving observability and engineering practices at Roblox, collaborating with Grafana Labs while adopting Grafana for visualizing metrics and logs and Grafana Tempo for traces, and changing the culture at Roblox for the better.

Xiaofeng Han

Xiaofeng Han

Director of Engineering at Roblox

Xiaofeng Han is currently the head of observability at Roblox. He leads a team of 20+ engineers to work on a wide range of monitoring, alerting, and debugging infrastructures. He is very passionate about bringing intelligence into the observability domain to deliver the full potential and insights of the observability data. Before Roblox, Xiaofeng had been working with Google for 10 years where he led the tools and infrastructure teams to support Google SearchAds org. Xiaofeng obtained his Ph.D. degree from the University of Delaware in Computer Science. He authored and co-authored more than 10 highly referenced research papers on engineer efficiency and optimization in wireless networks. Xiaofeng is currently living in San Jose with his wife and 14-year old son Grant. He enjoyed various outdoor events and building robots with his son.

Ying Dai

Ying Dai

Principal Engineer at Roblox

Ying Dai is currently a principal engineer at Roblox leading the Telemetry team, with the mission of building a unified telemetry platform for all Robloxians that is scalable, secure, easy to use, and provides actionable insights into how we are operating.