The value of performance baselines

Without a performance baseline, every test result is just a number with no context. You might know that your API responded in 350 ms, but is that good or bad? A baseline gives you the answer by establishing a known-good benchmark that you measure future changes against.

A performance baseline captures three key metrics under a realistic load:

  • Response time (latency): How long it takes for your system to respond to a request. Averages hide outliers, so baselines use percentiles instead. The p95 value means 95% of requests completed within that time, and p99 captures the slowest 1%. If your p95 is 200 ms but your p99 is 2 s, a small percentage of users are having a significantly worse experience.
  • Throughput (requests per second): The number of completed requests your system handles per second. Throughput tells you about capacity. If throughput plateaus while virtual users increase, your system has hit a bottleneck.
  • Error rate: The percentage of requests that return an error. A healthy baseline typically has an error rate near zero. Any increase from the baseline signals a regression.

These three metrics together tell a complete story. Latency tells you how fast, throughput tells you how much, and error rate tells you how reliably your system performs. Recording these values under controlled conditions gives you a benchmark you can use to detect regressions, validate deployments, and set meaningful SLOs.

In the next milestone, you design a load profile that simulates realistic traffic so your baseline reflects real-world conditions.


page 2 of 10