Baseline complete

Baseline complete!

What you accomplished:

Designed a realistic load profile with ramp-up, steady state, and ramp-down
Added checks that validate response correctness on every request
Ran a test to observe actual p95 latency, throughput, and error rate
Set thresholds based on measured values with appropriate headroom
Validated that your baseline test passes consistently
Stored results in Grafana Cloud k6 for historical comparison

Skills unlocked

Skill	You can now
Load profile design	Create realistic traffic patterns with ramping VUs
Correctness checks	Validate responses are correct, not just fast
Threshold setting	Turn observed metrics into automated pass/fail criteria
Baseline management	Compare results over time and detect regressions

Examine your baseline

Look at your baseline results in Grafana Cloud k6 and consider:

Threshold headroom. How much buffer did you leave above your observed p95? Tighter thresholds catch regressions sooner. Loose thresholds avoid false failures from normal jitter.
Steady state stability. Were metrics steady during steady state, or did they drift? Drift can mean warm-up, a slow leak, or shifting dependencies.
Check failures. Did any checks fail during the baseline? Decide whether that should change how you threshold the checks metric.
Confidence. If you ran this test three times, would you expect the same outcome? If not, list what might vary (data, time of day, cold caches, shared environments).

You will reuse this same review habit after every code change or infrastructure update.

What’s next

Your baseline answers “what does normal look like?” In the wrap-up module, Apply the pattern to your own service is a short transfer checklist to run the same workflow on a URL your team owns (still on non-production).

After that, the next question is what happens when things are not normal. Stress, spike, and soak testing are the usual follow-on topics; use the k6 test types guide when you are ready to design those runs.

Excellent work. You now have something most teams don’t: an automated performance test that tells you whether your system is healthy without anyone having to interpret the results manually.

Think about what that means. Before, you had a test that produced numbers. Now, you have a test that produces a verdict: pass or fail. That’s the difference between a manual inspection and an automated quality gate.

Your baseline test can run in continuous integration and deployment, catching performance regressions before they reach production. It can run on a schedule, alerting you when infrastructure changes affect performance. And it gives you a documented benchmark that the whole team can reference.

This is the foundation. When you are ready to go further, stress testing, spike testing, and soak testing each answer a different question about your system under load, and they all assume you have a baseline to compare against. Follow the k6 documentation on test types when you want to implement them.