What is Experiment Design?
Experiment design is the discipline of structuring product tests so that results are reliable and actionable. It covers everything from forming a testable hypothesis to choosing the right test format to determining when you have enough data to make a decision.
Good experiment design separates signal from noise. Without it, teams run A/B tests that prove nothing because the sample was too small, the metric was wrong, or too many variables changed at once.
Why Experiment Design Matters
Every feature decision carries risk. Experiment design lets PMs reduce that risk by testing assumptions before committing engineering resources. A well-designed experiment can save months of wasted development by killing a bad idea in two weeks.
It also builds a culture of evidence over opinion. When the VP says "users want feature X" and the experiment says otherwise, good experiment design gives you the credibility to push back.
How to Design Good Experiments
Start with a hypothesis: "If we add social proof to the pricing page, trial-to-paid conversion will increase by 5%." The hypothesis must be specific and falsifiable.
Choose your method. A/B tests work for UI changes with high traffic. Fake door tests work for validating demand before building. Surveys work for qualitative signals.
Set your success criteria before launching. Define what "success" looks like numerically. This prevents post-hoc rationalization where you move goalposts to match results.
Experiment Design in Practice
Booking.com runs thousands of concurrent experiments. Their key insight: most experiments fail. They design for fast failure by keeping experiments small, measuring early indicators, and automatically shutting down experiments that show negative impact.
Netflix uses interleaving experiments for their recommendation algorithm, where users see results from both the control and variant mixed together. This requires smaller sample sizes than traditional A/B splits.
Common Pitfalls
- Testing too many variables. Change one thing at a time or you cannot attribute the result.
- Peeking at results early. Checking results daily and stopping when they look good inflates false positive rates.
- Ignoring novelty effects. New features often see a temporary spike. Run experiments long enough for the novelty to wear off.
- No learning documentation. Record every experiment's hypothesis, result, and decision. This becomes your team's institutional knowledge.
Related Concepts
Experiment design underpins hypothesis-driven development and is executed through methods like A/B testing, multivariate testing, and fake door tests. Results are analyzed using product analytics. For a structured approach to discovery, see continuous discovery.