Statistical significance
📖 3 min readUpdated 2026-04-19
Statistical significance tells you the difference between variants is unlikely to be random. 95% confidence (p-value 0.05) is standard.
P-value
Probability the result is random. Lower is better.
Common mistakes
- Stopping early (p-hacking)
- Ignoring base rates
- Running many tests without correction
- Interpreting non-significance as 'no effect'
Practical threshold
95% confidence + minimum sample size. Lower confidence on low-stakes, higher on high-stakes.
What to do with this
- Match confidence level to stakes, 99% on checkout pricing, 90% on button microcopy
- Default to 95% for most tests, it's the balance between speed and reliability for typical CRO work
- Don't confuse significance with importance, a statistically significant 2% lift might not be worth the engineering cost
- Report both the point estimate and confidence interval, ranges communicate uncertainty better than single numbers
- Pre-register your significance threshold in the test hypothesis, changing it after seeing data is p-hacking