What to test
📖 5 min readUpdated 2026-04-18
There are hundreds of things you could test. Most aren't worth the effort. The highest-leverage tests come from the top of a specific list, market, offer, headline, and decay from there. This page is the checklist, ordered by expected impact.
Tier 1. Offer-level tests (5–200% lift possible)
Price
- Higher price vs, lower
- Three-tier vs, single-tier
- Monthly vs, annual pricing
- Pay-in-full vs, installments
Offer structure
- With bonuses vs, without
- One bonus vs, stack of five
- Digital-only vs, digital + physical component
Guarantee
- 30-day vs. 90-day vs. 1-year money-back
- Standard vs, better-than-money-back
- Conditional (outcome-based) vs, unconditional
Urgency / scarcity
- Hard deadline vs, open enrollment
- Cohort-based vs, rolling
- Bonus expiration vs, price increase as the urgency driver
Tier 2. Headline + hook (5–50% lift)
- Outcome-focused ("How to hit quota without cold calls")
- Specificity variant ("The 7 emails that…" vs. "Emails that…")
- Problem-named vs, benefit-named
- Story-driven vs, claim-driven
- Mechanism-led ("the 14-minute process") vs, benefit-led
- Length variants (short vs, long headlines)
Tier 3. Landing page structure (5–30% lift)
- Long form vs, short form
- Video vs, text hero
- Single CTA page vs, multiple CTAs throughout
- Above-the-fold with video vs, image vs, animation
- Social proof early vs, late
- FAQ section vs, no FAQ
- Testimonial placement, throughout vs, one dedicated section
Tier 4. Ad creative (paid channels; 10–100% CTR lift possible)
- UGC-style vs, polished production
- Founder face-on-camera vs, no face
- Static vs, video
- Problem-first hook vs, benefit-first
- Testimonial-based vs, brand-voice
- Different opening 3 seconds on video
- Caption / subtitle styles
Tier 5. Email
- Subject line variants
- Plain text vs. HTML
- Length (short vs, long)
- Sent time and day
- From name (founder vs, brand)
- Single CTA vs, multiple CTAs
- Story-led opening vs, benefit-led opening
Tier 6. Audience / targeting (paid)
- Lookalike 1% vs. 5% vs. 10%
- Interest-based vs, lookalike
- Broad targeting (let algorithm decide) vs, narrow targeting
- Different seed audiences for lookalikes
- Retargeting windows (7-day vs. 30-day)
Tier 7. Form / checkout
- Number of fields
- Single-page vs, multi-step checkout
- Address fields now vs, after purchase
- Mobile form design
- Guest checkout vs, account required
- Payment methods offered
Tier 8. Small design / copy
- Button text
- Button color (usually noise, but sometimes matters)
- Hero image variants
- Font family for body copy
- Sub-headlines
- Pricing display (strike-through vs, none, value anchoring)
The "impact estimate" filter
Before running a test, estimate: if this wins, how much lift would it produce?
- Expected lift > 20%, run it, priority
- Expected lift 10–20%, run it, normal queue
- Expected lift 5–10%, only if cheap to run
- Expected lift < 5%, skip
If you're consistently running tests in the < 5% range, you're optimizing the wrong things. Go back up the tier list.
Test one variable, not three
The temptation: "let's test a new headline, new image, and new button all at once and compare to the old." Result: you learn nothing about which element actually drove the change.
Test one thing at a time. If you've already found a winning combination through isolated tests, then test the stacked combination against the old stack, but even then, understand you're testing systems, not variables.
Sequential vs, concurrent
You can only run one test per page at a time (if you run two, they contaminate each other). But you can run tests on different pages simultaneously. A mature operation runs 3–6 independent tests in parallel across different funnels.
The "we already know what works" trap
The moment a team says "we know our headline is best, no need to test," conversion stops improving. Markets shift. Audiences shift. What won in Q1 often loses in Q4. The control is always subject to being dethroned.
Multi-armed bandit vs. A/B
Advanced: multi-armed bandit algorithms dynamically shift traffic toward winners during the test. Good for long-running optimization where you value ongoing performance over clean A/B comparisons. Most teams are better off with straight A/B until they've exhausted obvious tests.
Related: Scientific testing · Controls + challengers · Measurement