Controls + challengers
📖 5 min readUpdated 2026-04-18
In direct response, the "control" is whichever ad, email, headline, or landing page is currently winning. The "challenger" is the hopeful replacement. Every mature direct-response operation runs controls and challengers as an ongoing discipline. Without the framework, testing is random. With it, testing becomes a compounding engine.
The definitions
- Control, the version currently running at full volume. The thing you're trying to beat.
- Challenger, a new version designed to outperform the control.
- Test, a controlled comparison where challenger gets a meaningful portion of traffic against the control.
- Winner, the challenger, if it beats control with statistical significance.
- New control, once the challenger wins, it becomes the control. Start again.
The lifecycle
- A challenger is designed based on a hypothesis about why it'll beat control
- It runs against the control at, say, 30/70 split
- Traffic accumulates until statistical significance
- If challenger wins: it becomes the new control; a new challenger is designed
- If it loses: it's retired; a new challenger is designed
- The loop never stops
Hypothesis-driven challengers
A challenger without a hypothesis is just randomness. Every challenger should be based on a belief about why it might do better:
- "Current headline is benefit-led; I hypothesize a problem-led version will pull in problem-aware prospects who currently bounce."
- "Current landing page has a 30-minute VSL; I hypothesize a 6-minute version will convert mobile traffic better."
- "Current offer has a $2,000 stack; I hypothesize doubling it to $4,000 with two more bonuses raises perceived value enough to offset the added cost."
If you can't articulate the hypothesis, the test isn't worth running.
How much traffic to give the challenger
Common splits:
- 50/50, classical A/B test. Fair, fastest to statistical significance. Use when you're confident the challenger might win big.
- 70/30 (control/challenger), hedge. Limits exposure of the new version if it's worse. Use for risky challengers.
- 80/20, more conservative. Slower to significance. Use when the control is performing well and you want to preserve its volume.
- 90/10, low-exposure testing for very risky changes. Takes long to conclude but limits downside.
How long to run a test
- At least one full week (captures day-of-week variation)
- Until you hit statistical significance (usually 95% confidence)
- Not so long that markets shift during the test
- Don't stop early because the challenger is "clearly winning" after 3 days, that's often noise
Multiple challengers at once
You can run 3 challengers against 1 control simultaneously. 40% to control, 20% each to 3 challengers. Faster learning, but each challenger gets less traffic, slowing individual significance.
When the challenger is a total rewrite
Sometimes you test not a single variable but a whole new approach, a completely different sales letter, a different funnel, a different offer. Treat it as a challenger, but expect longer test cycles and weigh carefully:
- If challenger wins: big lift, but you've learned less about why
- If challenger loses: you've learned the whole new system is worse, but you don't know which part
These tests are worth running periodically, big rewrites produce the biggest wins, but aren't a substitute for ongoing isolated-variable testing.
The "beat control" tournament
High-performing teams treat control-beating as a formal internal game:
- Anyone on the team can propose a challenger
- Challengers are queued; the best-hypothesis ones run first
- Wins are publicly credited to the person who designed the challenger
- Leaderboard of lifetime control-beats per person
Creates a healthy internal competition. Pushes creative variation. Compounds organizational knowledge.
Control decay
Controls get worse over time without anyone changing anything:
- Audience fatigue, same prospects seeing the same creative
- Market sophistication. prospects evolving past your current framing
- Channel changes, algorithms update, deliverability shifts
- Competitive response, competitors copying your control
Even without a winning challenger, you need to keep testing. A control that's been "the winner" for 18 months is probably decaying, you just haven't fielded a challenger that beats it yet.
The "control graveyard"
Maintain a record of every past control. When a category of challenger repeatedly loses, you know you've saturated that space. When a challenger wins, archive the old control with full documentation. This history is one of your most valuable assets.
When to retire a control completely
- Performance has declined 30%+ from peak
- The market framing has fundamentally shifted
- Your offer has materially changed
- A completely new approach is outperforming by 50%+ even in early testing
Don't retire a control just because you're tired of it. Retire it because the data says so.
Related: Scientific testing · What to test · Measurement