How to A/B Test a Landing Page Without Wasting Traffic
How to A/B Test a Landing Page Without Wasting Traffic
Most founders who ab test landing page variants waste their traffic on tests that were doomed before they started. The hypothesis was weak, the sample size was too small, or the test got called early because one variant looked like it was winning on day two.
This tutorial walks through how to run a test that actually tells you something. No fluff, no theory dumps, just the steps in order.
Before you test anything, check if you can test at all
A/B testing needs traffic. A lot of it. If your landing page gets 200 visitors a month and converts at 3%, you're getting 6 conversions monthly. You cannot detect a meaningful difference between two variants with numbers that small. You'd need to run the test for a year, and by then your product, audience, and market have all changed.
Rough rule: you want at least 1,000 conversions per variant to detect a 10% relative lift with reasonable confidence. So if your page converts at 5%, you need around 20,000 visitors per variant, or 40,000 total. Per test.
If you don't have that traffic, stop reading about A/B testing. Do these instead:
- Run session recordings to see where people get stuck
- Talk to 5 users on a call
- Make obvious fixes based on heatmap data
For a deeper look at when heatmaps beat tests, see our Hotjar vs Google Analytics breakdown.
Step 1: Pick a hypothesis that can actually move the needle
A bad hypothesis: "I think a green button will convert better than a blue button."
A good hypothesis: "Visitors are bouncing because the headline doesn't match the ad they clicked. If I rewrite the headline to mirror the ad copy, signups will increase."
The difference is the second one has a reason behind it. You're not guessing at button colors. You looked at your data, found a gap, and have a theory about why fixing the gap will help.
Where good hypotheses come from:
- Heatmaps showing people don't scroll past a section
- Session recordings of users hovering on the CTA then leaving
- Form analytics showing field abandonment
- Customer interviews where prospects misunderstood your value prop
- Above-the-fold issues you spotted in audits (here's a list)
Write the hypothesis down in this format: "If I change X, then Y will happen, because Z." If you can't fill in Z, you don't have a hypothesis. You have a guess.
Step 2: Calculate the sample size before you launch
This is the step everyone skips. Then they wonder why their test "didn't work."
Plug your numbers into a free calculator (Evan Miller's is the standard). You need:
- Current conversion rate (your baseline)
- Minimum detectable effect (the smallest lift you'd care about)
- Statistical significance threshold (use 95%)
- Power (use 80%)
The calculator tells you visitors per variant. Multiply by your number of variants. That's your traffic budget.
If the result is "you need 80,000 visitors per variant" and you get 5,000 visitors a month, the test will take 16 months. Don't run it. Either pick a bigger change to test (one that should produce a larger lift) or skip testing and just ship the change.
Step 3: Test one thing at a time, but test something bold
Small changes need huge sample sizes to detect. A button color swap might produce a 2% relative lift, which means you need a massive amount of traffic to see it.
Big changes (new headline, restructured hero section, completely different value proposition) tend to produce bigger swings, which you can detect with less traffic.
So test bold variants. Rewrite the entire above-the-fold section. Try a totally different CTA approach. Compare a long-form page against a short one. The goal is to learn something useful, not to optimize button radius.
Test one variable at a time so you know what caused the change. If you swap the headline AND the CTA AND the hero image, and conversions go up 15%, you have no idea which change did the work.
Step 4: Set up the test correctly
You need a tool that splits traffic randomly, tracks conversions, and doesn't introduce flicker. Options:
- Built-in builder tools: Some landing page builders include A/B testing. Check our 2026 builder comparison to see which ones do.
- Dedicated testing tools: VWO, Convert, AB Tasty. More features, separate cost.
- Google Optimize replacements: Optimize shut down in 2023. People moved to GrowthBook, PostHog experiments, or Optimizely.
- Roll-your-own: Two pages, split traffic 50/50 at the ad platform level. Works fine for top-of-funnel tests.
Whatever you pick, verify the setup before launch:
- Open the page in incognito 10 times. Confirm you see both variants roughly equally.
- Convert once on each variant. Confirm both conversions register.
- Check that returning visitors stay on the same variant (sticky assignment).
- Confirm there's no flash of the original content before the variant loads.
That last one matters. Flickering kills tests. Visitors see the original page for a half second, then it swaps to the variant. Your test results are now measuring "people who didn't bounce from the flicker" instead of variant performance.
Step 5: Run the test for the full sample size, no peeking
Here's the rule almost everyone breaks: do not look at results until you hit the sample size you calculated in Step 2.
Why? Because conversion rates fluctuate wildly early in a test. On day two, variant B might be 40% ahead. On day five, it's tied. On day twelve, variant A is winning. If you call the test early, you're calling random noise.
Also: run the test for full week multiples. Traffic on Tuesday looks different from traffic on Sunday. Run for 1, 2, 3, or 4 weeks. Not 10 days.
If your sample size says you need 3 weeks of traffic but the calendar says you need to ship in 2 days, the right call is usually to ship without testing. A test that ends early is worse than no test, because it gives you false confidence in a result that's noise.
Step 6: Interpret results honestly
When the test completes, ask:
- Did the winning variant hit statistical significance (95%+)?
- Did the result match your hypothesis, or did you get a surprise?
- Is the lift big enough to matter? A 1% lift on a low-traffic page is rounding error.
If you got a winner, ship it. If you got a tie, ship whichever is simpler to maintain. If you got a loss, congratulations, you learned something. Document why you think it lost. That's data for your next hypothesis.
Don't ship variants that "lost but might have won with more traffic." That's gambling, not testing.
Common ways founders waste traffic
A quick list of mistakes I see constantly:
- Testing without a hypothesis. Just trying random changes. You'll get random results.
- Calling tests early. Looking at day 3 and shipping the leader.
- Testing tiny changes on low-traffic pages. Math says you can't detect the lift.
- Running multiple tests on the same page simultaneously. They interfere with each other.
- Ignoring segment differences. Mobile and desktop often behave differently. A variant that wins overall might be tanking mobile.
- Not accounting for ad campaign changes. If your traffic source mix shifts mid-test, your data is contaminated.
For more on CTA-specific tests, our CTA examples post has hypothesis ideas worth stealing.
What to do if you don't have enough traffic
Most early-stage products don't have testing-grade traffic. That's fine. Here's the order of operations:
- Fix obvious problems first. Slow load times, broken mobile layout, confusing headline. These don't need tests.
- Use qualitative tools. Heatmaps, recordings, user interviews. Lower data requirements, faster insight.
- Test only the biggest decisions. Headline, offer, page structure. Skip button colors forever.
- Batch your learnings. Ship a redesigned page based on qualitative research, then compare the new page against the old at the campaign level over a month.
That last approach (sequential testing) is less rigorous than a real A/B test but it's better than guessing, and it works at low traffic volumes.
Ready to test smarter, not more?
PagePulse audits your landing page and tells you which elements are most likely costing you conversions, so when you do run a test, you're testing the thing that actually matters. Drop in your URL and get a prioritized list of hypotheses worth testing, ranked by expected impact. No more burning a month of traffic on a button color.