The more you push your current bestseller, the harder it becomes to create the next one.

Not because your new products are bad. But because the test is rigged. Your incumbent gets the best placement, the most ad budget, the most social proof, and the most historical conversion data – so it wins every “experiment” by sheer exposure. Financially, that’s how brands stall: one hero SKU becomes a bottleneck, and your customer file stops evolving.

The invisible trap: the exposure flywheel

Here’s what’s really happening inside most catalogs:

  • The hero SKU gets more impressions (paid + onsite).

  • More impressions create higher click-through and more purchases.

  • That performance signal convinces teams to give it even more placement.

  • Platforms reinforce it because the conversion history is strongest there.

That loop looks like “product-market fit.” But it’s often just path dependence.

By the time you’re at $20M, $50M, $100M, your incumbent hero isn’t just a product – it’s a distribution advantage. And distribution advantages don’t reveal new winners. They protect old ones.

A story you’ll recognize

A brand has a hero SKU that drives a big chunk of revenue. Over the last 12–18 months, they’ve launched six new products that should have become the next hero.

What happened?

  • New SKUs were “tested” via small email callouts and a few scattered ads.

  • The hero remained pinned top of collections, featured in cart upsells, and led every retargeting campaign.

  • Whenever a new SKU underperformed, the team concluded: “Not a winner.”

  • So the hero got even more budget.

Then CAC started creeping up. Growth slowed. And suddenly leadership is asking, “Why can’t we find another hero?”

Because they never actually ran a fair test.

The old bestseller was born on a level playing field that no longer exists.

When you first found it, you likely tested multiple products with similar visibility and let customers decide. Years later, the audience has been primed to choose the incumbent again and again – and the algorithms have been trained to agree.

You can’t rerun the original test, because you changed the experiment.

5 principles that change how you think about “bestsellers”

A bestseller is not necessarily the best product – it’s the best-distributed product.

If one SKU has 60% of paid + onsite impressions, it will win most A/B tests by default. You’re not measuring “product quality.” You’re measuring exposure.

Your catalog has an algorithmic bias – even before Meta touches it.

Onsite placement, collection sorting, cross-sell modules, email hero blocks – all of it concentrates attention on the incumbent. Then Meta/Google sees conversion history and amplifies the same winner. It’s bias stacked on bias.

“Cheap CAC” is often just “easy-to-convert because everyone has seen it.”

The incumbent’s CAC looks better because the market has been trained on it. That doesn’t mean it’s your best long-term acquisition product. It might simply be the most familiar.

The cost of discovering a new hero is a learning tax.

When you force a fair test, short-term CAC often rises ~10–15% because you’re buying truth, not leaderboard points. If your team can’t tolerate that temporary inefficiency, you will never discover new winners.

You should declare a new hero only if it passes cohort guardrails – not if it wins day-1 ROAS.

The goal isn’t to crown what converts fastest today. The goal is to crown what creates the best customers: strong contribution margin, low returns, fast payback, and strong 60–90 day repeat.

The real takeaway

If you keep pushing the incumbent, you aren’t just growing revenue – you’re compounding bias. You’re training your team and your algorithms to believe only one product can win. That’s how brands get stuck with one hero, rising CAC, and a portfolio that never evolves.

The fix isn’t a better A/B test. It’s a fair test: equal exposure, isolated budgets, enough time to learn, and a clear set of cohort guardrails before you crown anyone.

Equalize exposure – or you’ll only ever crown the incumbent.

– Alex from RetentionX

Frequently Asked Questions

What’s the best “fair test” setup you’ve seen work in practice? Is it a fixed budget per contender, equal onsite placement, and a minimum runtime – or do you also lock the creative and audience to avoid “incumbent advantage” sneaking back in?

The cleanest setup is: fixed budgets, equal exposure, and a minimum runtime–plus controlling the biggest sources of bias. That usually means locking placements (same real estate), using comparable creative formats, and keeping the audience broad enough that you’re not just retargeting people already primed for the incumbent. I also like isolating to a “challenger” campaign or catalog set so the algorithm can’t drift back to the historical winner. The goal is to buy a learning window, not to win week one.

How do you convince a performance team to tolerate worse-looking CAC/ROAS during the fairness window without them killing the test early? Do you define a separate KPI (learning per dollar, cohort quality, add-to-cart rate) for “challenger” experiments?

Yes–you need a separate scorecard. Challenger tests should be judged on signal quality and downstream indicators, not day-1 ROAS. I’ll pre-commit to a budget and time window, then look at metrics like add-to-cart rate, conversion rate on cold traffic, return/discount risk, and early cohort signals (30/60-day repeat intent where possible). If you let “leaderboard KPIs” decide, the incumbent always wins. Treat it like R&D: you’re investing in the next profit engine, not optimizing the current one.

How long is “long enough”?

Long enough to exit the noise and let the algorithm learn–usually at least 1–2 full purchase cycles for your category. Practically, that’s often 2–4 weeks for many DTC brands, longer if consideration is high or AOV is big. I also like setting a minimum number of conversions/impressions per contender rather than just time. If you stop before the sample is meaningful, you didn’t run a test–you ran a bias confirmation.

Do you see this bias more on platforms (Meta, Shopping/PMax) or onsite (homepage/collections/recommendations)? If you had to pick one place to “reset fairness,” where would you start for the biggest impact?

It shows up in both, but platforms amplify it because they’re trained on historical conversion data and will keep finding the “safe” winner. If I had to start in one place, I’d start on paid media isolation, because that’s where budgets and learning scale fastest. Then I’d align onsite placement so you’re not sabotaging the test at the point of decision. The best results come when both sides match: equal traffic + equal shelf space.

What’s your rule for choosing which challengers deserve a fair test? If you level the field for everything, you burn budget. How do you pre-qualify candidates (reviews, margin, repeat potential, product stickiness) before giving them an isolated push?

I pre-qualify on three dimensions: unit economics (CM1 margin + return risk), product role (does it create repeat or ladder customers?), and creative/story clarity (can you explain it fast?). If a product is margin-weak or high-return, it’s a bad candidate no matter how “cool” it is. Then I pick a small set of contenders (2–5) and run fair tests with caps. The point is to create new heroes that are profitable, not just popular.