I see a lot of teams spending serious money on paid search, paid social, and other media. The uncomfortable question is always the same: which part of that spend is actually creating new revenue - and which part is just getting credit for conversions that would have happened anyway?
That’s what incrementality work is for. Not more reporting. Not another attribution tweak. A cleaner answer to what I should keep funding, what I should cut, and where the next dollar should go.
Why incrementality testing matters
When I talk to CEOs and founders about incrementality testing, it usually comes down to three outcomes: proving the true ROI of major channels and tactics, stopping spend on conversions that were already likely to happen, and reallocating budget allocation with confidence instead of gut feel (or polished slides that don’t survive scrutiny).
Most teams only turn to this when something starts to feel “off.” Customer acquisition cost keeps rising while conversion rates look fine. Pipeline growth plateaus even as media spend goes up. Brand, retargeting, and performance channels fight over the same deals. Attribution reports get noisier after privacy changes, tracking breaks, or model updates. In B2B, the situation gets worse when lag time distorts performance reporting.
Traditional measurement tells me where clicks and form fills came from. Marketing incrementality asks a sharper question: what changed because of marketing that would not have changed otherwise.
In practice, I treat “attributed conversions” and “incremental conversions” as two different numbers. Attributed conversions are what ad platforms and analytics tools report. Incremental conversions are what an incrementality test tries to isolate using a test group and a holdout. The gap between them - the incremental lift - is what actually grows the business.
You’ll hear a lot of adjacent terms (iROAS, geo-lift tests, conversion lift studies, causal inference, statistical significance, data contamination). Under the jargon, the idea stays simple: I want to treat marketing like any other investment and show that it moves a number that matters.
What is marketing incrementality?
Marketing incrementality is the share of outcomes that happened because of marketing, not just near marketing.
If I paused a campaign, how many signups, demo requests, or closed-won deals would disappear? That difference is the incremental lift from that activity - often described in advertising as conversions or revenue “above what would have occurred without the ads.”
A concrete example: imagine I’m running paid search on brand terms and a large retargeting program on social. Leads from both can look great on paper. But a holdout test may show that a large portion of “retargeting” conversions are people who already searched the brand and were already on their way to submit a form. In that case, retargeting didn’t create the intent; it captured it and took credit. This is why it helps to understand what brand search in B2B measures (and what it does not).
Incrementality testing helps separate cannibalization from genuine new demand.
Incrementality in one line
The share of conversions that only happened because the marketing existed.
Once I have that share, I can talk less about vanity metrics and more about incremental conversions, incremental revenue, and incremental ROAS (iROAS). iROAS is simply incremental revenue divided by ad spend - focusing on revenue the test indicates was caused by the campaign, not all revenue that got attributed to it. If you want a deeper definition and examples, see incrementality testing.
The core question: what would have happened anyway?
Every incrementality test tries to answer one question:
What would performance look like if this campaign or channel had not run?
I never directly observe that alternate world, so I have to estimate it. That estimate is the counterfactual, or baseline.
A baseline is not “zero marketing.” It’s the organic demand and momentum that exist even if I change nothing: existing customers who would renew or expand anyway, prospects already mid-funnel from sales outreach or previous marketing, and brand searches driven by reputation built over time. On top of that, the world keeps moving - seasonality, competitor activity, and broader market shifts can all affect outcomes during a test window.
That’s why incrementality testing compares two comparable timelines: a baseline (holdout) and a treatment (exposed) group observed over the same period. The distance between those lines - after accounting for normal noise - is the incremental lift. Without a clean holdout, enough volume, and a stable test window, the “what would have happened anyway” part turns into guesswork.
Incrementality vs attribution
I don’t treat attribution and incrementality as rivals. They answer different questions.
Attribution assigns credit across touchpoints in a journey (last click, first touch, position-based, data-driven models). It’s useful for day-to-day optimization inside channels, understanding which keywords, audiences, and creatives show up in journeys, and deciding where deeper tests might be worth running. For a refresher on the mechanics, see Attribution models. In B2B specifically, it’s also worth reviewing the case against last-click in B2B.
Attribution also has blind spots. It works only on observed paths, not on what would have happened without ads. It often over-credits retargeting and brand, where causality is frequently weaker. And view-through conversions inside walled gardens can inflate perceived impact. If you’re trying to make sense of multi-touch influence over time, how to interpret assisted conversions in long B2B cycles can help.
Marketing mix modeling (MMM) adds a top-down view using historical time series across online and offline spend. It can help with higher-level budget splits and forecasting, but it’s usually less direct when I want to know whether a specific campaign moved the needle last month. If you want a concise overview, see Media Mix Modeling.
Incrementality tests fill that gap with experiments.
| Question | Attribution | Incrementality testing | MMM |
|---|---|---|---|
| Which keyword drove this specific lead | Strong | Limited | Not suitable |
| Should I keep funding this retargeting strategy | Partial view | Strong | Weaker for this use |
| How much should I spend by major channel next quarter | Helpful input | Strong on tested pieces | Strong at high level |
| What did a TV or podcast burst do to signups | Weak | Good with geo-lift/holdout | Strong |
| What happens if I cut brand search spend by 30% | Weak | Good with experiments | Good for scenarios |
When measurement is mature, I prefer to blend them: attribution for tactical optimization, incrementality for causal lift, and MMM for longer-term mix decisions.
The foundation: test vs control groups
Every credible incrementality test relies on two groups that look the same at the start: a test (treatment) group that sees the marketing I want to measure, and a control (holdout) group that does not. In other words, an incrementality test is any structured experiment where I split an audience, market, or account list into test and control, expose only the test group to the campaign, and measure the lift in the KPI.
The control group has to be “untouched” by the specific activity, while everything else stays as equal as possible. In practice, that means I’m disciplined about randomization (when possible), stable targeting rules that prevent leaks into control, and a fixed measurement window so both groups are observed consistently.
When a test result looks suspicious - especially when it’s unexpectedly tiny or negative - I usually inspect the basics before I debate strategy. Can people in the control group still see the ads? Are both groups exposed to the same external events? Am I applying inclusion and exclusion logic consistently? If not, the test is at risk of data contamination, often caused by audience overlap or messy targeting boundaries.
A clean test vs control setup is intentionally boring. That’s the point: predictable beats clever when I’m trying to prove causal lift.
How to design and run an incrementality test
I try to keep incrementality testing repeatable and practical - especially in B2B, where volumes can be lower and sales cycles longer. The steps below are the core process I use, whether I’m measuring a single campaign or a broader channel tactic.
-
Choose a hypothesis and KPIs
I start with a falsifiable statement and one or two primary KPIs leadership cares about (for example: demo requests, qualified opportunities, or revenue). A hypothesis might look like: “Running paid social retargeting to high-intent website visitors increases demo requests versus holding out a comparable group.” If your org debates what “qualified” even means, align that first - for example, using a shared definition like what qualified means in B2B. -
Choose a methodology that fits the channel
In most cases, I’m choosing among a few families: randomized controlled trials (user-level where possible), platform conversion lift studies, geo-lift (matched market) tests, or causal inference approaches that approximate a control using historical patterns. For smaller budgets or low volume, I usually lean toward simpler designs - geo or account-level holdouts, time-based pauses, or focusing on higher-volume upstream KPIs - because otherwise noise can drown out the effect. -
Define test and control groups - and protect them
I split at user, account, or geo level depending on what I’m measuring. The exact split depends on risk tolerance, but the real goal is comparability and isolation: the holdout must remain meaningfully unexposed to the tactic I’m testing. -
Plan for sample size and duration (power)
Sample size needs depend on baseline conversion rate, the lift I want to be able to detect, and the confidence level I’m aiming for. Lower baseline rates and smaller expected lifts require larger audiences or longer tests. If I can’t realistically reach enough conversions, I either extend the duration, broaden the eligible audience, or move the primary KPI earlier in the funnel for that iteration. -
Run, analyze, and decide
During the test, I keep major variables steady (budget, targeting rules, and measurement windows) and avoid overlapping experiments that hit the same segment. At the end, I compare outcomes between test and control and make an explicit decision: scale, reshape, or cut - then document what I learned so the next test gets easier.
Calculating lift and iROAS
I keep the math straightforward and transparent. A common way to express incremental lift is the relative difference in conversion rates:
Incremental lift % = ((Test conversion rate / Control conversion rate) - 1) * 100
To translate that into incremental conversions:
Incremental conversions = (Test rate - Control rate) * Number of people in test group
And to tie it to spend and revenue using iROAS:
Incremental revenue = Incremental conversions * Average deal value
iROAS = Incremental revenue / Ad spend
If lift is positive and statistically meaningful, I treat it as evidence the activity is contributing incremental value. If results are flat or inconclusive, that’s still useful: it tells me I should reduce, redesign, or test a different lever rather than keep paying for credit.
Running incrementality tests in-house
I’ve seen B2B teams run incrementality testing fully in-house, but it works best when a few prerequisites are in place.
First, I need experiment design and execution skills: someone comfortable with the mechanics (control groups, sample sizes, confidence intervals) and operators who can implement clean splits across ad platforms, CRM, and analytics without leaking exposure into holdouts.
Second, I need analysis and reporting capability: access to reasonably raw data across platforms and CRM, a way to join spend to outcomes, and the ability to explain lift and iROAS in plain language to stakeholders who are tired of black-box metrics.
Third, I need governance and alignment: agreement with sales leadership on which outcomes “count,” guardrails for pausing and scaling based on results, and a simple way to prioritize which tests get run and when.
Timing varies widely by team and data quality. In practice, I usually think in terms of “first credible test” rather than a perfect system: the risk isn’t only doing it wrong, it’s waiting so long to do it at all that I keep spending into uncertainty. This is also where a finance lens helps - for a CFO-friendly framing, reference A finance-ready framework to bring clarity, discipline, and defensibility to marketing spend.
Common pitfalls and challenges in incrementality testing
Incrementality testing is powerful, and it’s also easy to get wrong in ways that look like “strategy issues” but are actually design or execution issues.
The most common failures I run into are data contamination (control users still see ads), audience overlap (the same users or accounts end up in multiple tests), insufficient sample size (so everything is inconclusive), and mid-test changes (creative, landing pages, targeting) that muddy causality. Seasonality and promotions can also distort results if the test window crosses major demand swings. And in B2B, short tests can be especially misleading when the sales cycle is long.
When results aren’t trusted internally, it’s rarely because incrementality “doesn’t work.” It’s usually because the test couldn’t hold the counterfactual steady enough to be interpretable.
The role of a unified data platform in incrementality
Most of the pain in marketing incrementality isn’t statistics - it’s messy data.
To run incrementality testing at scale, I need a unified data foundation, even if it’s a lightweight version: consistent feeds from ad platforms, web analytics, and CRM; shared definitions for key metrics like “qualified opportunity” or “pipeline created”; a way to connect spend to outcomes across channels and geos; and a place to store experiment metadata so results can be compared over time.
In day-to-day terms, that means spend and delivery data flow in from ad platforms while CRM and product systems supply conversions and revenue. A processing layer standardizes naming and maps campaigns to business-relevant groupings so experiment results are repeatable instead of rebuilt in spreadsheets each time.
Once that foundation exists, reporting becomes less fragile. Lift and iROAS calculations can be generated consistently, and when leadership asks why a budget decision was made, I can point to a documented test and the underlying data - rather than a debate about attribution settings.
Conclusion: build or accelerate?
Incrementality testing answers the budget question attribution alone can’t settle: which parts of marketing actually create new business, and which parts mainly capture demand that was already on its way.
I can build this capability in-house, but it requires comfort with experimentation, enough data plumbing to trust the numbers, and leadership support for holdouts and occasionally uncomfortable findings. The payoff is tighter measurement and a culture where “show me the lift” becomes normal.
If I’m deciding where to start, I keep it simple: pick one meaningful spend area, choose a KPI that leadership truly cares about, run a test with a defensible holdout, and calculate lift and iROAS. Then I reallocate based on what I learned and repeat with the next channel.





