Home Data-Driven Thinking The High Cost Of Bad Measurement: Why Randomized Geo Experiments Are The Gold Standard

The High Cost Of Bad Measurement: Why Randomized Geo Experiments Are The Gold Standard

SHARE:

The number-one job of a marketer is to invest budget wisely to drive sales. That inherently requires accurately measuring the performance of that spending. 

Yet most advertisers still rely on flawed measurement methods that systematically overstate performance and misallocate resources.

The measurement crisis

Even the smallest Fortune 500 companies generate roughly $10 billion in revenue, meaning they likely spend at least $1 billion on advertising. Whether you’re spending billions or mere millions, the stakes are too high to rely on half-measures when optimizing ROI.

Attribution modeling, matched-market tests, synthetic control methods and other quasi-experimental approaches dominate the measurement landscape despite well-documented limitations, including their being subject to bias and overfitting to pleasing results. 

Scientific guidance is clear: quasi-experiments – where test and control groups aren’t assigned strictly at random – should only be used when randomized controlled trials (RCTs) are infeasible or unethical.

In advertising measurement, RCTs are rarely unethical and almost always feasible. Marketers often choose lesser standards due to a lack of understanding about how inferior those methods are for causal inference, or from misplaced concerns about the perceived cost and complexity of proper experimentation. The real risk isn’t in running robust tests; it’s in wasting money or cutting high-performing channels based on misleading conclusions from bad measurement.

The geographic experiment solution

Geographic experiments using Nielsen’s Designated Market Areas (DMAs) offer a remarkably straightforward and effective way to measure true incremental return on ad spend (iROAS). 

Unlike user-level experiments, which have been compromised by privacy changes and were never as accurate as claimed, geo experiments are independent of media platforms and provide deterministic results without personal data or expensive infrastructure like user ID graphs, tracking pixels or clean rooms.

The method is elegant in its simplicity:

Subscribe

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

  • Randomly assign all 210 US DMAs to test and control groups
  • Run your advertising campaign only in the test DMAs
  • Measure sales lift by counting transactions by geography from CRM data or sales panels
  • Calculate iROAS by dividing incremental revenue by campaign cost

This methodology is transparent, replicable, unbiased and explainable. It works across virtually all media channels – TV, digital display, social, search, audio, out-of-home – making it ideal for validating channel performance and calibrating marketing mix models.

Achieving balance and statistical power with DMAs

A common apprehension about using DMAs in randomized experiments is that they may lack statistical power and balance, given that DMAs vary in size and characteristics and there are only 210 of them in the US. 

But in practice, several established techniques can overcome these limitations, achieving fine balance between test and control groups and enabling the detection of minimum effect sizes below 1% with 95% confidence.

Key techniques include:

  • Covariate-constrained randomization (re-randomization): Generates thousands of potential random assignments (e.g., 10,000 draws) and selects one that meets pre-specified balance criteria. This approach offers precise control over balance while preserving the core principle of randomization.
  • Post-stratification and covariate adjustment: Starts with a basic random assignment, then uses statistical adjustment methods like regression or ANCOVA in the analysis phase to correct for any imbalances. This doesn’t alter the design but can recover power lost to imbalance.
  • Multi-armed, stepped cluster randomized trial (CRT): Instead of a single test group, this design staggers multiple test arms over time. Pretreatment periods from test markets can be used to enrich the control group, naturally improving statistical power and balance. For advertisers with large transaction volumes, this enables detection of effects as small as 0.5% or less.

These and other methods are covered in detail in an open-source guide on GitHub, How To Design a Geographic Randomized Controlled Trial,” which includes visual diagrams, step-by-step frameworks and over a dozen annotated Python code snippets.

Real-world impact

In one case study, a Fortune 100 brand used the multi-armed, stepped CRT method to evaluate one of its largest digital channels. The CMO had lost faith in their attribution analytics and was preparing to cut the channel’s budget in half.

The geo experiment revealed that the channel was a substantial driver of new customer acquisitions, primarily through offline sales that attribution models missed. Cutting the spend would have jeopardized over a billion dollars in annual revenue.

Implementation guidelines

To run effective geo experiments:

  • Start with major channels: Validate your largest investments first, where gains matter most.
  • Randomize properly: Avoid cherry-picking; random assignment is essential for valid results.
  • Include all DMAs: Experiments limited to just a few markets lack national generalizability.
  • Normalize for pre-periods: Compare each DMA to its own baseline to control for differences.
  • Predefine decision rules: Decide how results will inform budgets before seeing the data.
  • Consider timing: Most experiments need 4-6 weeks of exposure to detect meaningful effects.

The path forward

As AI increasingly drives marketing decisions, training these systems on accurate performance data is critical. Feeding flawed attribution into AI models will only magnify bias and misallocation.

The carpenter’s adage, “Measure twice, cut once,” applies well here. Before making million-dollar media decisions, invest in measurement that truly captures incremental impact. Your marketing effectiveness – and perhaps your job – may depend on it.

Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Follow Central Control and AdExchanger on LinkedIn.

For more articles featuring Rick Bruner, click here.

Must Read

Comic: Gamechanger (Google lost the DOJ's search antitrust case)

DOJ v. Google: How Judge Brinkema Seems To Be Thinking After Week One

Where the DOJ v. Google ad tech antitrust trial stands after one week’s worth of remedies arguments.

Swish, A Company That's Bringing Programmatic to Product Sampling, Announces Seed Funding

Swish, a startup that partners with retailers to provide product full-size CPG samples to people doing their grocery shopping online, announces $2.3 million in seed funding.

DOJ v. Google: During Opening Arguments, The DOJ And Google Battle Over An AdX Divestiture

Court is back in session. And the fate of  the open internet is in the balance.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
Chris Mufarrige, director, Bureau of Consumer Protection, FTC

FTC Consumer Protection Chief: No Easy Answers On Privacy, ‘Only Trade-Offs’

Privacy isn’t black-and-white, says the FTC’s Chris Mufarrige, promising evidence-driven consumer protection cases under the Trump administration.

How Encryption Keys Could Resolve The TID Furor

Rather than sharing universal TIDs that any DSP or curator can access, Raptive says publishers should instead share encrypted TIDs with an encryption key provided only to trusted demand-side partners.

Clear Channel Brings Mid-Flight Measurement To Its OOH Network

Clear Channel will provide advertisers weekly, mid-flight reports on outcomes driven by its inventory in order to bring OOH measurement closer to the speed of digital.