The 5 Pillars Of Running Experiments At Scale

“Brand Aware” explores the data-driven digital ad ecosystem from the marketer’s point of view.

Today’s column is written by Sachin Puri, vice president of growth marketing at McAfee.

Running experiments is not a competitive advantage anymore, but rather a table stake in marketing. Marketers are always running experiments at some level – some more, some less.

Are you satisfied with how your team runs experiments? Do you have proper KPIs to measure and accelerate your test methodologies? Are you still struggling to get proper funding to run experiments? What type of talent are you hiring?

Having run experiments for many years, I have learned that there are five key pillars to run experiments at scale.

1. Always start with customer hypotheses

One of the most vital steps of test-and-learn methodology is to start with a customer hypothesis. For example, perhaps you are targeting price-sensitive buyers via display ads with a discounted price promotion and have a hypothesis that buyers who click on the ads are ready to buy.

If this is true, you might want to test if taking users straight to the checkout would generate better conversion. If the test shows statistically significant lift, then the hypothesis was valid – or, technically speaking, a null hypothesis is not disproved – and you should consider rolling out the winner quickly.

You can learn a lot about customers behavior from this approach and develop hypotheses for future testing. For this example, successive hypotheses can be developed about the location and treatment of the buy button (above or below the fold), call to action (sticky or static) and its appearance (high- or low-contrast color).

Iterative testing until the point of marginal return is key to the growth of your digital marketing investments and performance. And since user behavior evolves, you should retest the original hypothesis to ensure the test results are still valid or need amendment.

2. Integrate testing impact into the financial forecast

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

Daily Roundup

Daily News Roundup

You Think You Own That?; Patchwork Privacy, Meet AI

One of the fastest ways to get buy-in and funding for running experiments is to bake lift from A/B testing into the financial forecast. For example, if you are planning to run multiple conversion rate lift tests to promote a product, I’d recommend estimating and committing to delivering a certain level of product sales explicitly in the financial forecasts. You might face some initial resistance, but as your program starts to deliver the results, you’ll find growing interest from the broader organization to pressure-test and learn more about your methodology and process.

And how do you estimate the business impact? One simple funnel-based framework that I often use includes:

Business impact estimate = experiment velocity (number of hypotheses tested per month) * win rate (number of successful hypotheses / total hypotheses tested) * average lift per winner

This framework helps me plan, track, benchmark and estimate the business impact. Over time, you can develop a dashboard to track, benchmark and diagnose each metric to fine-tune and scale the program.

3. Define clear KPIs and measurement framework

All experiments must have clear success criteria with associated KPIs and measurement framework to ensure teams know what they are solving for and how they will measure and define the winner.

It’s fairly common and advisable to have secondary KPIs to learn about audience behavior more broadly. However, that shouldn’t influence the outcome of the experiment, but rather it should inform future hypotheses and testing. For example, if an experiment aimed to improve conversion rate, a resulting decline in average order value (AOV) or associated revenue shouldn’t render the test a failure; it could inform future tests to improve AOV or revenue.

Thus, it’s important that data and analytics teams are involved from the hypothesis generation stage to ensure test design and data capture is structured to enable effective measurement. Often, I have seen that the data team is engaged after the fact, which may not allow for accurate and scalable reading of experiments. I recommend that marketing leaders integrate data and analytics teams as a part of a core experiments team to facilitate robust measurement and successive hypothesis generation.

4. Run a tight ship with discipline and agility to achieve scale

I have learned that discipline and a defined process is key to successfully run experiments at scale. It can’t be run as a side activity with an undefined or ad-hoc process.

Defined steps and process allow consistent tracking, follow-through and critical path resolution for testing and rollout. For example, some of my teams start with an ideation workshop followed by a prioritization and impact assessment step. Once the prioritized tests are identified, then the team moves them to test design step, followed by a test flight step until the statistical significance is achieved. From there the team would move the test to the measurement and analysis step. Eventually, the test moves to results, socialization and rollout, which some of my teams call “banking,” as in the business impact is in the bank.

It’s vital that teams follow all steps for each experiment, even if it’s a quick conversation, as it builds organizational muscle and repeatability of test results at scale. My teams usually run agile scrum (a process used in agile software development) with daily standup meetings to facilitate regular test tracking, team discussion and process adoption.

One important step often skipped or missed is socialization and documentation of learning. I recommend a regular socialization and celebration of test results to ensure audience and performance learnings are shared across the broader organization.

5. Hire diverse talent and build the right culture

This is one of the most important and, at times, hardest parts of experimentation as the talent required varies across curious hackers, data geeks, researchers, engineers, behavioral scientists, marketers and so on. Hiring and retaining such diverse talent is a challenge in and of itself, but the biggest challenge is to build chemistry among them and a culture of empowerment, wherein the team is self-motivated to make decisions balancing human behavior and data.

Depending on the organization, cultural evolution requires both top-down and bottom-up organic change. In both scenarios, it’s extremely important for leaders to coach and enable their teams to feel empowered in decision-making. Some of the techniques that I have found work well include participating in daily standups, leaving titles at the door and letting teams and data drive the decisions. It is also helpful to constructively challenge the team about the hypothesis being tested and, most importantly, celebrate wins as that enables confidence building and shared achievement for the team.

What next, where to start?

Start with attending and observing some experimentation team meetings and assessing against this checklist. Does the team start with customer hypothesis? What’s been the business impact and associated test velocity, win rate and lift? How does the team decide the winners, and has it followed a consistent process diligently? Look for team dynamics and any talent gaps.

To assess the culture, you should pressure-test and observe carefully how the team makes decisions. Empowerment is critical to the success of the scaling and velocity of experimentation.

If you find that the team still looks for their managers to make decisions, that’s your clue and starting point.

Follow Sachin Puri (@spuri79), McAfee (@McAfee) and AdExchanger (@adexchanger) on Twitter.