Out-Of-View Impressions Can Be Valuable

mattscharfupdatedData-Driven Thinking" is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Matt Scharf, manager of display media operations and analytics at Adobe.

Contrary to popular belief, out-of-view impressions aren’t worthless.

As controversial as that sounds, impressions that are never seen can hold significant value to marketers. It’s not because they help drive conversions – they don’t, of course – but because they can help marketers develop control groups to evaluate media with more methodological rigor and less heavy lifting.

Adobe’s Project Iceberg, which identified out-of-view impressions at the event level (impression level for each cookie) and removed them from attribution consideration, revealed another compelling benefit that actually gives credence to the presence of out-of-view impressions. We used the Data Workbench capability within Adobe Analytics Premium to uncover the naturally occurring test and control groups in the data. We call it Project Iceberg 2.0.

If you've ever been involved in a study leveraging a test and control group, often referred to as "media effectiveness," "true lift study," "view-through analysis" or "incrementality study," you'd know it takes a lot of planning, coordination, media spending, analysis and waiting for results just to make important future decisions based on findings from a static period of time. That pain can be alleviated with an always-on control and test group predicated on viewability and a data platform allowing for easy analysis.

The Traditional Method

The traditional way to develop a control group requires marketers to carve out a portion of their budget – often 10% to 20% – and enable the ad server to serve banners from a nonprofit organization, such as the Red Cross. This is the standard approach in the industry and requires marketers to knowingly limit their media effectiveness by serving ads that aren't their own.

And after a month or two and sometimes longer, you can evaluate the results by comparing how your test and control groups convert against the same KPI while ensuring no cross-contamination between the groups. The idea is to see a resulting lift of the test group, who are people exposed to your company's message, over the control group, or people not exposed to the message. This validates that display media works and quantifies how much incrementality it drives. It also shows causality of display impressions and resulting view through conversions. 

graph1

The New Way

Now, instead of chalking up these out-of-view impressions to an industry flaw, marketers can leverage this data to our advantage. Viewability measurement is now at our fingertips.

If you're buying media on exchange or network inventory at scale, you're likely serving millions or billions of impressions. At the same time, you are inadvertently developing large, naturally occurring and mutually exclusive test and control groups. The test group is comprised of users exposed to viewable impressions and nothing else. The control group is made of users only exposed to out-of-view impressions.

graph2

The results: mutually exclusive and naturally occurring test and control groups, just like the traditional approach. You could argue that it's more advanced compared to the traditional way because previous studies done didn’t consider whether the impression was actually viewable or not. There's a chance that some studies understated the value of display media because the test group was potentially made up of people who were served only out-of-view impressions and never saw the impression. Or half of the impressions in those test groups may have been out of view. Either scenario could have compromised or diluted the test group performance and skewed any data regarding the true frequency.

graph3

With Project Iceberg 2.0, the definition of our test and control groups depends on whether an impression was viewable or not, and whether users were exposed to only viewable impressions or only out-of-view impressions.

Comparing The Methods

The traditional approach to establishing a control group is done by removing cookies from your addressable audience so they cannot see your company's ad impressions. In the alternate method, the control group includes users only exposed to out-of-view impressions. In both methods, the control group never saw your company's ad.

graph4

The credibility of this type of study is that the test and control groups are clean, uncontaminated, mutually exclusive, randomly assigned and made up of similar people. Both approaches can achieve this to the extent they can control it in the traditional way or find it in the data with the new way. One comes at a cost and requires setup and time. The other is naturally occurring and free but you're subject to how the chips (impressions) may fall. If you want a 20% control group the traditional way, you can manufacture that to happen. By using viewability as the determining factor, the size of your control group will be a function of your viewability rates and impression scale.

What It Can Help You Answer

Some of the difficult questions this can help answer: What's the value of a view-through conversion? Would retargeted users convert anyway without the help of display? Is the time to convert shorter for someone exposed to a display impression? Does that differ by frequency?

The industry has taken a long time to introduce viewability solutions and we have further to go before we fully account for payment standardization and measurement limitations. But the progress we've made thus far could be enough for an industry shift in the options available to marketers to execute on these types of media effectiveness studies.

Notable Caveats

Project Iceberg 2.0 only considers the evaluation of display media. It doesn't account for user exposure and influence from other marketing channels, such as search or email, but it’s a capability.

SQL was not used to analyze the data. I applied filters and metrics in Adobe Data Work Bench to the dataset to tease out the test and control groups that inherently existed in the underlying data.

Follow Matt Scharf (@Matt_Scharf), Adobe (@Adobe) and AdExchanger (@adexchanger) on Twitter.

9 Comments

  1. Hi Matt,

    Nice work. We considered doing this with our non-viewable inventory also, but were concerned about the risk of bias - people who see non-viewable impressions being different to people who don't (they look at different sites, they are less or more competed for in the auction, they scroll faster etc..). It stands to reason that this bias is exacerbated when we compare people who exclusively see viewable or non-viewable impressions. So far we have not been able to prove to our satisfaction that this approach is bias-free and have concerns about recommending it to our clients.

    May I ask how you have proved the absence of bias? Have you been able to make a valid comparison to a randomised holdout experiment? Have you look at the distribution of a few suitable control variables?

    Cheers,

    Andrew

    Reply
  2. Matt Scharf

    Excellent point, thanks Andrew. We went through the same considerations. A couple validation exercises were to look at the distribution of the control group across our media inventory to see if any sites over-indexed in their representation in the control group, indicating a bias. And compared whether the test and control groups had similar behavior on the site. For example, did both groups reach similar pages onsite. There are data points that may warrant removal more than others in an analysis like this to exclude any outliers depending on the campaign’s features. Like all methodologies, there are some potential shortfalls, such as the risk for audience bias you suggest but I’m excited to see how this methodology can be iterated on since we’re still at early stages of considering alternative options like this one.

    Reply
  3. Matt Anthony

    Andrew -

    a great catch. Nice to see there are others out there paying attention to the nitty-gritty of causal inference.

    The statistically correct way to solve this is still through causal-inference modeling ... but you have a two-stage model going on here. Treatment assignment here is still confounded by the fact that those who viewed the ad (which is out of the initial page display area) absolutely differentiate themselves from the "controls" who by definition did not scroll far enough out of the display area to bring the ad slot into view. SO the outcome of the analysis suggested here is actually a study on the causal effect of purchasing created by those who scroll down more within the page to bring the ad into view than those who do not ... additionally, the test and control here by definition were all served the ad so you have a population bias remaining such that the effect is only relevant within the universe targeted by the ad. What if that was the wrong universe?

    This is a start, but far from complete.

    Reply
  4. Andrew's point is the key here. Whilst the idea of not having to plan an experiment and manufacture a control group is attractive, the advantage of doing so is that you control the methodology by which the control group is selected and have the best chance of eliminating bias. You also have the ability to control all other factors - bids, sites, viewability etc - and keep them identical across the test and control group.

    Once you go through the necessary validation exercises etc, are you saving time and effort over the traditional method? This of course still leaves you with the advantage mentioned near the beginning, that you have continual confirmation of the effectiveness of ads, as opposed to results from a static period of time. Which probably makes it worth it. IF you can take every step to eliminate bias.

    Matt Anthony - I'm not sure what you're final point is, in the context of evaluating this new method against a more traditional experiment: "the test and control here by definition were all served the ad so you have a population bias remaining such that the effect is only relevant within the universe targeted by the ad. What if that was the wrong universe?" Does the same not apply to any traditional experiment? If I take an audience and show 10% an ad for the Red Cross and 90% an ad for my company, any conclusion I draw will only be relevant to the audience targeted. I don't see this as any particular disadvantage to this new method over any other.

    Reply
  5. Glad to know I'm not the only skeptic! I am pretty concerned about the extent of bias, and I don't feel like I could recommend this to a client without a lot more evidence - when we're measuring relatively small effects we owe to it ourselves to be a little pedantic. The fact that the two groups were seen on the same sites is hardly surprising when we put them there, and certainly isn't enough evidence that they are the same.

    It doesn't seem a big stretch to get into the habit of comparing the incidence of conversions among control audiences and those that see exclusively non-viewable ads. Of course that won't prove an absence of bias, but it could prove its presence.

    Reply
  6. I would agree with Andrew. The methodology is very creative, and introduces an interesting perspective how on how to transform a "waste" into something useful. Unfortunately, by doing so, the bias is inherent.

    The bias comes from several directions: The ads themselves are different, thus induce bias, as ads appearing on the top of the site, are going to have a higher viewability score than those appearing below and requiring consumers to scroll, and thus overrepresented in one of the groups. Consumers who scroll are different, some just browse passively, others engage with content. Also, the type of content each consumer engages with is different.

    We have a lot of experience with the creation of test and control methodologies. When the insight won is counter-intuitive (and thus, likely more valuable), you want to ensure you have a test methodology that is as clean as possible. These insights drive large investment decision, thus, discovering that the methodology may have driven part of the answer would create great concern.

    As with vaccine testing, the only way to ensure causality is by random selection of your test subjects.

    Reply
  7. Emilio Lapiello

    I agree with the skepticism on this methodology because of the bias it clearly introduces.
    By definition measuring ads lift using a difference between a treated and a control group is valid exclusively if the only difference between those groups is represented by the exposure to the ads.
    Exposed and hold-out groups must be statistically identical - I like to say they have to be ‘otherwise identical’, where the otherwise refers to the ads exposure- and the best way to guarantee they are identical is by randomly assigning a subset of the target audience to a control group.
    The additional point I would make is that, even if it turned out this methodology is unbiased (e.g. by chance the treated and control groups end up being statistically identical), the measurement described in the article would provide the incremental impact of viewable impression.
    As we all know, not all impressions will ever be viewable. Next time we go to market with our campaign, several impressions will not be viewed, and therefore won’t be effective. This ‘waste’ is part of any ads campaigns and it should be included in the ads effectiveness we are trying to measure.

    Reply
  8. Yishay S.

    Hi Matt, This is very interesting way to have an on going pulse for a display campaign. I wondering if the group of users that were exposed to only out of view impression is big enough. Assuming you are running a well executed targeted campaign you should reach the users many times (some out of view and some in view). In your diagram it seems like the groups are equal but in realty I am not sure this is the case. From your experience what was the percentage of users not expose to any viewable impression?

    Reply
  9. A big issue I see here is that if you have one user that sees 5 of the same ad, 100% in view, but for .96 seconds each time, that user has never seen an ad according to IAB standards. However, a second user could see 1 ad for 1.02 seconds and be counted as in-view. While this is an exaggeration of what's likely, to count users who never "saw" an ad based on IAB viewability criteria is not a true control. If you remember the 300x600 ad unit in Yahoo mail, we found that was only "in view" 32% of the time across a number of campaigns we looked at. It was the only ad on the page, though! It clearly was viewed, but just not counted under the criteria.

    Reply

Add a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>