Fraud-day With Integral Ad Science: 'The Way The Industry Measures Performance Online Is Fundamentally Flawed'

This is the second in a series of interviews with vendors combating the problem of ad fraud. Other companies participating in this series include White Ops, DoubleVerify, Forensiq, Moat, PubChecker and Telemetry. Read last week’s interview with Videology.

When it comes to detecting and preventing ad fraud, Integral Ad Science is the middleman. That is, the company provides technologies and services to all constituents in the ad-buying ecosystem.

It has account management teams that help get its fraud-detection and prevention tools implemented and running. The team also acts as an intermediary between buyers and sellers.

“If we see a publisher violate certain things like serve a whole host of ads to out-of-geography requirements or violate certain keywords in the campaign or brand safety policy settings, our account team will reach out proactively and try to rectify the problem,” said David Hahn, the company’s SVP of product management.

On the buy side, Integral works with the major demand-side platforms (DSPs) like Turn, DoubleClick Bid Manager and AppNexus, as well as with “several hundred brand advertisers” that include, according to Hahn, more than half of the Fortune 100. “We cover a fairly large footprint in the digital ecosystem,” he said. “All in all, we’re doing across all our products north of 2.5 to 3 billion impressions per day.”

On the sell side, Integral integrates with major networks, premium publisher exchanges and supply-side platforms to help police policies and increase media yield.

Integral originated in 2009 with a brand safety product, one that enabled advertisers to stop creative from serving if it posed a brand safety risk. The next year, it began getting into the fraud and viewability space.

This includes a metric called TRue Advertising Quality (TRAQ), designed to provide both advertisers and publishers with a sort of credit score, allowing both parties to determine the value of various media assets.

“We’re very much a real-time actionable solution so we’ll pick up something going wrong with the campaign within a day,” Hahn said. “Some of the traditional verification systems provide a weekly or campaign wrap-up report at the end of the week. And you realize you blew 5% or 10% of your media to the wrong geography, or to some illegal download media page, or to some fraudulent impressions.”

Hahn spoke with AdExchanger.

AdExchanger: What are the biggest misconceptions around fraud?

DAVID HAHN: Just because something is lousy quality doesn’t mean it’s fraud. Just because there are 20 ads on a page 19 of them are at the bottom of the page and don’t come into view: That’s not fraud. That’s poor-quality media.

The show that runs on your television at 3 a.m. serves ads. But not a lot of people see those ads because they’re sleeping. Same with the web. There’s a ton of lousy quality out there, but that doesn’t mean it’s fraud.

Why does this need to be distinguished?

Even really poor quality has some value to somebody out there. It might not be a lot of value. But even an ad that has a 20% probability of being seen by a human should contain some value.

Let’s say AdExchanger goes out, buys traffic and some of it is fraudulent. Does that mean every single impression on AdExchanger is fraudulent? No, it doesn’t. It means a certain percentage of impressions that are fraudulent. However, certain vendors will wipe out the entire site.

We had the same issue with brand safety in 2009, when certain vendors would look at subdomain or domain level, and they’d label Huffington Post as pornographic because of the 10% or 5% of pages that were somewhat salacious. That’s not fair to an advertiser or a publisher.

Why are poorly placed media and fraudulent media conflated?

Because the definition of fraud isn’t fully baked yet. That’s what the TOGI (Traffic of Good Intent, fielded by the IAB) task force is trying to work through: the actual definition of fraud. We’re a lot closer now than we were a year and a half ago when TOGI was set up.

What’s your definition of fraud?

The definition of a fraudulent ad is probably an ad that never has the opportunity of being seen by a human.

What type of ad fraud do you commonly see?

There are two broad buckets we focus on. The first is non-human behavior. This type of fraud exists where a machine has been taken over by a bot, and that bot has given that machine instructions to serve ads behind the scenes so no human will ever see them. There are tons of these botnets out there generating millions of ad impressions on a daily basis that never have the opportunity of being seen by human.

There’s another type of fraud that doesn’t have opportunity to be seen by humans. That doesn’t mean it’s served to a bot, it means the publisher partaking in this type of fraud is knowingly trying to defraud an advertiser. That type of fraud includes stuffing 1×1 pixels all over a page and serving a bunch of ads into those 1×1 pixels.

There’s impression stuffing where you layer seven, eight, nine or 10 impressions on top of each other in an ad slot so only the one on top gets seen and those below it don’t get seen. In the video space, we see similar types of behavior where video players are being stuffed into 1×1 iframes, or videos looping right after the other without being shown to users.

We see some instances where there’s masking or misrepresentation of the actual domain or the actual site the ad is being bought on.

How do you weed this stuff out?

Given the scale we deal with in the industry, you can’t have manual processors to find fraudsters. They’re too smart, they move too quickly, so you need to leverage tools to help you identify and rid your exchange, network or campaign of fraud.

The second piece is we need to make it less incentivizing to commit fraud. Currently, the way we measure performance online is fundamentally flawed. [The industry uses] correlation-based models. Was the last touch associated with this conversion? If so, the publisher should get credit.

But just because I saw the ad last doesn’t mean it’s the cause of my conversion.

So what needs to change?

We need to move to causality as a performance indicator and not correlation. Some of the things we work on with our buy-side clients is how to derive causality for these campaigns. If I’m being measured on last touch, I have an easy way to game that system. That is exactly how the fraudsters are winning.

How does this gaming work?

Let’s say you have three publishers on a campaign. Publisher one serves 100,000 impressions and it’s a direct premium publisher with almost no fraud on the campaign. Publisher two serves 500,000 impressions and half are fraudulent. Publisher three serve a million impressions and three quarters are fraudulent.

If you’re using last touch or last click attribution, chances are publisher three, because they’re serving so many more ads, will wind up with some type of correlation-based conversion, and chances are a lot of those last touch values will be derived from some of the fraudulent impressions they’re serving.

If you’re calculating attribution based on causality as opposed to correlation, any impressions served by publishers two and three that were fraudulent would be automatically eliminated from the possibility of converting. Meaning publisher three would only have 250,000 impressions that could potentially count toward attribution, versus a million.

Can you give me a real-world example of when correlation-based metrics failed?

In one example, we saw a DSP client of ours start optimizing around what they thought was a viewable impression. In fact, it was a viewable impression being served by a bot. The bot made the DSP think it was a viewable impression. And the vendor they were using – not us – was measuring it as in-view and optimizing around it.

However, it was fraud. The performance of the campaign never improved at all, but they thought they were doing a good job optimizing around viewability. They were really optimizing around fraud. To that end, if you just look at correlation-based metrics, you’ll never derive true performance around a campaign.

How does Integral differentiate its fraud-detection and prevention methodologies?

We look at behavioral patterns to look at infected users and infected machines. We can differentiate whether the signals come from a bot or a human on an infected machine.

We also look at a lot of impression signals because of our tag-based products. We see north of two to three and a half billion impressions per day across all of our products. So our footprint is vast when it comes to the ad-supported web, and leveraging that footprint and allowing us to see these impression level signals further assists in both the accuracy and granularity of our detection.

How rare is impression-level fraud detection?

I think it’s pretty rare. How do I say this politically correctly? I think there’s a level of marketing that creates some confusion. That applies to both the people who throw out large numbers, claiming that 40-50% of our industry is fraudulent. But it also applies to certain people who have limitations in their technology who can’t differentiate between domain level or impression level. We’ve been doing impression level for well over a year now. I don’t know who else does it versus who else claims to be doing it.

Why is having a vast footprint important?

If you’re looking at a small slice of the puzzle, it’s hard for you to figure out what the whole puzzle looks like. We see almost the entire puzzle. If we see certain things in one part of the puzzle and [certain things in] another part, we can connect those pieces easily because of our scale. We can come up with a reliable way of saying: “This doesn’t look like a normal transaction and must be fraudulent.”

Some of the buying and selling platforms have built their own fraud-detection technologies. But they’re only looking at their map of the universe. We sit across most of the major platforms. If something happens on platform one, platform two won’t get caught with it. By the time it gets there, we’ve updated our systems and platform two has benefited from a macro view of the ecosystem.

Is ad verification and prevention part of the same solution or separate?

I’d say they’re fairly separate. The disciplines are different and we’ve invested in them as individual businesses to make sure we’re best-in-breed for all of them.

What are the differences?

The person developing our model to detect whether there’s pornography or offensive language is a different type of data scientist from the one detecting fraud, who figures out behavioral analysis. The back end is where the skill set differs. But our solutions are not designed to do verification. We see verification as a passive category. Our brand safety solution actually blocks the ad from serving.

Do agencies and partners use more than one fraud vendor?

Most agencies have a list of preferred vendors and no agency will select one and say this is the de facto fraud vendor we use. They have a recommended list and let the teams make a decision. But once a team has made a decision, they’ll use one vendor. Most of the advertisers we work with use us exclusively for fraud detection.

And on the sell side, it varies. Certain publishers have one or two vendors – I don’t see much more than two usually – in their stack, and they take both those data points and come up with their own evaluation.

The sell side starts with more than one integrated and moves over time to a single vendor. Agencies have a few recommended, and the teams will pick which one they want to use exclusively.

Tagged in: