Home Data-Driven Thinking In The Fight Against Mobile Ad Fraud, Science Trumps Spam

In The Fight Against Mobile Ad Fraud, Science Trumps Spam

SHARE:

ivanzalameaddtData-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Ivan Zalamea, data scientist at PlaceIQ.

In digital advertising, fraud is a reality everyone must learn to deal with and adapt to. The mobile ecosystem is no exception, and it’s increasingly sprinkled with low-quality location data.

While the majority of mobile location data is of good quality, digital contamination brings unfortunate results, including spammy requests, industrywide hesitation and subpar ROI for brands.

At this critical time, when mobile is not only booming but also being combined with other powerful mediums, such as television, marketers need the best location data available. That’s where data scientists come in. Their roles are varied, but all data scientists share at least one common goal: to ensure the quality of data.

My background is in astrophysics, where we use connections and correlations to reveal the mysteries of the universe. The universe is vast, but to study it, we zero in on individual elements that make up the whole. Similarly, in the era of big data, it’s not the size of the data that matters most, but the actionable pieces of information that can be derived from that data. Understanding the many parts and layers involved make it possible to create a coherent picture of the validity of ad requests.

The ad request – an invitation to display a banner ad within an app or on the browser of a smartphone – is one of the basic building blocks of location analytics. Unfortunately, there are countless excuses and incentives for taking a shortcut and responding to these requests with poor-quality location data that results in spammy ad requests.

What Does Fraud Look Like?

Long story short: Spammy ad requests can and must be avoided, and it doesn’t take a rocket scientist to rid location data of fraud. Sometimes all you need is enough determination and patience to let nonhuman actors show themselves.

Suppose an ad publisher sends a large number of ad requests that qualify for an audience of soccer moms. How many soccer moms from a particular suburb are using their smartphones on a random Wednesday evening? Heuristics and historical records suggest the number should be between 10,000 and 12,000. If the publisher’s numbers are dramatically different from that estimate, it’s a good indication that it’s sending spammy data.

The number of questions or strategies that can be used to expose fraud is technically unlimited because any large-scale, non-human-generated ad requests will invariably fail most tests.

Human behavior, on the other hand, is diverse and difficult to understand at the individual level. Luckily, when you’re looking at a group of people as a collective, their actions tend to follow certain patterns that can be described mathematically. For example, some elements of a group can be expected to conform to an equation where quantity Y is proportional to a fixed power of quantity X, in which case we say that Y follows a power law with respect to X.

In the case of the suburban soccer moms, it’s precisely one of those power laws that allow ad requests to be classified as spammy or true. Below is an example of a test that can be performed to determine if a location data set is valuable or spammy. The points in red on the right side of the figure are well above the expected value, which is determined using the power law, denoted by the black line. A data scientist would flag the red dots as failing one test and remove them from subsequent analytics pipelines.

111214placeiq-inline

Spam Rises

The statistical properties of ad requests lay the groundwork for letting fraud reveal itself. The weeding process can then be improved by including geographical information for the region where ad requests originate. For example, based purely on statistical measures, ad requests originating from within a few blocks of a college football stadium may look suspicious to a data analyst when volume suddenly spikes. But once the analyst factors in additional variables, such as the number of businesses near the stadium or the fact that it’s game day, the higher volume doesn’t look unusual anymore. The most effective filter for fraudulent ad requests is knowledge of both the statistical properties of ad request traffic and contextual information about the time and place where the requests originate.

Location data analytics, much like astrophysics, offers valuable insight into human behavior. Poor data quality, however, impairs that insight, which is why spam prevention should be taken seriously across the mobile ad ecosystem.

Follow PlaceIQ (@PlaceIQ) and AdExchanger (@adexchanger) on Twitter.

Tagged in:

Must Read

Comic: Shopper Marketing Data

Infillion Strikes Again, This Time Buying The Retail Purchase Data Company Catalina

Infillion, an ad tech business built on M&A, is back with another acquisition. This time it’s Catalina, a century-old market research and shopper marketing company with roots in physical cash register machines.

This Election Season, Buyers Can Curate Deals Based On Voter Values

OpenX and Givsly’s new curation solution lets political campaigns reach voters based on data sourced from nonprofits, rather than traditional party affiliation.

Walmart’s Ad Revenue Totaled $6.4 Billion In 2025 As The Ecommerce Flywheel Started To Spin

“Fully a third of our profit in the most recent quarter was related to advertising and membership income,” Walmart CFO John David Rainey told investors on Thursday.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
Comic: AI-TA?

Q4: Omnicom’s IPG Merger Is An AI Test Case

Omnicom just reported its first earnings since closing the IPG deal and, shocker, it’s saying AI is main growth driver for combined holdco.

Digital-native brands need to figure out how to win in retail shelves. They're finding it difficult, to say the least.

Big CPG Brands Are Quick To Cut Ad Spend Amid A Tough US Market

Companies like P&G, PepsiCo and Colgate-Palmolive are cutting marketing spend as the easiest and quickest way to protect profitability.

How The Minnesota Star Tribune Protects Advertisers While Covering ICE Crackdowns

Amid a federal crackdown and local unrest, Minnesota’s biggest newsroom is proving brand safety and hard news can coexist.