Home The Sell Sider Can Publishers Enable A New Chapter For Modeled Data?

Can Publishers Enable A New Chapter For Modeled Data?

SHARE:
Alessandro De Zanche headshot

The Sell Sider” is a column written for the sell side of the digital media community.

Today’s column is written by Alessandro De Zanche, an audience and data strategy consultant.

Today’s modeled data at scale is evolving rapidly, due to legal and technical limitations.

It is frequently misrepresented and mis-sold, but there would be no need for ambiguous sales tactics if modeled data was used as it was intended. Marketers shouldn’t leverage modeled data for targeting, for example, and expect to only engage with a segment of male users, when in reality they are reaching a segment of users with a higher probability of being male than an untargeted audience.

Besides a better-informed sales pitch, is modeled data good data? It depends on several factors, including the type of data and its source, as well as how it is used and implemented in an advertising and marketing context.

Quality modeled data is extremely difficult to produce. I once worked for a market research company that produced several different data products. My team took traditional research assets and products – including consumer panels across dozens of countries – and connected them to digital advertising and marketing ecosystems for activation, such as targeting, personalization and reporting.

The ‘seed’ and the ‘universe’

Imagine a single source panel representing a country’s population with “always-on” capture of deterministic data, generated from each individual panelist’s activity across TV and digital – including demographic and household information, offline behaviors, print media consumption and purchase data. Our mission was to take the deep knowledge and insights provided by that limited, deterministic data source – the “seed” – and create reach through modelling.

High-quality modeled audience data requires a huge universe of cookies, device IDs and their relative data points to be used as a “canvas” that will be enriched with the knowledge being extracted from a deterministic seed. It is relatively “easy” to build or obtain that seed, compared to the more difficult task of gaining access to a high-quality, granular pool of millions of IDs, collected in a consistent, fully compliant and informative way, while also building a relationship with the user based on transparency and value exchange (which exceeds legal requirements).

From a technical perspective, another key aspect is that each cookie or device ID within the “universe” needs to have enough data points attached to it. Moreover, these data points shouldn’t be modeled attributes; modeling using modeled data – rather than raw data – is not best practice. The more granular and raw the data points, the more effective the models, which allow us to build robust lookalike segments with high confidence.

An unworkable problem

Subscribe

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

It is becoming increasingly difficult to create massive pools of user IDs with granular log-level data, especially for the middlemen that have no direct contact with audiences, and companies are dropping along the way.

Last Thursday, nonprofit Privacy International filed a complaint with data protection authorities in France, Ireland and the UK against Acxiom, Criteo, Equifax, Experian, Oracle, Quantcast and Tapad “to protect individuals from the mass exploitation of their data in contravention of the General Data Protection Regulation.”

Each of these companies collects data from millions of users. They are not household names, in the sense that 98% of the user base does not know them or have a clue about what data is collected and how it is used; almost none of these companies have any direct relationship with the user. They are increasingly under scrutiny as stricter privacy regulations are implemented across the world.

The UK Information Commissioner has also weighed in on the privacy element of modeled data: “If you’re targeting people on the basis of inferred data, that is personal data. The use of lookalike audiences should be made transparent to individuals.”

A high-quality open ecosystem

With much uncertainty and fewer sources for large data pools, we are left with the big platforms – Facebook, Google and Amazon – which are not immune from privacy and compliance-related controversies.

But these platforms lack either the context (Google and Facebook) or the variety of data needed to create rounded profiles of real people that go beyond buying behaviors (Amazon).

This void creates a unique opportunity and moves the spotlight to publishers, which have a unique combination of granular proprietary data, context and, crucially, a transparent and honest relationship with the individual. What would be missing is scale, which makes an even stronger case for media brands’ alliances.

Media brands have the potential to evolve, not only as a source of quality first-party data but as a sophisticated ecosystem to perform effective data modeling, including becoming a reference for advertisers wishing to enrich their data in a context which maintains transparency, fairness and the user at its core.

Follow Alessandro De Zanche (@fastbreakdgtl) and AdExchanger (@adexchanger) on Twitter.

Must Read

LiveRamp Outperforms On Earnings And Lays Out Its Data Network Ambitions

LiveRamp reported an unexpected boost to Q3 revenue, from $160 million last year to $185 million in 2024, during its quarterly call with investors on Wednesday.

Google in the antitrust crosshairs (Law concept. Single line draw design. Full length animation illustration. High quality 4k footage)

Google And The DOJ Recap Their Cases In The Countdown To Closing Arguments

If you’re trying to read more than 1,000 pages of legal documents about the US v. Google ad tech antitrust case on Election Day, you’ve come to the right place.

NYT’s Ad And Subscription Revenue Surge As WaPo Flails

While WaPo recently lost 250,000 subscribers due to concerns over its journalistic independence, NYT added 260,000 subscriptions in Q3 thanks largely to the popularity of its non-news offerings.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
Mark Proulx, global director of media quality & responsibility, Kenvue

How Kenvue Avoided $3 Million In Wasted Media Spend

Stop thinking about brand safety verification as “insurance” – a way to avoid undesirable content – and start thinking about it as an opportunity to build positive brand associations, says Kenvue’s Mark Proulx.

Comic: Lunch Is Searched

Based On Its Q3 Earnings, Maybe AIphabet Should Just Change Its Name To AI-phabet

Google hit some impressive revenue benchmarks in Q3. But investors seemed to only have eyes for AI.

Reddit’s Ads Biz Exploded In Q3, Albeit From A Small Base

Ad revenue grew 56% YOY even without some of Reddit’s shiny new ad products, including generative AI creative tools and in-comment ads, being fully integrated into its platform.