“Data-Driven Thinking" is written by members of the media community and contains fresh ideas on the digital revolution in media.
Today’s column is written by Leslie Wood, chief research officer at Nielsen Catalina Solutions.
Say the phrase “data modeling” in a room full of ad folks and watch them quickly become preoccupied with answering urgent emails.
Modeling has a reputation for being too complex — or too simplistic. It’s seen as overly theoretical by some and as a cheap trick by others, like a sleight of hand to cover deficiencies.
In short, it’s not well understood. But – spoiler alert – all audience segments are modeled to some degree, whether it's 10% or 50% or 90%. This is for several reasons, but most the important one is scale. If all advertising was delivered based only on recorded behavior, there would be a lot of good and useful data left on the table.
With modeling, it’s possible to identify potential buyers that look like recorded buyers across hundreds of variables and are very likely to make a purchase. Since modeling is an important piece of creating audience segments, advertisers need to understand how it impacts the quality of the segments they're using to deliver their content.
How Data Models Are Used In Advertising And Life
To use a practical example of a data model that most people interact with daily, consider Netflix or Hulu. If you were to log in and find a home screen listing out every available TV show or movie, you’d likely find yourself overwhelmed, struggling to navigate a dense user experience as you scroll between unrelated or uninteresting programs to you.
However, thanks to data models, Hulu and Netflix can personalize each user’s home-screen experience, complete with “pick up where you left off” and “recommended for you” sections based on past viewing behavior. Sure, the recommendations might not be perfect every time, but the beauty of a data model is that the more data fed into it and the more questions users answer, the more predictive and useful it becomes.
Every day, advertisers and their agencies must decide which audience segments they want to deliver their messaging to. Whether using a self-service programmatic platform or negotiating directly with a premium publisher, the question is the same: Do we want to reach a precise, targeted audience or a broader audience for the lowest cost?
While every campaign has a different strategy, the recent shift toward zero-based budgeting has the pendulum swinging toward more personalized advertising. And more personalized advertising means a more precise and targeted audience. This shift means less modeling is happening to large-scale data sets, but it’s still used often, and it's important to understand what goes into a model.
When Data Models Are Important
When selecting a target for a brand, it’s important to start with the creative message. Does the creative have general appeal, or will it only resonate with a specific group of consumers? If it’s very focused, then it becomes important that the message only reaches the right target – the target that will respond well to the creative. In this case, modeling is very important, as is understanding what data is supporting both the modeled and unmodeled parts of the target. If the creative has a general appeal and doesn’t require such targeted delivery, there may be enough data to reach the right audience without much modeling.
What Makes A Good Model
Modeling has two key parts: the model and the data.
Models today are trained on one set of data – called the seed data – and then the model is applied to the rest of the data. This seed data needs to be high-quality, meaning that it’s recent, covers a long enough time span to establish a pattern and is representative of the right audience you’re trying to reach.
For example: If you’re selling pet food, you are better off with seed data that tells you whether someone owns a pet or previously bought pet food versus data that tells you he or she watched a YouTube video about pets. It’s also important to understand what data is being used for the modeled data set: How many variables are the same as the seed data?
The second aspect is the model. Most models are pretty good for creating a lookalike model. But many models only look at or model one behavior at a time. This is the common approach, but then many models tend to look alike.
Instead, models should be built on the foundation of an entire category or group of brands, so that they can find the nuances between the brands and discern the distinctions between different audience segments. It’s also critical that a model can select the right variables to use. Most models can do this, thanks to advancements in machine learning.
So, to get back to the original question: Are audience segments mostly real or mostly modeled? And does it matter?
The answer: Most of the data sets used for advertising are modeled to some degree, probably close to 100% of them. But can you still have high-quality segments that have been modeled? Absolutely.
There's a lot than can go wrong with a data model if the underlying data, analysis and decisioning aren't of good quality. But, at the end of the day, everybody models.