Six Questions Marketers Need To Ask About Data Quality

Data-driven advertising requires good data. But lots of bad data and questionable data practices can harm a marketing campaign.

Marketers need to know when to use their own data, and when to rely on partners. They need to understand the trade-offs between cost, accuracy and scale. They need to know where their data came from, and how to test it cheaply. And they need to know how to evaluate multiple data sources.

AdExchanger talked to brands, agencies and consultants about the questions they should ask to ensure they’re using data effectively and accurately.

Question One: How Is The Segment Created?

Finding out how segments are created is arguably the most important question of the bunch. When a marketer is targeting “auto intenders” or “beauty buyers” or “people who visit coffee shops,” they need to know how that segment is built and whether it was created using their own data or that of a third party.

“Third-party data can be very valuable when it’s segmented very carefully,” said Ana Milicevic, principal and co-founder at Sparrow Advisers, a boutique data-focused consultancy.

“If someone is targeting ‘auto intenders,’ they may not think about what it signifies,” she said. “Is it someone buying a car this weekend? Or someone interested in cars in general? If you don’t have this defined, it’s very easy to lump together widely defined segments.”

Data providers can use different methods to come up with segments. Some data can be “totally probabilistic, based on assumptions you never asked about,” warned Oleg Korenfeld, Mediavest Spark’s ad tech/platform EVP.

“On the other side,” he said, “you can know the list of the email addresses where they were created and matched against a database, like a supermarket loyalty card. That’s as deterministic as it gets, without any cookies involved.”

Other segments are created using modeling, which can improve scale, but reduce quality.

“We want to know now exactly what percentage of a segment is modeled versus seed data,” said Jonny Silberman, director of digital strategy and innovation at Anheuser-Busch InBev, at LiveRamp’s RampUp conference in San Francisco Tuesday.

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

Daily Roundup

Comic: Revenue "Sharing" (starring Marck Zuckerberg and Sundar Pichai)

Daily News Roundup

EU Probes Google Over Ad Auction Tactics (Again); Consumers Say Ads Should Pay For The News

Question Two: Is The Data Worth The Cost?

If males are half the population, but it costs three times as much to target them, is buying a gender-based segment even worth it? Sometimes.

Spending money on data to serve the right creative can be worth it, especially for brand marketers. “If you are bombarding them with messages they don’t want, because it’s cheaper to do it, you are going to annoy them and they will shut down ads in general,” said Accenture’s Matt Gay, senior manager of the media and entertainment practice.

But for performance marketers, spending on data only makes sense if it improves outcomes.

“You can have the most accurate, amazing data in the world. But if it’s 15 times more expensive than anything else, maybe it’s not worth the squeeze,” said Mindshare Chief Data Officer Rolf Olsen.

Performance-based marketers have the luxury of testing to see if expensive data still drives stronger results. “I look at cost and quality in concert with each other,” said Shutterstock CMO Jeff Weiser, who comes from an analytics background. “If there is going to be a higher cost to acquire better data, it’s got to be justified by a higher ROI.”

Mediavest Spark looks at data’s ability to drive efficiencies. Since media is “the most expensive things marketers pay for,” using data to buy less media can drive results, Korenfeld said.

“The formula is how many fewer impressions did you buy in order to justify the KPI goals,” he said. “Did you buy 10% less media? If you paid the same amount overall, then the data is wasteful.”

Question Three: What’s the Trade-Off Between Scale and Accuracy?

Bad data sometimes proliferates because crappy data can drive results for an advertiser.

A small, high-quality data segment may work for an email marketing campaign, but is way too small for a media campaign. So data providers futz with data to add more scale, juicing results. Brands need to be aware of lookalike modeling or any other tactics used to gain scale.

“There is always a balance between pure reach and ability to target that makes conversations around data quality difficult to have,” Milicevic said. “If you create a stringent segment of people like women in their 30s who bought a magazine in the past 14 days in these four ZIPs, you realize that’s 30 people. It’s a valuable segment, but doesn’t have reach or scale.”

Buyers reflexively want the most accurate segment, Korenfeld said, “But you lose scale that way.”

Transparency is the best remedy. Going back to question one, if marketers know how the segment is created, they can determine the trade-off between accuracy and scale that make sense for their brand.

Question Four: Can I Test This Segment Without Buying Media?

Traditionally, advertisers test data by buying media against a segment. But media is expensive.

“While we have healthy budgets, we can only test out a few segments every year,” said Anheuser-Busch InBev’s Silberman. “What makes sense for us is brand health or offline sales lift, and that means we need to do long and expensive studies for our campaigns.”

If marketers don’t want to spend money to test a segment, they can try to validate the data against another data segment they have in their DMP or CRM and see if there are any head-scratching results. (Unfortunately, this method doesn’t work as well for a CPG like Anheuser-Busch, which doesn’t sell directly to customers.)

“You don’t need to start with in-market testing,” Shutterstock CMO Jeff Weiser said, who has an analytics background. “You can append to the CRM database, and check the match rate. To the extent that it can be matched, does it have correlations to the rest of the database?”

A lack of correlations indicates bad data, Weiser said. Another red flag would be correlations that don’t make sense, like an outside data set that suggests a marketer has a wealthy customer base when internal data suggests the opposite.

Data that doesn’t make sense can be chucked before going through the expense of testing it with media.

Question Five: How Often Is The Data Refreshed?

Some data – like demographic information or interests – doesn’t change much over time and marketers can use it without worrying it will decay. But other types of data decay quickly. “You are going to want to update SKU-level or transaction-level data more frequently than a lifestyle category,” Weiser said.

Take someone on the cusp of a big purchase, like a car. A consumer enters that phase in a matter of weeks or months, so predictive models of “car intenders” refreshed every year won’t drive results. Pun intended.

“Particularly in the world of behavioral segments, there can be three months out of a three- or four-year cycle where your signals are clear,” Mindshare’s Olsen said.

Brands can run into problems when activating their slow-moving CRM data in a media environment.

“Lots of brand marketing was built with an annual plan or a quarterly plan,” noted Howard Bass, partner and global media and entertainment advisory leader at EY. “Brands need to move to a more near-real-time data exchange. In the digital media ecosystem you’ve got to rethink the rhythms.”

Question Six: Where Did This Data Come From, And What Has It Been Combined With?

“No piece of data has a virgin birth,” Shutterstock’s Weiser said. It’s captured, then extracted, transferred and loaded into a database, queried with SQL and transformed in Excel. At each of the steps, “there is a little bit of telephone that can happen to data elements.”

For instance, matching data to cookies or device IDs can degrade data quality.

“You might combine a bunch of data points, but the match rates are so low you end up with a data set that’s not valuable,” Mediavest Spark’s Korenfeld noted.

Conversely, having a data set that plays well with other data sets improves quality.

“We talk about how well a data set integrates with other data sets,” Mindshare’s Olsen said. “If you have to merge three to four data sets to get a clean read on the viewability rate or ad fraud, there is a significant level of complexity in the integration of that data set.”

Call it “the unsexy part of analytics,” as Accenture’s Gay does, but data organization, matching and cleansing impacts results.

And every marketer should ask how data has been combined when bringing in new data or analyzing existing data. “If you don’t understand how the data is built from the ground up, it can lead to very misleading conclusions,” Gay added.

Bringing The Answers To All The Questions Together

Using data in media today requires discipline around quality, but also an acceptance that sometimes still things will get messy. “We are in the early innings,” Gay said.

As digital matures, data quality will likely improve but retain certain flaws.

“The digital land is geometrically more complicated [than TV], because you can get much more granular with data,” Gay said. “We will never get to perfect. It’s going to be an evolution with degrees and shades of gray.”

Ryan Joe contributed reporting.

Tagged in: