Home Data-Driven Thinking Distinguishing Good Data From The Bad

Distinguishing Good Data From The Bad

SHARE:
Nish Desai headshot

Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Nish Desai, senior director of technology, operations and partnerships at Xaxis.

Marketers have ever-growing streams of data and signals they can use to activate and optimize their advertising campaigns. But using this data to execute well is only half the battle.

To hit their mark, brand marketers must make their data both reliable and extensible to as many places as possible so they can pair it with partners’ first-party data, then fill any gaps with properly vetted third-party data. This means they need to ensure its proper collection, storage and upkeep.

Brands should investigate data collection methodologies to determine the data sets are complete, consistent and representative of the segments they are trying to reach. Properly vetting data requires human interaction, not just a technological solution.

Key considerations

What are my objectives? Before building a data set, brands need to ask themselves why they are building it. Where will it be used and for what purpose? This may seem counterintuitive – starting at what seems like the end – but understanding the reason for the data set’s creation will help ensure that the data that goes into it is correct.

What data is available and where did it come from? Understanding how the available data was collected can help determine how it will be used and how much value it will bring to the data set.

Where and how is the data stored? Data can be stored in house or by a partner. Often, centralizing the data in a data lake may be in the brand’s best interest.

Knowing how your data is stored is equally as important as where it is stored. To get the most value from the data, it must be current. Knowing how often data is refreshed is vital.

As privacy regulations emerge in more markets, it is essential that all data is collected and stored in a privacy-compliant manner.

Subscribe

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

What data is missing? Once brands have identified what data is available, they’ll likely find gaps that need to be filled. Ask potential partners direct questions and listen carefully to the answers. Vague and unclear definitions are a warning sign.

How is the data collected? To determine how a partner knows its first-party data is accurate, a brand may ask how phone numbers, ZIP codes (preferably plus six digits) or email addresses are collected and tested. It would be a positive sign if the information comes from users’ self-declared registration data and there are 100,000 users in the data pool who match the segment desired by the brand. The confidence score for that kind of deterministic first-party registration data is generally higher than 90%. Having users proactively express who they are, what they like and consistently using a login boosts confidence that the data held by a platform or publisher is sound.

But brands should be cautious if the publisher or platform touts a complicated methodology used to deduce someone’s identity. For data derived by probabilistic means, the confidence level is nearly always below 85% and is often closer to 50%.

How are the data sets structured? Is the data merged with data from other parties? Understanding how data is structured after it is collected is extremely important. A brand needs to ensure that any partner data is in an apples-to-apples configuration with their own so that it can be easily merged. If other parties are involved in the sourcing of the data, the brand may need to inquire about their collection methods.

How is the data kept current? Any data based on interests or other attributes that can change over time should be refreshed periodically. Knowing how and how often these attributes are refreshed is key.

How is the data kept clean? If personally identifying information (PII) is collected, understanding how it is sanitized is essential. Is a clean room used to ensure that PII is removed? If values are being hashed or encrypted, understanding how this is done will help ensure that the brand is complying with the requisite privacy standards and industry best practices.

Don’t forget the other huge added benefit to making sure data is good: It helps brands better prepare for when third-party cookies are phased out.

Digital marketers preparing for that day know they have to build and test their first-party data stores to be as ready as possible. They need to have the best data they can and build on that to mix, match and build segments within the social platforms, Google’s Privacy Sandbox and Ads Data Hub, and to match with publishers’ data warehouses. Marketers and their partners need to use best practices in gathering and maintaining their data to keep data stores clean, accurate and current.

Follow Xaxis (@XaxisTweets) and AdExchanger (@adexchanger) on Twitter.

Must Read

Publishers are initiating more and more auctions – but doesn’t mean DSPs are listening to more bids, according to Chris Kane.

The Bidstream Is A Duplicative, Chaotic Mess – But It Doesn’t Have To Be That Way

Publishers are initiating more and more auctions – but doesn’t mean DSPs are listening to more bids, according to Chris Kane.

Readers Are Flocking To Political News, Says WaPo – And Advertisers Are Missing Out

During certain periods this year, advertisers blocked more than 40% of The Washington Post’s inventory over brand safety concerns.

Monopoly Man looks on at the DOJ vs. Google ad tech antitrust trial (comic).

Spicy Quotes You’ll Be Quoting From The Google Ad Tech Antitrust Trial

A lot has already been said and cited during the Google ad tech antitrust trial, with more to come. Here are a few of the most notable quotables from the first two weeks.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
The FTC's latest staff report has strong message for social media and streaming video platforms: Stop engaging in the "vast surveillance" of consumers.

FTC Denounces Social Media And Video Streaming Platforms For ‘Privacy-Invasive’ Data Practices

The FTC’s latest staff report has strong message for social media and streaming video platforms: Stop engaging in the “vast surveillance” of consumers.

Publishers Feel Seen At The Google Ad Tech Antitrust Trial

Publishers were encouraged to see the DOJ highlight Google’s stranglehold on the ad server market and its attempts to weaken header bidding.

Albert Thompson, Managing Director, Digital at Walton Isaacson

To Cure What Ails Digital Advertising, Marketers And Publishers Must Get Back To Basics

Albert Thompson, a buy-side veteran with 20+ years of experience, weighs in on attention metrics, the value of MFA sites, brand safety backlash and how publishers can improve their inventory.