What’s Your Level Of Confidence In The Data?

kathy-leakeData-Driven Thinking" is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Kathy Leake, CEO at Qualia.

As cross-screen marketing continues to evolve, the industry’s approach to the underlying data science also will mature. The amount of time consumers spend per platform is on the rise: Over the past four years alone, smartphone time grew 394%, tablet by 1,721% and desktop by 37%, according to comScore.

With such rampant, saturated usage, the need for more sophisticated data options continues to increase. Data scientists may debate which kinds of data are most useful to the processes and practices of cross-screen, but detecting, interpreting and optimizing true connections between devices is what matters.

During the cross-screen conversation, marketers have recently begun asking vendors whether they use probabilistic or deterministic data.

But that’s the wrong question. Marketers should be asking: “What is your level of confidence in the data?”

Understanding Data Differences

When marketers discuss data types, they are usually talking about either deterministic or probabilistic data. Deterministic data generally indicates a clear connection between devices. For example, login data links a particular user login to a given device and can be attained via partnerships with entities that have garnered that data through direct user login on websites, apps or online services.

Probabilistic data is based on informed inference. We assume a connection between devices based on certain user activities, but the data does not directly contain connection information. The data used to infer a connection might include a sighting of a device coming from an IP, a sighting of a device visiting a website or even a sighting of a device on an app. This is where the level of data confidence becomes vital.

The Problem With Either/Or

The potential power of either data set is clear but there are issues with both. Many view deterministic data as the gold standard because it is measured with no uncertainty but it has a big weakness. It does not scale because there are far fewer opportunities to directly measure connections. That kind of weakness won’t fly in light of the enormous opportunity cross-screen represents. We need scale, and when it comes down to it, relying only on deterministic data is inadequate.

The problem with using probabilistic data alone and relying on inferences instead of measurements: lots of noise. The subjectivity associated with interpreting what any given action might mean or truly indicate lowers your confidence in the veracity of that connection to significantly less than 100%.

The Power Of Combination

Given the imperfection of data and the relative shortcomings of either one of these data types, a truly powerful data graph blends and co-leverages deterministic and probabilistic. The key is to embrace the holistic approach and use cross-device analysis to better understand data signals, extend reach and ultimately derive a clearer picture of where to attribute success within the mix.

It’s possible to blend what we do know with what we can confidently model to raise the degree of certainty across the board for the whole data set, maintaining data integrity and scale at the same time.

Follow Qualia (@qualia) and AdExchanger (@adexchanger) on Twitter.

 

Add a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>