“Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.
Today’s column is written by John Shomaker, CEO at AdJuggler.
The New York Times recently ran a column by Gary Marcus and Ernest Davis, two professors in NYU’s computer science department, questioning the merits and hype of so-called “big data.”
The term big data reflects the exponential increase in data that many industries, governments and individuals now collect, as well as the supposed insights yielded by these heaps of data. The combination of pervasive, real-time data collected from online interactions, social media, ecommerce and mobile devices, combined with the ever-shrinking costs for database software and storage, are the drivers behind frequent assertions that 90% of all known human knowledge has been captured in the past decade. Is this data, as the professors imply, more noise than signal?
The noise observation is understandable. Arguably, we’re still in the first phase of big data: collection. Google, the epicenter of all data with photographs outside your front door, recently acquired Nest as an Internet of Everything play, so it can collect more data inside your house. And, just weeks ago, Facebook received permission to acquire Oculus VR, ostensibly to track your virtual life now that your real life is fully on display.
But whether it’s Google, Walmart, Bank of America, Procter & Gamble or the NSA, the data scientists, industry leaders and even professors cannot help but lag in their ability to construct meaningful analytic methods for understanding all of this data. The flow of data is absolutely crushing, and, in some industries, the potential analytic value is questionable. Undoubtedly, macroeconomic forecasting, disease correlation and the weather industry are awash in data, yet predictive insights remain less than perfect.
But I’m still a believer. In most industries the leaders differentiate themselves by building, pricing and servicing products that reflect the attitudes and behaviors of their customers, who are increasingly described as a portfolio of segments or individuals. Digital advertising is on the forefront of big data, operating a rapid-fire and increasingly intelligent digital dialogue with consumers as they learn, shop, buy, share and recommend.
In the quest to swim past the data itself and find real insight, I see five scenarios where big data presents tangible improvement over smaller data sets:
- Replacing less statistical analyses
Prior to the growth in data, many analyses, segmentation profiles and operating metrics were based on small sample sizes, surveys or focus groups. Today, the larger data sets allow for richer, more statistically significant results, marginalizing more qualitative findings.
- Finding a needle in the haystack
The breadth of the datasets themselves – what’s captured – is also much larger. Yes, this leads to a huge risk of noise, but, in certain applications such as cybercrime or terrorism, considering obscure attributes can yield important, yet unforeseen, results.
- Getting to the big picture
One of the biggest computing and operating challenges of the last 30 years has been system fragmentation and data “islands.” For companies with multiple divisions, country organizations, products and customer touch points, big data finally enables a comprehensive view of the customer through their entire life cycle.
- The fourth dimension: time
Historically, most data sets and data analyses were limited to one-time snapshots and didn’t accurately reflect longitudinal changes over time. By capturing data – lots of it – and capturing it by day, hour or second, our understanding is more complete and trended predictions more accurate.
- Real time
Just as more data is collected across an expanded set of online user connections, we and the machines can learn and respond in an equally real-time manner. In the world of security trading or online advertising, it is commonplace to respond to a live trade or consumer in mere milliseconds, adjusting pricing and offers based on the individual and the entire pool of users.
Big data is relative. There’s no agreed-upon size that defines it. But as the cost of data collection, storage and analysis approaches zero, organizations are motivated to innovate how data can lead to truly unique insights, organizational differentiation and entirely new business models.
Follow John Shomaker (@jshomaker), AdJuggler (@AdJuggler) and AdExchanger (@adexchanger) on Twitter.