"Data-Driven Thinking" is written by members of the media community and contains fresh ideas on the digital revolution in media.
Today’s column is written by Juan M. Huerta, senior data scientist at PlaceIQ.
It has been said that we are in an era of big data. Not a day goes by without hearing about a new type of sensor, wearable device, innovative data source, creative data-visualization app or a new promising tool or technology to help us make sense of it all.
Enterprises and consumers alike generate both the supply and demand for once unthinkable quantities and varieties of data. The pressure to embrace it is strong.
At the same time, brands and marketers are taking notice and adopting big data strategies to better understand and approach the consumer. They know that when pieced together, the picture this data reveals is fresh, compelling and full of valuable insights.
So how do we extract those precious signals from all the noise, like the proverbial needle in the haystack? How do we avoid the seemingly unavoidable pitfalls? Where to start?
Luckily, there is a plethora of thinking and lessons available when distinguishing between the signal and noise. Here are several key points to remember when tackling big data.
1. Data brings information: Data does not, however, equate to information. Information is extracted from data, and is a measurable and valuable asset. Data is just its carrier.
2. All data are not created equal: Data should be constantly vetted, monitored and analyzed for quality. Most importantly, you should always know how much information your data is providing. Know your data.
3. Look for patterns at the intersections of diverse data streams: As more data and information is brought together, an increasingly bigger and more complex picture is obtained. Understanding the underlying processes that drive the data is crucial.
4. Data should always blend judiciously: A well-known example of problems that can arise when naively combining and grouping data is the Simpson's Paradox, where simple cross-segmental tallying can produce contradictory and misleading results.
5. It's all about the hypothesis: Big data is still supported by old methods of inquiry and discovery. Hypothesis formulation requires creativity and domain familiarity. No shortcut here.
6. Be aware: When building your hypotheses, you must be aware of the selective attention problem. One well-known experiment is the "invisible gorilla," a truly humbling experiment showing that it is possible — and human — to miss the proverbial trees for the forest. As a matter of fact, the harder we try, the more likely we'll miss.
So regardless of the metaphor du jour, big data is still just an opportunity for fundamental inquiry and discovery, but it gives us a bigger picture to study.