What are the key challenges online marketers face in collecting data and making real-time decisions based on that information?
The data that's being considered in many of these applications is very big. They're looking at every single click that's happening on a site or every single event that's being generated, every interaction or customer behavior, and that really can't be done with traditional systems.
A big reason Hadoop is driving a lot of marketing is the fact that it allows companies to break down the silos. It's much less rigid, in terms of the schema requirements. You don't have to define a schema for every table in advance and maintain that schema and keep it up to date all the time. So, traditionally if you look at enterprises, they can have 10,000 Oracle databases with data that is spread across all these systems, and it prevents them from having that single view of the customer that spans all their customer's activities.
So one large MapR customer, in the retail side of financial services, went from taking months to understand customers to minutes. I'll give you an example. They want a query that shows all the consumers that have been skiing in California. In the past, there'd be many separate systems and you'd have to get some information from one system, get another analyst to email a spreadsheet from another, and there's much more manual processing that could take months. With MapR, that takes several minutes to do because now they have a single system that has all that information and the power to actually run such a query.
This allows you to do much more accurate customer targeting because it's not just based on what you bought over the last year, but it's based on what are you are looking at right now, and what you have been looking at in the last week, and so forth.
For example, one of the large cable companies was able to do ad insertions in video on demand, based on what you're doing with your set-top box, whether you clicked stop, rewind or fast-forward, or skipped certain sections of the show.
Your use cases are all over the map — everything from cybersecurity to sales performance-management systems. Where does marketing fall into the mix?
Marketing is one of the major use cases that our customers use our technology for, across all the verticals and across many channels.
Cisco, for example, developed a 360-degree customer application that’s collecting all the information that it knows about its customers, from the billing system, support system, social media and every interaction point it has with its customers. And it uses that for lead generation. It's helping its own partners identify new sales opportunities and making decisions as to which partner to provide that opportunity data to. It's analyzing the dial-home data, the behavior of its customers on its websites and when using its products. It increased revenue by $40 million just in the first year of deploying MapR.
Another example is at one of the world's largest retailers, which makes decisions on pricing based on MapR. It looks at competitors' pricing and social media data to determine which products to stock in which stores. In that example you see the marketing spanning across all the four Ps of marketing — product, place, price and promotion.
Also, one of the leading IT vendors uses our product to make decisions on its website as to how to create a customized flow for the user to increase the probability that the customer will buy and increase the amount that they will spend on the site. So every customer is getting a personalized flow on the website, and that's all based on MapR and using machine-learning technologies.
What about ad tech?
Online advertising or ad tech is a very large vertical for us and for Hadoop in general. A lot of the early adopters were in the online advertising space. If you look at companies like Rubicon, it's the largest ad exchange in the US by reach. People are bidding on the exchange and it analyzes all of those bids and auctions that are taking place – we're talking 90 billion events every day. It's predicting the price that the next auction is going to close at and making decisions on which ad to show for each slot that's available. So it's doing that matching of the publishers and advertisers, and to do that in an optimal way, it needs to analyze all that data that's coming out of that system.
The data is about what people are clicking, what they are looking at and how much they are engaging after that initial view or click. That helps Rubicon do better matching. If you're looking at the amounts of data here, we're talking many petabytes of data analyzed to make a decision. The scale is way too significant to use a relational database or data warehouse for this use case. Also, the type of analysis it does is beyond a SQL query. And the data changes often.
Meanwhile, as the de facto standard measurement company on the Internet, comScore tracks who is looking at which different online properties. It also has a panel of users about whom it collects all online behavioral data. ComScore then analyzes that data and produces information that it provides to advertisers. It's analyzing 1.7 trillion events every month, so it's reaching more than 90 percent of the Internet population now, in almost 200 countries.
So, chances are, if you've done something on your phone or using your browser this morning, then you've generated an event on comScore's system. And that is running on MapR and is being analyzed and aggregated on a MapR cluster.