Stepping Our Way To Real Market Data

Data-Driven Thinking - Rob Leathern “Data Driven Thinking” is a column written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Rob Leathern, CEO of CPM Advisors, an online advertising company.

In the online ad business we often get caught up in flawed comparisons to the financial services industry. It’s an “exchange”, “market”, impressions or clicks are seen as a commodity when they are not. Many people reading this right now are building their companies around the notion of correctly valuing an impression and then getting one of their ads there (or else passing it on to someone else), and yet we’re stuck in a place without the data we need to really make a decision.

The Right Media Exchange (RMX) started out as a platform where impressions were mostly seen by the participants as commodities, bought and sold in bulk by ad networks with too many of them or not enough of them, and then sometimes with hope springing eternal, broken into buckets by “channel”: each publisher or network would classify their sites’ traffic into one of 20 or 30 channels like automotive, technology, business or shopping. The ad networks who were RMX’s first clients would use the channel designation to target their campaigns and then the RMX system would take care of bidding differentially based on other elements like user frequency that were harder for humans to manipulate and assess the value of themselves. Of course it turns out there were some other attributes the advertiser side wanted and it was dutifully added in a fairly ad-hoc fashion, like whether the inventory was a website or a desktop application, was there unmoderated UGC on the site, and so on. More recent exchange participants were lured by the promise of billions of commodity-like impressions they could bid on, with a few exclusions of things they didn’t want: but they very quickly learned the wide variance in performance of those impressions they got to see, and the long time it took to hone in on the ones that actually did perform.

We intuitively like the idea of buying and selling simply defined entities like stocks or (a la Trading Places) frozen concentrated orange juice futures: these are actually things that are very complex but have each been broken down and grouped into a single thing (a company which is a group of hundreds of thousands of individuals, processes, inventory, relationships, assets => represented by a stock, or a future which in this case represents millions of similar pieces of fruit that have generated juice of a certain grade, being delivered at a particular time). The online display ad world is a series of loosely-connected, sub-optimal markets with uneven and inconsistent information about the entities that are being sold.

Today with exchanges, on average most participants find more success with audience-based targeting (retargeting, behavioral) enabled by a large cookie pool and billions of daily impressions running through, than they do in running regular non-cookie targeted campaigns. In many of the behavioral cases, what site the user is on when they are found is less important to the advertiser than getting the message in front of them. And yet, fairly soon (probably not 5 months, but probably less than 5 years), I believe that most online advertising will be bought and sold in some kind of market, a lot of it a real-time auction probably, meaning that this is where the really BIG campaigns with billions of non-behavioral impressions will be, so it is time to realize that the battle for better information about inventory has begun.

Despite people talking about how advertising online is data-driven, there is not a lot of good, clean data for buyers or sellers. Bits and pieces of data about a user and ad inventory are everywhere but publisher practices vary. Some publishers have started to better understand the role played by page layout, design and clutter: according to 2009 data from mediageeks, consumers were more likely to respond to ads on pages with fewer ads, which in turn led to higher revenues for publishers. On pages where there were more than 3 banner ads on a single page, ad revenue dropped as much as 40%, compared to the ad revenue received from running fewer ads. Ad revenue was highest on pages with the least amount of clutter.

Advertisers, agencies and buyers (like my company CPMa) need to know more about inventory in a standardized, systematic, scalable way and for that to happen, this information needs to be created at the publisher end and retained throughout the buying process whether that is direct, via a marketplace or through an ad network. A lot may have to change for this to happen, but it would create all kinds of new opportunities, and best practices would perforce arise more easily since data would be cleaner and standardized.

What happens now is that through all the ad network daisy-chaining and iframes, what underlying identity of the inventory is available is often lost. On the other hand, some sites are selling their inventory to ad networks or aggregators and believe that process hides it somehow, but buyers like us see millions of URLs a month, many of them mapping to sites saying they don’t work with ad networks or exchanges. The irony here is that many of these chance secondary impressions in small volume may lead to more advertising business for these sites as the buyers see value in sites they wouldn’t have time or inclination to buy directly and clamor for more impressions, and are able to pay a higher price.

The more information that is available about an impression, the greater the chance I can make a good decision whether or not to buy it, or if I’m stuck buying it and I believe I won’t be able to get value out of actually showing one of my ads I can sell it to someone else and cut my losses. Many smart buyers would like to do this. Perhaps I want to even run a quick secondary auction for the impression if I have enough milliseconds left and a few buyers. But to do this I should pass along to the buyers or prospects some of the same information I got about the inventory; of course now there’s a higher chance the ad will be missed by the user because I’m taking an extra 50ms to figure out what to show, but that should be reflected in the price.

Let’s forget about the information about the user we might have via cookies for a moment. It’s nice and we feel really good about buying data and cherry-picking those people who searched for flights in the last 7 days. Great. But what about the larger number of people we don’t know who might be interested in our product? Let’s create a standard way of passing information through about a publisher’s ad spot so it makes the inventory more liquid! Think about creating groupings so that people can build knowledge around them and buy and sell them more easily because they know what they are getting – just like stocks and bonds. It doesn’t mean they know exactly how it will perform ahead of time of course, but at least they know the next time they can connect the dots just as you have a price history for a stock like MSFT or GOOG that always refers to the same thing!

What would be in this standard publisher-initiated description of inventory? Plenty of ideas floating around, here are a few of them:

The full URL of the spot
How the publisher classifies this grouping of inventory – if I want to get more of it, how/where do I go – because it may be set up in a certain way on the publisher end like “run of men 18-24”. Notice this is not the same thing as telling me what the demo of the traffic is – it may do that – but it’s really about empowering publishers to later sell this directly to me.
Standardized notion of how much attention an ad spot gets – the data to help you might be, where on the page is it in pixels, how many spots are there on the page with ads? What is the average refresh rate of the page (how long is an ad on the page on average?)
A standardized way to know what the page is about. Give 5 keywords, or put it into a standardized category (oh wait we don’t have one of those either!) but obviously people use the URL to do some kind of offline crawl
Information about where in the site session an impression is, or (dreaming) where in the user’s session

How would you make sure this information was not gamed by the publisher? Well read the history of the NYSE (http://www.nyse.com/about/history/timeline_regulation.html) and start to think about some of this stuff: 1869, requires all shares to be registered, 1892, establishes NYSE Clearing House etc. There will be firms that will help this to happen, it will be built into adservers and checked and confirmed and verified. I don’t include any kind of pricing information above (especially not the flawed notion of a floor price) because price should be determined by the market the inventory is being sold in. Think of the above as the snapshot 10-Q quarterly financial statement of the little piece of inventory; enough information to make a quick decision, especially when smart buyers can link up to the other information that they will or a new class of service providers will rise up to provide. I can’t wait to get real market quotes for ad inventory. What about channel conflict and supporting your own ad salesforce? Market-based pricing and market-based buying just means more chances to promote the company’s sites and audience instead of the logistics and headaches of (arbitrary) pricing and (operationally inefficient) contract management.

There is only going to be more online ad inventory moving forward. The price will not drop to zero because testing it is difficult and takes time and money for advertisers. Publishers need to think how to actively participate in creating new opportunities for buying and selling using centralized markets, and how their ad salesforces’ efforts can and should coexist with these markets. We should all start thinking about building the plumbing to get more information about publisher inventory and pricing distributed throughout the ecosystem. We’ve still got a lot of work to do.

Follow Rob Leathern (@robleathern), CPM Advisors (@cpmadvisors) and AdExchanger.com (@adexchanger) on Twitter.

Tagged in: