Unifying Reporting Across All Programmatic Vendors: No Small Task For Publishers

madhurasenguptaThe Sell Sider” is a column written by the sell side of the digital media community.

Today’s column is written by Madhura Sengupta, director of ad product technology at Edmunds.com

Programmatic yield management can easily become a nightmare for publishers. We’ve all tested dozens of demand sources, including ad networks and exchanges, such as DoubleClick AdExchange (AdX), AOL and Amazon, as well as different ad placements, units and more.

While every publisher tries to optimize and maximize revenue, this usually includes the laborious process of recording and carefully monitoring tons of data for revenue, impressions, discrepancies and fill rates by device and geography for each demand source. In addition, programmatic buying uses real-time-bidding (RTB) technology, which changes frequently and requires constant monitoring.

Traditionally, this might translate to a human manually logging into various partner portals, normalizing numbers and managing unwieldy Excel spreadsheets with hundreds of tabs. However, more publishers now use data science and development teams to automate and unify this reporting in one centralized location.

But while there are a few third-party vendors that offer reporting consolidation services, they typically haven’t been able to scale simply because each publisher’s reporting needs are so different.

Choosing The Right Metrics

First and foremost, it is important for a dedicated member of a publisher’s ad operations (ad ops) or yield management team to determine which metrics and dimensions they should track. Apart from collecting standard revenue and impression data, there are several to be mindful of.

Discrepancy by demand source: This measures how many impressions are lost when they “hop” between the publisher ad server and demand source. While a “normal” discrepancy rate is anywhere between 1% to 10%, every impression lost in cyberspace means loss of revenue for a publisher. Anything greater than 10% should stir attention, and anything more than 20% is in the red zone.

Performance of demand sources based on rCPM, not eCPM: The “r” in rCPM stands for “real,” meaning the use of impression numbers from the publisher’s ad server to determine performance, instead of from the demand source.

Performance of private marketplace deals (PMPs) within your ad exchanges: Many publishers set up PMP deals directly with advertisers to give them priority over the open exchange. This can be beneficial for both sides, but publishers need to monitor impression and revenue volumes of each deal to ensure that advertisers are spending healthy amounts through this channel. This can be challenging, especially if a publisher uses more than one PMP platform, such as AdX, Rubicon and OpenX.

Latency and ad load times for header-bidding partners: Which demand sources are consistently “timing out” from the standard 500 milliseconds before which they should be returning bids? While analytics and cookie syncing often allows exchanges to bid higher, users are being subjected to a plethora of measurement pixels that contribute to the slow load times of the page. It is important for publishers to hold their demand sources accountable for any larger-than-expected latency.

Historical bid data from header-bidding partners: With the onset of header bidding, there is a vast amount of data available, including win vs. close bids from each partner and percent of time each ad partner bids. Publishers can use historical bid information to understand the value of their inventory and audience, and price effectively against it.

Performance by ad unit: Certain ad units and sizes may outperform others. Tracking this data can help inform site layout and design, as well as set pricing by unit.

Performance by site section or page type: Certain sections of a publisher site can have vastly different ad performance based on user behavior. In general, when there are more page views per user session, it is important to have more demand sources to fill each impression. For example, photoflippers or slideshows can garner very different ad performance compared to a long-form editorial page.

Performance by geography: Not all countries are created equal. Instead of blending CPMs across geography, it is important to understand and optimize by region.

Extracting The Data

Once the metrics and dimensions that need to be tracked are determined, publisher ad ops teams can work with development teams to automate the extraction of this data from various sources. There should be at least one full-time developer dedicated to this project, with coordination from an ad ops project manager.

The first step to the development effort will be to determine which demand sources have APIs, and which metrics are available to pull via API. Note that APIs are time-consuming to maintain and may not have all the up-to-date functionality that a reporting interface might have. For example, there are many metrics in the AdX Query Tool which are unavailable within the reporting API.

In addition, many demand sources may not have APIs available. In this case, there are a few other options, including pulling from automated daily emailed reports or using a scraping mechanism, such as Selenium, to log into each portal, download the data and feed the information into a data warehouse.

Once the raw data is stored within a data warehouse, publishers can layer on visualization tools, such as Tableau or QlikSense. Others build their own reporting interface with charts and graphs to complement the data tables. Even after the initial development phase is complete, there usually needs to be ongoing development maintenance because demand-source APIs might change, new metrics may be available to report on and publishers are constantly evolving the partners they choose to work with.

While it is undoubtedly a significant effort to unify reporting across all various demand sources, it can be well worth it to gain such insights in a real-time fashion. Instead of using manpower to manually log and track all the data, we can start shifting toward analyzing the insights, setting optimization strategies and, best of all, finding more money under the hood.

Follow Edmunds (@edmunds) and AdExchanger (@adexchanger) on Twitter.

4 Comments

  1. Great article Madhura, you're right, it is extremely difficult and publishers need to employ many tech people focused solely on the finicky APIs of their vendors. We recommend a look Our way, which offers the economies of scale into hundreds of maintained integrations we've built up over 3 years and flexibility in handling any publisher reporting needs. The benefits you describe are enormous and possible, with resources that are 100% focused on extracting and unifying reporting.

    Reply
  2. Great overview Madhura! I'm curious, how have you incorporated the bid data from the header bidders into your reporting? We are working on this issue, and trying to figure out a good method to track bids by partner for each impression.

    Also, in case it's helpful for anyone else, at http://www.SpanishDict.com, we went through an effort to centralize reporting across all our partners, and it proved incredibly valuable, and very time consuming. In addition to tracking data by geography, ad unit, and platform, we also setup our reporting to allow us to run A/B tests, which was a game changer. Details are here: http://blog.pubnation.com/ab-testing-ads/.

    Reply
  3. Great read, Madhura! The issues you're writing about are incredibly complex and you managed to lay out the challenges, and the processes to overcome them, in an easily digestible way. Just a couple of points I'd like to expand on. Yes, publishers have had two choices for unified reporting:

    1) Invest in reporting consolidation services that cannot scale with the addition of new data sources.
    2) Invest in custom software development in-house to attempt to keep up with the momentum of all-new data sources at scale.

    Clearly, scale is at crux of the problem in this space. The primary problem here is that business intelligence platforms have approached this issue as an IT project. This means every time a new data source comes online, metrics and models must be scoped, programmed and deployed. The reality is that in the new world of marketing this is not sustainable.

    At Datorama we've taken a different approach to this problem as we've created an infrastructure that scales with the complexity of a data environment. Our solution leverages the power of machine learning to integrate data sources as well as create a flexible data model so that the age-old approach you write about can finally be put to bed. This means that our solutions' algorithms learn over time, and understand how to categorize and normalize data — by itself — rather than forcing users to do it manually, which takes a lot of SQL, time and $$$ invested. We've heard from our customers that with Datorama, for the first time, they've been empowered to work with data at the pace of business. It shouldn't come as much of a surprise that on average the typical customer integrates 70 data sources — on their own — within their first year using our product.

    Having said all of this, I invite anyone to shoot me a note at darica@datorama.com to discuss this further.

    Reply
  4. Hi Madhura,

    Echoing Darica's comments, building a solution in house is quite the investment. Publishers would need a team of dedicated engineers for collecting the data.

    The costs, are in collection and unifying reporting, which could be used to focus on building new properties, sponsorship placements, content and yield. It's no easy task.

    If a solution is built internally, the ongoing maintenance of it is definitely a cost too. It doesn't stop once it is built. The publisher partner stack changes constantly, there are new metrics Pubs are held to, like viewability and engagement, or partners will report in different times zones and currencies. Some partners have fragile reporting UIs, APIs, or CAPTCHAS that even Selenium cannot crack, or not even reporting UIs altogether, so you need to set up scheduled emails to upload it into your visualization tools like DOMO, Tableau, Darica's Datorama or Lookr to view the data your dedicated team is collecting.

    Publishers can automate all of this and fuel these visualization tools with STAQ, and use it independently to run PMPs, Revenue At Risk Reports, rCPM reporting, Pacing, Viewability. Or the visualization companies above may let you use their own services to collect the data for you.

    AdExchanger had another article on this: http://adexchanger.com/publishers/match-com-dating-sites-upping-optimization-with-staq/

    The important items to think about when building a unified reporting initiative:

    - Collecting the data - Do you have a dedicated dev team ready to collect data? If not, can your third party partner automate this for you? You will need to feed any visualization tools you buy.

    - Viewing the data - Do you have a visualization tool? Or will your tech team query the data warehouse as well?

    While implementing a reporting strategy carries a cost. It can save the Ops team a fortune if it is in place.

    Reply

Add a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>