Category: NPR

ProgrammableWeb : NPR API Architect Headed to Netflix

comments Comments Off on ProgrammableWeb : NPR API Architect Headed to Netflix
By , September 16, 2010 9:14 am

This article was originally published to ProgrammableWeb.com on September 16, 2010, upon the announcement that I was leaving NPR to join Netflix.

Daniel Jacobson, responsible for NPR’s trailblazing API, is leaving his post to join Netflix next month. Jacobson will become API Director of Engineering for the movie rental service, looking to support the company’s continued expansion to additional streaming devices. At NPR for over 10 years, Jacobson launched its API in 2008 and recently supported mobile devices that helped NPR’s traffic double in a year.

Jacobson joins Netflix at a time when the company is widely distributing its content to wherever media is played. All of these applications are supported by the Netflix API, which provides the meta-data, such as movie titles, and the ability to authenticate users to their own Netflix accounts. The new role is less about creating an API as it is expanding what’s already there. “More of the focal point will be continuing to evolve the APIs for the enterprise needs of the company,” Jacobson said.

While Netflix has been popular with developers, one major reason for an API is internal development, as Jacobson recently wrote in a guest post. “I think it’s a great fit because I think that’s exactly the model that NPR has taken,” Jacobson said. “It’s all about eating your own dog food.”

Before Jacobson launched NPR’s API, the organization had two outlets for its digital content: the website and what Jacobson called a “less-than-optimal mobile site.” Using the same APIs available to developers, NPR built apps for iPhone, Android and iPad, as well as a new mobile site. The result was 100% growth in NPR website traffic, mostly due to the apps. “As we launched apps, we saw additive pageviews. It wasn’t cannibalizing pageviews from the site,” Jacobson said.

NPR was among the first major media organizations to publish an API. When the we covered its launch, we noted it was the first talk radio API to provide access to the station’s content. Additionally, we compared it to the New York Times, which had announced but not released an API. The newspaper released its first API three months later.

Jacobson has been a frequent contributor to ProgrammableWeb as a guest author. For reference, here are all six of his blog posts so far:

His posts have provided a transparent view of how he ran the NPR API and we hope to continue to learn from his experience at Netflix.

ProgrammableWeb : Metrics for Content APIs: An NPR Case Study

comments Comments Off on ProgrammableWeb : Metrics for Content APIs: An NPR Case Study
By , September 10, 2010 1:27 pm

I originally posted this article to ProgrammableWeb.com on September 15, 2010.

This guest post comes from Daniel Jacobson, Director of Application Development for NPR. Daniel leads NPR’s content management solutions, is the creator of the NPR API and is a frequent contributor to the Inside NPR.org blog.

In my previous post, I discussed how companies can make money by using their content APIs to improve internal processes to enable rapid product development and to extend their reach. To successfully do this, however, this also requires a strong plan on how to capture appropriate metrics for the API.

At least from NPR’s perspective, the primary goal of the API is to get as many eyeballs on the content as possible. To achieve this goal, there are several ways to track the content as it travels through the API, each of which serve their own role. The following are the four key metric types that NPR is targeting:

  • Request
  • Response
  • Impression
  • Loyalty

Each of these are important in determining the true reach of the API, although their respective values to the overall equation are different. Moreover, each comes with its own challenges in capturing and parsing the data. Below is NPR’s definition of each of these metrics along with some basic data that NPR has (or doesn’t have) so you can see the relevance of each to our API strategy.

Request
The marketplace standard right now for tracking API metrics is based on API requests. While this metric is very useful and important, it is only a segment of the metrics needed to really determine the success of a content API. This is because requests do not translate into actual consumption – they merely create opportunties for consumption. To put it another way, tracking requests reveals information about how developers use the API even though the API itself is really just a means to get content in front of consumers. So, it is critical for producers of content APIs to be able to track how the content is consumed when distributed through the API.

The following chart details the growth of the NPR API over time in terms of API requests.

Response
Although the request metrics tell us what the developer asked for, they do not tell us what was delivered to the developer. Depending on the nature of the API, the response may include multiple items for each request and/or they could include warning and/or error codes and other information that gets returned to the user. A common example of this is an RSS feed which receives a single request but can deliver many stories. If the API captures only request metrics, it is missing the specifics around what was returned to the API developer.

The response data is critical as it tells you what content is potentially available to end-users.

Although NPR received more than 72 million requests to the API in August, it delivered over 1.3 billion stories over that same timeframe. This translates into roughly 18 stories per request. Clearly, by capturing only the request data, you are missing a very important part of the story.

Impression
Impressions are the first point in the metrics calculation where actual consumption is captured. By an impression, I mean a page view (or equivalent) where an end-user experiences the content that was delivered by the API. Generally, the way this metric gets captured in APIs is by putting an image beacon in a piece of the content. The beacon renders from the API’s server when it gets presented by the calling app, providing you with information about the content and its consumption every time it is viewed.

This is a very important metric because it is the impression that measures the number of eyeballs that see the content delivered by the API. For example, there are likely some requests that never get presented to a user if the calling application never presents them. Additionally, there are other requests that the calling application caches and gets presented multiple times to a user for that one request. Moreover, because a single request could return multiple items in a response, depending on how the requesting application handles it, there could be many impressions for that single request. As a result, the impression numbers could be substantially higher or lower than the request and/or response totals, depending on how the calling application interacts with the API. Because advertising revenues for many content APIs are dependent on actual consumption numbers (and not server traffic), the impression metric is much more important than the request or response totals.

The above image demonstrates how the NPR News iPhone App accesses the NPR API. In our app, a single API request is made to present the screen on the left. In that request, 25 stories get returned. Each of those stories contain the full story content, including images, audio and full text. The list view of all 25 stories garner a single page view. Clicking through one any of those stories results in the screen on the right, which is the full story page. The full story page garners another page view even though the iPhone app does not make another API request for it. In fact, if I launched the app, went to the Science page and looked at every story page from that list, it would result in 26 page views, all stemming from a single API request.

Loyalty
Once an impression is realized by the API, the next step is to create some relationships and loyalty around your content. After the user consumes a piece of content, did they carry on to another piece? Or do they have trackable sessions in the system already, perhaps from a different platform (whether delivered from the API or not)? There are several ways to try to make these relationships, but this is quite challenging and NPR is in the very early stages of trying to handle this. Our approach so far has been to use impression-related data mixed with query string parameters and session-related data (such as cookies).

A tangible example of this is if the content that gets delivered contains an audio or video asset. Generating an impression on the story is the first step. If the user then clicks on the audio, that click-through should also be attributed to the session attached to the API impression by passing tracking information to the audio URL so the audio piece can be related to the page view. By creating opportunities for the API content to create serendipitous experiences with other content, you are building a strong, more sellable content API.

As I mentioned before, capturing data for each of these metrics offers unique challenges. For example, to improve performance on our APIs, NPR uses a suite of caching layers. Moreover, the API has a lot of rights exclusion algorithms and transformations. As a result, it is increasingly difficult to ensure successful tracking of all of the metrics for all of the requests. Tracking impressions from APIs offers unique challenges since much of the content is getting distributed in XML, JSON or something comparable. How do you put a tracking beacon in the content? In which field should it go and how can you be sure that the calling application will consume that field? If you put it in multiple fields to ensure consumption, how do you prevent duplicate hits for a single page view? Finally, assuming that you are successful in accurately tracking metrics for each of the above, how do you convert them into a compelling story, one that offers value to the business?

I do not want to imply that NPR has solved all of these problems. Rather, we have built systems that do help us capture information about all phases of these metrics. But these systems are not bullet-proof. They do, however, give us more data about the content consumption from the API than merely request-based metrics, allowing us to learn more about how the API helps us achieve of greater goal: to increase the total number of eyeballs on our content.

Presentation : NPR’s Digital Distribution Strategy: OSCON2010

comments Comments Off on Presentation : NPR’s Digital Distribution Strategy: OSCON2010
By , July 21, 2010 11:28 pm

This presentation was given at OSCON in July of 2010.

When launching the API at OSCON in 2008, NPR targeted four audiences: the open source community; NPR member stations; NPR partners and vendors; and finally our internal developers and product managers. In its short two-year life, the NPR API has grown tremendously, from only a few hundred thousand requests per month to more than 60M. The API, furthermore, has enabled tremendous growth for NPR in the mobile space while facilitating more than 100% growth in total page views in the last year.

Presentation : NPR’s Digital Distribution and Mobile Strategy

comments Comments Off on Presentation : NPR’s Digital Distribution and Mobile Strategy
By , June 25, 2010 8:06 pm

This presentation was first published on June 25, 2010.

The NPR API has been the great enabler to achieve rapid development in the mobile space. That is, because we have our rich and powerful API, our mobile team is free to pursue the development of their mobile products without being encumbered by limited internal development resources. The touch-point between the mobile product and our content is fixed which means the mobile team can focus on design and usability for the specific platform.

Presentation : NPR API Usage and Metrics

comments Comments Off on Presentation : NPR API Usage and Metrics
By , April 19, 2010 1:38 pm

I originally published this post to the Inside NPR.org blog on April 19, 2010

Daniel Jacobson’s Presentation on NPR API Usage and Metrics

It has been more than a year since my last post about API usage, so this is long overdue.  Needless to say, the NPR API has grown tremendously since its launch in 2008.  The biggest consumer of the API is still NPR by a long shot.  That said, member stations, partners and the general public have been making extensive use of the API. 

Below are metrics on the API followed by a suite of examples of how it is being used.  But first, here are a few qualifiers to the statistics:

– These statistics are about requests to the API, not necessarily what was consumed by or presented to users.

– These numbers were obtained through server logs.  Although we believe they are largely accurate, it is possible that these numbers are off by a couple percentage points in either direction.

– A request to the API can mean many things.  For example, on the NPR News iPhone app, the Top Stories section is populated by one request to the API – that one request returns many stories.  The Flash Player on NPR.org, when playing a single piece of audio, represents one request to the API for exactly one story.  Meanwhile, some requests (such as a story page on WBUR.org) get exactly one story, but because they get cached result in many page views for that one request.  As a result, metrics around requests are imperfect.  That said, they do offer some information about usage and consumption, even if not the whole story.

The NPR API has grown tremendously since its launch in July 2008.  Most of the requests today are from NPR products, but a sizable portion of the activity is also from public usage and requests from NPR member stations.
Daniel Jacobson/NPR

The big jump in total API requests from July to August are due to the launch of many new products in July.  Among them are the new NPR.org, the NPR.org Flash Player, the NPR News iPhone app, WBUR’s new web site, and Minnesota Public Radio’s new site. Since then  an increasing number of applications have been implemented on the API, including the NPR mobile site and other station sites (like KQED and KPCC), accounting for the steady growth over the last few months.

In the six months of tracking stories delivered, the NPR API has delivered almost five billion stories.  Last month alone, it pushed out over 1.1 billion.
Daniel Jacobson/NPR

The total number of stories requested is a new metric that we started tracking in October, 2009.  This is the number of stories requested, not the number necessarily delivered or consumed by a user.  This metric is based on the numResults parameter (or lack thereof) in the API request.  And to be clear, in March, 2010, the API did deliver over 1.1 billion stories!  Since tracking this metric six months ago, the API has handled almost 5 billion story requests.

The current distribution of requests by output formats are as follows: NPRML – 86%RSS – 5.8%JavaScript Widget – 2.6%PodcastRSS – 1.6%HTML Widget – 1.5%MediaRSS – 0.7%JSON – 0.14%Atom – 0.01%
Daniel Jacobson/NPR

Overwhelmingly, NPRML is the dominant output format delivered by the API.  That said, most of the requests are from NPR and related products.  In the past, prior to NPR building out the new site, iPhone app, etc., the distribution was slightly less NPRML-heavy.  Those models had a distribution of NPRML at about 55% and RSS at about 25%, with MediaRSS, JS and HTML getting reasonable traffic but significantly less than RSS.

Although the API has always been dominated by NPRML requests, this dominance has only been more dramatic with the myriad launches in July 2009.  Prior to those releases, NPRML represented about 55% of the 3.5M requests.  Now NPRML represents about 85% of the 53M requests.
Daniel Jacobson/NPR

As you can see from this chart, the non-NPRML formats have grown a bit over time, especially the JavaScript Widget in recent months.  Prior to August, 2009, NPRML was still very dominant, but it its use was substantially closer to the use of the other formats.

Although NPRML is overwhelmingly dominant, the other eight output formats do get a lot of requests.  Over the last several months, RSS and JavaScript Widgets, in particular, have seen the most growth.  Surprisingly (to me, at least), ATOM has remained almost non-existent.
Daniel Jacobson/NPR

Abstracting away NPRML, it is much clearer how the other formats have grown over time.  RSS and the JavaScript Widget have really grown tremendously over the last year.  Mix-Your-Own-Podcast launched in December, 2008 and took a while to start its growth. 

To get a sense as to what is creating all of this traffic, I have created a presentation that shows a range of examples on how the NPR API is getting used.  The following presentation is probably missing some interesting and heavily-accessed examples, but it should give you an idea as to how pervasive the API has become. 

Given the growth charts above, along with the introduction of the API to station sites and the launch of the iPad and other upcoming tablets and portable devices, we expect these numbers to continue to climb.

Finally, if you are aware of some interesting implementations using the NPR API, please let us know in the comments here or by emailing us at techcenter at npr dot org.

Panorama Theme by Themocracy