Daniel Jacobson's Blog

Category: Article

ProgrammableWeb : Why REST Keeps Me Up At Night

Comments Off

By daniel_jacobson, December 11, 2012 4:43 am

This post first appeared on ProgrammableWeb.com

With respect to Web APIs, the industry has clearly and emphatically landed on REST as the standard way to implement these services. And for good reason… REST, which is generally implemented as a one-size-fits-all solution, is an excellent choice for a most companies who wish to expose their content to third parties, mobile app developers, partners, internal teams, etc. There are many tomes about what REST is and how best to implement it, so I won’t go into detail here. But if I were to sum up the value proposition to these companies of the traditional REST solution, I would describe it as:

REST APIs are excellent at handling requests in a generic way, establishing a set of rules that allow a large number of known and unknown developers to easily consume the services that the API offers.

In this model, everyone knows how to behave and it can be incredibly powerful. The API providers establish a set of rules and the API consumers must adhere to those rules to get what they want from the API. It is perfect, right? In many cases, the answer is obviously yes. But in other cases, as our world scales and the number of ways for people to consume digital content and services continues to expand, this one-size-fits-all model is likely to fall short.

The potential shortcomings surface because this model assumes that a key goal of these APIs is to serve a large number of known and unknown developers. The more I talk to people about APIs, however, the clearer it is that public APIs are waning in popularity and business opportunity and that the internal use case is the wave of the future. There are books, articles and case studiescropping up almost daily supporting this view. And while my company, Netflix, may be an outlier because of the scale in which we operate, I believe that we are an interesting model of how things are evolving.

Netflix is currently available on over 800 different device types, including game consoles, mobile phones, TVs, Blu-ray players, tablets, computers, and almost any other device that can stream video. Our API alone handles more than two billion incoming requests on peak days, which translates into almost ten billion real-time outgoing requests from the API to internal dependency services. These numbers are up by about 70x from just two years ago. Most companies do not have that kind of scale, but it is clear that with the continued growth of the device market more companies are resetting their strategies to be less about the public API and more about internal consumption of their own APIs to support device proliferation. When this transition occurs, the API is no longer targeting “a large number of known and unknown developers.” Rather, the key audience is a small number of known developers.

The potential conflict between the internal and public use cases is in the design of the API itself. Keep in mind that the design implications will not be problematic in many scenarios. It becomes a potential problem if the breadth of devices becomes so wide that the variability of features across them becomes substantially harder to manage. It is the breadth of devices that creates a problem for the one-size-fits-all API solutions.

If your target is a small group of teams with whom you have close relationships, the dynamics around the API change. For Netflix, we persisted on the one-size-fits-all REST model for quite a while as more and more devices got added on top of the API. But given our scale, one thing has become increasingly obvious. Our REST API, while very capable of handling the requests from our devices in a generic way, is optimized for none of them. This is the case because our REST API focuses on resources that are meant to be granular representations of the data, from the perspective of the data. The granularity is exactly what allows the API to support a large number of known and unknown developers. Because it sets the rules for how to interface with the data, it also forces all of the developers to adhere to those rules. That means that each device potentially has to work a little harder (or sometimes a lot harder) to get the data needed to create great user experiences because devices are different from each other.

The differences across these devices can be varied and sometimes significant. Here are some examples of variances across devices that may be challenging for one-size-fits-all models:

Different devices may have different memory capacity
Some devices may require a unique or proprietary format or delivery method
Some devices may perform better with a flatter or more hierarchical document model
Different devices have different screen real estate sizes which may impact which data elements are needed
Some devices may perform better having bits streamed across HTTP rather than delivered as a complete document
Different devices allow for different user interaction models, which could influence the metadata fields, delivery method, interaction model, etc.

Just think about the differences between an iPhone and your TV and how they beg for different user experiences. Moreover, the XBox and the Wii, both of which project to the TV, are different in the way users interact with them as well as in the hardware constraints, both of which may require different APIs to support them. When considering more than 800 different device types, the variance across them becomes overwhelming. And as more manufacturers continue to innovate on these devices, the variance may only broaden.

How do you know if your company is ready to consider alternatives to the one-size-fits-all API model? Here are the ingredients needed to help you make that decision:

Small number of targeted API consumers is the top priority
Very close relationships between these API consumers and the API team
An increasing divergence of needs across the top priority API consumers
Strong desire by the API consumers for more optimized interactions with the API
High value proposition for the company providing the API to make these API consumers as effective as possible

If these ingredients are met, then you have the recipe for needing a new kind of API.

Because of the differences in these devices, Netflix UI teams would often have to do a range of things to get around our REST API to better serve the users of the device. Sometimes, the API team would be required to extend the base service to handle special cases, often resulting in spaghetti code or undocumented features. And because different teams have different needs, in the REST API world, we would often need to delay feature development for some due to the challenges around prioritization. In addition to these kinds of issues, significant performance and/or architectural problems are bound to emerge. For example, these more granular APIs often result in chattier interactions between device and server or chunkier payloads, as I discussed in a previous post on the Netflix Tech Blog.

To solve this issue, it is becoming increasingly common for companies (including Netflix) to think about the interaction model in a different way. Rather than having the API create a set of rather rigid rules and forcing the various devices to follow them, companies are now thinking about ways to let the UI have more control in dictating what is needed from a service in support of their needs. Some are creating custom REST-based APIs to support a specific device or category of devices. Others are thinking about greater granularity in REST resources with more batching of calls. Some are creating orchestration layers, such as ql.io, in their API system to customize the interaction. These are all smart and practical ways around the problem. But with the growing number of devices, the increasing urge for companies to be on as many of them as possible, and the desire for continued innovation across these devices, these various solutions are still somewhat restricted. They are still forcing the developers to adhere to server-side rules and non-optimized payloads in an effort to have a one-size-fits-all solution. These approaches are closer to the flexibility needed in that they are not as rigid as the typical REST-based solution, but when supporting as many devices as Netflix does, we believe they fall short for us.

For Netflix, our goal is to take advantage of the differences of these devices rather than treating them generically. As a result, we are handing over the creation of the rules to the developers who consume the API rather than forcing them to adhere to a generic set of rules applied by the API team. In other words, we have created a platform for API development.

APIs, Article, Engineering, Technology | ProgrammableWeb, REST

ProgrammableWeb : 7 Ways to Make Your API More Successful

Comments Off

By daniel_jacobson, December 10, 2012 8:35 pm

I originally published this article to ProgrammableWeb.com on December 10, 2012

The purpose of a content API is to make the content available to its audience in the most useful and efficient way possible. To be a useful API, it needs to help the developers make their jobs easier. This could mean a wide range of things, including making it easier to dig into the API, allowing for greater flexibility in the responses, improved performance and efficiency for both the API and its consumer. Below are seven development techniques (all of which are part of the NPR API) that can help content providers improve the usefulness and efficiency of their APIs on both sides of the track. These techniques played a critical role in the success of the API which now delivers over 700 million stories per month to its users (more stats on the NPR API coming soon on our Inside NPR.org blog).

Be Flexible: Support Multiple Output Formats
Making the API as available and accessible as possible is very important in drawing developers to use it. So providing the content in a range of formats will increase the likelihood that the developer can rely on existing libraries and make as few changes to the code as possible.

The NPR API offers eight different output formats in an effort to improve efficiency for the developers. Above is a graph demonstrating the distribution of requests for each of the formats in July of 2009. As you can see, the majority of requests are to our proprietary XML markup (NPRML). That also means that almost 50% of the requests, or about 20M requests per month, use the other seven formats. In offering offering these other non-proprietary XML formats, the API is able to support developers that may have existing applications that pull in content in one of these standardized format, such as MediaRSS or Atom.

To make it even easier for people to use the API, NPR also launched with JavaScript and HTML “widgets”. The other six formats require more sophistication in order to put the content in an application or website. The widgets, however, are pre-designed feeds of NPR content (based on the developer’s selections) that can be easily dropped into a page.

Be Efficient: Handle Partial Response
This concept is now starting to get some more traction, now that Google announced partial response handling for some of their APIs. NPR’s API also makes extensive us of this feature because it really is tremendously valuable to the provider and the consumer of the API. For example, NPR stories contain a wide variety of fields and assets in the API. If the consumer is forced to handle the complete document, even if they only want a few fields, they have to endure all of the latency issues from the API itself as well as the additional processing power needed to handle the undesired fields.

As a result, NPR incorporated a “fields” parameter (the same parameter name used by Google) that can be used in the query string to limit the resulting document to only the fields of interest. This approach creates documents that are smaller and much more efficient. Overwhelmingly, more requests to the NPR API contain the fields parameter than those that do not (in fact, it isn’t even close).

Here are a few examples of how the same query to the NPR API, returning the same stories, delivers different documents based on the fields parameter (you will need to register for your own NPR API key to execute these queries):

http://api.npr.org/query?id=1001&apiKey=your_api_key

http://api.npr.org/query?id=1001&fields=title&apiKey=your_api_key

http://api.npr.org/query?id=1001&fields=title,teaser,text,image,audio&apiKey=your_api_key

An extension of partial response is to allow the developer to specify the number of items they would like in return. Some APIs return a fixed number of results, which can bloat the document just like the extra fields can. The NPR API, to counter this, allows the developer to pass in the number of results desired (with a fixed ceiling for any given request). To dig deeper in the results, we incorporated a “pagination” feature in the API. Here are some examples of how to control the number of stories:

http://api.npr.org/query?id=1001&numResults=5&apiKey=your_api_key

http://api.npr.org/query?id=1001&numResults=5&startNum=6&apiKey=your_api_key

Give Them Control: Allow for Customizable Output Markup (“Remapping Fields”)
As mentioned in the transform section, if the API can easily serve existing applications that expect specific markup, it potentially increases adoption and improves developer efficiency. To extend that functionality, the NPR API offers a function that we call “Remap” which essentially lets the developer modify the name of one or more XML elements or attributes in the output at request time. This is done in the query string and the API transforms the markup accordingly in real-time. Here are a few examples:

In this example, the remap parameter changes the story title to < specialTitle >:

http://api.npr.org/query?id=1001&remap=list.story.title:specialTitle&apiKey=your_api_key

In this example, the remap parameter changes the story title to < specialTitle > and it changes the image caption to < imageCaption >:

http://api.npr.org/query?id=1001&remap=list.story.title:specialTitle,list.story.image.caption:imageCaption&apiKey=your_api_key

In this example, the remap parameter changes the audio element’s id attribute to be named audioId:

http://api.npr.org/query?id=1001&remap=list.story.audio~id:audioId&apiKey=your_api_key

Another benefit to remap (which we have fortunately not had to use) is that it can be used to handle backward compatibility as the API grows and changes. NPR’s philosophy is to make sure that upgrades do not adversely affect existing functionality. That said, if an element or attribute does need to change, we could execute apache rewrites for all old API calls and have the remap function applied to have the output match that of the old markup. Alternatively, the developer could simply modify their API call instead of having to change their codebase to match the markup changes. (Although we do not intend to change existing markup, if we do, we would advise developers to upgrade their code accordingly. That said, rather than having the applications fail during the transition, remap could be used to temporarily handle requests until the full codebase can be upgraded).

Be Fast: Set Up a Comprehensive Caching Architecture
Performance is another critical aspect of APIs when it comes to enticing developers to use them. After all, if the API is sluggish, developers may not want to depend their application on it.

Smart caching of queries and results can really improve the speed of the system. NPR has implemented several layers of caching for the API, as follows:

Base XML – Caching the full document for each item is important to prevent the system from executing disk I/O before doing any transform. We cache the Base XML first in memory and secondarily as XML files to eliminate the need to access our content database.
Full Query Results – When compiling the list of items to be returned for any given story, it is important to cache the full list because popular applications that have many concurrent users (such as NPR Addict) are very likely to execute the same queries and expect the same results. The cached result is a single document containing the full list of all items and the full base XML for each.
Transformed Query Results – The calling application, such as NPR Addict, expects the document to be transformed to fit the application’s needs. So, the results that get cached in Full Query Results may get transformed to MediaRSS while simultaneously removing extraneous fields. Caching the final results that get returned to the calling application enable fastest performance without compromising the system’s ability to use the other caching layers to produce different versions of the document.

Give Them Tools: Provide a Query UI with the Documentation
There are two truths about developers and documentation: the former always expects the latter, but seldom uses it. Of course, you cannot have an API without providing comprehensive documentation. That said, offering a simple user interface that helps developers get what they need from the API wil increase adoption and make life easier for them.

NPR’s API launched with a tool that we call the Query Generator. This tool exposes more than 6500 query-able IDs, methods for controlling the output format, fields to be returned, date and search restrictions, pagination, and more. Using the interface, the developer can select their options and have the tool create the query string for their API request. The developer can also see the results of that query inline before commiting it to their application. Almost exclusively, developers (including the NPR staff) use this tool to create queries, rather than reading the documentation.

Be Open: Eliminate Rate Limiting
Throttling or limiting access to APIs is an inherent disincentive for developers. Moreover, it is actually a detriment to the API provider. After all, the purpose of the API is to grant access to the content. If a given developer can only call the API 5000 times a day, and that developer creates a hugely popular application, the rate-limiting will inherently stifle the developer and the viral nature of the API.

Granted, most APIs use rate-limiting or tiered access levels to allow business people to control the graduation of API users. This seems counter-productive to me though. The better approach is to open access completely, identify those incredibly successful usages, then work with the developer accordingly on a mutually beneficial relationship. This way, applications are given full ability to grow and mature without arbitrary constraints.

Other APIs implement rate-limiting to protect the servers from unexpectedly high load. This is a legitimate risk which, if encountered, can adversely affect the performance of all users. That said, building complicated features into the system, such as rate-limiting, can be much more costly than configuring a scalable server architecture. Moreover, each request to the API will see slight latency increases as a result of the rate-limiting analysis. I know that latency is marginal, but why introduce any additional latency, especially when creating disincentives for developers?

Be Agile: Practice Iterative Development
Building your API over time has several benefits. First, it signals to the developer community that this API is meaningful to the provider and will continue to grow and get supported over time. This sounds trivial, but it is a very important part of the relationship with the community. If developers are not sure about your commitment to the API, are they likely to spend their own time building an application around it?

Another benefit of iterative development is that you do not have to get the API perfect the first time. I will qualify that by saying that, as a matter of principle, any release for an API should be done with the expectation that it will be supported for a long time. This is important because changes to existing API features will break the applications of those that use them. When I say the API doesn’t have to be perfect, I mean it does not have to be complete. New features can (and should) be added over time, extending its capability and making it more attractive for potential developers.

To put it another way, you will not have every detail of the API solved at the initial launch. It is much better to go live with the features that you know well while deferring those that you do not. Trying to cram in tenuous requirements will create headaches for you and for the community down the road. Spend the time necessary on figuring out the features, the supporting markup, the access and error methods, etc. before you commit to an API feature.

APIs, Article, Content Management, COPE, NPR, Technology | ProgrammableWeb

Forbes : Explaining the API Revolution to your CEO

Comments Off

By daniel_jacobson, February 12, 2012 12:01 pm

This article first appeared on Forbes on February 12, 2012

APIs: A Strategy Guide

For most of the past year, I have worked with two brilliant experts on APIs, Daniel Jacobson, at Netflix and Greg Brail, CTO of Apigee, to create a book that clearly explains the value of APIs. In researching the book, APIs: A Strategy Guide, we talked to dozens of other smart people who had led the creation of APIs for both internal and external use.

One of the most striking findings was how often API programs were started in secret, nurtured by the true believers in a clandestine way, slipped into production, and then brought to the awareness of senior management after the API was shown to be a success.

This pattern is understandable but ridiculous. Our book is an attempt to obviate this pattern by providing a top-to-bottom reference for people who want to understand the business value of APIs. But we are not naive. We know that the most likely people to read our book are also true believers and innovators who are already open to new ideas. The trick to accelerating the adoption of APIs and reaping the massive value they can create, is to convince skeptics. At some point, that means convincing the CEO, who, if sold on the idea, can bring all the resources of the organization to bear. APIs have been proven over and over to be a transformative force. It is time for technology leaders to force the issue. So, get yourself a meeting with your CEO and have a conversation along the following lines.

The Business Basics of APIs

When a CEO looks at the calendar entry for your meeting, he or she will likely think, “My god, this seems a bit technical for me. Shouldn’t the CIO or CTO be handling this?” So your first job is to make sure that the CEO understands that APIs are huge channels for business. Start the conversation by pointing out these facts:

APIs are not experimental: More than half of all the traffic to major companies like Twitter and eBay come through APIs

APIs are channel to new customers and markets: APIs used externally unlock the power of partners to use business assets to extend the reach of your products or services to customers and markets you may not easily reach.
APIs can be private: Much of the talk about APIs emphasizes their public use. Internal APIs should be part of every companies IT strategy.
APIs promote innovation: Through an API, people who are passionate about a problem can solve it on their own.
APIs are a better way to organize IT: APIs used internally can accelerate innovation by allowing everyone in a company to use each others assets without having to wait around for permission.
APIs are not only for huge companies: The technology is standardized and able to be used by companies of all sizes.
APIs create a path to lots of Apps: Apps are powered by APIs. When developers are motivated, they can use APIs and combinations of APIs to create new experiences for end users.
APIs to create lots of apps that can lead to lots of customers: Apps are going to be a crucial channel in the next 10 years. There will be trillions of apps in the next decade vs.a billion web browsers in the last.

Then point the CTO to the post by Google engineer and Amazon alumnus Steve Yegge, that points out that APIs are so important to Amazon CEO Jeff Bezos that he threatened to fire anyone who didn’t expose their assets through APIs.

APIs from Twitter, Amazon, Google, and Facebook have been used to create thousands of applications. These victories are being followed by APIs from AT&T, Sears, E*Trade, Alcatel-Lucent, Accuweather, and hundreds of other companies. Point the CEO to the research from John Musser, founder of ProgrammableWeb.com, a site that tracks the growth of APIs.

At this point if your CEO isn’t interested, then there’s not much you can do.

However, the next question your CEO will likely have is, “What is an API?”

Drawing this simple graphic is a good way to start:

API Value Chain

Explain that APIs are a simple way to provide access to some type of business assets. The business asset can be information itself, information about a product or service, or direct access to the product or service.

Point out that to make an API successful, everyone in the value chain must benefit. Make sure you convey you are not suggesting that if you build an API, the world will come rushing to use it.

The value chain takes many forms. The organization that owns the business asset may or may not be the same as the organization that builds the APIs. Different people or organizations may build, distribute, and market the applications. At the end of the chain are end users who gets the benefit of the business asset. Often, many APIs are used to create a new experience for end users.

Finding your API Value Proposition

If you are getting anywhere with your conversation, your CEO will likely get impatient with the technology and want to know the payoff for your company. He or she may say, “Great, APIs are a big deal, but what does that have to do with us?”

Here’s where you need to come armed with some sort of experiment that you want to get support for. One of the best ways to start is with an internal use of APIs. A key point of our book, one tirelessly championed by Daniel Jacobson, is that internal use of APIs is going to have the largest impact for most businesses. Remember, Amazon’s Jeff Bezos was going to fire people if they didn’t create internal APIs.

So, what are the business assets? Usually information that would benefit the company if more people had it in their hands. Look at the backlog of requests to IT. Is there an API that could take several items off the list? Have this in mind.

It is much better to build skills internally and master the design, development, testing, and operational processes before putting an API in public. That said, with careful planning and a clear value proposition, it is possible to have your first API be a public one, especially if you are following a well-established method of using APIs.

If you do start with an external public API, make sure you have a long testing cycle and flesh out all business and legal concerns before launching. I recently talked about API design and development with Byron Sebastion of Heroku, the Ruby platform for development that was purchased by Salesforce.com in 2010. He said that all of their APIs are first used extensively internally before going through a long public beta test. “When you launch an API, you want to make sure you are confident it is right, because people will rely on it,” Sebastian said.

At this point, I hope you are successful, because then you will need to answer questions about business models, design, engineering, operations, and marketing, all of which are covered in our book.

This article covers one way to go about educating a CEO about APIs. If you have other approaches, please send them along.

Also, we are going to be expanding the book several times this year. If you have a good example of how APIs have been successful for your organization, please reach out to me.

Dan Woods is CTO and editor of CITO Research, a firm focused on advancing the craft of technology leadership. He consults for many of the companies he writes about. For more stories about how CIOs and CTOs can grow visit CITOResearch.com.

APIs, Article, Netflix, Technology | APIs: A Strategy Guide, Forbes

ReadWriteWeb : Netflix’ Daniel Jacobson: Letting APIs Change Everything

Comments Off

By daniel_jacobson, February 3, 2012 4:20 pm

This article originally posted by Scott Fulton on ReadWriteWeb on February 3, 2012 as Part II in a series. Part I can be found here.

What we today call the “mobile app” could, in a very short period of time, become known as the portable app, or just the “app.” It tends to use such a simple and straightforward model of interaction that people are starting to prefer using their smartphones for certain tasks, even when their PCs are right in front of them. By this time next year, portable apps originally designed for use on smartphones and tablets may be running on laptops.

The extent to which this changes everything is a topic that no one, not even ReadWriteWeb, has fully fathomed. The Web as we have come to know it will be affected significantly. What users have come to know as Web sites will be willingly and eagerly substituted with Web apps. In Part 2 of our interview with the co-author of APIs: A Strategy Guide, Netflix lead API engineer Daniel Jacobson tells us the one huge difference between an app and a site involves the extent to which they rely on an API. It is part of every app’s DNA.

The First, Painful Steps Toward Multi-Platform

In 2002, as you learned from part 1 of our RWW interview last week, when Jacobson was with NPR, he helped make a critical decision about its information infrastructure, the implications of which his team had not foreseen: “Literally the first thing that we did,” he tells RWW, “is, we built the API and we put the Web site on top of it. So the Web site runs off the API. It’s a little bit of a different interaction model; it doesn’t have to go through the authentication and whatever else, in the same way that external apps do.”

That API later gave NPR the freedom to build apps that run outside the browser, and that use that same API in different ways. So when mobile apps were invented, NPR was among the first publishers to be ready for them. When Netflix saw it needed an architecture that enabled it to reach all its users without it being dependent upon the usage model for any one device, including the Web browser, it hired Jacobson to build it.

A 2005 Netflix demo at a Microsoft convention featured one of that company’s program managers at the time, Darryn Dieken, showing then-President Jim Allchin the prospects of using one underlying technology as the foundation for developing a unified product line across different devices. The technology at the time was code-named “Avalon,” and evolved into what we now call Silverlight.

After showing how a Netflix product selector ran outside the browser but through the Web, in a way people had never seen before at that time, Dieken showed essentially the same selector running inside Windows Vista on a tablet PC. From there, he proceeded to show where else folks would eventually find Netflix.

The demo took the audience inside Windows Media Center, which had just been released for Windows XP and was being vastly updated for Vista. The Media Center plug-in used many of the same presentation techniques and concepts as the stand-alone version, demonstrating the benefits of code reuse.

But when the demo turned its attention to Netflix on a Windows Mobile phone, it became painfully obvious that the benefits of client-side code reuse could only go so far. Yes, there was communication taking place between all these different clients and the server. But the way these interactions were happening were based on leveraging Web site-oriented, forms-based submissions that at one level could be described as an API, but failed to be uniform – one API for many platforms.

The goal of any modern API, Dan Jacobson emphasizes, is “to treat any presentation layer the same. So if you have multiple Web sites, like NPR does (they have NPR Music as well as NPR.org), both of those sites run off of the same interaction model through the API. They’re just presentation layers, the same way as mobile app or Google TV or [NPR] Infinite Radio. Users are going to consume new material in any way that they want to, wherever, whenever; and your goal as publisher is to make sure that you have a presentation layer that serves them wherever that is. And in doing so, the easiest way, the most effective way to date is to leverage APIs, and invest a little bit on having the right talent surrounding it.”

“Publish Everywhere” Doesn’t Have to Be Homogenous

Because presentation layers are so different from one another, he goes on, a business can and should nurture teams of developers with the exclusive skillsets that each of those layers needs – for example, Objective-C developers for iPhone apps. There’s no reason why certain teams can’t specialize. Having a single API that addresses each layer in a standard way, he says, provides all your teams with the flexibility they require to take advantage of the platforms on which they’re focused.

This allowance for specialization tends to work itself away from the “one Web” way of thinking, the belief that everything will inevitably merge into HTML5. In professing that API design should not be centered around any one single mode of presentation, lest it eventually become obsolete (among other reasons), Jacobson advises that API designers focus on finding ways to symbolize and encode business interactions, the things that businesses do, not the things that Web sites do. Your goal is not to make the browser more efficient or the user experience more immersive. Leave that to the UX designers. As the API engineer, your goal is to enable business.

“That kind of thinking is fundamentally different than, ‘How do I want to structure my content? Do I need to think about what resources can be broken up in which ways and made available in different ways?’” says Jacobson. “For NPR, for example, there are stories, there are assets, different kinds of things in that system. For Netflix, there are users, catalog items. How do you want to structure that material, both in terms of the resource level as well as items underneath it? What are the rights management concerns that go into this, legal constraints internally about what can be published? For Netflix, what can I show users in Latin America that I can’t show to people in Canada? For NPR, it’s, I’m publishing AP photos; whom can’t I present that to, and whom can I? Those kinds of things are really business-oriented decisions that you can’t just flip a switch and say, ‘Make it happen.’ You need to be very thoughtful about what you’re exposing and to whom, and how you’re going to do it so you can get the maximum effectiveness out of it.”

It is this concept which may outmode, or render obsolete, the traditional notion of the Web site, the notion that something that’s created once and published everywhere (COPE) must always be the same thing. Done properly, Jacobson says, it can and should be integrated with the uniqueness of each device.

“When Web APIs started out, they tended to be more about publishing on all kinds of different platforms. Now I think it’s very much about aggregation, and merging others’ API experiences,” says the Netflix engineer. “One of the interesting things with Netflix, for example: We have branded apps on a wide range of platforms, and if you look at something like AppleTV or Roku or Xbox, or any of these other devices, we’re not the only ones there. There is an aggregation of services where Netflix creates an experience on that platform. We actually integrate with their systems, we’re creating an experience on that site, and then people can access our experience in the way they expect it to be presented.”

APIs, Article, Engineering, Netflix, Technology | ReadWriteWeb

ReadWriteWeb : Netflix Engineer Daniel Jacobson: The API at the Root of Your Business

Comments Off

By daniel_jacobson, January 28, 2012 4:15 pm

This article originally posted by Scott Fulton on ReadWriteWeb on January 28, 2012

The first place I had ever seen an API actually at work was as part of an operating system. It was a strange OS at that, a permutation of CP/M that used a graphical front end called GEM, which would later be ported to the Atari ST. The definition was explained to me like this: An “interface,” as everyone knows, is a specification for how electrical components interconnect. Well, now it’s possible for an application program – the part that does what users need – to interconnect with the operating system, which does what the computer needs. This way the operating functions don’t have to be built into every program, they can just be handed off to the OS and the connection will look seamless. The principle was called a layer of abstraction. It was 1984, and it was the first time I’d heard the term.

It would be wrong to call the concept “revolutionary,” unless you measure time in units of eons. Nearly three decades after its introduction, only recently have businesses come to realize how widely this architectural principle could be applied. No longer do complex processes have to be bound to precise, policy-intrinsic procedures. If teams can work independently, and computer resources devised to suit each team individually, then all that needs to be specified is the exchange of information between them.

So it is that a software designer ends up becoming one of the public faces of the ideal of API architecture as a business tool. Daniel Jacobson is the lead API engineer for Netflix – arguably the largest single consumer of bandwidth on the entire Internet. His O’Reilly book, APIs: A Strategy Guide, co-authored with Apigee CTO Greg Brail and research editor Dan Wood, deals with the implementation of APIs not so much for software’s own exclusive purposes, but moreover as a means of realigning and renovating business’ resources overall.

“APIs should not be geeky ‘science projects,’” reads the first paragraph of Chapter 4. “They are critical business tools. Successful APIs need clear objectives that relate directly to business objectives and track closely to key performance indicators for the business at large.”

More open on the inside

We’ve written here in ReadWriteWeb in the past about the value of APIs in providing transparency and accessibility to businesses, mainly through enabling them to develop mobile apps that connect more directly to their customers. Jacobson has a different perspective, which derives from his experience with Netflix, and earlier as the creator of the API for NPR. It was in 2002 that Jacobson and his NPR team made discoveries that he describes as part logic, part luck.

“At that time, a lot of publishers would be buying these CMSes, off-the-shelf products like Interwoven or Vignette,” Jacobson relates to RWW. “And the flexibility, and the opportunity for thinking in these [new] kinds of ways, was somewhat limited.”

Subsisting sometimes from month to month on public and government funding, NPR didn’t have the budget to go big and invest in a colossal, support-intensive CMS like Vignette – an investment which, at that time, often cost bigger businesses tens if not hundreds of thousands per year, after including maintenance costs. Faced with no other obvious option, NPR was forced to go it alone, building its own CMS. And in recognizing the need to maximize its efficiency, Jacobson and his colleagues decided that their system had to be designed from the beginning to be flexible enough to publish to any platform, including those that had not yet been created.

So NPR adopted a design philosophy called COPE: Create Once, Publish Everywhere.

“That was the really fortunate decision that we made… We didn’t think about iPhones and tablets, and things like that, in 2002. But we were thinking that we could imagine a case somewhere down the road where the Web site would need to change again, or we’re going to do another redesign… It was really important for us to have this COPE model, so we can actually capture all the metadata that’s important to us in a very modularized way so that, regardless of what the display is going to look like, we can publish to it very easily. So conceptually, we separated the idea of capturing the data from presenting the data.”

It was NPR’s first abstraction layer. But it was not yet an API, mainly because the CMS and the database were still tightly bound. To this day, businesses that invested in content management systems around the year 2000 are wrestling with the headaches of data portability, because their CMS is too tightly bound to its database, and the database has become a rusty, misbehaving vault.

The interface as publishing

It was 2007. While NPR had a system that could publish anywhere, the create-once part was giving it problems. The creation was becoming a frightful mess.

“It was that moment in 2007, I think, when we said, we’ll need another abstraction layer to separate out the direct access from the presentation layer to the database, even though we had conceptualized them as being different, that binding to the database was still there. That’s when we created this new abstraction layer of the API, and shortly after that, [we realized] we could open this thing up quickly.”

The process of integrating the abstraction layer was entirely internal, and its goals were focused on how NPR could retool itself. But in making that change, the organization realized it could effectively publish the benefits of that abstraction in a way that was entirely in keeping with the goals of its COPE methodology. Dan Jacobson tells us that, in this phase of the project, he incorporated another important ethic, this time straight from the world of broadcasting: Know your audience. More specifically, build each component of the system in tune with the needs of its consumer.

Jacobson’s API project enabled NPR to publish stories and excerpts through its own cross-platform app, entitled “Infinite Radio.”

Jacobson’s book suggests that more businesses either nurture or hire someone who can serve as the technologist for their company, and make it this person’s job to know the audience – to understand how data is being consumed and who is doing the consuming. “And then understand what abstraction layer, like an API, needs to be put in place,” he explains, “to basically be the glue between the capturing of the data and the presentation to its users.”

One term Jacobson often borrows from the software development world and applies to the business world is context. He uses it to mean the breadth of a person’s influence in the company, and there are reasons that influence may be limited. But only through understanding the different contexts of business units, he feels, can a developer build an API that enables them to interoperate.

“Publishers are thinking about how can they create an organization that will put them in a position for this kind of rapid growth,” Dan Jacobson continues. “At Netflix now, we have several hundred devices running off our API. Many publishers of various kinds would love to have that kind of distribution. You need your technologists in a position to have the context and the trust of the superiors, and basically everybody on board with making smart decisions and allowing them to execute. The larger the company sometimes, the more bureaucracy there is, and the more they need to have these discussions. They’re basically, potentially, shackling their people… Here, you’re putting them in a position to make decisions for you.”

Fate has an interesting way of making itself appear coincidental. Had NPR not been so constrained by its own budget limitations, it might never have hired the team that designed their CMS and that implemented COPE in the first place. And it might still be bound by the same tight, complex information architecture that binds so many bigger commercial enterprises to this day.

“I think it’s the confluence of a range of things – the financial restrictions, having good people, good context, good control of the situation, and making smart decisions – and a little bit of luck,” says Jacobson. “We could have made some smart decisions at the time that weren’t quite as lucky down the road. We were very fortunate.”

APIs, Article, Engineering, Netflix, Technology | API, ReadWriteWeb

Categories
- Netflix (32)
- NPR (22)
- Org Culture (2)
- Personal (2)
- Public Appearance (40)
  - Article (13)
  - Interview (3)
  - Keynote (1)
  - Presentation (24)
- Technology (50)
  - APIs (49)
  - Content Management (16)
  - COPE (14)
  - Engineering (29)
  - Mobile (2)
Recent Posts
Archive of Posts
- March 2016 (1)
- November 2015 (1)
- July 2015 (1)
- February 2015 (1)
- May 2014 (2)
- April 2014 (1)
- March 2014 (1)
- January 2014 (1)
- October 2013 (2)
- September 2013 (1)
- July 2013 (1)
- June 2013 (1)
- May 2013 (1)
- April 2013 (1)
- March 2013 (1)
- February 2013 (4)
- December 2012 (5)
- July 2012 (2)
- May 2012 (1)
- April 2012 (1)
- February 2012 (2)
- January 2012 (1)
- October 2011 (2)
- July 2011 (1)
- May 2011 (1)
- April 2011 (1)
- February 2011 (1)
- October 2010 (1)
- September 2010 (3)
- July 2010 (1)
- June 2010 (1)
- April 2010 (2)
- October 2009 (1)
- September 2009 (2)
- June 2009 (1)
- July 2008 (1)
Posts by Day
July 2025

S M T W T F S

1 2 3 4 5

6 7 8 9 10 11 12

13 14 15 16 17 18 19

20 21 22 23 24 25 26

27 28 29 30 31

« Mar