Category: Netflix

The Next Web : The future of API design: The orchestration layer

comments Comments Off on The Next Web : The future of API design: The orchestration layer
By , January 18, 2014 9:24 pm

This article was originally published to The Next Web on December 17, 2013

The digital world is expanding at an amazing rate, giving us access to applications and content on myriad connected devices in your homes, offices, cars, pockets and on even on your body. The glue that allows all of this to happen, that connect the companies who provide these services to the devices that you use, is the API.

Because APIs have such a huge responsibility for so many people and companies, it is natural that API design is often one of the industry’s liveliest discussions, touching on a range of topics including resource modeling, payload format, how to version the system, and security.

While these are likely important areas to explore when designing virtually any API, the reality is that a much larger decision needs to be made first. That decision is based on a fundamental question: who are the primary audiences for this API and how can we optimize for those audiences?

This seems like an innocuous enough question, but don’t underestimate its importance or complexity in the growing world of APIs.

Years ago, this question was much simpler

At that time, many emerging APIs were being built as open or public APIs, targeting a large set of unknown developers (LSUDs) external to the providing company.

Because of the (hopefully) vast numbers of external developers using the API representing different use cases, the most sensible way to design the API for this audience is to have the providing API team design it in a very clean, concise, and resource-oriented way that closely represents the data model and/or features of its source(s).

In a previous post, I referred to these as OSFA APIs (or one-size-fits-all APIs). Allowing for such granularity in the modeling means that any developer who wishes to use the API can mix and match the elements in whatever way they choose to satisfy their application without further API team involvement.

The resource-model approach to designing an API can be very powerful, especially for this type of audience. The problem with this approach, however, is that the way that many companies use APIs today is different than described above. While many are still supporting the use cases of LSUDs, more are using their APIs to support a growing mobile or device strategy.

For some of these implementations, the engagement with the developers is different. The audience is a small set of known developers (SSKDs). They may be engineers down the hall from the API team, a contracted company hired to develop an iPhone app, or an engineering team in a partnering company. In all of these cases, however, the API team knows who these people are (at least in the abstract sense).

More importantly, however, the API team and the providing company care about the success of these implementations in a different way than they might care about the applications developed by the LSUDs. In fact, the success of the SSKDs may very well be paramount to the success of the business as a whole, a model that is becoming increasingly more pervasive.

Because of this change in audience and the deep interest in their success, there is great opportunity to change the API design.

For the SSKDs, having granular resource-based APIs that closely represent the data model works, but it just isn’t as optimal as it could be. This is especially the case when you consider the growing number of device types in the world and the fact that more and more companies’ business strategies are dependent on providing value to customers on such devices.

So, all it takes is a couple of devices with diverging needs and/or capabilities, each of great import to the company, for the resource-based API to start to show some warts. Making the API better, more optimized, for each of these target applications is the next logical, and most critical, step.

Enter the Orchestration Layer

An API Orchestration Layer (OL) is an abstraction layer that takes generically-modeled data elements and/or features and prepares them in a more specific way for a targeted developer or application

To address this opportunity, more companies are employing orchestration layers into their API infrastructure. While there are many ways in which to implement this architectural construct, the concept remains the same across all of them.

Below, I will describe a few of the more common patterns that I have seen (and/or been involved in implementing). But first, here are a few key principles that need to be considered when building an OL:

1. Most APIs are designed by the API provider with the goal of maintaining data model purity. When building an OL, be prepared to sometimes abandon purity in favor of optimizations and/or performance.

2. Many APIs are designed by API teams to make it easier for the API team to support. When building an OL, be prepared to potentially add complexity for the API team (or other teams, depending on the way it is implemented).

While this sounds undesirable, the goal here is to dramatically improve efficiency and/or simplicity for other people at some mild cost to the API team. Also keep in mind that such costs can potentially be programmed away over time.

3. It is important to understand the breadth of the audiences for the API.Depending on those constituents, you may only need the OL. In other cases, you may need the OSFA foundation in addition to the OL.

Here are a few examples of how some OLs have been approached:

Device-specific wrappers

This is the most common pattern that I have seen because most companies that are experiencing the distress referenced above already have APIs that they still use, continue to support, and invest in. The result is to continue to offer the granular resources as they always have, but to offer a wrapper tier on top of them – with new endpoints that are tailored to specific developers, devices or device clusters.

In this model, the API team will work more closely with, for example, the iPhone team to write a custom wrapper that handles specific requests and deliver specific payloads that are optimized for the iPhone app. In this model, most often the team to build the endpoints and the wrappers is the API team although that doesn’t have to be the case.

Query-based APIs

In this model, the API team is putting the power in the hands of the requesting developer, although that power is limited. The goal here is to create a more flexible way in which the requester can make requests and tailor payloads without putting additional ongoing burden on the API team, as could be the case with the Device-Specific Wrappers.

This is achieved by breaking down the resource-based APIs and allowing them to be queried against like a database through flexible parameters and payloads that can contract, expand and possibly morph based on what is needed. The benefit here is that once the query language is set, the API team does not need to keep writing wrappers as new implementations are needed for different devices.

The detriment, however, is that the query-based API is still a set of rules to which the developer needs to adhere, although these rules are much more flexible than the resource-based API model.

Experienced-based APIs

resource v experience apis The future of API design: The orchestration layer

This is the model that Netflix has implemented, which in some ways is a blend of the two above. In this model, we basically have device-specific wrappers but they are designed, implemented and owned by the device teams.

A key concept here is that we have put the API team in the position of gathering the data in a generic, reusable way while putting the device teams in the position of owning the data formatting and delivery. After all, the formatting needs evolve in concert with the UI changes so putting that effort in the hands of those closest to the changes eliminates additional steps.

(For more details on how this system operates, see the links at the bottom of the post.)

As I noted, the range of implementations is potentially much more diverse than these three, although these are some of the most consistent and interesting patterns that I have seen. Regardless of how this is achieved, however, the key is for the API team to stop supporting the API as a service that is designed independent of those SSKDs who consume it.

Rather, the API team needs to view the SSKDs as partners in the design with an interest in making the products as great as possible so the end-users can get the best experience possible. The API team has the opportunity to build services that help developers to be better at developing by focusing on optimizing for the developers’ needs rather than how to optimize the time spent supporting the API.

Given the opportunity ahead with the potential number and diversity of connected devices, the effort to provide such optimizations is a small price to pay for the massive upside.

Netflix Does Not Have an Unlimited Vacation Policy

comments Comments Off on Netflix Does Not Have an Unlimited Vacation Policy
By , October 28, 2013 1:02 pm

I am finding it increasingly common, when talking about working at Netflix, for people to tell me that their companies are embracing an unlimited vacation policy similar to the one Netflix has been employing for years. When I hear these kinds of statements, my first thought is that these programs are not likely to have the desired outcome. I think that because Netflix does not have an unlimited vacation policy. Rather, Netflix has a culture of “Freedom and Responsibility” (F&R), which is discussed extensively in our publicly available culture slides.

The culture of F&R extends much further than its application to vacations, sick days, and holidays. F&R is a mindset on how we treat each other and our jobs on a day-to-day basis. It is a simple mantra that basically states that we are all adults, so let’s all behave and treat each other like adults. And if a person is not consistently behaving like an adult, they should not work at Netflix.

So, what does it mean to “behave like an adult”? At the core, Netflix seeks out senior-level people who are seasoned, both in their primary competency as well as in how to work effectively with others. We expect people to live up to the values that we discuss in the culture slides, to be people on which we would want to trust the future of our business. In fact, we are trusting the future of the business on each and every person who works with us. And that is the central point of it all… trust. If we cannot trust you, you should not be working at Netflix. Period.

This kind of trust takes many shapes. One example is that there are no strict silos in our server access rights. Virtually every engineer at the company can easily get root-level access to virtually any system in the production environment. Another manifestation is the fact that virtually everyone in the company knows key strategic initiatives that we are focused on well before these initiatives go public. As a result, everyone in the company is subject to trading windows for our stock options. That level of openness, at a company the size of Netflix, is exceedingly rare. And we can do all of this because we hire (and fire), essentially, for that deep level of trust.

With respect to vacation, that is just another manifestation of our trust as it pertains to F&R. We trust our employees and peers to take care of what they are responsible for. As long as we are being responsible (ie. getting our part of the project done, communicating our situation to others, etc.), then we don’t have much interest or concern around how others budget the rest of their time. We are free to do as we please because we can be trusted to do what we are supposed to do. As a result, tracking people’s time for vacation, sick, holidays, etc. just does not make sense. Track the results of the output, not the amount of time that it takes to execute it (or even which hours of the day were used to complete it).

On a related note, for the companies that do track hours work and audit leave days, many of them do not also track late night hours working on production deployments, site outages, etc. If you are going to track one, you must track the other. But it really makes no sense to track either. Companies trust these people to deploy and maintain their entire digital presence at midnight on a Saturday, but they won’t trust these same people to be responsible with their own time. That makes no sense!

At the core, Netflix does not have an “unlimited vacation” policy, we have a trust policy. If we trust you, then we will trust you and will give you great liberty to accomplish amazing things. If not, we will part ways.

Presentation : The Evolution of Your API and Its Value – Daniel Jacobson, Netflix

comments Comments Off on Presentation : The Evolution of Your API and Its Value – Daniel Jacobson, Netflix
By , October 18, 2013 1:04 pm

This full presentation was at the Mashery Business of APIs Conference in 2013.

Daniel Jacobson, Director of API Engineering at Netflix, discusses the evolution of the company’s API program using Simon Sinek’s ‘Start With Why’ framework. Jacobson stresses the importance of designing an API that answers the company’s philosophical needs before deciding how to best expose the platform to internal or external developers. Jacobson further explains why APIs are critical to the success of developer communities, business partnerships, internal development efficiencies and device proliferation.

Key Points:

  • Ensure system reliability and resiliency by scaling the API with the business
  • The advantage of optimizing systems for rapid innovation and improved product experience
  • Why it may be necessary to consider hardware constraints when designing an API

“We’re focusing on our streaming application — what Netflix is trying to do — and then we’re implementing a tactic, which happens to be an API, to accomplish that.”

The following is the video of the full presentation:

And here are the slides from this presentation:

The following is a short clip of the presentation that Mashery published, highlighting APIs as a Tactic to Advance Business Strategy.

Presentation : Scaling the Netflix API – From Atlassian Dev Den

comments Comments Off on Presentation : Scaling the Netflix API – From Atlassian Dev Den
By , July 16, 2013 3:14 pm

This presentation was originally given at the Atlassian Dev Den in San Franscisco on July 16, 2013

The term “scale” for engineering often is used to discuss systems and their ability to grow with the needs of its users. This is clearly an important aspect of scaling, but there are many other areas in which an engineering organization needs to scale to be successful in the long term. This presentation discusses some of those other areas and details how Netflix (and specifically the API team) addresses them.

Presentation : Netflix API: Keynote at Disney Tech Conference

comments Comments Off on Presentation : Netflix API: Keynote at Disney Tech Conference
By , June 8, 2013 6:15 pm

In June 2013, Disney hosted what I believe was their first internal tech conference, to be hosted in Orlando at Disney World. They asked me to be a keynote speaker at this conference. Below are the slides from that keynote.

This presentation discusses the evolution of Netflix’s API architecture over time. It begins by explaining Netflix’s initial “one-size-fits-all” REST API model and how requests grew exponentially as more devices were supported. This led Netflix to move away from resource-based APIs to experience-based APIs tailored for specific devices and use cases. Key aspects of the new design included reducing “chattiness”, handling variability across devices, and supporting innovation at different rates. The document also discusses Netflix’s approach to versioning APIs and making them more resilient through techniques like circuit breakers and fallbacks.

Panorama Theme by Themocracy