Daniel Jacobson's Blog

Category: Interview

Interview : Daniel Jacobson on Ephemeral APIs and Continuous Innovation at Netflix

Comments Off

By daniel_jacobson, November 17, 2015 4:04 pm

This interview with Jerome Louvel originally appeared on InfoQ on November 17, 2015

Following his talk at the recent “I Love APIs” conference, InfoQ had the opportunity to interview Daniel Jacobson about ephemeral APIs, their relationship to experience-based APIs and when to consider them in your organization.

Daniel leads development of critical systems that are the front door of Netflix, servicing 1,000+ different device types and billions of requests per day. He also manages the Netflix playback experience which accounts for approximately one-third of Internet downstream traffic in North America during peak hours.

InfoQ: What is your current role at Netflix and your day-to-day responsibilities?

Daniel Jacobson: I run the edge engineering team which is responsible for handling all traffic for all devices around the world for signup, discovery and playback. On the playback side we are responsible for the functionality that supports the playback experience. The API side is responsible for handling the traffic directly from devices, fetching data from a broad set of mid-tier data services and then we broker the data back. Both teams are critical to success of Netflix because nobody can stream if playback is not available and nobody can stream if the API is not available.

InfoQ: Can you explain what Ephemeral APIs are all about and how different they are from the Experience APIs that you have proposed before?

DJ: Experience APIs are trying to handle an optimized response for a given requesting agent. That’s orthogonal to the ephemeral APIs. The experience API is more about the requesting pattern and the payload. Ephemeral API is more about the process of iterating and evolving the experience APIs.

Traditionally, APIs get set up to make it easier for the API provider to support, which results in one-size-fits-all APIs. The problem with that approach is that the API ends up being harder to use for a wide array of consumers. In other words, the optimization in that model is to make things easier for the few but harder for the many. For experience APIs, the goal is to focus on the needs of the individual requesters and optimize the APIs for each of them. It means that you are essentially running a wide array of different APIs. This results in a more challenging environment for the API provider to support because the variability is higher, but it allows the API consumers to develop what is best for them and for the performance of their clients. Ultimately, this should translate into a better customer experience.

Ephemerality is part of our story in how we develop our APIs, but not essential for the experience API model. Ephemeral APIs mean that the endpoints and payloads should be able to be terminated and created with ease and flexibility with the expectation that this can happen at any moment and potentially very frequently. If we can support ephemerality, then we can innovate faster and continuously to support the product needs without being a bottleneck.

To give an example, if we are running an A/B test to evaluate a new feature in our SmartTV experience, the UI team working on that feature can iterate on the client code and the APIs without the API team’s involvement. As they develop the test, they may realize that the data needs change or can be optimized, which would result in them killing the endpoints and create new ones. This can happen dozens of times over the course of the project and without the API team getting involved (as long as all of the data elements already exist in the pipeline).

InfoQ: What is the best way to find the right granularity for experience-based APIs? Is it mostly based on the device capabilities or on team organization?

DJ: I’ve written a detailed blog post on this topic in the past, which includes the recipe for when experience-based APIs might be a good choice. Basically, it is likely many companies don’t need to go this route because it’s a scale question.

So, if you have a wide array of different interaction models that are diverging and a close relationship with those who are consuming the APIs, those are good indicators that you might want to optimize for this. The proximity to the consumer of the API is key because you have a tighter feedback loop and more understanding of what their individual needs are.

The difference with generic resource-based APIs is that you don’t know who is going to consume the APIs and how they will be consumed. If the consumers are in your organization, and if you understand those nuances, you can create an architecture that is optimized for them all.

Within Netflix, we have created the architecture as a set of Java APIs and all these different device teams can build their own experience-based web APIs that are optimized for their clients. We like to call our system a platform for API development, more than an traditional API.

InfoQ: Do you have a separate API for Netflix mobile app on Android and on iOS?

DJ: In the construct of the platform, we have base Java APIs that are method calls within a JVM. Then, we have an adapter layer that sits on top of that where web APIs can be developed in a device-specific way. So, we have mobile teams developing their corresponding adapters, those are different endpoints, request patterns, payloads and maybe different protocols.

There used to be more overlap between iOS and Android, but now these experiences are indeed different. There are shared functions across all of this so we built a set of tools to allow for the shareability.

InfoQ: Do you rely on an API language to describe Netflix APIs?

DJ: Not at this point. This is something we discuss periodically, but have not pursued yet because of the challenges and costs in maintaining them. Most of the time, if you have language descriptors it means that you are trying to fix things in place, make them consistent for the API consumers. Because our web APIs are ephemeral, the descriptor would also need to be ephemeral, so using one would cost more and not be as helpful.

But another thing is you have many teams building these web APIs with different needs and those teams are iterating on their consumption of the web APIs. This iteration is happening continuously because we are always running A/B tests that require changes to the data being delivered. As the teams iterate, the same person or group is writing and consuming the web API and they are doing the development of both at the same time, which means they already know the nature of the interface, so there is no value.

Most of the discussion for description languages have been at the Java API level, but again, those APIs are changing frequently as well. If we can find a way to describe those APIs consistently at very low cost, we would like to add that to the system, but so far it seems as though the costs of maintenance exceed the benefit.

InfoQ: Do you rely on API tooling to accelerate the development of APIs by device teams?

DJ: We develop a suite of tools to allow people to manage, deploy, and view the health of their API scripts, and to determine which endpoints are active and not. We also have tools to support shareability of code around these scripts and we have tools to inspect the payloads. Also, there are tools that we still need to develop. For example, the difficulty in this world is debuggability and we need to improve in this area.

InfoQ: How does your move to Universal JavaScript for your main web site fit into the experience-based API platform?

DJ: The architecture and API for the web site team is different than most devices because they have a separate tier fronting their API calls. For typical devices, they call directly into the web API but for the web site, they call into their own cluster where they handle the traffic directly and then call into our API cluster to get the data. What’s happening in their cluster and above it is currently outside our view but they are still writing scripts in our adapter layer.

What’s interesting is that we are investigating now if we should apply similar constructs across the breadth of devices or some subsets, and evaluating the cost of doing this more broadly. Some things that we might gain in this approach would be process isolation and an easier path towards debuggability.

InfoQ: What is the place of Groovy and other scripting languages in the Netflix API platform?

DJ: Groovy is the only language in our API environment that people are writing adapter scripts with, but we are looking at other languages. The next one is likely going to be Node.js. Going to another JVM language would be easier, but there hasn’t been enough interest so far. If device teams want to use Scala or other languages, we would need to do more investigation and work to make it happen.

Node.js is not going to run integrated in the JVM so it’s an additional benefit of isolating that into another layer like we’ve done for the main web site.

InfoQ: How were the device teams able to adapt to such changes in their development flows?

DJ: The cultural change to the company was a lot harder than the technology changes. Even with teams willing to go to this route, there were some challenges in getting people to think and operate differently in the new environment. For example, it took some time for them to adapt to writing Groovy and to the functional programming paradigm. But looking back it is definitely a net win.

InfoQ: In your talk, you mentioned an ongoing project to introduce containers at the API adapter layer. Will that effort have impact on the Nicobar open source project?

DJ: As we are investigating containers for the web site layer, we are thinking about how it could be applied to other devices as well. For the container-based model, Nicobar would not be a central player for us. In fact, when we designed Nicobar and the scriptability, it was in part to deploy the scripts in an isolated way. Containers take our original intent to the next level and obviates away the need for Nicobar. That said, our system will continue to support the scripting and Nicobar for years to come, so we expect to continue to develop and evolve Nicobar for a while. As Nicobar evolves, it is likely that such changes will be made in the open source project as well.

InfoQ: The Netflix Falcor open source project was announced in August and its usage on Android recently explained. What does it offer and how does it relate to your broader API platform?

DJ: It helps us represent remote data sources as a single domain model through a virtual JSON graph. You code the same way no matter where the data is, whether in memory on the client or over the network on the server. Falcor also handles network communications between devices and servers, and can batch and deduplicate requests to make them more efficient.

Because Falcor is a more efficient data fetching mechanism between devices and servers, it’s going to continue to play a significant role in our platform even as our system evolves into a different architecture.

The main benefits we get out of Falcor are developer efficiency and improved application performance. We get the developer efficiency because the access patterns for the engineers writing the adapters is more consistent. That said, there is a steeper learning curve to use Falcor and it is a more challenging environment to debug.

InfoQ: What are the limitations that you found with AWS Auto Scaling Groups and how does Netflix Scryer help? Will it become open source?

DJ: AWS autoscaling is used widely at Netflix. It’s very useful and powerful. Amazon is responding to metrics like load average, determining that it’s time to add new servers when those metrics pass a certain threshold. Meanwhile, it can take 10 to 20 minutes to bring a new set of servers online. A lot of bad things can happen in a manner of minutes, so that adds risk to our availability profile.

That’s what prompted us to develop Scryer. What Scryer does is it looks at the historical data and incorporates a feedback loop of real-time data, evaluates what the needs will be in the near future for capacity, and then it adds servers in advance of that need. What we see is that response times and latencies are much more leveled with Scryer because load averages are not spiking and because the cluster can handle the traffic more effectively.

While we announced it via a blog post a couple of years ago, there is no plan right now to open source it.

InfoQ: Netflix Engineering is well known for its Chaos Monkey service. Can you tell more about other services that are part of your Simian Army?

DJ:There is a suite of monkeys that do different things. Here are some of these services:

Latency Monkey has various degrees of utility and was designed to inject errors and latencies into a service to see how the failure would cascade. That has since evolved into FIT (Failure Injection Testing).

Chaos Gorilla is similar to Chaos Monkey but instead of killing individual instances, it is killing AWS availability zones. The idea here is to test high availability across zones by redirecting traffic from a failed zone to a healthy one.

Conformity Monkey and Security Monkey make sure that builds conform to certain operational and security guidelines and shuts down those that are not confirming.

Janitor Monkey which will cleanup unhealthy or dead instances.

Chaos Kong is a recent addition to the army, which simulates and outage in an entire AWS region and pushes traffic to a different region.

InfoQ: Over the years, Netflix has launched many open source projects. What is the best way to know what is available and actively maintained, to take advantage of these contributions?

DJ: As our OSS strategy has evolved, we’ve released around 60 projects in total across a diverse set of categories including UI, cloud and tools. Some of them are more actively managed than others and we try to partition them in our developer website. Supporting the APIs directly, there are a range of tools including Zuul, Nicobar, Histrix and RxJava.

InfoQ: Should a company new to APIs start with a one-size-fits-all API approach and progressively evolve like Netflix did, or start immediately with finer-grained ephemeral experience APIs?

DJ: If you are brand new to APIs, start with OSFA (one size fits all). There is a question of whether you will ever get to the scale needs that Netflix has. Experience APIs are more of a challenge. I believe that ephemerality should be part of the mindset of each company, regardless.

Going the experience based API route is a function of opportunity and cost. You are adding more overall cost, but the efficiency and the optimization gains might be worth it. If you only have a few devices or very small development team or if you have a wide range of external parties that consume APIs, the cost of operating this more variable environment would likely not be recovered.

You really need to have a tipping point where the development efficiency of the API consumers is hindered by the fact that they are fighting against the rigid API. In other words, if you have different device teams, that have to make inefficient API calls that are different from each other and they have to compensate by doing additional parsing, error handling, etc. then the cost of all of that added energy can potentially be obfuscated by creating an optimized interaction model. This benefit is only worth it if you have enough developers doing these inefficient activities.

InfoQ: In addition to developer efficiency, are there other benefits that you might be looking for with Experience APIs?

DJ: With an optimized set of APIs, you are building a solution to provide a better experience for the customer, such as improved system performance and improved velocity in getting changes into the product.

If you want to have this kind of ephemerality and optimization, you can’t set it up for public APIs. The experience APIs are excellent tactics but are geared towards private APIs because having a close relationship with a small set of developers allows you to have much more latitude in solving the needs of the API consumers.

InfoQ: What excites you the most right now about the API space?

DJ: We are most excited about things like containers, streaming data, HTTP 2.0, websocket and persistence connections, tooling and analytics behind supporting a massive scale API. So we are investigating in those kind of things and experimenting.

Other things are emerging in this space like microservices, continuous integration, continuous deployment, and we are already doing them. At Netflix, we have a distributed architecture with specific functions for each microservice. But successful microservices inevitably grow in scope, potentially causing it to become more of a monolith over time. At that point, it makes sense to start breaking things down again.

InfoQ: Finally, how does continuous deployment relate to ephemeral APIs?

DJ:I often describe my team as being the skinny part of the hourglass that’s pushing data back and forth between the two fat parts. In one of the fat parts is all of the API consumers, the UI and devices teams. On the other fat part we have all the distributed server-side microservices. Both of the fat parts are constantly changing (A/B testing, new features, new devices, etc.).

As those change, we need to ensure that data is flowing through the skinny part to support the product and any test that is being performed on the product. We need to change at a faster rate than the rest of the company because we need to handle the changes that many other teams make.

Several years ago we decided the only way to do this was to develop a fully automated deployment pipeline. From a continuous deployment perspective, it was important for us to be able to deploy rapidly, frequently, at low risk and with the high ability to quickly rollback. The goal behind all of that is that we should not be the bottleneck to getting product change to the customer.

Like other things my team does, continuous deployment is a means to an end. And the end is continuous innovation. Having an environment that can rapidly and constantly change to the need of the business and the customer ties back to our ephemerality mindset.

APIs, Engineering, Interview, Netflix, Public Appearance, Technology | InfoQ

Interview : Insert Content Here Podcast, Episode 11: NPR’s COPE and Content APIs

Comments Off

By daniel_jacobson, March 15, 2013 7:39 pm

This podcast episode was published on Insert Content Here with Jeff Eaton on March 15, 2013

Daniel Jacobson recaps the history of NPR’s COPE, his work at Netflix, and the future of content APIs.

Host: Jeff Eaton

Guest: Daniel Jacobson, Netflix

Mentioned in this episode:

This episode’s transcript is in progress; the partial transcript below has been auto-generated by Otter.ai.

Jeff Eaton: Hi, I’m Jeff Eaton, senior architect at Lullabot. And your host for insert content here. today I’m here with Daniel Jacobson, the illustrious Daniel Jacobson. He’s not necessarily someone that you may have seen him in content circles or nerding out in the CMS world that much, but he’s actually one of the movers and shakers.

His role for quite a few years was Director of Application Development at NPR, where he oversaw the development of COPE – the Create Once Publish Everywhere infrastructure that made all kinds of waves in the content strategy and CMS world. He’s the author of O’Reilly and Associates’ APIs: A Strategy Guide. And now, he’s the director of engineering for the Netflix API. So, he is just all over the place when it comes to content, APIs, and reusable structured content. It’s a pleasure to have you here, Daniel, thank you for joining us.

Daniel Jacobson: Thanks for having me, Jeff. Thanks for that very nice intro.

Jeff Eaton: Well, you know, it’s funny, I was actually we reading your blog a couple of weeks ago. And, you talked about a presentation that you had done at a conference on the architecture and you said, “wow”. You know, a bunch of people in the room knew about it and you know, they were familiar with it.

And it seems like you were almost a little startled by that. Did you expect the work that you were doing on that project to become such a big discussion topic,

Daniel Jacobson: You know, I really didn’t. We had been working — this goes back to the time of NPR — we’d been working on these concepts and implementation since 2002.

I can give you the history on that if you’re interested, but we were just doing what we thought was right. You know, thinking about architecture, thinking about longevity of application development and knowing that there are going to be a host of things that we’re not going to be aware of down the road, so trying to prepare for nimbleness.

So then in 2009, that’s when I first published that blog post on programmable web about COPE, I was just taking this step of, “okay, we’ve done all this stuff in the spirit of sharing, let’s share it”. And it was interesting to see how some people latched onto it.

But what really has surprised me is we’re in 2013, people are still talking about it. It’s mind boggling for me. And if you look at the blog post, over half of the, the comments were from 2012. I don’t know, 70 or 80 comments. so what’s really kind of staggering for me is the fact that, we published it in 2009.

We’ve been thinking of it for a number of years and implementing it for that time. But it’s still, or it is now resonating. It’s very weird.

Jeff Eaton: Well, I’ll back up for a second for anybody who might not be familiar, what is COPE? You know, what’s the idea behind it.

Daniel Jacobson: Okay. I’ll give you a little bit of a history on how we arrived at COPE and some of the sensibilities that went into it. COPE stands for Create Once Publish Everywhere. And the idea there is, you want to maximize the leverage of content creation and minimize the effort to distribute. The way that the whole thing really started, it goes back a little deeper. It was actually in 2002 when we had a series of different ways of publishing content to NPR. We had HTML files. We had a Cold Fusion system with a SQL Server database that publish very thin radio pieces to the web. Then I had build this other system, a Java based system with an Informix database to capture a little bit richer content and offer an opportunity for co-branding with local stations. In 2002, we looked at this and said, well, this is kind of a mess. We’re spending all this energy publishing in three different ways and we have three different presentation layers. What we really need to do is to collapse them into a system that really leverages our editorial time and gets us an opportunity to distribute to all these different places.

So interestingly, part of the story is that I pitched the idea with my boss at the time Robert Holt, who’s now at Microsoft. We pitched the idea to actually spend a bunch of time collapsing these systems and building an awesome system that takes care of all of this. At the time the VP, her name was MJ Bear, she wanted us to do due diligence and explore all these different avenues, content management systems like TeamSite and vignette, which are all quite expensive. So we did all that due diligence but I wasn’t convinced that that was the right route. And in fact, I knew we could do a better job by building it in house. The inflection point was that she quit. She left the job and, I basically said, “Great, let’s go build it. There’s no impediment anymore before things settle, let’s get this built before the next VP.” And, so we did that. We hunkered down and that’s where we really started thinking about the philosophies of “separate content from display” and, and “content modularity” and things like that.

So this was back in 2002 and it was partially driven by the idea that everything we’re doing here of collapsing these data systems, we’re also doing a redesign of the presentation layer. If we’re doing that, it’s highly likely that in the future, we’re going to have another presentation layer, either a new one to replace the old one or an additional one. And it’s almost like this keeps happening. It’s cyclical.

We need something great. so let’s throw out the old, and so we said, “all right, this presentation layer is going to go away too. So we really need a decoupling. And that’s where a lot of these COPE philosophies started to soak in. And actually we launched that CMS in November, 2002 and it’s still the centerpiece of NPR today.

Jeff Eaton: Holy cow.

Daniel Jacobson: Yeah, the CMS has lasted 10 or 11 years and we see that as a kind of success. So we take a lot of pride in that.

Jeff Eaton: It seems like that that decoupling that you talked about from a pure software development and architecture standpoint, that feels like the right thing to do, but you know what you mentioned the idea of teasing apart the core business assets that are content that gets created and managed over time from the changing sort of ephemeral presentation, teasing those things apart. It feels like it makes good business sense too. It’s not just about architectural purity. It’s a way to make sure that you don’t have to dig up the foundation every time the house needs to get painted.

Daniel Jacobson: Absolutely. And I will say it’s not without some costs because there were certainly some cultural battles that went into those discussions. An example that I can offer is when we were doing this design, what we intentionally said is that the story is the atom of the system. That was another philosophy of what we were doing. And so the atom the center of the NPR system universe. And we basically said the story is super thin. And generally speaking in NPR, people think of the story as being a radio piece. And we said, “no, radio pieces are not a necessary component of a story, it’s an enhancing and enriching part.” But so are images. So is text, so are pull quotes. So is whatever else you want to imagine that you want to put in there, but they’re not necessarily. The only parts that are necessary are basically the title, a unique ID, a date. And I think we said a teaser, that’s it. That’s all you need for a story.

Jeff Eaton: Interesting.

Daniel Jacobson: And from there, there’s a hierarchical data model. Tell me if I’m getting too geeky.

Jeff Eaton: Oh, this is the kind of stuff I love.

Daniel Jacobson: So we had hierchical data model where we basically had a story own what we called a “resource” and a resource was any enhancing component of a story. And a resource is generic. And then the sub hierarchy underneath that were the particulars, which would be text, images, videos, external links, internal references…

Jeff Eaton: transcripts, I think just went live a little while ago.

Daniel Jacobson: Transcripts is another one for sure. Those were all enhancing enriching components, but not necessary to have a story on the site. And I know that you and others on this podcast have talked about blobs and chunks. So, we, we tried to be very surgical with this idea, we wanted to have everything be its own entity. We don’t want these blobs of data. In fact, every paragraph in the text are stored separately and very distinct fields. We really thought about how we can manage this stuff in modular ways so, as you were saying, we’re teasing it out so that our business can gain value down the road.

And the controversial part is, especially in 2002, NPR is a journalism company focused on broadcast. Saying that the radio or the audio piece is not essential was pretty controversial. And it takes a little, it took a little massaging inside to get people to understand the power of this model. And it took actually a number of years to really get everybody on board.

Jeff Eaton: I would imagine that some of the decisions that you made earlier, like the story is central, even if every single story has a radio component for the first several years, there’s a stake in the ground, just in the nomenclature that you’ve been able to get in there.

Daniel Jacobson: Sure. you know, we really talked about stories and lists. Those were the two core concepts and you start talking about it and eventually that becomes part of the vernacular and part of the culture. And really, every story can belong to any number of lists, any number of types of lists. Lists are really just aggregating mechanisms. And that cut against the culture as well. And NPR, because you think about morning edition for March 13th, you think there’s a rundown, there are 15 stories for that day. That’s really just another list. And it’s not really a program. It is, it happens to be that this list’s name is Morning Edition…

Jeff Eaton: It’s a particular branded aggregation of stories, not necessarily this thing that is the primary means that people approach and find content, right?

Daniel Jacobson: It’s not a radio program anymore. It’s a list, and you know, you can identify it by titles. But you gotta think about it more generically and that’s when we started introducing a broader set of topics, such as “News”, “Music”, and things like that. And it really created, tremendous opportunity for NPR actually.

Jeff Eaton: There’ve been a couple of things that have been written about the actual longterm impact of it and what it’s allowed NPR to do in terms of turning around new products that are targeted at emerging devices or platforms without having to go through the same sort of just profound, painful rearchitecting of everything that a lot of other companies are having to do. Do you think that it’s interesting, do you think that this set up NPR for taking advantage of mobile, because of all the work that had already been in place.

Daniel Jacobson: Unquestionably, yes. Following the story of the content management system creation, I’ve talked a bunch about not building a web publishing tool. So, we felt like we were building a content management system and we thought about it in terms of having the content management editing tool be just another their presentation layer. And you can have any number of presentation layers. And so that actually gave a lot of freedom. If the editing tool is just one presentation layer and the website is one, we can have any number of these layers. We can just tack on new presentation layers, some of which have write capabilities and some don’t.

So as that started to mature a little bit back in 2004, RSS started to emerge. It was easy just to spin up another PHP file and have it render a RSS feeds and you pass in parameters and it’s going to yield a topic or a program. And then shortly after that, podcasting started to emerge and we can just easily float an MP3 file in the RSS feeds.

And, it was about 2007 when the NPR music site launched. We started to see the fact that we had this single point of failure, an Oracle database, which we were growing out of. We actually need a different data redundancy model. We needed a cluster of databases to be able to scale it. And NPR was a nonprofit, so we couldn’t afford a million Oracle’s servers. So, there were a couple of things that had me thinking that we needed another level of abstraction. And that’s when we introduced the API. That gave us an opportunity to basically put the NPR music development on a different trajectory, as well.

Now that everything is, or at least was, we were moving in the direction of having our sites fueled from the API. We can more easily abstract away the database underneath the API and swap in a cluster of MySQL servers, multiple databases. And, so we started thinking of the API in those terms. And then sometime after that, we opened it up publicly.

After we opened it up, we realized, “wow, we should be using that for other presentation layers like mobile sites and iPhones and iPads” and feeding into the Livio radio, radio devices that are feeding NPR content. Now it’s going into cars. So, all of this strategy, which started early in 2002, I think was in retrospect, we were lucky with having that kind of architecture.

It put us in a great position to just tack on more presentation layers and allow them all to feed off of one central distribution channel, which was the API.

Jeff Eaton: No, it definitely makes sense. I mean, although I think it is interesting that the real rise of mobile web traffic, especially apps, this idea is a given.

Businesses that have content might need this cluster of different ways that people can get to their stuff. I think it feels like that’s broken and a lot of existing sites and a lot of existing workflows and a lot of existing platforms that companies have built out. And I think that sort of corresponds to where the huge spike in interest in the work that’s been done on COPE, really took off because suddenly everyone was feeling just such a tremendous amount of pain around that issue.

What you’re describing is interesting because it’s not the same kind of story your team had been working on this for a long time based on deeper needs than simply we need to go mobile or something like that.

Daniel Jacobson: Yeah, that’s right. I think we were lucky to have had a series of challenges, like financial challenges, not being able to afford vignette. We were lucky and that we were doing a series of things all at once. And we knew that we were redesigning and we were likely to have to redesign again. There’s a confluence of things that got us thinking early about it. We had no idea the iPhone was going to go nuts. I mean, if we had that kind of foresight, I should have been betting on the stock.

But we were very lucky and I think we actually made a series of decent decisions thereafter that put us in a good position. When I hear about folks who have been, who weren’t quite as lucky to have that confluence of events and they have to go back and retrofit. Well, now the space is much more complicated and you’re already embedded in your web publishing tool. So their rearchitecture at that stage is actually much more expensive and much more painful. It just seems like we did it at the right time. Actually, things were very lucky.

Jeff Eaton: You mentioned that the actual editorial tools, the things that the actual content creation people use is just another presentation layer in that sort of approach. How did that side of things evolve? Cause you know, the structured content approach that you’re describing isn’t necessarily a natural fit or a natural transition for people who aren’t used to saying data modeling is their weekend hobby.

Daniel Jacobson: Data modeling is really at the center of it all. I felt it was critically essential to start with a very clean data model. That that was the starting point. How do we, how do we imagine this content to be stored and thinking about it in those very teased out ways? So yeah. Text is very different than a teaser, which is very different than images.

And everything was really isolated. That’s where the content modularity part comes in. So we, that’s where we started. And if you have that well, that’s just your data repository. And then any number of apps can hit against that. And we had the website hitting against that, but we also had this suite of tools that could write to it. They can also access it, but then we started, over time experimenting with areas of the website, writing different user related things to the database. Basically, it starts with the database. That’s all you have. And then every, everything else is either reading or writing or both to that database. And we just thought about it in those terms.

Jeff Eaton: Was the user experience aspects of building out those tools, something that you were involved with or was that something that there was a different team of people trying to make the tools usable for this way of approaching things.

Daniel Jacobson: Okay. So I’m actually really glad you brought that up because I kind of glossed over that. First I want to be clear and maybe this kind of gives people some hope. When you say the team of people, the total team that I can think of that executed on this entire CMS project back in 2002, it was about four people.

Jeff Eaton: Oh, this is like learning how bacon gets made.

Daniel Jacobson: Yeah. I hope I don’t disappoint anybody.

Jeff Eaton: I think that’ll encourage a lot of people actually.

Daniel Jacobson: Yeah. I mean, it really started with this massive document that I put together that we used for due diligence to send out the vendors per the VPs request.

So we had a vision that we kind of pieced together and that vision, both encompassed what the data structure’s going to look like, what’s it going to look like in terms of conceptually the CMS or the content capture part as well as, directions for the website. We obviously then brought it in house.

We didn’t have a VP anymore. It was me and one other backend developer at the time. And there was one front end developer and there was a designer. And then my boss, so that was the team. We had a suite of editorial folks who contributed meaningfully to this. I don’t want to discount them, but in terms of the engineering, that was it.

So me and the other backend engineer, we wrote all the tools around the content management system. But before we did that, we were heavily informed by the designer. And I was pretty involved in this as well. We did a series of usability tests. We took data from both the usage patterns of our users online and the npr.org users. We took information about the three discreet ways that we publish. We had the Cold Fusion system, that Java system, the HTML files.

What are the kinds of things we’re building? We knew that actually, in the mode right now, at that moment too, we had a very limited sets of assets. It was very audio focused, but we knew that things coming down the road were going to be much more expansive in terms of the available assets: images were probably to come later, and full text. And we were hiring editors too on the online side specifically to build out those stories.

So we were informed by all of those things and did a whole series of mockups and clickable prototypes for the editors and sat down with them and said, “okay, well, what do you think of this? What do you think of that?” And then here’s the interesting part. We took all that data and we had to discount fair amount of it because we thought that they were still thinking about it too much like NPR, right?

Jeff Eaton: Yeah. They were thinking, “Oh, you’re building the new version of what we’re used to.”

Daniel Jacobson: Yes. Right. So, we took on all of that. There were a lot of really great ideas and fundamental things that drove the direction of where we’re going. But again, we had to think, we’re thinking bigger than this. We’re thinking there will be another design, there’s this and that.

So, we needed to discount a portion of our learnings. And, yeah, so all of that kind of boiled into the content management system. And I think you and Karen have talked about this fair amount, that you can’t just build tools as an engineer would build a tool, right? You’re building your website to have it be meaningfully useful to the website user.

You need the same mindset when you’re building the CMS. You want it to be infused with the sensibilities that will make them effective at what they’re doing. So we tried to take all those sensibilities and make something that they would succeed with, and evolve it over time.

Jeff Eaton: Especially with sites that have an existing infrastructure and stuff like that. It’s very easy to see those kinds of things. And imagine it’s just insurmountable, there’s definitely a lot of hurdles, especially, it feels like every year that passes there’s more weight that’s being put on a lot of the web properties, the different companies, but the idea is it doesn’t necessarily take an army to do this. It seems like it’s more of a wheel inside of the organization to take this kind of path.

Daniel Jacobson: Yeah. I have a couple of thoughts on that. First, generally I agree. I think the one area I would disagree is that the world is different now than 10 years ago. And I think, again, we were lucky to have gone down that route at the time that we did, because like you said, the weight is lighter in 2002 than in 2013. But in every other capacity, I agree. And you need to have commitment, and the commitment not only starts with the commitment to vision, but resourcing it appropriately. If you need these engineers to build this out, hire engineers, and I’m a big believer in the idea that excellent engineers are going to be, some people say, 10 times more effective than average engineers. So having really smart people who are able to execute on these things, you can really tease out what you’re going for. What’s the best way to execute on it? That’s going to pay way more dividends than just hiring a bunch of people and throwing money at it.

Jeff Eaton: What you were describing earlier was it was a very cross functional team, all working on these things. You were talking about working in close conjunction with the designers who are planning out what the new visual appearance of the site was going to be in how it was going to operate, working with the the content creators and the editorial team.

It wasn’t just a matter of wireframes that were thrown over the wall and you guys implemented it, which I think is one of the hardest scenarios for a lot of people who build an implement to find themselves in these days.

Daniel Jacobson: Yeah. It was kind of a scrum before scrum started taking off. I mean, that’s basically what it was. The only way it could have succeeded was very close collaboration with everybody on board with the message.

Jeff Eaton: So I guess coming back to that initial question, is it a little weird now to hear the stuff that you worked on turned into sort of the GoTo example slide in everyone’s presentations about structured content and reuse?

Daniel Jacobson: Yeah. It still freaks me out. It’s great. I love seeing it. I love talking to people about it and if there’s a way that I can help people, I love doing that. It is still a little surreal too, however, as I haven’t even been at NPR for the last two and a half years. And, so it’s interesting to still see it coming through the tweet stream once in a while.

Jeff Eaton: I can imagine. Well, you mentioned that you’re now at Netflix actually, and you’re the director of engineering for the Netflix API. How is that different? I mean, it seems like it’s an API and it deals with content, but it is really a big shift.

Daniel Jacobson: Yeah. It’s a huge shift in certain categories and it’s actually quite similar in others. The similarities are, we’re both media companies. But actually Netflix considers ourselves technology, company in media. I think NPR should be striving, if they are not already, for the same thing, as a technology company producing content for distribution, trying to get on multiple devices, reaching consumers with rich multimedia experiences.

Those kinds of things are similar. The scale. is fundamentally different. NPR, I think is a reasonably decent scale operation. but again, by the time I left it, my team was, including some contractors, around 20, I think. Here the engineering team is about 600.

The scale of the APIs: I think the NPR API is however many millions of requests a day. Maybe it’s a hundred, maybe it’s 50. I don’t remember the exact number. The API here does two and a half billion transactions a day. So what goes into those problems, you know, solving those problems? It’s a fundamentally different approach. And so contextually at NPR, it was even when I left, it was basically one team broken down into different groups, but focused on one pipeline and that pipeline was pretty interconnected. So, you have the content management system that publishes into a cluster of databases and the cluster of databases draw from an API, and the API distributes out to any number of destinations. The NPR engineering team was building pretty much all of that.

Jeff Eaton: Wow. 20 people or so!

Daniel Jacobson: Here it’s highly distributed. It’s an SOA model. lots of engineering teams focusing on specialized tasks. And my team does not really store any data. We don’t really have any editorial tools or anything like that. We’re basically a broker that takes data from other people’s systems and pass it across HTTP over to devices and people. So the core responsibilities for this team is making sure that we have a solid distribution pipe, scaling the system effectively with the growth of the company and growth of the system and ensuring resiliency. Those are the three key responsibilities I played out for the team. Whereas NPR, it’s building a lot of features and presentation layers, or managing a CMS.

So yeah, the scale I think is really at the core fundamentally different, and that drives a lot of the differences in other categories.

Jeff Eaton: Yeah. I think, at Netflix, you guys are responsible for what I think a double digit percentage of evening internet traffic or something, something like that.

Daniel Jacobson: Yeah. It’s 33%, right?

Jeff Eaton: That’s definitely a statistic. Not many people could claim.

Daniel Jacobson: I mean, there are all kinds of different ways that Netflix has massive scale. That’s one of them. The two and a half billion transactions a day, but we’re also an 800 different device types. It’s kind of mind boggling. When you think about some of these numbers.

Jeff Eaton: I am just blown away that there are that many kinds of devices to be on. I mean, I guess it makes sense, but it’s just, it is staggering.

When you think about that the device proliferation that we’re seeing, is it really difficult to keep up with it?

Daniel Jacobson: Yeah. And the scary part is it’s not done. Your fridge is going to have a screen on it. Why not have Netflix there? Right. Basically this is the beginning of it, actually, in my view.

Jeff Eaton: So how has it been, how has it been different? Just philosophically, like at NPR? I think a lot of the case for the API is about very public distribution of a lot of different information. But with Netflix, it’s different, it’s basically you’re API is part of a tool chain that allows you to provide a particular service to customers.

Are there ripple effect differences in how the two get approached?

Daniel Jacobson: I actually think you’ve mischaracterized NPR a little bit. Even by the time I left, I was starting to position it differently and I think that’s still the case. So, Javaun Moradi, he’s the PM there now for the API and the rest of the folks there. I think they focus intently on internal consumption. They still have the public API and still use it as a distribution mechanism to the member stations, among other things. But an overwhelming percentage of the traffic to the API is from NPR properties. It’s not from the general public domain. and then the next category would be the stations, and then the general public.

So I think in that regard, it’s very similar to Netflix. now that the percentages and the numbers are very different. So I think whatever their percentage is, 60, 70% at NPR. For Netflix, 99.9 plus percent of the traffic is from Netflix-ready devices. And we still have a public API, but that’s sees an incredibly small percentage of the traffic

Jeff Eaton: Then, I guess that is sort of a perception thing then. I mean, I’ve always heard a lot about the context of the public open API being there. And that’s actually a question that I think a lot of people have as well, “if my business doesn’t make sense for putting out a giant fire hose of all of my stuff to the world, how can I really leverage this stuff?” And I think that that’s part of it. You don’t have to think about it as just the all you can eat buffet of our stuff. It can be used for internal purposes too.

Daniel Jacobson: Yeah. I’d go a step further and I’d say it’s the majority. In fact, the overwhelming majority of value from API use will be the internal consumption. and this is from my background at NPR and at Netflix, but also in talking to a lot of people in the space.

I see other companies like Evernote and The Guardian and The New York Times. They all have similar pie charts where the overwhelming consumption is internal. So I actually think that we are seeing a shift in the marketplace towards internal consumption. People are looking at their businesses differently.

It’s like, how can we get on all these different devices? Let’s not worry as much about trying to piggyback on developers in their free time in their garage. And let’s dedicate our resources to building these things on our own. How do we get there? Let’s build an API so we can leverage the distribution rather than building one-offs all the time for either individuals, developers who are looking at dealing with the massive explosion of devices and channels.

Jeff Eaton: What would, what kind of advice would you have for them? What kind of major pitfalls would you tell them to steer clear of?

Daniel Jacobson: Well, definitely steer clear of lots of one-off development and steer clear of thinking, depending on who you are and what your business is, steer clear thinking of yourself as not being a software company.

I think that’s, to me, that’s the number one thing. If you can re-imagine whatever your business is and think of yourself as being a technology company, or at least partially a technology company, then you’re going to dedicate the resources to that. And if you’re dedicating resources to that, then you’re gonna have smart people who are thinking about these problems in the right way.

And I don’t think there’s a one size fits all approach for everybody, but I think if you have good people thinking about it, you’re going to end up with highly leverageable content management or distribution channels. You’re going to end up being much more nimble than you were otherwise.

So it’s probably sidestepping your question, but, I don’t know how else to say it because even at Netflix, we have very clearly stepped away from trying to be one size fits all for reaching all of the platforms we’re hitting.

Our REST API used to be a one size fits all model that we use for quite a while, and it felt like the right thing to do. But my view on that is that the API is a tool and when that tool runs its course, we need to move on to something that has greater pragmatic value. So I wouldn’t be beholden to any given technology. I’d be beholden to smart technologists who you trust to make good decisions.

Jeff Eaton: And it sounds like an important part of that is having a really clear coherent grasp of what it is that you’re trying to accomplish and what the longterm goals are too.

Daniel Jacobson: Absolutely. Yep. That goes right in with the commitment. If you’re committed to this, then think about how this is going to play out in five years and start planning for that.

Jeff Eaton: Well, I want to say thank you very much for joining us. it’s been a pleasure and, I hope that we’ll cross paths again in the future.

Daniel Jacobson: Absolutely. Thanks a lot, Jeff. I really appreciate it. And yeah, definitely.

Jeff Eaton: Thanks for listening to insert content here. If you’d like to catch up on our archives or keep up on our new episodes, visit us at dot com slash ideas slash podcasts slash insert content queue.

And you can also visit us directly at InsertContentHere.com.

Extra Story: Daniel Jacobson: Me and another NPR engineer, when we were doing a modeling exercise, we really got caught up on what to call the story, the atom of the content management system, in the database. And we debated about this for way too long. And her stance was let’s call it “page”, as in a web page or some page representation. And I wanted nothing to do with that because I thought that was too bound to a given presentation layer concept. I really wanted something way more generic. So I started throwing out ideas like “object” or something that really didn’t have a whole lot of meaning, but was abstract.

Jeff Eaton: Totally abstract.

Daniel Jacobson: Exactly. Yeah. And we went around this debate time and time again, and ultimately what we decided on was “thing”. So the central table in the system, and I think it still is the case today is called “thing”. We did that specifically so that we could not be worried about if a story going to end up on a mobile app or on an IP radio, which don’t really have a concept of “page”. It’s just, here’s this “thing”, we’re distributing it out and that’s it.

APIs, Content Management, COPE, Engineering, Interview, Netflix, NPR, Org Culture, Technology | Insert Content Here, Interview

Interview : Daniel Jacobson Interviewed at Where 2.0 in 2011

Comments Off

By daniel_jacobson, April 20, 2011 4:22 pm

This interview was original posted by O’Reilly Media on April 20, 2011

Daniel Jacobson is the director of API engineering at Netflix. Prior to Netflix, Daniel was at NPR where he created the NPR API as well as the content management system that drives NPR.org, mobile platforms and all other digital presentations for NPR content. Daniel is also on the board of directors for OpenID.

APIs, Interview, Netflix, Technology | O'Reilly Media

Categories
- Netflix (32)
- NPR (22)
- Org Culture (2)
- Personal (2)
- Public Appearance (40)
  - Article (13)
  - Interview (3)
  - Keynote (1)
  - Presentation (24)
- Technology (50)
  - APIs (49)
  - Content Management (16)
  - COPE (14)
  - Engineering (29)
  - Mobile (2)
Recent Posts
Archive of Posts
- March 2016 (1)
- November 2015 (1)
- July 2015 (1)
- February 2015 (1)
- May 2014 (2)
- April 2014 (1)
- March 2014 (1)
- January 2014 (1)
- October 2013 (2)
- September 2013 (1)
- July 2013 (1)
- June 2013 (1)
- May 2013 (1)
- April 2013 (1)
- March 2013 (1)
- February 2013 (4)
- December 2012 (5)
- July 2012 (2)
- May 2012 (1)
- April 2012 (1)
- February 2012 (2)
- January 2012 (1)
- October 2011 (2)
- July 2011 (1)
- May 2011 (1)
- April 2011 (1)
- February 2011 (1)
- October 2010 (1)
- September 2010 (3)
- July 2010 (1)
- June 2010 (1)
- April 2010 (2)
- October 2009 (1)
- September 2009 (2)
- June 2009 (1)
- July 2008 (1)
Posts by Day
July 2025

S M T W T F S

1 2 3 4 5

6 7 8 9 10 11 12

13 14 15 16 17 18 19

20 21 22 23 24 25 26

27 28 29 30 31

« Mar