Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles Untangling an API-first Transformation at Scale. Lessons Learnt at PayPal – Part 2

Untangling an API-first Transformation at Scale. Lessons Learnt at PayPal – Part 2

Key Takeaways

  • Define a business capability taxonomy for your business
  • Adopt an API product mindset when talking and thinking about your APIs
  • Develop a customer-focused process for managing your API portfolio
  • Decouple your service implementation artifacts from your APIs
  • Your API transformation is a culture change, not a one-off project

This is the second in a three-part series that explores how PayPal has adopted a more API-first approach to building platform services. In the first part, we explored the reasons for moving to a more loosely-coupled, API-driven architecture as well as some of the infrastructure needed to make the process work at scale. In this article, we’ll take a closer look at how the portfolio of API’s themselves are managed.

Making Every API Investment Count

You’re constantly building new features. In theory, each one of these investments is driven by a rational, well-justified strategy and roadmap.

The reality is messier. We tend to think about roadmaps as continuums of prioritized investments that help us actualize our strategic goals. As we reduce the time horizon, however, a few practical realities emerge. The first is that our immediate execution plans tend to be heavily influenced by a relatively small number of high profile projects and key partner relationships. This is partly because it’s desirable to have a few, tangible goals to rally around, but it’s also because time-sensitive deliverables tend to rise to the top. We may have signed an important deal with a large customer. We may have acquired a company that needs to be integrated. Such and such feature may need to get out before the holiday season. These are all very real projects that have very real dates attached. It turns out our nice, strategically-aligned, philosophically justified roadmap looks a lot more event-driven when practical reality intervenes.

Is this a problem? Well, no, not from the perspective of getting teams to work toward common goals and ensuring we’re focused on in the most important priorities for the business. It helps align the organization. The challenge it presents at the tactical level is that we’ve reduced our beautiful platform vision to a small set of narrowly-defined solutions. Workers on the ground – the ones responsible for actually getting stuff done – naturally optimize for their success. Their context becomes the immediate project, which means they tend to make decisions within those solution boundaries.

This is a big challenge in a API-first organization. We’re trying to build a reusable set of capabilities that aren’t overly biased towards any particular partner or project. We want our technology investments to maximize their utility, which means building generic, reusable APIs that can be used across multiple contexts. We need to satisfy not only the demands of the immediate project driving the investment, but all the other projects we can’t see coming. As can be seen, however, the narrow, project-driven success criteria of our roadmap breeds tension with the broader goal of maximizing platform services utility. Given these constraints, how can we guard against making narrow, myopic decisions in services design that limit our future business agility?

Identify your Business Capabilities

The definition of your problem space has a big impact on how you think about solving your problem. One way to help ensure you’re always building platform capabilities and not just point solutions is to clearly define the capabilities that compose your business. These become your problem spaces.

Business capabilities represent the core, reusable building blocks that your business needs to support the business processes required to function. By defining your business capability taxonomy, you establish a shared language that can be used by all domains to describe the logical relationships in any given process. This serves as a stable, business-driven (not technology-driven) context in which to discuss solutions that, hopefully, remains relatively consistent over time. Is also provides a critical link between how the Business thinks about its investments and how Technology leverages them.

In a small company, the set of capabilities is quite limited. Being highly resource constrained, you may build some core services that differentiate your business and leverage other service providers for generic things like messaging, identity, payments, etc. Managing them isn’t that hard as they fit within the heads of the core brain trust. It’s still very beneficial to write them down and define them clearly, but managing them may be done by a single lead architect or a small group of tightly knit stakeholders. Very little formal process is required or, usually, desirable.

When your organization grows, things change. If you have hundreds of scrum teams spread across multiple business units and product lines, keeping track of what you have and avoiding duplicate investment gets hard. To address this, you need some model or system to track and manage your API portfolio across the extended organization. This itself is a “thing”.

When we started rebuilding our platform at PayPal, we clearly wanted to understand where we were aiming. We spent the better part of a year decomposing various business processes to identify our core business capabilities. As you’ll never fully understand tens of millions of lines of code, it wasn’t perfect, but following the 80/20 rule, it gave us a good starting taxonomy we could use to classify and manage the future APIs we’d build. Perhaps most importantly, it gave us a common language to describe the boundaries of the end state. We used this to engage in a constructive, defensible - and largely consistent - conversation with domain owners on the shape of the APIs and services they were building.

This simple example illustrates how we defined capabilities and assigned them to groups. These groups represented higher level domains containing related functions. Note that this is a very simplified snapshot of the overall model.

In total, we identified over 70 business capabilities across seven or eight high level domains. Over the past few years, we’ve tweaked the model around the edges as we’ve evolved as a business and better understood our various business processes, but the original model has served as a useful baseline throughout.

When doing this, it’s helpful to look for other examples across your industry. In our case, we leaned heavily on the models developed for banking, since we share many of the same industry concepts. In particular, the BIAN model was very helpful. Where our business branched outside of traditional banking, we augmented BIAN with capabilities from other industries. We also had to invent a few that are unique to our business. These, in particular, are quite interesting as they typically represent the most differentiated and defensible capability investments you make.

Throughout the process, we worked with teams across the company to validate our assumptions and get their buy-in. Domain architects were key contributors, although you have to be careful not to let your current implementation architecture overly influence your business architecture definition. Remember that business capability boundaries and language should be driven by customer use cases and the processes required to support them. If your end result doesn’t make sense to somebody outside your organization, you probably need to revisit it and work harder to remove internal implementation and organizational bias. This is a constant struggle.

Shifting Towards an API Product Mindset

Not only should you assume a customer-focused perspective when defining your business capability boundaries, you must do the same when defining your actual APIs. It’s taken a while for the industry to latch onto this, but the truth is that APIs are products. They have customers, they solve some problem, they must be easy to understand, and the experience using them must be great. You may even end up actually selling some of them to customers for real money. For many in Engineering and Product, this is kind of a new idea.

It’s easy to understand why. API’s grew out of engineering’s need to connect systems and desire to improve componentization. They were part of the “how” and their utility beyond the immediate project wasn’t immediately obvious to many – especially product managers. Remember also that our short-term roadmaps tend to define success in terms of relatively narrow project outcomes. It’s a natural tendency to bias API design decisions towards immediate project optimization over generic reusability – which often takes more work. In the heat and pressure of execution, it’s not clear why you should care about contexts outside of the one driving your next paycheck. You have a deadline to meet.

In a sprint, all decisions are tactical.

An API-first strategy recognizes that if you design interfaces well, with an eye towards generic reusability, they become the building blocks on top of which future business processes and products are built. Your investment today will pay you back in the future. The challenge, then, is to overcome the bias towards project optimizing decisions and adopt an API product mindset. Let’s look at some of the attributes and behaviors that encourage this.

Talk Like a Customer

The first thing is to start talking about your APIs from a customer perspective. The set of business capabilities you identified are defined using language that makes sense to somebody looking from the outside in. Internal concepts, code words, and acronyms have no place in your API taxonomy or the documentation that describes them. Using customer-centric language helps decouple your conceptual business capabilities and related APIs from the underlying physical implementation. It makes your APIs more understandable and more usable by somebody outside the domain. That’s essential.

The second imperative is to take naming your APIs and resources very seriously. It seems trivial, but names are extremely important when it comes to understanding your API portfolio. It’s sometimes easy to forget that it’s people that must read and understand APIs. Computers only do what you tell them - names don’t matter. It’s actually humans that must read, understand, conceptualize, and synthesize solutions based on the API vocabulary you provide. If you choose a bunch of cryptic or non-intuitive resource names, you’re increasing the cognitive load and making life much more difficult for your developer customers. The cost of integration just went up for anybody who wants to use them.

Manage Your API Product Portfolio

If you read the first article in this series, you noticed a process step called “Alignment”. This is the point at which you identify the position an API has within the broader capability portfolio. It’s a critical step. It’s where you think about the generalized use cases the API is going to support and choose the best nouns and verbs to represent the resources and actions of your interface.

One best practice is write down the use cases an API is solving. It’s amazing how if you force yourself to actually write good customer use cases, the key words that become your resources and actions tend to pop off the page. The endpoint almost writes itself. It also becomes much easier to validate where it fits in the broader API portfolio context.

The challenge is that most product managers aren’t involved in API design and most developers don’t think about their APIs in terms of use cases. Developers tend to think in terms of implementation. If you ask a developer to write a use case, they’ll usually come up with something like:

I need to collect region_2 data to insert in table legacyFoo so that oldService_B can return a decision.

This makes no sense to anybody outside of the immediate domain and it will likely lead to an equally obscure API. Believe me, we’ve seen far worse. Taking a step back and analyzing the business process that lead to this requirement typically reveals a use case something more like the following:

As a consumer, I need to provide all address information required to successfully complete a payment transaction.

That’s a statement comprehensible to normal humans. It starts to point to a potential resource called “address”. Referring back to our model, we can also infer that address is likely related to the Account capability within the Identity domain. It’s now easy to look at other APIs within that capability to see if this is a new resource or simply an extension to an existing one (or potentially duplicate functionality that we can avoid building altogether).

Get Some Process

Mention the word “process” in a crowded room and people recoil. It’s sort of the antithesis of “fun” and “creative”. The fact of the matter is, processes exist largely to enable scalability of outcomes and consistency of results. While hopes and dreams are nice motivators, you need something tangible to ensure your API portfolio evolves in some sane way. The challenge is making the process fast and efficient as well as value-adding in the eyes of your developer customers. If you succeed, you’ve turned that “process” into an asset that developers seek out to make their lives easier. That is a key success measure of any worthwhile, long-lived process.

The primary way we’ve done this at PayPal is by requiring all APIs to go through a common governance step. We have a small, central, largely autonomous team that serves as the primary API portfolio management group. It’s comprised of technical product managers working in partnership with the primary stakeholders and architects of each business capability domain. They work with each team to help define their general use cases, recommend the best vocabulary, and finalize their API names, resources and endpoints.

At first – big surprise - there was a lot of resistance to this model. We were yet another bureaucratic step standing in the way of getting stuff done. Developers tended to wait until the last minute – often after the implementation was almost done – to engage. This caused much pain on both sides. Over time, the attitude shifted. As teams have gone through it a few times, they have come to understand the value in exposing more understandable endpoints with good documentation. Their customers have fewer problems and their support overhead is reduced.

Today, we see teams engaging in the API portfolio alignment process much earlier in the software development lifecycle. Many proactively seek advice early in the SDLC as they’ve seen how genericizing their approach has led to cleaner interfaces and cleaner implementations. As a bonus, the customer perspective and business architecture modeling influences their thinking about their long-term implementation architectures. It helps keep them aligned with the business. Some teams are going through their second or third iteration of APIs and services as they evolve towards a cleaner, simpler model that helps them serve all their customers easier. Most importantly, developers have come to appreciate not having to figure all this out on their own. They see the value the process contributes to solving their immediate problem, which encourages them to come back.

Talk to Your Customers

Getting customer feedback, understanding their needs, and validating their satisfaction are, of course, primary behaviors of good product management. When considering APIs as products, your primary customers are developers.

This is not how most people think about APIs. Again, API development is typically driven by a project goal, which is the measure of success. APIs are a means to an end. How do we change behavior such that the experience of developers actually using your APIs is actually considered?

We’ve done a few things at PayPal to encourage this:

  • During design reviews, we encourage API developers to reach out to their customers, get feedback on their designs, and work with them to make sure their documentation is thorough and understandable.
  • All API contracts are managed in well-known Github repos and everybody is encouraged to subscribe, review changes, and contribute pull requests.
  • The API portal contains links to the team slack channel, so customers can easily get help and ask questions.
  • We publish all pull requests to Slack, so it’s easier for API customers to follow the APIs they’re interested in and be notified when things change.
  • Customer suggestions, bugs, and questions can by submitted through the API portal and they’re automatically tracked in the same Github repo that contains the API spec.

All these tools help, but at the end of the day, API developers must have a reason to care about the experience of their customers. Incentives must be aligned and recognized as important by management. Some form of instrumentation is desirable to measure how well each API developer is doing meeting the expectations of their customers. This is an area we’re just beginning to explore as we look for additional ways to reinforce behaviors that lead to great API customer satisfaction. Again, this is a cultural change that takes time. It’s not necessarily built into the DNA of your organization and you’re going to need to experiment with different approaches to building this muscle.

The Role of Product Management

We’ve been talking about APIs as products without really addressing the role of product management. It most likely varies by organization. If you have a very technical Product Management function, chances are they’ll be highly involved and want to take the lead in much of this. They’ll inherently understand the need and the value they can add by leading this conversation. API portfolio management then becomes more of a federated consortium across all the domains, with Product a highly engaged stakeholder. If this is you, consider yourself lucky.

If you don’t have an inherently technical product management team, then forming a central team that can serve this role and provide consistency across the platform may be the only practical, short term solution. A lot will depend on the dynamics, culture, and maturity level of your organization. Over time, you may also be able to evolve towards a more decentralized model as you strengthen the technical skillset (and align the incentives of) your Product group. We’ve also seen some domains with very motivated and involved product managers, so a hybrid approach may also be appropriate. You need to determine the realistic level of product manager involvement for your organization and tailor your approach to suite.

Decoupling the Logical vs. the Physical

When we talk about APIs, we’re talking about the logical definition for how a service interaction should behave. The service that implements the API is the actual, physical artifact that’s deployed and services requests. It’s easy to assume the relationship between these is always 1:1. You should question that assumption.

What we’ve found is that this relationship can be many:many. There are a few reasons for this:

  • A legacy service that implements an API is being replaced by a new one that honors the same contract. This strangulation approach happens gradually and there is a period of overlap where some endpoints are serviced by the legacy service and others are serviced by the new service. The advantage is a small part of the legacy service can be replaced in every release and the risk of replacing a large, complex service is mitigated.
  • There is substantial implementation overlap between two APIs and developers would like to service both APIs from the same service to save time and development effort.
  • The read vs. write activity for an API is grossly unbalanced and the architectural solution is to split the reads and writes into separate services.
  • There are any number of poor, short sighted, or org structure-driven architectural decisions that can lead to something other than a 1:1 ratio. Basically, bad design or date-driven decisions sometimes compromise the goal of fully encapsulated, well-bounded service to API relationship.

An obvious implication is that you should probably maintain your API contracts in a different repo than your implementation code. This allows you to easily decouple both the relationship and the lifecycle of the API and the service. It turns out that the API lifecycle is dictated by your API versioning and backward compatibility policy. The lifecycle of your service may be driven by orthogonal concerns such as bug fixes and improvements to operational qualities like performance and availability. Decoupling allows each to evolve somewhat independently, as long as the API contract itself is always honored.

Decoupling also reinforces the notion that your API contract is itself a first class product citizen. It should be driving the implementation (API first), not the other way around. This role reversal is much easier to reinforce when there is clear separation in the process for maintaining each artifact.

The issue of control is important when it comes to process. At PayPal, we maintain very strict control over who can merge changes in the API repos. Every change is portfolio aligned and evaluated against standards to produce a maturity score. The centralized team started off doing most of this work. As we’ve matured, we’ve trained and deputized developers across the organization to operate in a more federated manner (more on that in the next article). Even still, we try to avoid developers merging their own changes and encourage as much cross-domain collaboration and review as possible.

Lastly, there is a constant struggle against Conway’s Law.

organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations

— M. Conway

Developers derive identity from their team’s mission. One of the first things teams like to do after they’re reorganized is create a new service named after their team - slicedbreadserv. That artifact represents their future work and contribution to the organization. Stable, long lived APIs are the goal. Just like any product, deleting them is expensive. Reorgs happen frequently by comparison and are often driven by very different forces. It’s beneficial to decouple the definition of your API platform product(s) from the whims of underlying implementation changes. One of them you can try to control. The other one you most certainly can’t.

This is a Lifestyle

Our experience developing an API portfolio management discipline has taught us a lot. We’ve learned that it is possible to shift thinking and behavior towards a more customer-focused, API product-driven methodology at scale. At times it’s painful, thankless, and hard, but by listening to and embracing the needs of developers on both sides, we’ve managed to continuously refine the program and improve internal customer satisfaction.

There is an oft-repeated adage that developers are lazy. We don’t believe that. Developers work hard when they have an interesting problem to solve. They’re hungry for knowledge and skills that allow them to solve them faster and more elegantly. Most of all, they don’t want to solve the same problem twice.

Part of the promise of an API product, design-first approach to system design is that you can solve the same problem for many more people at once. It’s naturally appealing and elegant. Understanding takes time, but it’s powerful once it sinks in. The methodologies and processes we’ve implemented have helped us guide the organization along this path. You’ll never actually get there, but by approaching the problem from the standpoint of a culture change, not a project, the impact should be long lasting.

In the last article, we’ll explore the importance of program management to align the organization, market the concept, provide training, and create various incentives that encourage adoption.

About the Author

Erik Hogan has been learning what it means to be a great Product Manager for almost two decades. Much of that time has been spent understanding how to be a customer champion and trusted leader who can scale. He derives great satisfaction factoring complex problems into reusable parts and giving engineering exactly what they need to execute quickly. Lately, he’s been applying fundamental product concepts of simplicity, storytelling, and focus to affect organizational change - a very different type of "product". Also, his wife still gets mad when he asks for her success criteria.

Rate this Article