Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles The Past, Present, and Future of API Gateways

The Past, Present, and Future of API Gateways

Leia em Português

This item in japanese


Key Takeaways

  • An API gateway began by performing the routing functionality that was originally in the monolith, creating a common facade for the entire application.
  • API gateways evolved further to meet the needs of microservices by incorporating traffic management and real-time service discovery.
  • The cloud-native era requires API gateways to participate in release automation and expose metrics to full cycle development teams.

The Internet has become a ubiquitous part of our lives over the past few decades. The increasing breadth of online services has driven the development of increasingly sophisticated web applications. The evolution of these web applications has driven an evolution in the data center edge -- the hardware and software stack that connects these web applications to the Internet, and the location at which a customer begins interacting with an organisation’s business services. The edge has evolved from simple hardware load balancers to a full stack of hardware and software proxies that comprise API Gateways, content delivery networks, and load balancers. In this article, we’ll trace the evolution of the data center edge as application architecture and workflows have evolved.

The early Internet

In the mid-1990s, web application architecture was in its infancy. Ruby on Rails, Django, and other web frameworks had not been developed and technologies such as Apache Struts and J2EE were starting to gain traction.

The classic n-tier architecture, consisting of a database tier, application tier, and presentation tier, was the de facto application architecture of this time. The n-tier architecture was horizontally scalable -- more instances of the application architecture and presentation tier could be added as traffic increased. (The database was a different challenge.) 

Connecting multiple instances of an application or presentation tier to the Internet required the first iteration of the data center edge: a load balancer.

In this era, the load balancer was responsible for routing traffic between different instances of the application, ensuring high availability and scalability. The load balancer was typically a hardware appliance, although the release of HAProxy in 2001 started to popularize the concept of software load balancers.

Web 2.0

Darcy DiNucci coined the term Web 2.0 in 1999 to refer to the evolution of the Internet from a one way medium to a two way medium where users would participate with websites. Instead of being passive consumers of content, Web 2.0 websites would have users actively contribute and engage with each other.

AJAX (Asynchronous JavaScript and XML) development techniques became ubiquitous during this time. By decoupling data interchange from presentation, AJAX created much richer user experiences for end users. This architecture also created much “chattier” clients, as these clients would constantly send and receive data from the web application.

In addition, ecommerce during this era was starting to take off, and secure transmission of credit card information became a major concern for the first time. Netscape introduced Secure Sockets Layer (SSL) -- which later evolved to Transport Layer Security (TLS) -- to ensure secure connections between the client and server.

These shifts in networking -- encrypted communications and many requests over longer lived connections -- drove an evolution of the edge from the standard hardware/software load balancer to more specialized application delivery controllers (ADCs). ADCs included a variety of functionality for so-called application acceleration, including SSL offload, caching, and compression. This increase in functionality meant an increase in configuration complexity. A variety of proprietary configuration standards emerged, e.g. VCL, SSI.  The load balancer was no longer just load balancing!

The Web-scale Era

In the early 2010s, a number of cloud-first companies experienced exponential growth in their user base. The software behind these companies was originally architected as monolithic web applications using easy-to-use web frameworks such as Ruby on Rails and Django. As their user bases swelled to astronomical numbers, these companies found that web scale problems were indeed a different type of problem that dictated a different architecture. Twitter’s infamous fail whale was perhaps the eponymous example of the challenges faced by these companies as they attempted to build web-scale applications for the first time.

Companies such as Twitter, Facebook, and New Relic started to refactor key pieces of functionality out of their monolith into independently deployed services. By deploying critical business functionality as services, these organizations were able to independently scale and manage different aspects of their overall application. Traffic to these independent services was routed through the monolith. Any changes to routing meant that developers frequently had to redeploy the entire monolith. This acted as a bottleneck for speed of change.

The rise of the API Gateway

These organizations, and many others, shared their innovations and discoveries with the broader technology community. Many of the early engineers at these companies went on to work at other companies or to found startups based on their learnings. One of the learnings from these architectures was fairly obvious -- for the refactored services, the monolith was simply functioning as a router.

This observation sparked the development of early API gateways. An API gateway performed the routing functionality that was in the original monolith, creating a common facade for the entire application. Cross-cutting application-level functionality such as rate limiting, authentication, and routing was centralized in the API gateway. This reduced the amount of duplicative functionality required in each of the individual services.

The Cloud-Native Era: Microservices

Today, we operate in the cloud-native era. Key to the cloud-native era is the proliferation of microservices. Microservices represent another shift in application architecture. Each microservice represents a self-contained business function, and is developed and released independently of the other microservices of an application. By decoupling development cycles, microservices enable organizations to scale their software development processes more efficiently for the cloud.

Given that microservices can be deployed in multiple environments: virtual machines, bare metal, containers, as functions -- API gateways play a critical role in routing traffic to the right microservice.

API gateways have evolved to meet the needs of microservices in a few different areas. In particular, modern API gateways:

  • Continue to support cross-cutting application-level concerns such as authentication, rate limiting, publishing APIs, and metrics collection.
  • Have incorporated traffic management functionality that has historically been seen in application delivery controllers. This includes advanced load balancing, caching, and resilience semantics such as automatic retries and timeouts.
  • Support real-time service discovery. Increasingly, microservices are deployed in ephemeral environments such as Kubernetes or serverless environments. Thus, discovering in real-time the network location of every instance of a microservice is critical.

Full Cycle Development: The Cloud-Native Workflow

Microservices aren’t just a shift in application architecture. Microservices are also a shift in development workflow. With microservices, individual teams are empowered to deliver software. Most importantly, these teams are responsible for the full software development lifecycle -- from design to development to testing to deployment and release. Some organizations put these teams as part of the on-call rotation (“aka you build it, you run it”).. This development model, popularized as full cycle development by Netflix, is a transformational shift in how software is developed and shipped.

This shift in workflow also has implications for the data center edge. Not only do API gateways (and other elements of the edge stack) need to adapt to a microservices architecture, the entire edge needs to be accessible and manageable by full cycle development teams. We’ll discuss two main considerations for edge management in today’s cloud-native era below.

Consideration 1: Scaling release and deployment

Deployment is the process for installing a code update on production infrastructure. Release is the process for actually exposing a code update to real production users. While organizations may treat both deployment and release as one operation (the so-called “release-in-place” model), this exposes actual deployment risk to production users. For example, a missing mandatory configuration parameter could cause an end-user visible outage in the release-in-place model. By separating the two phases, deployment risk is never exposed to an end user.

The data center edge plays a crucial role in release. By controlling the flow of traffic to specific versions of microservices, the edge is responsible for actually releasing an update to end users. Moreover, modern edge services support strategies for incremental release such as canary release (where a percentage of traffic goes to a new update, over time) or blue/green rollouts.

Full cycle development teams need control over the edge in order to orchestrate release. These controls include routing (which version of a service should receive production traffic) as well as finer-grained controls such as weighted routing (needed for canary releases) and traffic shadowing (create a copy of the traffic to a test version of a service for testing purposes). By giving development teams the ability to manage release and deployment, organizations are able to scale these processes to support even highly complex applications.

Consideration 2: Monitoring the spectrum of services

Full cycle development teams frequently have operational responsibility for their microservices. Critical to this success is real-time visibility into the performance of their microservice. The edge provides important insight into the behavior of a microservice by virtue of analyzing all traffic that flows to and from a microservice. This enables the edge to report on metrics such as latency, throughput, and error rates, providing insight into application health.

These metrics need to be correlated with metrics collected elsewhere in the application to prevent a complete end-to-end view of application behavior. This correlation occurs today by introducing correlation identifiers on request data using a standard such as OpenTracing. Finally, all of these metrics collected by the modern edge stack needs to be exposed to these full cycle development teams as configurable dashboards.

Edge Policy Management

Given the importance of the edge in modern cloud-native workflows, how do full cycle development teams manage the edge? Unfortunately, all components of the edge stack have traditionally been managed by operations, and operational interfaces are a poor fit for application developers on full cycle development teams. In addition, edge components are often operated in isolation, without a cohesive operational interface. After all, the full cycle developers are not full-time operators; they just need to be able to operate the edge machinery for their specific needs.

Fortunately, the Kubernetes ecosystem can provide guidance and inspiration. In the Kubernetes model, users declare their intent as policies with a common YAML-based configuration language. Unlike traditional REST or UI-based interfaces, the declarative model enables policies to be codified and managed via source control systems (e.g., GitOps). This provides auditability, versioning, and transparency to all users. Moreover, the Kubernetes model supports decentralized policies, where multiple policy files are aggregated into a global policy configuration for the cluster as a whole. API Gateways are also adopting this model, as shown by the development of ingress controllers for the Azure Application Gateway and Kong. While these products started with REST APIs for management, the shift to a declarative paradigm is slowly obsoleting this approach.  In addition, newer, cloud-native Gateways such as the Ambassador Edge Stack have emerged that were designed with this paradigm in mind.

Extending this model of decentralized, declarative configuration to the edge is critical for full cycle development teams. Each team can maintain their own edge policy, independent of other teams. This policy can be managed in source control alongside the code for their microservice. This enables the policy to be managed as part of the actual development process, instead of as a separate, additional configuration that needs to be managed.


Over the past few decades, the edge has evolved from a simple load balancer to a full stack of hardware and software that includes load balancers, API gateways, and more. This evolution has been driven by both application architecture and, more recently, application development workflows.

Today’s cloud-native architectures require an edge stack that is accessible to full cycle development teams. By exposing the full functionality of the edge as declarative, decentralized configuration, full cycle development teams can take advantage of the edge to accelerate their development workflows. This includes improving observability as well as decoupling release from deployment. Ultimately this leads to faster iteration times, increased stability of releases, and more value being delivered to customers.

About the Author

Richard Li is cofounder and CEO of Datawire. Datawire provides several popular open source tools to accelerate Kubernetes development, including Telepresence (local development) and the Ambassador API Gateway. Richard is a veteran of multiple technology startups including Duo Security, Rapid7, and Red Hat. He is a recognized Kubernetes and microservices expert and has spoken at numerous conferences including ApacheCon, the Microservices Practitioner Summit, KubeCon, and O’Reilly Velocity. He holds both a BS and MEng in computer science from MIT.

Rate this Article