Key Takeaways
- A proxy is usually deployed in the middle of two isolated networks and is responsible for data transfers from one side to the other.
- Proxies of the first generations are not powerful enough to meet the needs and requirements of current technology trends.
- Developers are always seeking for more functionalities without sacrificing the speed and reliability of previous generations of proxies.
- New generation of proxies are required to support customization, extension, and development of new functionality through easy-to-use administration panels, REST interfaces, or programming via built-in scripting support.
- For proxies to be labeled as Programmable proxies, they need to fulfill several promises and goals, including support for extending core features, dynamic logic, and more.
- Programmable proxies should provide an easier interface to integrate with external systems or to fit into larger management systems.
A question which gets often asked is "What is a programmable proxy, and why do I need one?" This article tries to answer this question from different perspectives. We will start with a brief definition of what a proxy is, then discuss how proxies evolved through different stages, explaining what needs they responded to and what benefits they offered at each stage. Finally, we will discuss several aspects of programmability and provide a summary of why we need a programmable proxy.
What is a Proxy?
A proxy server is usually deployed in the middle of two isolated networks and is responsible for data transfers from one side to the other so they appear as a single network. In its simplest form, a proxy is a gateway between a user and the internet, as it has existed since the birth of computer networks. A proxy not only acts as a network connector, though, since it also enables additional functionality and use cases like:
- Routing. The proxy forwards data to different destinations according to the characteristics of the data transmitted.
- Load balancing. During forwarding, data is distributed to different destinations to improve throughput and avoid single point of failures. Layer 4 or Layer 7 load balancing is one of the use cases of proxy servers.
- Failover. When forwarding to a given destination fails, the proxy can choose an alternate destination, providing uninterrupted service to the requester.
- Access control. A proxy can decide what traffic is allowed through and what needs to be blocked. Web Application Firewalls (WAF) are a typical example.
- Identity management. Access control is often based on identity information, so proxies often have identity management capabilities as well.
- Network acceleration. A proxy accelerates network access by caching data.
- Metrics collection. Proxy captures data statistics and summarizes it to Network Performance Monitor (NPM) software for network optimization and network planning.
- Information security. In addition to access control, proxies can also be used for security auditing, TLS/SSL offloading, and data encryption to meet security requirements.
Proxies working at layers 4 and 7 of the ISO/OSI model are sometimes referred to as "routing mode" proxies. Most proxy services are available as open source software and account for the majority of network infrastructure software, providing specialized functions in different domains, such as proxies for specific protocols, proxies for load balancing, proxies for cache acceleration, and so on.
Proxy Software Evolution
Proxy servers have evolved through different stages of development:
Configuration-file era
This generation of proxy is entirely configuration based. The user sets a number of parameters, configures rules in a configuration file, and then starts the service process to execute those rules.
Configuration DSL era
Static configuration files make it hard to express complex logic, so many proxies introduced thin scripting capabilities on top of configuration files. These are commonly referred to as “configuration languages” or Domain specific languages (DSL for short), such as Haprxoy’s ACL or Varnish’s VCL.
Scripting language era
As the logic becomes more complex, it gets more difficult to express it via configuration languages. At the same time, when the number of distinct configuration languages used in the same network reaches a certain threshold, their management becomes difficult.
Using shell scripts, for example, one can write simple logic, but when shell code reaches a certain level of complexity, it is often required to step up to more structured scripting languages like Perl or Python.
Such languages bring the convenience of scripting and the structural advantage of a full-fledged programming language. Examples of this are OpenResty (Nginx + Lua) and Nginx Plus (Nginx + NJS). This category also includes proxy servers implemented in a number of application programming languages, such as Node-based StrongLoop Microgateway and Java based Spring Cloud Gateway, which often have scripting capabilities of their own.
Cluster era
Scripting languages solve the complexity inherent in modularizing and structuring complex logic. A further requirement at this point is to integrate proxies with other administrative control tools, hence the need for a REST or similar interface. Indeed, an external control plane can use that interface to dynamically update the logic in the script.
At the same time, the use of proxies has moved from single instances to clusters of proxies. In fact, proxy software like Envoy and OpenResty-based Kong often support clustering capabilities themselves, implementing them in some centralized or shared (via RDBMS etc) way, while also providing REST administrative interfaces to manage the configurations.
For proxies prior to this era, cluster management is generally possible through configuration management. Configuration management tools can also expose REST interfaces. For example, Ansible + Nginx implements similar capabilities to cloud-era proxies. In contrast, cluster-era proxies require more components to form the scheme, whereas cloud-era proxies remove the burden of managing moving parts and are much preferred.
Cloud era
In the Cloud era, proxies are deployed in a distributed manner. The most common scenario is to deploy one proxy for each application process, following the Sidecar Proxy pattern.
In clustering mode, there are usually different configurations and policies for different upstream services, such as different authentication modes and access control mechanisms. As upstream services grow, the configurations of these different upstream services are logically separated but run physically in the same proxy process. This scheme has some disadvantages: more logic running in the same process brings more complexity. Additionally, different upstream services share resources such as CPU and memory, affecting each other. If a script of an upstream service has a security vulnerability, the configurations of other upstream services may be leaked, resulting in security risks.
In the Cloud era, proxy processes for each upstream service are independent and isolated from each other. The adoption of distributed proxies opened the door to using different rules and policies for different upstream services, that is to multi-tenant capabilities.
The various upstream services not only have logically independent rules and policies. Physical isolation is also provided, allowing granular management at the process and interface level. This isolation is a strong requirement in a multi-tenant environment – different upstream services belong to different tenants, and tenants should not affect each other or know each other’s configurations.
Service meshes are representative of this era. Service meshes are composed of two key architectural components, a data plane and a control plane. Contrary to what the name suggests, a service mesh is not a “mesh of services.” It is a mesh of proxies that services can plug into to completely abstract the network away. Typical examples are istio+envoy, Linkerd + Linkerd proxy.
Created by one of the authors of this article, Pipy is a product of this era and falls under this category. Pipy is an open-source, lightweight, high-performance, modular, programmable network proxy for cloud, edge, and IoT. Pipy is ideal for a variety of use-cases ranging from (but not limited to) edge routers, load balancers & proxy solutions, API gateways, static HTTP servers, service mesh sidecars, and other applications. Pipy is in active development and maintained by full-time committers and contributors, though still an early version, it has been battle-tested and in production use by several commercial clients.
From the discussion above, it is clear that each stage is an improvement over the previous one.
Proxy Software Requirements and their Evolution
Let’s take another look at the evolution of proxies considering how their requirements evolved.
Configuration-file era
The first generation of proxies mainly implemented a functionality of gateway between users and the services and provided basic configurable capabilities. The real-time transmission of massive data requires high throughput, low latency and low resource usage. Like all software, proxies are also required to support modularity and extensibility.
Proxy software in this stage was mainly developed in C, as was the case with extension modules that get loaded dynamically at the start of the process.
To summarize, proxy requirements at this stage are connectivity (network capabilities), ease of use (configurable via configuration files), reliability (requirements for cross-linking devices), high performance, and scalability.
Configuration DSL era
The second generation of proxies got further improvements in extensibility and flexibility, such as some dynamic data acquisition and the ability to perform some logical decisions on acquired data. The introduction of DSL-based scripts further enhanced usability. Support for combinatorial logic and dynamic data retrieval provided flexibility while improving scalability.
Scripting language era
The main improvements of 3rd generation proxies over 2nd generation proxies are manageability, developer friendliness, and programmability.
Developers’ productivity and the complexity of maintaining massive scripts require this generation of proxies to use a structured scripting language, while keeping the performance, low resource utilization, and other core capabilities of the previous generation.
Scripting capabilities are widely used, mainly because it is difficult to develop and maintain extensions using C. Indeed, scripting languages are easier to learn and provide faster turnarounds when compared to compiled languages.
The use of structured and modular scripting language ushered the era of programmable proxies and required proxies to provide two levels of programmability: using C for development of core modules, and scripts to program dynamic logic. In other words programmable proxies provided their user the power of development of core modules and dynamic logic.
Cluster era
The fourth generation of proxies starts with cluster support, which improves manageability.
Thanks to REST interfaces, proxies become part of the network infrastructure implementation, and a starting point for infrastructure as code. REST interfaces besides improving proxies manageability also contribute to simplify their administration. External interfaces are also an important feature of programmable proxies, and REST, as the most common form of interface.
At this point, programmability consists of three layers: programmable core modules, programmable dynamic logic, and programmable external interfaces. The emergence of proxy server clusters reflects the change of perspective in scalability from functionality extension to resource expansion (where users can modularize functionality into multiple instances, instead of writing a monolith script). The emergence of REST interfaces provides the technical foundation for self-service and managed services that provide, e.g. a control panel for configuration and control.
Cloud era
The evolution of the fifth generation of proxies is driven by the popularity and rapid development of cloud computing, bringing the requirements of elasticity, self-service, multitenancy, isolation, and metering.
If the fourth generation of agents is for system administrators, the fifth generation of agents is for cloud services. While fully maintaining the characteristics of previous generations of proxy software, fifth generation proxies become cloud-ready.
With the expansion of cloud computing to the edge, the fifth generation of proxies need to support heterogeneous hardware, heterogeneous software, and low-energy consumption to optimize the integration of the cloud and the edge.
The fifth generation of proxies also shows advances in programmability: from the core module, dynamic logic, and external interface we saw earlier to cloud-ability, support for distribution, multitenancy, and metering. Metering is a derivative requirement of multi-tenancy, which requires isolation on the one hand, and that resources can be measured at the smallest possible granularity on the other.
Let’s summarize the above discussion into a tabular format, where rows correspond to specific requirements and columns to proxies at different stages. For each evolution stage, we also provide typical examples or known software in parentheses. In each cell, we use * to indicate whether such capabilities are available and to what extent (1-5 , 5 for full support, and 1 * for basic support).
SN |
Requirement |
Configuration (squid, httpd, nginx) |
Configuration Language (varnish, haproxy) |
Scripting support (nginx+lua, nginx+js) |
Clustering (kong, envoy) |
Cloud (istio+envoy, linkerd, pipy) |
Remarks |
1. |
Connectivity |
* * * * * |
* * * * * |
* * * * * |
* * * * * |
* * * * * |
Connectivity in the cloud era began with kernel technologies such as iptables and ebpf. Previously, there were only user space processes. |
2. |
Reliability |
* * * * * |
* * * * * |
* * * * * |
* * * * * |
* * * * * |
Reliability has always been the most fundamental capability of proxy servers. |
3. |
High Performance |
* * * |
* * * * |
* * * * * |
* * * * * |
* * * * * |
Performance includes throughput, latency, error rate, and deviation from the mean. Common latency metrics are P99, P999 and others. Early proxy software has a long tail effect, so the indicators above P99 are not as good as for later software. Proxy with high-performance scripts often perform better than their predecessors when returning the same content. Proxies which avoid long tails are more stable (with less deviation from the mean) while providing the same performance. |
4. |
Flexibility |
* |
* * |
* * * |
* * * * |
* * * * * |
Compared with the fourth generation, the fifth-generation proxies significantly enhance multi-protocol support capability, so we give this generation a five-star evaluation. Moreover, the processing model of the fifth generation can adapt to various protocols and offer better suitability as compared with the fourth generation. |
5. |
Scalability |
* |
* * |
* * * |
* * * |
* * * * |
Similarly to flexibility, 5th generation proxies support multiple protocols and provide easy extension mechanisms for extension of core functionality, or customized application layer (Layer 7) protocol extension development, so we give it one star more than the 4th generation. |
6. |
Hardware Compatibility |
* * * * |
* * * * |
* * * * |
* * * * |
* * * * |
Proxies developed in C or C++ generally have better hardware compatibility and a more active community to migrate applications to new hardware architectures. Proxies developed using Rust, Go, and Lua have been relatively slow to migrate for hardware compatibility. |
7. |
System Compatibility |
* * * |
* * * |
* * * * |
* * * * |
* * * * * |
The system mainly includes two aspects, one is the operating system, the other is the cloud platform. In terms of operating system compatibility, each generation of proxies is similar. However, in terms of cloud platform compatibility, both the fourth and fifth generation proxies provide better cloud compatibility. In contrast, the significant difference between the fifth and the fourth generation lie in their ability to support multi-tenancy. |
8. |
Ease of Management |
* * |
* * |
* * |
* * * |
* * * * |
Ease of management is a function of system operation & administrator roles. The first and second generations mainly use configuration files, based on the use of configuration management tools to achieve automatic and batch management. In the third generation, in addition to managing configuration files, we need to further manage script files. But in essence, there is no significant difference from the first and second generation in terms of ease of management. The fourth generation provides REST interfaces, which greatly improves ease of management. In addition to REST, the fifth generation of proxies usually provides a Cloud-based control plane to manage proxies. It also provides multiple external interfaces to meet other management requirements, such as monitoring, auditing, and statistics. |
9. |
Ease of Use |
* |
* |
* |
* * |
* * * |
The primary users of the first three generations of proxies are ops & sys admins. In the fourth generation, administrators began to provide some functions to users, and the as-a-service model began to appear. The fifth generation takes into account more user scenarios and provides more tenant capabilities. |
10. |
Ease of Development |
* |
* * |
* * * |
* * * * |
* * * * * |
Development around the proxies includes two aspects. One is inside the proxy, aiming to achieve the functionality; the other is outside the proxy, aiming to achieve the management ability of the proxy. The first three generations provide an interface for internal development. The last two generations provide both internal and external interfaces. The significant improvement brought by the fifth generation compared with the fourth generation is the cloud interface. |
11. |
Core interface is programmable |
* |
* |
* |
* |
* |
Each generation of proxies provides the ability to extend core interfaces, but these interfaces are too low-level and difficult to master. |
12. |
Functionality extension is programmable |
* |
* * |
* * * |
* * * * |
* * * * * |
Providing the ability to extend functions more efficiently is part of the process that gets better with each generation of proxies. It is the core metric of programmable proxies. |
13. |
Protocol extensions are programmable |
* * |
* * * |
The first three generations are mainly for single protocol, or fixed protocols. Starting with the fourth generation, users began to seek support for multiple protocols and custom protocols. In the fifth generation, protocol extension is considered as a core capability in the design. |
|||
14. |
Modular scripting |
* * * |
* * * * |
* * * * |
Third generation proxies are beginning to pay more attention to script structuring. The fourth and fifth generations attempt to allow for more structured programming, such as Envoy’s attempt to provide multilingual support through WASM; Pipy introduces high-performance JS scripts for better structuring |
||
15. |
Configuration management is programmable |
* * |
* * * |
The first three generations of proxy configuration target mainly ops management personnel, and external configuration management tools are based on this premise. The fourth generation supports REST management interfaces. The fifth generation further provides a standard cloud interface for configuration management. |
|||
16. |
Resource extensions are programmable |
* |
* |
* |
* * |
* * * * |
For the first three generations, proxy capacity expansion aimed mainly to increase the number of threads or processes. The fourth generation provides scale-out capability for processes, known as clustering capability. Based on that, the fifth generation provides on the one hand horizontal expansion capabilities, on the other hand capabilities under constrained resources to support more fine-grained metering and billing. That is, it not only supports incremental scaling, but also provides the ability to scale down. Furthermore, all of these capabilities provide programming interfaces. |
17. |
Tenant extension programmable |
* * * |
The cloud is something that emerged at the same time as the fourth generation of proxy, and multi-tenancy, as a core feature of the cloud, is not well supported in the fourth generation. The fifth generation is designed on the premise of cloud computing, considering and providing tenants with the possibility of programming their own extensions. |
Rows #11 to #17 in the table above are the specific aspects of a programmable proxy. These aspects also constitute the answer to the question why to use a programmable proxy:
- The internal functions of the proxy can be expanded, including underlying core capabilities, supported protocols, and Layer7 processing capabilities (forwarding, routing, judgment, access control, etc.). Layer7 processing power requires a more convenient way of programming, that is, scripting and structured programming.
- Proxies need to provide external interfaces to integrate into larger management systems (such as cloud platforms), including configuration management, resource management, and so on.
- Proxies need to provide extensible capabilities for different roles, including operations, administrators, resource providers, and tenants, all of which require programmability to some extent.
- Also, like any programmable component, a programmable proxy requires accompanying documentation, development manuals, code management, dependency management, build and deployment tools, and preferably a visual development and debugging environment. Only when these requirements are fully met can users efficiently manage network traffic and the services producing it.
Summary
In this article we tried to provide our answer to what a programmable proxy is. To this aim, we started from the definition of what a proxy is and what its key characteristics are. Then we expanded our discussion to include the evolution proxy that has gone through, explaining the features and functionalities that were added at each stage. Finally, we summarize our discussion of proxy features by splitting them into 17 different categories and ranking each generation of proxy. This classification allowed us to identify the key characteristics and attributes that are required for a proxy to be labeled as programmable.